Mercurial > ~dholland > hg > ag > index.cgi

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<TITLE> Token Scanner - Macro preprocessor and C Parser </TITLE>
</HEAD>

<BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif"
 TEXT="#000000" LINK="#0033CC"
 VLINK="#CC0033" ALINK="#CC0099">

<P>
<IMG ALIGN="right" SRC="../../images/agrsl6c.gif" ALT="AnaGram"
         WIDTH=124 HEIGHT=30 >
<BR CLEAR="all">
Back to :
<A HREF="../../index.html">Index</A> |
<A HREF="index.html">Macro preprocessor overview</A>
<P>
<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
        WIDTH=1010 HEIGHT=2  >
<P>

<H1> Token Scanner - Macro preprocessor and C Parser   </H1>
<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
        WIDTH=1010 HEIGHT=2  >
<P>
<BR>

<H2>Introduction</H2>

          The token scanner module, <tt>ts.syn</tt>, accomplishes the following
          tasks:
<OL>
   <LI>          It reads the raw input, gathers tokens and identifies
               them. </LI>
   <LI>          It analyzes conditional compilation directives and
               skips over text that is to be omitted. </LI>
   <LI>         It analyzes macro definitions and maintains the macro
               tables. </LI>
    <LI>         It identifies macro calls in the input stream and calls
               the <tt>macro_expand()</tt> function to expand them. </LI>
    <LI>         It recognizes <tt>#include</tt> statements and calls itself
               recursively to parse the include file. </LI>
</OL>

          The token_scanner parser, <tt>ts()</tt>, is called from a shell
          function, <tt>scan_input(char *)</tt>, which takes the name
          of a file
          as an argument. <tt>scan_input()</tt> opens the file, calls
          <tt>ts()</tt>, and
          closes the file. <tt>scan_input()</tt> is called recursively by
          <tt>include_file()</tt> when an <tt>#include</tt> statement
          is found in the
          input.
<P>
          Output from the token scanner is directed to a token_sink
          pointed to by the <tt>scanner_sink</tt> global variable. The main
          program may set scanner sink to point to either a
          <tt>token_translator</tt> or a <tt>c_parser</tt>. During the
          course of
          processing, the token scanner redirects output to a token
          accumulator or to the conditional expression evaluator, as
          necessary, by temporarily changing the value of
          <tt>scanner_sink</tt>.
<P>
          The token scanner module contains two syntax error
          diagnostic procedures: <tt>syntax_error(char *)</tt> and
          <tt>syntax_error_scanning(char *)</tt>. The former is set up to
          provide correct line and column numbers for functions called
          from reduction procedures in the token scanner. The latter
          is set up to provide line and column numbers for errors
          discovered in the scanner itself. Both functions accept a
          pointer to an error message.
<P>
<BR>

<H2>     Theory of Operation  </H2>

          The primary purpose of the token scanner is to identify the
          C language tokens in the input file and pass them on to
          another module for further processing. In order to package
          them for transmission, the token scanner maintains a "token
          dictionary", <tt>td</tt>, which enables it to characterize each
          distinct input token with a single number. The token scanner
          also classifies tokens according to the definitions of the C
          language. The "token" that it passes on for further
          processing is a pair consisting of an id field, and a value
          field. The id field is defined by the <tt>token_id</tt>
          enumeration
          in <tt>token.h</tt>. The value field is the index of the
          token in the
          token dictionary, <tt>td</tt>.
<P>
          To support its primary purpose, the token scanner deals with
          several other problems. First, it identifies preprocessor
          control lines which control conditional compilation and
          skips input appropriately. Second, it fields <tt>#include</tt>
          statements, and recurses to process include files. Third, it
          fields <tt>#define</tt> statements and manages the macro definition
          tables. Finally, it checks the tokens it identifies and
          calls the macro/argument expansion module to expand them if
          they turn out to be macros.
<P>
          The conditional compilation logic in the token scanner is
          carried out in its entirety by syntactic means. The only C
          code involved deals with evaluating conditional statements.
          <tt>#ifdef</tt> and <tt>#ifndef</tt> are quite
          straightforward. <tt>#if</tt> is another
          matter. To deal with the generality of this statement, token
          scanner output is diverted to the expression evaluator
          module, <tt>ex.syn</tt>, where the expression is evaluated. The
          outcome of the calculation is then used to control a
          semantically determined production in the token scanner.
<P>
          Processing <tt>#include</tt> statements is reasonably
          straightforward. Token scanner output is diverted to the
          token accumulator, <tt>ta</tt>. The content of the token accumulator
          is then translated back to ASCII string form. This takes
          care of macro calls in the <tt>#include</tt> statement. Once the file
          has been identified, <tt>scan_input()</tt> is called recursively to
          deal with it.
<P>
          The only complication with macro definitions is that the
          tokens which comprise the body of a macro must not be
          expanded until the macro is invoked. For that reason, there
          are two different definitions of token in the token scanner:
          "simple token" and "expanded token". The difference is that
          simple tokens are not checked for macro calls. When a macro
          definition is encountered, the token scanner output is
          diverted to the token accumulator, so that the body of the
          macro can be captured and stored.
<P>
          When a macro call is recognized, the token scanner must pick
          up the arguments for the macro. There are three
          complications here: First, the tokens must not be scanned
          for macros; second, the scan must distinguish the commas
          that separate arguments from commas that may be contained
          inside balanced parentheses within an argument; and finally,
          leading white space tokens do not count as argument tokens.
<P>
<BR>

<H2>     Elements of the Token Scanner  </H2>

          The remainder of this document describes the macro
          definitions, the structure definitions, the static data
          definitions, all configuration parameter settings, and all
          non-terminal parsing tokens used in the token scanner. It
          also explains each configuration parameter setting in the
          syntax file. In <tt>ts.syn</tt>, each function that is defined is
          preceded by a short explanation of its purpose.
<P>
<BR>

<H2>     Macro definitions  </H2>
<DL>
<DT>     <tt>GET_CONTEXT</tt>
   <DD>       The <tt>GET_CONTEXT</tt> macro provides the parser with context
          information for the input character. (Instead of writing a
          <tt>GET_CONTEXT</tt> macro, the context information could be stored
          as part of <tt>GET_INPUT</tt>.)

<DT>     <tt>GET_INPUT</tt>
    <DD>      The <tt>GET_INPUT</tt> macro provides the next input
          character for
          the parser. If the parser used <b>pointer input</b> or <b>event
          driven</b> input, a <tt>GET_INPUT</tt> macro would not be
          necessary. The
          default for <tt>GET_INPUT</tt> would read <tt>stdin</tt> and
          so is not
          satisfactory for this parser.

<DT>     <tt>PCB</tt>
   <DD>       Since the <b>declare pcb</b> switch has been turned off, AnaGram
          will not define <tt>PCB</tt>. Making the parser control block part of
          the file descriptor structure simplifies saving and
          restoring the pcb for nested #include files.

<DT>     <tt>SYNTAX_ERROR</tt>
   <DD>       <tt>ts.syn</tt> defines the <tt>SYNTAX_ERROR</tt> macro,
          since otherwise the
          generated parser would use the default definition of
          <tt>SYNTAX_ERROR</tt>, which would not provide the name of the file
          currently being read.
</DL>
<P>
<BR>

<H2>     Local Structure Definitions </H2>
<DL><DT>     <tt>location</tt>
   <DD>       <tt>location</tt> is a structure which records a line
          number and a
          column number. It is handed to AnaGram with the context type
          statement found in the configuration segment. AnaGram then
          declares two member fields of type <tt>location</tt> in the parser
          control block: <tt>input_context</tt> and a stack, <tt>cs</tt>. In
          <tt>scan_input()</tt>, the <tt>input_context</tt> variable
	  is set explicitly
          with the current line and column number. In <tt>syntax_error()</tt>
          the <tt>CONTEXT</tt> macro is used to extract the line and column
          number at which the rule currently being reduced started.

<DT>     <tt>file_descriptor</tt>
   <DD>       <tt>file_descriptor</tt> contains the information that
          needs to be
          saved and restored when nested include files are processed.
</DL>
<P>
<BR>

<H2>     Static Variables  </H2>
<DL><DT>     <tt>error_modifier</tt>
   <DD>       Type: <tt>char *</tt><BR>

          The string identified by <tt>error_modifier</tt> is added to the
          error diagnostic printed by <tt>syntax_error()</tt>. Normally it is
          an empty string; however, when macros are being expanded it
          is set so that the diagnostic will specify that the error
          was found inside a macro expansion.

<DT>     <tt>input</tt>
    <DD>      Type: <tt>file_descriptor</tt><BR>

          <tt>input</tt> provides the name and stream pointer for the
          currently active
          input file.

<DT>     <tt>save_sink</tt>
    <DD>      Type: <tt>stack&lt;token_sink *&gt;</tt><BR>

          This stack provides for saving and restoring <tt>scanner_sink</tt>
          when it is necessary to divert the scanner output for
          dealing with conditional expressions, macro definitions and
          macro arguments. Actually, a stack is not necessary, since
          such diversions never nest more than one level deep, but it
          seems clearer to use a stack.
</DL>
<P>
<BR>

<H2>     Configuration Parameters </H2>
<DL><DT>     <tt>~allow macros</tt>
   <DD>       This statement turns off the <b>allow macros</b> switch so that
          AnaGram implements all reduction procedures as explicit
          function definitions. This simplifies debugging at the cost
          of a slight performance degradation.

<DT>     <tt>auto resynch</tt>
    <DD>      This switch turns on automatic resynchronization in case a
          syntax error is encountered by the token scanner.

<DT>     <tt>context type = location</tt>
   <DD>       This statement specifies that the generated parser is to
          track context automatically. The context variables have type
          <tt>location</tt>. <tt>location</tt> is defined elsewhere to
          consist of two
          fields: line number and column number.

<DT>     <tt>~declare pcb</tt>
   <DD>       This statement tells AnaGram not to declare a parser control
          block for the parser. The parser control block is declared
          later as part of the <tt>file_descriptor</tt> structure.

<DT>     <tt>~error frame</tt>
   <DD>       This turns off the error frame portion of the automatic
          syntax error diagnostic generator, since the context of the
          error in the scanner syntax is of little interest. If an
          error frame were to be used in diagnostics that of the C
          parser would be more appropriate.

<DT>     <tt>error trace</tt>
  <DD>        This turns on the <b>error trace</b> functionality, so
          that if the token
          scanner encounters a syntax error it will write an <tt>.etr</tt>
          file.

<DT>     <tt>line numbers</tt>
   <DD>       This statement causes AnaGram to include <tt>#line</tt>
          statements in
          the parser file so that your compiler can provided
          diagnostics keyed to your syntax file.

<DT>     <tt>subgrammar</tt>
   <DD>       The basic token grammar for C is usually implemented using
          some sort of regular expression parser, such as <tt>lex</tt>, which
          always looks for the longest match to the regular
          expression. In no case does the regular expression parser
          use what follows a match to determine the nature of the
          match. An LALR parser generator, on the other hand, normally
          looks not only at the content of a token but also looks
          ahead. The subgrammar declaration tells AnaGram not to look
          ahead but to parse these tokens based only on their internal
          structure. Thus the conflicts that would normally be
          detected are not seen. To see what happens if lookahead is
          allowed, simply comment out any one of these subgrammar
          statements and look at the conflicts that result.

<DT>     <tt>~test range</tt>
   <DD>       This statement tells AnaGram not to check input characters
          to see if they are within allowable limits. This checking is
          not necessary since the token scanner is reading a text file
          and cannot possibly get an out of range token.
</DL>
<P>
<BR>

<H2>     Scanner Tokens, in alphabetical order  </H2>
<DL><DT>     any text
    <DD>      These productions are used when skipping over text. "any
          text" consists of all characters other than eof, newline and
          backslash, as well as any character (including newline and
          backslash) that is quoted with a preceding backslash
          character.

<DT>     arg element
    <DD>      An "arg element" is a token in the argument list of a macro.
          It is essentially the same as "simple token" except that
          commas must be detected as separators and nested parentheses
          must be recognized. An "arg element" is either a space or an
          "initial arg element".

<DT>    character constant
   <DD>       A "character constant" is a quoted character or escape
          sequence. The token scanner does not inquire closely into
          the internal nature of the character constant.

<DT>     comment
   <DD>       A "comment" consists of a comment head followed by the
          closing "*/".

<DT>     comment head
   <DD>       A "comment head" consists of the entire comment up to the
          closing "*/". If a complete comment is found following a
          comment head, its treatment depends on whether one believes,
          with ANSI, that comments should not be nested, or whether
          one prefers to allow nested comments. Followers of the ANSI
          principle will want "comment head, comment" to reduce to
          "comment". Believers in nested comments will want to finish
          the comment that was in progress when the nested comment was
          encountered, so they will want "comment head, comment" to
          reduce to "comment head", which will allow the search for
          "*/" to continue.

<DT>     conditional block
   <DD>       A "conditional block" is an #if, #ifdef, or #ifndef line and
          all following lines through the terminating #endif. If the
          initial condition turns out to be true, then everything has
          to be skipped following an #elif or #else line. If the
          initial condition is false, everything has to be skipped
          until a true #elif condition or an #else line is found.

 <DT>    confusion
   <DD>       This token is designed to deal with a curious anomaly of C.
          Integers which begin with a zero are octal, but floating
          point numbers may have leading zeroes without losing their
          fundamental decimal nature. "confusion" is an octal integer
          that is followed by an eight or a nine. This will become
          legitimate if eventually a decimal point or an exponent
          field is encountered.

<DT>     control line
    <DD>      "control line" consists of any preprocessor control line
          other than those associated with conditional compilation.

<DT>     decimal constant
    <DD>      A "decimal constant" is a "decimal integer" and any
          following qualifiers.

<DT>     decimal integer
   <DD>       The digits which comprise the integer are pushed onto the
          string accumulator. When the integer is complete, the string
          will be entered into the token dictionary and subsequently
          it will be described by its index in the token dictionary.

<DT>    defined
   <DD>       See "expanded word". id_macro will recognize "defined" only
          when the if_clause switch is set.

<DT>     eof
  <DD>        end of file: equal to the null character.

<DT>     eol
   <DD>       end of line: a newline and all immediately following white
          space or newline characters. eol is declared to be a
          subgrammar since it is used in circumstances where space can
          legitimately follow, according to the syntax as written.

 <DT>    else if header
  <DD>       This production is simply a portion of the rule for the
          #elif statement. It is separated out in order to provide a
          hook on which to hang the call to init_condition(), which
          diverts scanner output to the expression_evaluator which
          will calculate the value of the conditional expression.

<DT>     else section
  <DD>        An "else section" is an #else line and all immediately
          following complete sections. An "else section" and a "skip
          else section" are the same except that in an "else section"
          tokens are sent to the scanner output and in a "skip else
          section" they are discarded.

 <DT>    endif line
  <DD>        An "endif line" is simply a line that begins #endif

<DT>    expanded token
  <DD>        The word "token" is used here in the sense of Kernighan and
          Ritchie, 2nd Edition, Appendix A, p. 191. In this program a
          "simple token" is one which is simply passed on without
          regard to macro processing. An "expanded token" is one which
          has been checked to see if it is a macro identifier and, if
          so, expanded. "simple tokens" are recognized only in the
          bodies of macro definitions. Therefore spaces and '#'
          characters are passed on. For "expanded tokens" they are
          discarded.

<DT>     expanded word
   <DD>       This is the treatment of a simple identifier as an "expanded
          token". "variable", "simple macro", "macro", and "defined"
          are the various outcomes of semantic analysis of "name
          string" performed by id_macro(). In this case reserved words
          and identifiers which are not the names of macros are
          subsumed under the rubric "variable". These tokens are
          simply passed on to the scanner output.
<P>
          The distinction between "macro" and "simple macro" depends
          on whether the macro was defined with or without following
          parentheses. A "simple macro" is expanded by calling
          expand(). expand() simply serves as a local interface to the
          expand_text() function defined in <tt>mas.syn</tt>.
<P>
          If a "macro" was defined with parentheses but appears bereft
          of an argument list, it is treated as a simple identifier
          and passed on to the output.  Otherwise the argument tokens
          for the macro are gathered and stacked on the token
          accumulator, using "macro arg list". Finally, the macro is
          expanded in the same way as a "simple macro". Note that
          "macro arg list" provides a count of the number of arguments
          found inside the balanced parentheses.
<P>
          If "if_clause" is set, it means that the conditional
          expression of an #if or #elif line is being evaluated. In
          this case, the pseudo-function defined() must be recognized
          to determine whether a macro has or has not been defined.
          The defined() function returns a "1" or "0" token depending
          on whether the macro has been defined.

<DT>     exponent
 <DD>         This is simply the exponent field on a floating point number
          with optional sign.


<DT>     false condition
  <DD>        The "true condition" and "false condition" tokens are
          semantically determined. They consist of #if, #ifdef, or
          #ifndef lines. If the result of the test is true the
          reduction token is "true condition", otherwise it is "false
          condition".

<DT>     false else condition
  <DD>        The "true else condition" and "false else condition" tokens
          are semantically determined. They consist of an #elif line.
          If the value of the conditional expression is true the
          reduction token is "true else condition", otherwise it is
          "false else condition".

<DT>     false if section:
   <DD>       A "false if section" is a #if, #ifdef, or #ifndef condition
          that turns out to be false followed by any number, including
          zero, of complete sections or false #elif condition lines.
          All of the text within a "false if section" is discarded.
<DT>     floating qualifier
   <DD>       These productions are simply the optional qualifiers to
          specify that a constant is to be treated as a float or as a
          long double.

<DT>     hex constant
  <DD>        A "hex constant" is simply a "hex integer" plus any
          following qualifiers.

<DT>     hex integer
   <DD>       The digits which comprise the integer are pushed onto the
          string accumulator. When the integer is complete, the string
          will be entered into the token dictionary and subsequently
          it will be described by its index in the token dictionary.

<DT>    if header
  <DD>        This production is simply a portion of the rule for the #if
          statement. It is separated out in order to provide a hook on
          which to hang the call to init_condition(), which diverts
          scanner output to the expression evaluator which will
          calculate the value of the conditional expression.

<DT>     initial arg element
  <DD>        In gathering macro arguments, spaces must not be confused
          with a true argument. Therefore, the arg element token is
          broken down into two pieces so that each argument begins
          with a nonblank token.

<DT>     include header
  <DD>        "include header" simply represents the initial portion of an
          #include line and provides a hook for a reduction procedure
          which diverts scanner output to the token accumulator. This
          diversion allows the text which follows #include to be
          scanned for macros and accumulated. The include_file()
          function will be called to actually identify and scan the
          specified file.

 <DT>    input file
   <DD>       This is the grammar, or start token. It describes the entire
          file as alternating sections and eols, terminated by an eof

<DT>     integer constant
   <DD>       These productions simply gather together the varieties of
          integer constants under one umbrella.

<DT>     integer qualifier
   <DD>       These productions are simply the optional qualifiers to
          specify that an "integer constant" is to be treated as
          unsigned, long, or both.

<DT>     macro
  <DD>        See "expanded word". id_macro specifies "macro" or "simple
          macro" depending on whether the named macro was defined with
          or without following parentheses.

<DT>     macro arg list
  <DD>        A "macro arg list" can be either empty or can consist of any
          number of token sequences separated by commas. Commas that
          are protected by nested parentheses do not separate
          arguments. Argument strings are accumulated on the token
          accumulator and counted by "macro args".

 <DT>    macro args
 <DD>         Each argument to a macro is gathered on a separate level of
          the token accumulator, so the token accumulator level is
          incremented before each argument, and the arguments are
          counted.

<DT>     macro definition header
  <DD>        The "macro definition header" consists of the #define line
          up to the beginning of the body text of the macro. It serves
          as a hook to call init_macro_def() which begins the macro
          definition and diverts scanner output to the token
          accumulator. The macro definition will be completed by the
          save_macro_body() function once the entire macro body has
          been accumulated. Note that the tokens for the macro body
          are not examined for macro calls.

<DT>     name string
   <DD>       "name string" is simply an accumulation on the string
          accumulator of the characters which make up an identifier.

<DT>     nested elements
  <DD>       "nested elements" are "arg elements" that are found inside
          nested parentheses.

<DT>     not control mark
  <DD>        This consists of any input character excepting eof, newline,
          backslash and '#', but including any of these if preceded by
          a backslash. It serves, at the beginning of a line, to
          distinguish ordinary lines of text from preprocessor control
          lines.
<DT>
     octal integer
  <DD>        The digits which comprise the integer are pushed onto the
          string accumulator. When the integer is complete, the string
          will be entered into the token dictionary and subsequently
          it will be described by its index in the token dictionary.

<DT>     operator
 <DD>        This is simply an inventory of all the multi-character
          operators in C.

<DT>     parameter list
  <DD>        "parameter list" is simply a wrapper about "names" which
          allows for empty parentheses. Note that both the "names"
          token and the "parameter list" tokens provide the count of
          the number of parameter names found inside the parentheses.
          The names themselves have been stacked on the string
          accumulator.

<DT>     qualified real
  <DD>        This production exists to allow the "floating qualifier" to
          be appended to a "real constant".
<DT>     real
  <DD>        These productions itemize the various ways of writing a
          floating point number with and without decimal points and
          with and without exponent fields.

 <DT>    real constant
  <DD>        This production is simply an envelope to contain "real" and
          write the output code once instead of four times.

<DT>     section
 <DD>         This is a logical block of input. It is either a single line
          of ordinary code, a control line such as #define or #undef,
          or an entire conditional compilation block, i.e., everything
          from the #if to the closing #endif. Notice that the eol that
          terminates a "section" is not part of the "section". The
          only difference between a "section" and a "skip section" is
          that in a "section", all tokens are sent to the scanner
          output while in a "skip section", all input is discarded.

<DT>     separator
  <DD>        This is simply a gathering together of all the tokens that
          are neither white space nor identifiers, since they are
          treated uniformly throughout the grammar.

 <DT>    simple macro
  <DD>        See "expanded word".

<DT>     simple real
  <DD>       A "simple real" is one which has a decimal point and has
          digits on at least one side of the decimal point.
          Unaccompanied decimal points will be turned away at the
          door.
<DT>     simple token
 <DD>        The word "token" is used here in the sense of Kernighan and
          Ritchie, 2nd Edition, Appendix A, p. 191. In this program a
          "simple token" is one which is simply passed on without
          regard to macro processing. An "expanded token" is one which
          has been checked to see if it is a
<P>          macro identifier and, if so, expanded. "simple tokens" are
          recognized only in the bodies of macro definitions.
          Therefore spaces and '#' characters are passed on. For
          "expanded tokens" they are discarded.

<DT>     skip else line
  <DD>        For purposes of skipping over complete conditional sections
          #elif and #else lines are equivalent.

<DT>    skip else section
  <DD>        A "skip else section" consists of the #else or #elif line
          following a satisfied conditional and all subsequent
          sections and #elif and #else lines. All input in the "skip
          else section" is discarded.

<DT>     skip if section
 <DD>         A "skip if section" consists of an #if, #ifdef, or #ifndef
          line, and all following complete "sections" (represented as
          "skip sections", so their content will be ignored) and #else
          and #elif lines.

 <DT>    skip line
   <DD>       When skipping text, we have to distinguish between lines
          which begin with the control mark ('#') and those which
          don't so that we deal correctly with nested #endif
          statements. We wouldn't want to terminate a block of
          uncompiled code with the wrong #endif.

<DT>     skip section
  <DD>        A "skip section" is simply a "section" that follows an
          unsatisfied conditional. In a "skip section", all input is
          discarded.

<DT>     space
  <DD>        space consists of either a blank or a comment. If a comment
          is found, it is replaced with a blank.
<DT>     simple chars
   <DD>       "simple chars" consists of the body of a character constant
          up to but not including the final quote.

<DT>     string chars
  <DD>        "string chars" consists of the body of a string literal up
          to but not including the final double quote.

<DT>     string literal
   <DD>       A "string literal" is simply a quoted string. It is
          accumulated on the string accumulator.

<DT>     true condition
  <DD>        The "true condition" and "false condition" tokens are
          semantically determined. They consist of #if, #ifdef, or
          #ifndef lines. If the result of the test is true the
          reduction token is "true condition", otherwise it is "false
          condition".

<DT>     true condition
  <DD>        The "true condition" and "false condition" tokens are
          semantically determined. They consist of #if, #ifdef, or
          #ifndef lines. If the result of the test is true the
          reduction token is "true condition", otherwise it is "false
          condition".

<DT>     true else condition
   <DD>       The "true else condition" and "false else condition" tokens
          are semantically determined. They consist of an #elif line.
          If the value of the conditional expression is true the
          reduction token is "true else condition", otherwise it is
          "false else condition".

 <DT>    true if section
   <DD>       A "true if section" is a true #if, #ifdef, or #ifndef,
          followed by any number of complete sections, including zero.
          Alternatively, it could be a "false if section" that is
          followed by a true #elif condition, followed by any number
          of complete "sections". All input in a "true if section"
          subsequent to the true condition is passed on to the scanner
          output.

<DT>     word
  <DD>        This is the treatment of a simple identifier as a "simple
          token". The name_token() procedure is called to pop the name
          string from the string accumulator, identify it in the token
          dictionary and assign a token_id to it by checking to see if
          it is a reserved word.

<DT>     variable
  <DD>        See "expanded word".

     ws
   <DD>       The definition for ws as space... simply allows a briefer
          reference in those places in the grammar where it is
          necessary to skip over white space.
</DL>
<P>
<BR>


<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
      WIDTH=1010 HEIGHT=2 >
<P>
<IMG ALIGN="right" SRC="../../images/pslrb6d.gif" ALT="Parsifal Software"
                WIDTH=181 HEIGHT=25>
<BR CLEAR="right">

<P>
Back to :
<A HREF="../../index.html">Index</A> |
<A HREF="index.html">Macro preprocessor overview</A>
<P>

<ADDRESS><FONT SIZE="-1">
                  AnaGram parser generator - examples<BR>
                  Token Scanner - Macro preprocessor and C Parser <BR>
                  Copyright &copy; 1993-1999, Parsifal Software. <BR>
                  All Rights Reserved.<BR>
</FONT></ADDRESS>

</BODY>
</HTML>
author	David A. Holland
date	Mon, 13 Jun 2022 00:06:39 -0400
parents	13d2b8934445
children