view doc/misc/html/examples/mpp/token.html @ 24:a4899cdfc2d6 default tip

Obfuscate the regexps to strip off the IBM compiler's copyright banners. I don't want bots scanning github to think they're real copyright notices because that could cause real problems.
author David A. Holland
date Mon, 13 Jun 2022 00:40:23 -0400
parents 13d2b8934445
children
line wrap: on
line source

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<TITLE> Token Definitions - Macro preprocessor and C Parser </TITLE>
</HEAD>


<BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif"
 TEXT="#000000" LINK="#0033CC"
 VLINK="#CC0033" ALINK="#CC0099">

<P>
<IMG ALIGN="right" SRC="../../images/agrsl6c.gif" ALT="AnaGram"
         WIDTH=124 HEIGHT=30 >
<BR CLEAR="all">
Back to :
<A HREF="../../index.html">Index</A> |
<A HREF="index.html">Macro preprocessor overview</A>
<P>
<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
        WIDTH=1010 HEIGHT=2  >
<P>

<H1> Token Definitions - Macro preprocessor and C Parser   </H1>
<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
        WIDTH=1010 HEIGHT=2  >
<P>
<BR>

<H2>Introduction</H2>
<P>

          The token module, <tt>token.cpp</tt>, and its header file,
          <tt>token.h</tt>, provide
          detailed support to the macro preprocessor for manipulating and
          transmitting tokens between processes.
<P>
<BR>

<H2>     Type and Class Definitions </H2>
<H4>     <tt>token</tt>, <tt>token_id</tt> </H4>

          The macro preprocessor includes a token scanner module,
          <tt>ts.syn</tt>, which identifies legitimate C tokens in the input
          stream and then uses a string dictionary, <tt>td</tt>, also known as
          the "token dictionary", to keep track of the particular
          sequence of characters which make up each token. The actual
          unit of data which is passed from function to function is
          also called a token and is a pair consisting of the
          dictionary index and a type field. An <tt>enum</tt> statement in
          <tt>token.h</tt> provides definitions for the various
          types. The <tt>enum</tt> type is called <tt>token_id</tt>.
<P>
          There are several things to note about the type definition.
          First, all single character punctuation tokens are given an
          ID corresponding to their ASCII character code. Multi-
          character punctuation tokens are given codes in the range
          normally occupied by the capital letters. Reserved words,
          numeric constants, and alphabetic tokens are given
          identifiers above 128 so they cannot be confused with ASCII
          characters.
<P>
          The syntax files which accept
          token input (<tt>ex.syn</tt>, <tt>mas.syn</tt>,
          <tt>jrc.syn</tt>) have matching <tt>enum</tt> statements. Because
          punctuation characters are identified by their ASCII codes,
          these parsers can simply use the character representation
          for such tokens.
<P>

<H4>     <tt>token_sink</tt>  </H4>
          A token sink is an abstract class much like a
          <A HREF="../../oldclasslib/charsink.html">
          character_sink </A>. Derived classes, however, comprise the
          <tt>token_accumulator</tt> class, the <tt>token_translator
          class</tt>, the
          <tt>c_parser</tt> class, and the
          <tt>expression_evaluator</tt> class.  The
          macro substitution parser, <tt>mas</tt>, could also have been
          configured as a <tt>token_sink</tt>; however, the point of this
          particular example is to show a number of different ways of
          interfacing parsers, so a different technique was used for
          mas.
<P>

<H4>     <tt>token_accumulator</tt>  </H4>
          The <tt>token_accumulator</tt> class is derived from the
          <tt>token_sink</tt>
          class. It is identical in function to the
          <tt>string_accumulator</tt>
          class except that it uses strings of tokens instead of
          strings of characters. It uses the <tt>END_OF_FILE</tt> token as a
          string terminator.
<P>

<H4>     <tt>token_translator</tt>  </H4>
          The <tt>token_translator</tt> class is similar to the
          <tt>output_file</tt>
          class, except that it takes a pointer to a character sink as
          an argument to the constructor. This allows the
          <tt>token_translator</tt> to be used to output a token sequence to a
          file, or to translate a token sequence into a string in a
          <tt>string_accumulator</tt>.
<P>
          The <tt>space_required</tt> flag is used to insert blank characters
          where necessary to separate alphanumeric tokens from each
          other.
<P>
<BR>

<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
      WIDTH=1010 HEIGHT=2 >
<P>
<IMG ALIGN="right" SRC="../../images/pslrb6d.gif" ALT="Parsifal Software"
                WIDTH=181 HEIGHT=25>
<BR CLEAR="right">

<P>
Back to :
<A HREF="../../index.html">Index</A> |
<A HREF="index.html">Macro preprocessor overview</A>
<P>

<ADDRESS><FONT SIZE="-1">
                  AnaGram parser generator - examples<BR>
                  Token Definitions - Macro preprocessor and C Parser <BR>
                  Copyright &copy; 1993-1999, Parsifal Software. <BR>
                  All Rights Reserved.<BR>
</FONT></ADDRESS>

</BODY>
</HTML>