Mercurial > ~dholland > hg > ag > index.cgi
view doc/misc/html/examples/mpp/token.html @ 24:a4899cdfc2d6 default tip
Obfuscate the regexps to strip off the IBM compiler's copyright banners.
I don't want bots scanning github to think they're real copyright
notices because that could cause real problems.
author | David A. Holland |
---|---|
date | Mon, 13 Jun 2022 00:40:23 -0400 |
parents | 13d2b8934445 |
children |
line wrap: on
line source
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <TITLE> Token Definitions - Macro preprocessor and C Parser </TITLE> </HEAD> <BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif" TEXT="#000000" LINK="#0033CC" VLINK="#CC0033" ALINK="#CC0099"> <P> <IMG ALIGN="right" SRC="../../images/agrsl6c.gif" ALT="AnaGram" WIDTH=124 HEIGHT=30 > <BR CLEAR="all"> Back to : <A HREF="../../index.html">Index</A> | <A HREF="index.html">Macro preprocessor overview</A> <P> <IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------" WIDTH=1010 HEIGHT=2 > <P> <H1> Token Definitions - Macro preprocessor and C Parser </H1> <IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------" WIDTH=1010 HEIGHT=2 > <P> <BR> <H2>Introduction</H2> <P> The token module, <tt>token.cpp</tt>, and its header file, <tt>token.h</tt>, provide detailed support to the macro preprocessor for manipulating and transmitting tokens between processes. <P> <BR> <H2> Type and Class Definitions </H2> <H4> <tt>token</tt>, <tt>token_id</tt> </H4> The macro preprocessor includes a token scanner module, <tt>ts.syn</tt>, which identifies legitimate C tokens in the input stream and then uses a string dictionary, <tt>td</tt>, also known as the "token dictionary", to keep track of the particular sequence of characters which make up each token. The actual unit of data which is passed from function to function is also called a token and is a pair consisting of the dictionary index and a type field. An <tt>enum</tt> statement in <tt>token.h</tt> provides definitions for the various types. The <tt>enum</tt> type is called <tt>token_id</tt>. <P> There are several things to note about the type definition. First, all single character punctuation tokens are given an ID corresponding to their ASCII character code. Multi- character punctuation tokens are given codes in the range normally occupied by the capital letters. Reserved words, numeric constants, and alphabetic tokens are given identifiers above 128 so they cannot be confused with ASCII characters. <P> The syntax files which accept token input (<tt>ex.syn</tt>, <tt>mas.syn</tt>, <tt>jrc.syn</tt>) have matching <tt>enum</tt> statements. Because punctuation characters are identified by their ASCII codes, these parsers can simply use the character representation for such tokens. <P> <H4> <tt>token_sink</tt> </H4> A token sink is an abstract class much like a <A HREF="../../oldclasslib/charsink.html"> character_sink </A>. Derived classes, however, comprise the <tt>token_accumulator</tt> class, the <tt>token_translator class</tt>, the <tt>c_parser</tt> class, and the <tt>expression_evaluator</tt> class. The macro substitution parser, <tt>mas</tt>, could also have been configured as a <tt>token_sink</tt>; however, the point of this particular example is to show a number of different ways of interfacing parsers, so a different technique was used for mas. <P> <H4> <tt>token_accumulator</tt> </H4> The <tt>token_accumulator</tt> class is derived from the <tt>token_sink</tt> class. It is identical in function to the <tt>string_accumulator</tt> class except that it uses strings of tokens instead of strings of characters. It uses the <tt>END_OF_FILE</tt> token as a string terminator. <P> <H4> <tt>token_translator</tt> </H4> The <tt>token_translator</tt> class is similar to the <tt>output_file</tt> class, except that it takes a pointer to a character sink as an argument to the constructor. This allows the <tt>token_translator</tt> to be used to output a token sequence to a file, or to translate a token sequence into a string in a <tt>string_accumulator</tt>. <P> The <tt>space_required</tt> flag is used to insert blank characters where necessary to separate alphanumeric tokens from each other. <P> <BR> <IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------" WIDTH=1010 HEIGHT=2 > <P> <IMG ALIGN="right" SRC="../../images/pslrb6d.gif" ALT="Parsifal Software" WIDTH=181 HEIGHT=25> <BR CLEAR="right"> <P> Back to : <A HREF="../../index.html">Index</A> | <A HREF="index.html">Macro preprocessor overview</A> <P> <ADDRESS><FONT SIZE="-1"> AnaGram parser generator - examples<BR> Token Definitions - Macro preprocessor and C Parser <BR> Copyright © 1993-1999, Parsifal Software. <BR> All Rights Reserved.<BR> </FONT></ADDRESS> </BODY> </HTML>