Mercurial > ~dholland > hg > ag > index.cgi

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<TITLE>Token Classifier - Macro preprocessor and C Parser </TITLE>
</HEAD>


<BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif"
 TEXT="#000000" LINK="#0033CC"
 VLINK="#CC0033" ALINK="#CC0099">

<P>
<IMG ALIGN="right" SRC="../../images/agrsl6c.gif" ALT="AnaGram"
         WIDTH=124 HEIGHT=30 >
<BR CLEAR="all">
Back to :
<A HREF="../../index.html">Index</A> |
<A HREF="index.html">Macro preprocessor overview</A>
<P>


<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
        WIDTH=1010 HEIGHT=2  >
<P>

<H1>Token Classifier - Macro preprocessor and C Parser   </H1>

<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
        WIDTH=1010 HEIGHT=2  >
<P>
<BR>
<H2>Introduction</H2>
<P>
          The token classifier module, <tt>ct.syn</tt>, exists only to deal
          with one very specific problem: When two tokens are spliced
          together in accordance with the "##" operator during a macro
          expansion, it is necessary to determine the syntactic type
          of the resulting token. Although it is conceivable that the
          token scanner could accomplish the job, it was felt that it
          would be simpler to simply extract the appropriate syntax
          from <tt>ts.syn</tt> and create a separate
          module. <tt>ct.syn</tt> is the
          result.

          The parser for the token classifier is <tt>ct()</tt>. It uses pointer
          mode input. The interface function for the token classifier,
          <tt>classify_token</tt>, takes a string pointer as an argument and
          returns a <tt>token_id</tt>:
<PRE>
      token_id classify_token(char *);
</PRE>
<BR>

<H2>     Theory of Operation </H2>

          The syntax found in <tt>ct.syn</tt> is simply the token recognition
          syntax of <tt>ts.syn</tt>. The reduction procedures are limited to
          those necessary to identifying the type of token presented.
<P>
          The parser is set up to use pointer input, since the string
          to be scanned is always in memory. The address of the input
          is saved in <tt>input_string</tt> in case it is necessary to check
          for a reserved word.
<P>
          There are no error diagnostics. If the token is not
          recognizable, it is given a token_id of <tt>UNRECOGNIZED</tt>.
<P>
          The default token type parameter is set to <tt>token_id</tt> in order
          to avoid specifying a type for each relevant token. Note
          that "name" is set to type <tt>void</tt> in order to avoid a compiler
          error. Since the return value of the reduction procedures
          is <tt>character_sink</tt>, without the void declaration, the
          compiler will try to convert <tt>character_sink</tt> to
          <tt>token_id</tt>.

<P>
<BR>


<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
      WIDTH=1010 HEIGHT=2 >
<P>
<IMG ALIGN="right" SRC="../../images/pslrb6d.gif" ALT="Parsifal Software"
                WIDTH=181 HEIGHT=25>
<BR CLEAR="right">

<P>
Back to :
<A HREF="../../index.html">Index</A> |
<A HREF="index.html">Macro preprocessor overview</A>

<P>
<ADDRESS><FONT SIZE="-1">
                  AnaGram parser generator - examples<BR>
                  Token Classifier - Macro preprocessor and C Parser <BR>
                  Copyright &copy; 1993-1999, Parsifal Software. <BR>
                  All Rights Reserved.<BR>
</FONT></ADDRESS>

</BODY>
</HTML>
author	David A. Holland
date	Sat, 22 Dec 2007 17:52:45 -0500
parents
children