view doc/misc/html/examples/mpp/parsers.html @ 0:13d2b8934445

Import AnaGram (near-)release tree into Mercurial.
author David A. Holland
date Sat, 22 Dec 2007 17:52:45 -0500
parents
children
line wrap: on
line source

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

<HTML>
<HEAD>
<TITLE>C Parsers - Macro preprocessor and C Parser </TITLE>
</HEAD>


<BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif"
 TEXT="#000000" LINK="#0033CC"
 VLINK="#CC0033" ALINK="#CC0099">

<P>
<IMG ALIGN="right" SRC="../../images/agrsl6c.gif" ALT="AnaGram"
         WIDTH=124 HEIGHT=30 >
<BR CLEAR="all">
Back to :
<A HREF="../../index.html">Index</A> |
<A HREF="index.html">Macro preprocessor overview</A>
<P>


<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
        WIDTH=1010 HEIGHT=2  >
<P>

<H1>C Parsers - Macro preprocessor and C Parser   </H1>

<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
        WIDTH=1010 HEIGHT=2  >
<P>
<BR>

          Two C grammars are provided with this example: <tt>jrc.syn</tt> is a
          translation of a <tt>yacc</tt> grammar for C created by James A.
          Roskind. <tt>krc.syn</tt> is a grammar adapted from Kernighan and
          Ritchie, Section A13.
<P>
          Both grammars use semantically determined productions to
          distinguish between typedef names and identifiers, however,
          neither grammar has been adapted to take advantage of this
          capability to resolve many of the problems associated with
          typedef names.
<P>
          In order to parse simple examples successfully, it was
          necessary to create a rudimentary capability to recognize
          typedef names. Such capabilities were incorporated into each
          grammar. Note that the scope handling is inadequate for
          production use.
<P>
          In <tt>jrc.syn</tt>, the typedef token causes a flag to be set. When
          the identifier in a declaration is encountered, the flag is
          interrogated. The complication is the possible interposition
          of field names within a struct or union which could be
          inadvertently marked as typedef names. To avoid this, the
          flag is stored on encountering a struct or union and
          restored when the danger is past.
<P>
          In <tt>krc.syn</tt>, the treatment of typedefs is marginally less
          inelegant. Storage class attributes are accumulated in a bit
          mask. As identifiers are encountered in the declaration
          statement they are pushed onto a multilevel stack, <tt>id_stack</tt>.
          When struct or union declarations are encountered, the level
          of the id_stack is incremented. When the danger is past, the
          level is decremented so that identifiers stacked in the
          course of parsing the struct or union are discarded.
          Eventually, when the statement is complete, all the
          identifiers that have been found are popped from <tt>id_stack</tt>
          and given the appropriate storage class attribute.
<P>
          Both grammars are set up so that the name of the parser is
          <tt>cc()</tt>.
          Each
          syntax file contains member function definitions for the
          class <tt>c_parser</tt>. They are identical except for the
          initialization of the typedef logic. The definition of the
          class <tt>c_parser</tt> is found in <tt>MPP.H</tt>.
<P>
          <B>Note:</B> Until AnaGram 2.40, the parsers were set up
          so both would emit the files <tt>cc.cpp</tt> and
          <tt>cc.h</tt>. This caused problems for AnaGram 2.40's more
          automated build and test system. As of this writing,
          <tt>krc.syn</tt> is used by default; to use <tt>jrc.syn</tt>
          you must change the <tt>#include</tt> near the top of
          <tt>mpp.h</tt> to include <tt>jrc.h</tt>. With luck, this
          annoyance will be cleared up in a future release.
<P>
          It should be noted that the KRC grammar has at least one
          bug. It will not recognize function declarations of the
          form:
<PRE>
            foo(int);
</PRE>
          It will, however, recognize
<PRE>
            int foo(int);
</PRE>
          This problem appears to be inherent in the grammar published
          in K &amp; R.
<P>
<BR>


<IMG ALIGN="bottom" SRC="../../images/rbline6j.gif" ALT="----------------------"
      WIDTH=1010 HEIGHT=2 >
<P>
<IMG ALIGN="right" SRC="../../images/pslrb6d.gif" ALT="Parsifal Software"
                WIDTH=181 HEIGHT=25>
<BR CLEAR="right">

<P>
Back to :
<A HREF="../../index.html">Index</A> |
<A HREF="index.html">Macro preprocessor overview</A>

<P>
<ADDRESS><FONT SIZE="-1">
                  AnaGram parser generator - examples<BR>
                  C Parsers - Macro preprocessor and C Parser <BR>
                  Copyright &copy; 1993-1999, Parsifal Software. <BR>
                  All Rights Reserved.<BR>
</FONT></ADDRESS>

</BODY>
</HTML>