Mercurial > ~dholland > hg > ag > index.cgi
diff doc/misc/html/examples/ffcex.html @ 0:13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
author | David A. Holland |
---|---|
date | Sat, 22 Dec 2007 17:52:45 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/misc/html/examples/ffcex.html Sat Dec 22 17:52:45 2007 -0500 @@ -0,0 +1,314 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> +<HTML> +<HEAD> +<TITLE>Four Function Calculator</TITLE> + + +</HEAD> + +<BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif" + TEXT="#000000" LINK="#0033CC" + VLINK="#CC0033" ALINK="#CC0099"> + +<P> +<IMG ALIGN="right" SRC="../images/agrsl6c.gif" ALT="AnaGram" + WIDTH=124 HEIGHT=30> + +<BR CLEAR="all"> + +Back to <A HREF="../index.html">Index</A> +<P> +<IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------" + WIDTH=1010 HEIGHT=2 > +</P> + + +<H2>Four Function Calculator:<BR>An Annotated AnaGram Example</H2> +<IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------" + WIDTH=1010 HEIGHT=2 > + + +<P>The following example is a complete program: The output produced by AnaGram +from this example can be compiled, linked and run without any support modules +other than the standard run-time library provided by any C compiler. In the +interest of brevity, the example has been kept very simple. +</P> +<P>FFCALC.SYN implements a simple four function calculator which reads its +input from stdin. The calculator has 52 registers, labeled 'a' through 'z' and +'A' through 'Z'. FFCALC evaluates arithmetic expressions and assignment +statements and prints the results to stdout. The expressions may contain '+', +'-', '*', and '/' operators as well as parentheses. In addition, FFCALC supports +the free use of white space and C style comments in the input. It also contains +complete error handling, including syntax error diagnostics and +<A HREF="../gloss.html#Resynchronization">resynchronization</A> after syntax errors. +</P> +<P> <STRONG>For purposes of annotation, line numbers have been inserted at the left +margin.</STRONG> The line numbers are not part of the AnaGram syntax. +Immediately following the example are some brief explanatory notes keyed to the +line numbers. +</P> +<PRE> +<A HREF="#Note1" NAME="Line1">Line 1:</A> {/* FOUR FUNCTION CALCULATOR: FFCALC.SYN */} + +Line 2: // -- CONFIGURATION SECTION ---------------------------- +<A HREF="#Note3" NAME="Line3">Line 3:</A> [ +<A HREF="#Note4" NAME="Line4">Line 4:</A> default token type = double +<A HREF="#Note5" NAME="Line5">Line 5:</A> disregard white space +<A HREF="#Note6" NAME="Line6">Line 6:</A> lexeme { real} +<A HREF="#Note7" NAME="Line7">Line 7:</A> // You could specify traditional engine here +Line 8: ] + +<A HREF="#Note9" NAME="Line9">Line 9:</A> // -- FOUR FUNCTION CALCULATOR ------------------------- +<A HREF="#Note10" NAME="Line10">Line 10:</A> (void) calculator $ +<A HREF="#Note11" NAME="Line11">Line 11:</A> -> [calculation?, '\n']..., eof + +Line 12: (void) calculation +<A HREF="#Note13" NAME="Line13">Line 13:</A> -> expression:x =printf("%g\n",x); +<A HREF="#Note14" NAME="Line14">Line 14:</A> -> name:n, '=', expression:x ={ +Line 15: printf("%c = %g\n",n+'A',value[n]=x);} +<A HREF="#Note16" NAME="Line16">Line 16:</A> -> error + +Line 17: expression +<A HREF="#Note18" NAME="Line18">Line 18:</A> -> term +<A HREF="#Note19" NAME="Line19">Line 19:</A> -> expression:x, '+', term:t = x+t; +Line 20: -> expression:x, '-', term:t = x-t; + +Line 21: term +Line 22: -> factor +Line 23: -> term:t, '*', factor:f = t*f; +<A HREF="#Note24" NAME="Line24">Line 24:</A> -> term:t, '/', factor:f = t/f; + +Line 25: factor +Line 26: -> name:n = value[n]; +Line 27: -> real +Line 28: -> '(', expression:x, ')' = x; +Line 29: -> '-', factor:f = -f; + +Line 30: // -- LEXICAL UNITS ------------------------------------ +<A HREF="#Note31" NAME="Line31">Line 31:</A> digit = '0-9' +<A HREF="#Note32" NAME="Line32">Line 32:</A> eof = -1 + +<A HREF="#Note33" NAME="Line33">Line 33:</A> (void) white space +<A HREF="#Note34" NAME="Line34">Line 34:</A> -> ' ' + '\t' + '\r' + '\f' + '\v' +<A HREF="#Note35" NAME="Line35">Line 35:</A> -> "/*", ~eof?..., "*/" // C style comment + +<A HREF="#Note36" NAME="Line36">Line 36:</A> (int) name +<A HREF="#Note37" NAME="Line37">Line 37:</A> -> 'a-z' + 'A-Z':c = c-'A'; + +<A NAME="Line38">Line 38:</A> real +Line 39: -> integer part:i, '.', fraction part:f = i+f; +Line 40: -> integer part, '.'? +Line 41: -> '.', fraction part:f = f; + +Line 42: integer part +Line 43: -> digit:d = d-'0'; +Line 44: -> integer part:x, digit:d = 10*x + d-'0'; + +Line 45: fraction part +Line 46: -> digit:d = (d-'0')/10.; +Line 47: -> digit:d, fraction part:f = (d-'0' + f)/10.; + +Line 48: { /* -- EMBEDDED C ---------------------------------- */ +Line 49: double value[64]; /* registers */ +<A HREF="#Note50" NAME="Line50">Line 50:</A> void main(void) { +<A HREF="#Note51" NAME="Line51">Line 51:</A> ffcalc(); +Line 52: } +Line 53: } // -- END OF EMBEDDED C ------------------------------ +</PRE> +<H3>Notes to example</H3> +<P>General note: When an AnaGram <A HREF="../gloss.html#Grammar">grammar</A> is written to use direct character +input, the <A HREF="../gloss.html#Terminal">terminal tokens</A> are written as <A HREF="../gloss.html#CharacterSets">character sets</A>. A single character is +construed to be the set consisting only of the character itself. Otherwise +character sets can be defined by ranges, e.g., 'a-z', or by set expressions +using +, -, &, or ~ to represent <A HREF="../gloss.html#SetUnion">union</A>, +<A HREF="../gloss.html#SetDifference">difference</A>, <A HREF="../gloss.html#SetIntersection">intersection</A>, or +<A HREF="../gloss.html#SetComplement">complement</A> respectively. If the sets used in the grammar are not pairwise +disjoint, and they seldom are, AnaGram calculates a disjoint covering of the +<A HREF="../gloss.html#Universe">character universe</A>, and extends the grammar appropriately. The semantic value of +a terminal token is the ascii character code, so that semantic distinctions may +still be made even when characters are syntactically equivalent. +</P> +<P><A HREF="#Line1" NAME="Note1">Line 1.</A> Braces { } are used +to denote embedded C or C++ code that should be passed unchanged to the <A HREF="../gloss.html#Parser">parser</A>. +Embedded C at the very beginning of the syntax file is placed at the beginning +of the parser file. All other embedded C is placed following a set of +definitions and declarations AnaGram needs for the code it generates. AnaGram +saves up all the <A HREF="../gloss.html#ReductionProcedure">reduction procedures</A>, or semantic actions, and places them +after all the embedded C. +</P> +<P><A HREF="#Line3" NAME="Note3">Line 3.</A> Brackets [ ] are used to denote +configuration sections. Configuration sections contain settings for +configuration parameters and switches, and a number of attribute statements that +provide metasyntactic information. +</P> +<P><A HREF="#Line4" NAME="Note4">Line 4.</A> This statement sets the default +token type for <A HREF="../gloss.html#Nonterminal">nonterminal tokens</A> to double. The default value for "default +token type" is void. You can override the type for a particular token using +an explicit cast. See +<A HREF="#Line10">line 10.</A> The default type for <A HREF="../gloss.html#Terminal">terminal tokens</A> is int. +AnaGram uses the token type declarations to set up calls and definitions of +<A HREF="../gloss.html#ReductionProcedure">reduction procedures</A> and also to set up the parser value stack. +</P> +<P><A HREF="#Line5" NAME="Note5">Line 5.</A> The disregard statement tells +AnaGram to extend the <A HREF="../gloss.html#Grammar">grammar</A> so that the generated <A HREF="../gloss.html#Parser">parser</A> will skip all +instances of white space which are not contained within lexemes. "White +space" is a token defined at <A HREF="#Line33">line 33</A>. There is +nothing magic about the name. Any other name could have been used. +</P> +<P><A HREF="#Line6" NAME="Note6">Line 6.</A> The lexeme statement identifies a +list of <A HREF="../gloss.html#Nonterminal">nonterminal tokens</A> within which the "disregard" statement is +inoperative. real is defined at <A HREF="#Line38">line 38</A>. +</P> +<P><A HREF="#Line7" NAME="Note7">Line 7.</A> "traditional engine" is +a configuration switch. Simply asserting it turns it on. You can also write: +traditional engine = ON. To turn off a switch use ~: thus ~traditional engine +would guarantee the switch is off, whatever its default value. Alternatively set +traditional engine = OFF. +</P> +<P>AnaGram <A HREF="../gloss.html#Parser">parsers</A> normally use a parsing engine with more than the standard +four parsing <A HREF="../gloss.html#ParserAction">actions</A>: shift, reduce, error and accept. The extra actions are +compound actions. The result of using these actions is to speed up the parser +and to reduce the size of the state table by about fifty per cent. The +traditional engine switch turns this optimization off, so the parser will only +use the four traditional actions. This is usually only done for clarity when +using the File Trace or Grammar Trace options described below. +</P> +<P><A HREF="#Line9" NAME="Note9">Line 9. </A>AnaGram supports both C and C++ +style comments. Nesting of C comments is controlled by the "nest comments" +switch. +</P> +<P><A HREF="#Line10" NAME="Note10">Line 10.</A> An explicit cast can be used +to override the token type for <A HREF="../gloss.html#Nonterminal">nonterminal tokens</A>. Types can be just about any C +or C++ type, including template types. Basically, the only exceptions are types +containing <CODE>( )</CODE> or <CODE>[ ]</CODE>. +</P> +<P>The simplest way to specify the <A HREF="../gloss.html#GrammarToken">goal token</A> for a <A HREF="../gloss.html#Grammar">grammar</A> is to mark it with +a dollar sign. You can also simply name it "grammar", or set the "grammar +token" parameter in a configuration section. +</P> +<P><A HREF="#Line11" NAME="Note11">Line 11.</A> For "<CODE>-></CODE>" + read "produces". +A question mark following a token name makes it optional. Tokens in a rule +are separated by commas. Multiple rules with the same left side can also be +separated with the vertical bar, '<CODE>|</CODE>'. +</P> +<P>The rules for character constants are the same as for C. Brackets [] +indicate the rule is optional. Braces { } would be used if the rule were not +optional. Brackets and braces can include multiple rules separated by |. The +ellipsis ... indicates unlimited repetition. These constructs are referred to as +<A HREF="../gloss.html#VirtualProduction">"virtual productions"</A>. eof is defined at <A HREF="#Line32">line 32</A>. + +</P> +<P><A HREF="#Line10">Lines 10 and 11</A> taken together specify that this +<A HREF="../gloss.html#Grammar">grammar</A> describes a possibly empty sequence of lines terminated with an eof +character. Each line contains an optional "calculation" followed by a +newline character. +</P> +<P><A HREF="#Line13" NAME="Note13">Line 13</A>. To assign the value of a token +(stored on the parser value stack) to a c variable for use in a semantic action, +or <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A>, simply follow the token name with a colon and the name +of the variable. +</P> +<P>Short form reduction procedures are simple C or C++ expressions terminated +with a semicolon. They cannot include a newline character. The name of the C +variable is local to this particular procedure. Normally the value of the +reduction procedure is assigned to the token on the left side of the <A HREF="../gloss.html#Production">production</A>. +In this case, since calculation is of type "void", the result of the +printf call is discarded. +</P> +<P><A HREF="#Line14" NAME="Note14">Line 14.</A> When <A HREF="../gloss.html#ReductionProcedure">reduction procedures</A> +won't fit on a single line or are more complex than a single expression, they +can be enclosed in braces { }. Use a return statement to return a value. +</P> +<P><A HREF="#Line16" NAME="Note16">Line 16.</A> The error token can be used +to <A HREF="../gloss.html#Resynchronization">resynchronize</A> a parser after +encountering a syntax error. It works more or +less like the error token in YACC. In this case it matches any portion of a "calculation" +up to a syntax error and then everything up to the next newline, as determined +by the <A HREF="../gloss.html#Production">production</A> on <A HREF="#Line11">line 11</A>. AnaGram also provides an +alternative form of error continuation called "automatic resynchronization" +which uses a heuristic approach derived from the <A HREF="../gloss.html#Grammar">grammar</A>. By default, AnaGram +<A HREF="../gloss.html#Parser">parsers</A> provide syntax error diagnostics. The user may provide his own if he +wishes. +</P> +<P><A HREF="#Line18" NAME="Note18">Line 18.</A> If a <A HREF="../gloss.html#GrammarRule">grammar rule</A> does not +have a <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A>, the value of the first token in the rule is assigned +to the token on the left side of the <A HREF="../gloss.html#Production">production</A>. +</P> +<P><A HREF="#Line19" NAME="Note19">Line 19.</A> Since the default type +specification given on <A HREF="#Line4">line 4</A> was "double", x +and t have type double, and the <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A> returns their sum, also +double. +</P> +<P><A HREF="#Line24" NAME="Note24">Line 24.</A> Note that in the interest of +simplicity, this <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A> omits any provision for divide by zero +errors. +</P> +<P><A HREF="#Line31" NAME="Note31">Line 31.</A> Definition statements may be +used to provide shorthand names. '0-9' is a character range, as discussed above. + +</P> +<P><A HREF="#Line32" NAME="Note32">Line 32. </A> Input characters can also be +defined using decimal, octal or hex notation. They are not limited to any +particular range, so that it is possible to define the end of file token as the +standard stream I/O end of file value. +</P> +<P><A HREF="#Line33" NAME="Note33">Line 33.</A> Note that AnaGram permits +embedded blanks in token names. +</P> +<P><A HREF="#Line34" NAME="Note34">Line 34.</A> The set consisting of blank, +tab, return, form feed or vertical tab. +</P> +<P><A HREF="#Line35" NAME="Note35">Line 35.</A> Keywords are strings of +characters enclosed in double quotes. Standard C rules apply for literal +strings. Keywords stand outside the character space and are recognized in +preference to individual characters. +</P> +<P> +~ indicates the <A HREF="../gloss.html#SetComplement">complement</A> of a character set, so that ~eof is any character +except end of file. The <A HREF="../gloss.html#Universe">character universe</A> is the set of characters on the range +0..255 unless there are characters outside this range, in which case it is +extended to the smallest contiguous range which includes the outside characters. + ?... allows zero or more comment characters. This rule describes a standard C +comment (no nesting allowed). +</P> +<P><A HREF="#Line36" NAME="Note36">Line 36.</A> The value of name is an int, +an index into the value table. +</P> +<P><A HREF="#Line37" NAME="Note37">Line 37.</A> The '+' is <A HREF="../gloss.html#SetUnion">set union</A>. +Therefore c is any alphabetic character. +</P> +<P><A HREF="#Line50" NAME="Note50">Line 50.</A> If you don't have any embedded +C in your syntax file, AnaGram will create a main program automatically. Since +there was already embedded C at line 1, AnaGram won't automatically create a +main program, so we need to define one explicitly. +</P> +<P><A HREF="#Line51" NAME="Note51">Line 51.</A> The default function name for +the <A HREF="../gloss.html#Parser">parser</A> is taken from the file name, in lower case. There is a configuration +parameter available to set it to something else if necessary. Lacking any +contrary specification, the parser will read its input from stdin. +</P> + + +<P> +<BR> + +<IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------" + WIDTH=1010 HEIGHT=2 > +<P> +<IMG ALIGN="right" SRC="../images/pslrb6d.gif" ALT="Parsifal Software" + WIDTH=181 HEIGHT=25> +<BR CLEAR="right"> +<P> + +Back to <A HREF="../index.html">Index</A> +<P> +<ADDRESS><FONT SIZE="-1"> + AnaGram parser generator - examples<BR> + Annotated four function calculator<BR> + Copyright © 1993-1999, Parsifal Software. <BR> + All Rights Reserved.<BR> +</FONT></ADDRESS> + +</BODY> +</HTML>