comparison doc/misc/html/examples/ffcex.html @ 0:13d2b8934445

Import AnaGram (near-)release tree into Mercurial.
author David A. Holland
date Sat, 22 Dec 2007 17:52:45 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:13d2b8934445
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
2 <HTML>
3 <HEAD>
4 <TITLE>Four Function Calculator</TITLE>
5
6
7 </HEAD>
8
9 <BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif"
10 TEXT="#000000" LINK="#0033CC"
11 VLINK="#CC0033" ALINK="#CC0099">
12
13 <P>
14 <IMG ALIGN="right" SRC="../images/agrsl6c.gif" ALT="AnaGram"
15 WIDTH=124 HEIGHT=30>
16
17 <BR CLEAR="all">
18
19 Back to <A HREF="../index.html">Index</A>
20 <P>
21 <IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------"
22 WIDTH=1010 HEIGHT=2 >
23 </P>
24
25
26 <H2>Four Function Calculator:<BR>An Annotated AnaGram Example</H2>
27 <IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------"
28 WIDTH=1010 HEIGHT=2 >
29
30
31 <P>The following example is a complete program: The output produced by AnaGram
32 from this example can be compiled, linked and run without any support modules
33 other than the standard run-time library provided by any C compiler. In the
34 interest of brevity, the example has been kept very simple.
35 </P>
36 <P>FFCALC.SYN implements a simple four function calculator which reads its
37 input from stdin. The calculator has 52 registers, labeled 'a' through 'z' and
38 'A' through 'Z'. FFCALC evaluates arithmetic expressions and assignment
39 statements and prints the results to stdout. The expressions may contain '+',
40 '-', '*', and '/' operators as well as parentheses. In addition, FFCALC supports
41 the free use of white space and C style comments in the input. It also contains
42 complete error handling, including syntax error diagnostics and
43 <A HREF="../gloss.html#Resynchronization">resynchronization</A> after syntax errors.
44 </P>
45 <P> <STRONG>For purposes of annotation, line numbers have been inserted at the left
46 margin.</STRONG> The line numbers are not part of the AnaGram syntax.
47 Immediately following the example are some brief explanatory notes keyed to the
48 line numbers.
49 </P>
50 <PRE>
51 <A HREF="#Note1" NAME="Line1">Line 1:</A> {/* FOUR FUNCTION CALCULATOR: FFCALC.SYN */}
52
53 Line 2: // -- CONFIGURATION SECTION ----------------------------
54 <A HREF="#Note3" NAME="Line3">Line 3:</A> [
55 <A HREF="#Note4" NAME="Line4">Line 4:</A> default token type = double
56 <A HREF="#Note5" NAME="Line5">Line 5:</A> disregard white space
57 <A HREF="#Note6" NAME="Line6">Line 6:</A> lexeme { real}
58 <A HREF="#Note7" NAME="Line7">Line 7:</A> // You could specify traditional engine here
59 Line 8: ]
60
61 <A HREF="#Note9" NAME="Line9">Line 9:</A> // -- FOUR FUNCTION CALCULATOR -------------------------
62 <A HREF="#Note10" NAME="Line10">Line 10:</A> (void) calculator $
63 <A HREF="#Note11" NAME="Line11">Line 11:</A> -&gt; [calculation?, '\n']..., eof
64
65 Line 12: (void) calculation
66 <A HREF="#Note13" NAME="Line13">Line 13:</A> -&gt; expression:x =printf("%g\n",x);
67 <A HREF="#Note14" NAME="Line14">Line 14:</A> -&gt; name:n, '=', expression:x ={
68 Line 15: printf("%c = %g\n",n+'A',value[n]=x);}
69 <A HREF="#Note16" NAME="Line16">Line 16:</A> -&gt; error
70
71 Line 17: expression
72 <A HREF="#Note18" NAME="Line18">Line 18:</A> -&gt; term
73 <A HREF="#Note19" NAME="Line19">Line 19:</A> -&gt; expression:x, '+', term:t = x+t;
74 Line 20: -&gt; expression:x, '-', term:t = x-t;
75
76 Line 21: term
77 Line 22: -&gt; factor
78 Line 23: -&gt; term:t, '*', factor:f = t*f;
79 <A HREF="#Note24" NAME="Line24">Line 24:</A> -&gt; term:t, '/', factor:f = t/f;
80
81 Line 25: factor
82 Line 26: -&gt; name:n = value[n];
83 Line 27: -&gt; real
84 Line 28: -&gt; '(', expression:x, ')' = x;
85 Line 29: -&gt; '-', factor:f = -f;
86
87 Line 30: // -- LEXICAL UNITS ------------------------------------
88 <A HREF="#Note31" NAME="Line31">Line 31:</A> digit = '0-9'
89 <A HREF="#Note32" NAME="Line32">Line 32:</A> eof = -1
90
91 <A HREF="#Note33" NAME="Line33">Line 33:</A> (void) white space
92 <A HREF="#Note34" NAME="Line34">Line 34:</A> -&gt; ' ' + '\t' + '\r' + '\f' + '\v'
93 <A HREF="#Note35" NAME="Line35">Line 35:</A> -&gt; "/*", ~eof?..., "*/" // C style comment
94
95 <A HREF="#Note36" NAME="Line36">Line 36:</A> (int) name
96 <A HREF="#Note37" NAME="Line37">Line 37:</A> -&gt; 'a-z' + 'A-Z':c = c-'A';
97
98 <A NAME="Line38">Line 38:</A> real
99 Line 39: -&gt; integer part:i, '.', fraction part:f = i+f;
100 Line 40: -&gt; integer part, '.'?
101 Line 41: -&gt; '.', fraction part:f = f;
102
103 Line 42: integer part
104 Line 43: -&gt; digit:d = d-'0';
105 Line 44: -&gt; integer part:x, digit:d = 10*x + d-'0';
106
107 Line 45: fraction part
108 Line 46: -&gt; digit:d = (d-'0')/10.;
109 Line 47: -&gt; digit:d, fraction part:f = (d-'0' + f)/10.;
110
111 Line 48: { /* -- EMBEDDED C ---------------------------------- */
112 Line 49: double value[64]; /* registers */
113 <A HREF="#Note50" NAME="Line50">Line 50:</A> void main(void) {
114 <A HREF="#Note51" NAME="Line51">Line 51:</A> ffcalc();
115 Line 52: }
116 Line 53: } // -- END OF EMBEDDED C ------------------------------
117 </PRE>
118 <H3>Notes to example</H3>
119 <P>General note: When an AnaGram <A HREF="../gloss.html#Grammar">grammar</A> is written to use direct character
120 input, the <A HREF="../gloss.html#Terminal">terminal tokens</A> are written as <A HREF="../gloss.html#CharacterSets">character sets</A>. A single character is
121 construed to be the set consisting only of the character itself. Otherwise
122 character sets can be defined by ranges, e.g., 'a-z', or by set expressions
123 using +, -, &amp;, or ~ to represent <A HREF="../gloss.html#SetUnion">union</A>,
124 <A HREF="../gloss.html#SetDifference">difference</A>, <A HREF="../gloss.html#SetIntersection">intersection</A>, or
125 <A HREF="../gloss.html#SetComplement">complement</A> respectively. If the sets used in the grammar are not pairwise
126 disjoint, and they seldom are, AnaGram calculates a disjoint covering of the
127 <A HREF="../gloss.html#Universe">character universe</A>, and extends the grammar appropriately. The semantic value of
128 a terminal token is the ascii character code, so that semantic distinctions may
129 still be made even when characters are syntactically equivalent.
130 </P>
131 <P><A HREF="#Line1" NAME="Note1">Line 1.</A> Braces { } are used
132 to denote embedded C or C++ code that should be passed unchanged to the <A HREF="../gloss.html#Parser">parser</A>.
133 Embedded C at the very beginning of the syntax file is placed at the beginning
134 of the parser file. All other embedded C is placed following a set of
135 definitions and declarations AnaGram needs for the code it generates. AnaGram
136 saves up all the <A HREF="../gloss.html#ReductionProcedure">reduction procedures</A>, or semantic actions, and places them
137 after all the embedded C.
138 </P>
139 <P><A HREF="#Line3" NAME="Note3">Line 3.</A> Brackets [ ] are used to denote
140 configuration sections. Configuration sections contain settings for
141 configuration parameters and switches, and a number of attribute statements that
142 provide metasyntactic information.
143 </P>
144 <P><A HREF="#Line4" NAME="Note4">Line 4.</A> This statement sets the default
145 token type for <A HREF="../gloss.html#Nonterminal">nonterminal tokens</A> to double. The default value for "default
146 token type" is void. You can override the type for a particular token using
147 an explicit cast. See
148 <A HREF="#Line10">line 10.</A> The default type for <A HREF="../gloss.html#Terminal">terminal tokens</A> is int.
149 AnaGram uses the token type declarations to set up calls and definitions of
150 <A HREF="../gloss.html#ReductionProcedure">reduction procedures</A> and also to set up the parser value stack.
151 </P>
152 <P><A HREF="#Line5" NAME="Note5">Line 5.</A> The disregard statement tells
153 AnaGram to extend the <A HREF="../gloss.html#Grammar">grammar</A> so that the generated <A HREF="../gloss.html#Parser">parser</A> will skip all
154 instances of white space which are not contained within lexemes. "White
155 space" is a token defined at <A HREF="#Line33">line 33</A>. There is
156 nothing magic about the name. Any other name could have been used.
157 </P>
158 <P><A HREF="#Line6" NAME="Note6">Line 6.</A> The lexeme statement identifies a
159 list of <A HREF="../gloss.html#Nonterminal">nonterminal tokens</A> within which the "disregard" statement is
160 inoperative. real is defined at <A HREF="#Line38">line 38</A>.
161 </P>
162 <P><A HREF="#Line7" NAME="Note7">Line 7.</A> "traditional engine" is
163 a configuration switch. Simply asserting it turns it on. You can also write:
164 traditional engine = ON. To turn off a switch use ~: thus ~traditional engine
165 would guarantee the switch is off, whatever its default value. Alternatively set
166 traditional engine = OFF.
167 </P>
168 <P>AnaGram <A HREF="../gloss.html#Parser">parsers</A> normally use a parsing engine with more than the standard
169 four parsing <A HREF="../gloss.html#ParserAction">actions</A>: shift, reduce, error and accept. The extra actions are
170 compound actions. The result of using these actions is to speed up the parser
171 and to reduce the size of the state table by about fifty per cent. The
172 traditional engine switch turns this optimization off, so the parser will only
173 use the four traditional actions. This is usually only done for clarity when
174 using the File Trace or Grammar Trace options described below.
175 </P>
176 <P><A HREF="#Line9" NAME="Note9">Line 9. </A>AnaGram supports both C and C++
177 style comments. Nesting of C comments is controlled by the "nest comments"
178 switch.
179 </P>
180 <P><A HREF="#Line10" NAME="Note10">Line 10.</A> An explicit cast can be used
181 to override the token type for <A HREF="../gloss.html#Nonterminal">nonterminal tokens</A>. Types can be just about any C
182 or C++ type, including template types. Basically, the only exceptions are types
183 containing <CODE>( )</CODE> or <CODE>[ ]</CODE>.
184 </P>
185 <P>The simplest way to specify the <A HREF="../gloss.html#GrammarToken">goal token</A> for a <A HREF="../gloss.html#Grammar">grammar</A> is to mark it with
186 a dollar sign. You can also simply name it "grammar", or set the "grammar
187 token" parameter in a configuration section.
188 </P>
189 <P><A HREF="#Line11" NAME="Note11">Line 11.</A> For "<CODE>-&gt;</CODE>"
190 read "produces".
191 A question mark following a token name makes it optional. Tokens in a rule
192 are separated by commas. Multiple rules with the same left side can also be
193 separated with the vertical bar, '<CODE>|</CODE>'.
194 </P>
195 <P>The rules for character constants are the same as for C. Brackets []
196 indicate the rule is optional. Braces { } would be used if the rule were not
197 optional. Brackets and braces can include multiple rules separated by |. The
198 ellipsis ... indicates unlimited repetition. These constructs are referred to as
199 <A HREF="../gloss.html#VirtualProduction">"virtual productions"</A>. eof is defined at <A HREF="#Line32">line 32</A>.
200
201 </P>
202 <P><A HREF="#Line10">Lines 10 and 11</A> taken together specify that this
203 <A HREF="../gloss.html#Grammar">grammar</A> describes a possibly empty sequence of lines terminated with an eof
204 character. Each line contains an optional "calculation" followed by a
205 newline character.
206 </P>
207 <P><A HREF="#Line13" NAME="Note13">Line 13</A>. To assign the value of a token
208 (stored on the parser value stack) to a c variable for use in a semantic action,
209 or <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A>, simply follow the token name with a colon and the name
210 of the variable.
211 </P>
212 <P>Short form reduction procedures are simple C or C++ expressions terminated
213 with a semicolon. They cannot include a newline character. The name of the C
214 variable is local to this particular procedure. Normally the value of the
215 reduction procedure is assigned to the token on the left side of the <A HREF="../gloss.html#Production">production</A>.
216 In this case, since calculation is of type "void", the result of the
217 printf call is discarded.
218 </P>
219 <P><A HREF="#Line14" NAME="Note14">Line 14.</A> When <A HREF="../gloss.html#ReductionProcedure">reduction procedures</A>
220 won't fit on a single line or are more complex than a single expression, they
221 can be enclosed in braces { }. Use a return statement to return a value.
222 </P>
223 <P><A HREF="#Line16" NAME="Note16">Line 16.</A> The error token can be used
224 to <A HREF="../gloss.html#Resynchronization">resynchronize</A> a parser after
225 encountering a syntax error. It works more or
226 less like the error token in YACC. In this case it matches any portion of a "calculation"
227 up to a syntax error and then everything up to the next newline, as determined
228 by the <A HREF="../gloss.html#Production">production</A> on <A HREF="#Line11">line 11</A>. AnaGram also provides an
229 alternative form of error continuation called "automatic resynchronization"
230 which uses a heuristic approach derived from the <A HREF="../gloss.html#Grammar">grammar</A>. By default, AnaGram
231 <A HREF="../gloss.html#Parser">parsers</A> provide syntax error diagnostics. The user may provide his own if he
232 wishes.
233 </P>
234 <P><A HREF="#Line18" NAME="Note18">Line 18.</A> If a <A HREF="../gloss.html#GrammarRule">grammar rule</A> does not
235 have a <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A>, the value of the first token in the rule is assigned
236 to the token on the left side of the <A HREF="../gloss.html#Production">production</A>.
237 </P>
238 <P><A HREF="#Line19" NAME="Note19">Line 19.</A> Since the default type
239 specification given on <A HREF="#Line4">line 4</A> was "double", x
240 and t have type double, and the <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A> returns their sum, also
241 double.
242 </P>
243 <P><A HREF="#Line24" NAME="Note24">Line 24.</A> Note that in the interest of
244 simplicity, this <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A> omits any provision for divide by zero
245 errors.
246 </P>
247 <P><A HREF="#Line31" NAME="Note31">Line 31.</A> Definition statements may be
248 used to provide shorthand names. '0-9' is a character range, as discussed above.
249
250 </P>
251 <P><A HREF="#Line32" NAME="Note32">Line 32. </A> Input characters can also be
252 defined using decimal, octal or hex notation. They are not limited to any
253 particular range, so that it is possible to define the end of file token as the
254 standard stream I/O end of file value.
255 </P>
256 <P><A HREF="#Line33" NAME="Note33">Line 33.</A> Note that AnaGram permits
257 embedded blanks in token names.
258 </P>
259 <P><A HREF="#Line34" NAME="Note34">Line 34.</A> The set consisting of blank,
260 tab, return, form feed or vertical tab.
261 </P>
262 <P><A HREF="#Line35" NAME="Note35">Line 35.</A> Keywords are strings of
263 characters enclosed in double quotes. Standard C rules apply for literal
264 strings. Keywords stand outside the character space and are recognized in
265 preference to individual characters.
266 </P>
267 <P>
268 ~ indicates the <A HREF="../gloss.html#SetComplement">complement</A> of a character set, so that ~eof is any character
269 except end of file. The <A HREF="../gloss.html#Universe">character universe</A> is the set of characters on the range
270 0..255 unless there are characters outside this range, in which case it is
271 extended to the smallest contiguous range which includes the outside characters.
272 ?... allows zero or more comment characters. This rule describes a standard C
273 comment (no nesting allowed).
274 </P>
275 <P><A HREF="#Line36" NAME="Note36">Line 36.</A> The value of name is an int,
276 an index into the value table.
277 </P>
278 <P><A HREF="#Line37" NAME="Note37">Line 37.</A> The '+' is <A HREF="../gloss.html#SetUnion">set union</A>.
279 Therefore c is any alphabetic character.
280 </P>
281 <P><A HREF="#Line50" NAME="Note50">Line 50.</A> If you don't have any embedded
282 C in your syntax file, AnaGram will create a main program automatically. Since
283 there was already embedded C at line 1, AnaGram won't automatically create a
284 main program, so we need to define one explicitly.
285 </P>
286 <P><A HREF="#Line51" NAME="Note51">Line 51.</A> The default function name for
287 the <A HREF="../gloss.html#Parser">parser</A> is taken from the file name, in lower case. There is a configuration
288 parameter available to set it to something else if necessary. Lacking any
289 contrary specification, the parser will read its input from stdin.
290 </P>
291
292
293 <P>
294 <BR>
295
296 <IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------"
297 WIDTH=1010 HEIGHT=2 >
298 <P>
299 <IMG ALIGN="right" SRC="../images/pslrb6d.gif" ALT="Parsifal Software"
300 WIDTH=181 HEIGHT=25>
301 <BR CLEAR="right">
302 <P>
303
304 Back to <A HREF="../index.html">Index</A>
305 <P>
306 <ADDRESS><FONT SIZE="-1">
307 AnaGram parser generator - examples<BR>
308 Annotated four function calculator<BR>
309 Copyright &copy; 1993-1999, Parsifal Software. <BR>
310 All Rights Reserved.<BR>
311 </FONT></ADDRESS>
312
313 </BODY>
314 </HTML>