Mercurial > ~dholland > hg > ag > index.cgi
comparison doc/misc/html/examples/ffcex.html @ 0:13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
author | David A. Holland |
---|---|
date | Sat, 22 Dec 2007 17:52:45 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:13d2b8934445 |
---|---|
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> | |
2 <HTML> | |
3 <HEAD> | |
4 <TITLE>Four Function Calculator</TITLE> | |
5 | |
6 | |
7 </HEAD> | |
8 | |
9 <BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif" | |
10 TEXT="#000000" LINK="#0033CC" | |
11 VLINK="#CC0033" ALINK="#CC0099"> | |
12 | |
13 <P> | |
14 <IMG ALIGN="right" SRC="../images/agrsl6c.gif" ALT="AnaGram" | |
15 WIDTH=124 HEIGHT=30> | |
16 | |
17 <BR CLEAR="all"> | |
18 | |
19 Back to <A HREF="../index.html">Index</A> | |
20 <P> | |
21 <IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------" | |
22 WIDTH=1010 HEIGHT=2 > | |
23 </P> | |
24 | |
25 | |
26 <H2>Four Function Calculator:<BR>An Annotated AnaGram Example</H2> | |
27 <IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------" | |
28 WIDTH=1010 HEIGHT=2 > | |
29 | |
30 | |
31 <P>The following example is a complete program: The output produced by AnaGram | |
32 from this example can be compiled, linked and run without any support modules | |
33 other than the standard run-time library provided by any C compiler. In the | |
34 interest of brevity, the example has been kept very simple. | |
35 </P> | |
36 <P>FFCALC.SYN implements a simple four function calculator which reads its | |
37 input from stdin. The calculator has 52 registers, labeled 'a' through 'z' and | |
38 'A' through 'Z'. FFCALC evaluates arithmetic expressions and assignment | |
39 statements and prints the results to stdout. The expressions may contain '+', | |
40 '-', '*', and '/' operators as well as parentheses. In addition, FFCALC supports | |
41 the free use of white space and C style comments in the input. It also contains | |
42 complete error handling, including syntax error diagnostics and | |
43 <A HREF="../gloss.html#Resynchronization">resynchronization</A> after syntax errors. | |
44 </P> | |
45 <P> <STRONG>For purposes of annotation, line numbers have been inserted at the left | |
46 margin.</STRONG> The line numbers are not part of the AnaGram syntax. | |
47 Immediately following the example are some brief explanatory notes keyed to the | |
48 line numbers. | |
49 </P> | |
50 <PRE> | |
51 <A HREF="#Note1" NAME="Line1">Line 1:</A> {/* FOUR FUNCTION CALCULATOR: FFCALC.SYN */} | |
52 | |
53 Line 2: // -- CONFIGURATION SECTION ---------------------------- | |
54 <A HREF="#Note3" NAME="Line3">Line 3:</A> [ | |
55 <A HREF="#Note4" NAME="Line4">Line 4:</A> default token type = double | |
56 <A HREF="#Note5" NAME="Line5">Line 5:</A> disregard white space | |
57 <A HREF="#Note6" NAME="Line6">Line 6:</A> lexeme { real} | |
58 <A HREF="#Note7" NAME="Line7">Line 7:</A> // You could specify traditional engine here | |
59 Line 8: ] | |
60 | |
61 <A HREF="#Note9" NAME="Line9">Line 9:</A> // -- FOUR FUNCTION CALCULATOR ------------------------- | |
62 <A HREF="#Note10" NAME="Line10">Line 10:</A> (void) calculator $ | |
63 <A HREF="#Note11" NAME="Line11">Line 11:</A> -> [calculation?, '\n']..., eof | |
64 | |
65 Line 12: (void) calculation | |
66 <A HREF="#Note13" NAME="Line13">Line 13:</A> -> expression:x =printf("%g\n",x); | |
67 <A HREF="#Note14" NAME="Line14">Line 14:</A> -> name:n, '=', expression:x ={ | |
68 Line 15: printf("%c = %g\n",n+'A',value[n]=x);} | |
69 <A HREF="#Note16" NAME="Line16">Line 16:</A> -> error | |
70 | |
71 Line 17: expression | |
72 <A HREF="#Note18" NAME="Line18">Line 18:</A> -> term | |
73 <A HREF="#Note19" NAME="Line19">Line 19:</A> -> expression:x, '+', term:t = x+t; | |
74 Line 20: -> expression:x, '-', term:t = x-t; | |
75 | |
76 Line 21: term | |
77 Line 22: -> factor | |
78 Line 23: -> term:t, '*', factor:f = t*f; | |
79 <A HREF="#Note24" NAME="Line24">Line 24:</A> -> term:t, '/', factor:f = t/f; | |
80 | |
81 Line 25: factor | |
82 Line 26: -> name:n = value[n]; | |
83 Line 27: -> real | |
84 Line 28: -> '(', expression:x, ')' = x; | |
85 Line 29: -> '-', factor:f = -f; | |
86 | |
87 Line 30: // -- LEXICAL UNITS ------------------------------------ | |
88 <A HREF="#Note31" NAME="Line31">Line 31:</A> digit = '0-9' | |
89 <A HREF="#Note32" NAME="Line32">Line 32:</A> eof = -1 | |
90 | |
91 <A HREF="#Note33" NAME="Line33">Line 33:</A> (void) white space | |
92 <A HREF="#Note34" NAME="Line34">Line 34:</A> -> ' ' + '\t' + '\r' + '\f' + '\v' | |
93 <A HREF="#Note35" NAME="Line35">Line 35:</A> -> "/*", ~eof?..., "*/" // C style comment | |
94 | |
95 <A HREF="#Note36" NAME="Line36">Line 36:</A> (int) name | |
96 <A HREF="#Note37" NAME="Line37">Line 37:</A> -> 'a-z' + 'A-Z':c = c-'A'; | |
97 | |
98 <A NAME="Line38">Line 38:</A> real | |
99 Line 39: -> integer part:i, '.', fraction part:f = i+f; | |
100 Line 40: -> integer part, '.'? | |
101 Line 41: -> '.', fraction part:f = f; | |
102 | |
103 Line 42: integer part | |
104 Line 43: -> digit:d = d-'0'; | |
105 Line 44: -> integer part:x, digit:d = 10*x + d-'0'; | |
106 | |
107 Line 45: fraction part | |
108 Line 46: -> digit:d = (d-'0')/10.; | |
109 Line 47: -> digit:d, fraction part:f = (d-'0' + f)/10.; | |
110 | |
111 Line 48: { /* -- EMBEDDED C ---------------------------------- */ | |
112 Line 49: double value[64]; /* registers */ | |
113 <A HREF="#Note50" NAME="Line50">Line 50:</A> void main(void) { | |
114 <A HREF="#Note51" NAME="Line51">Line 51:</A> ffcalc(); | |
115 Line 52: } | |
116 Line 53: } // -- END OF EMBEDDED C ------------------------------ | |
117 </PRE> | |
118 <H3>Notes to example</H3> | |
119 <P>General note: When an AnaGram <A HREF="../gloss.html#Grammar">grammar</A> is written to use direct character | |
120 input, the <A HREF="../gloss.html#Terminal">terminal tokens</A> are written as <A HREF="../gloss.html#CharacterSets">character sets</A>. A single character is | |
121 construed to be the set consisting only of the character itself. Otherwise | |
122 character sets can be defined by ranges, e.g., 'a-z', or by set expressions | |
123 using +, -, &, or ~ to represent <A HREF="../gloss.html#SetUnion">union</A>, | |
124 <A HREF="../gloss.html#SetDifference">difference</A>, <A HREF="../gloss.html#SetIntersection">intersection</A>, or | |
125 <A HREF="../gloss.html#SetComplement">complement</A> respectively. If the sets used in the grammar are not pairwise | |
126 disjoint, and they seldom are, AnaGram calculates a disjoint covering of the | |
127 <A HREF="../gloss.html#Universe">character universe</A>, and extends the grammar appropriately. The semantic value of | |
128 a terminal token is the ascii character code, so that semantic distinctions may | |
129 still be made even when characters are syntactically equivalent. | |
130 </P> | |
131 <P><A HREF="#Line1" NAME="Note1">Line 1.</A> Braces { } are used | |
132 to denote embedded C or C++ code that should be passed unchanged to the <A HREF="../gloss.html#Parser">parser</A>. | |
133 Embedded C at the very beginning of the syntax file is placed at the beginning | |
134 of the parser file. All other embedded C is placed following a set of | |
135 definitions and declarations AnaGram needs for the code it generates. AnaGram | |
136 saves up all the <A HREF="../gloss.html#ReductionProcedure">reduction procedures</A>, or semantic actions, and places them | |
137 after all the embedded C. | |
138 </P> | |
139 <P><A HREF="#Line3" NAME="Note3">Line 3.</A> Brackets [ ] are used to denote | |
140 configuration sections. Configuration sections contain settings for | |
141 configuration parameters and switches, and a number of attribute statements that | |
142 provide metasyntactic information. | |
143 </P> | |
144 <P><A HREF="#Line4" NAME="Note4">Line 4.</A> This statement sets the default | |
145 token type for <A HREF="../gloss.html#Nonterminal">nonterminal tokens</A> to double. The default value for "default | |
146 token type" is void. You can override the type for a particular token using | |
147 an explicit cast. See | |
148 <A HREF="#Line10">line 10.</A> The default type for <A HREF="../gloss.html#Terminal">terminal tokens</A> is int. | |
149 AnaGram uses the token type declarations to set up calls and definitions of | |
150 <A HREF="../gloss.html#ReductionProcedure">reduction procedures</A> and also to set up the parser value stack. | |
151 </P> | |
152 <P><A HREF="#Line5" NAME="Note5">Line 5.</A> The disregard statement tells | |
153 AnaGram to extend the <A HREF="../gloss.html#Grammar">grammar</A> so that the generated <A HREF="../gloss.html#Parser">parser</A> will skip all | |
154 instances of white space which are not contained within lexemes. "White | |
155 space" is a token defined at <A HREF="#Line33">line 33</A>. There is | |
156 nothing magic about the name. Any other name could have been used. | |
157 </P> | |
158 <P><A HREF="#Line6" NAME="Note6">Line 6.</A> The lexeme statement identifies a | |
159 list of <A HREF="../gloss.html#Nonterminal">nonterminal tokens</A> within which the "disregard" statement is | |
160 inoperative. real is defined at <A HREF="#Line38">line 38</A>. | |
161 </P> | |
162 <P><A HREF="#Line7" NAME="Note7">Line 7.</A> "traditional engine" is | |
163 a configuration switch. Simply asserting it turns it on. You can also write: | |
164 traditional engine = ON. To turn off a switch use ~: thus ~traditional engine | |
165 would guarantee the switch is off, whatever its default value. Alternatively set | |
166 traditional engine = OFF. | |
167 </P> | |
168 <P>AnaGram <A HREF="../gloss.html#Parser">parsers</A> normally use a parsing engine with more than the standard | |
169 four parsing <A HREF="../gloss.html#ParserAction">actions</A>: shift, reduce, error and accept. The extra actions are | |
170 compound actions. The result of using these actions is to speed up the parser | |
171 and to reduce the size of the state table by about fifty per cent. The | |
172 traditional engine switch turns this optimization off, so the parser will only | |
173 use the four traditional actions. This is usually only done for clarity when | |
174 using the File Trace or Grammar Trace options described below. | |
175 </P> | |
176 <P><A HREF="#Line9" NAME="Note9">Line 9. </A>AnaGram supports both C and C++ | |
177 style comments. Nesting of C comments is controlled by the "nest comments" | |
178 switch. | |
179 </P> | |
180 <P><A HREF="#Line10" NAME="Note10">Line 10.</A> An explicit cast can be used | |
181 to override the token type for <A HREF="../gloss.html#Nonterminal">nonterminal tokens</A>. Types can be just about any C | |
182 or C++ type, including template types. Basically, the only exceptions are types | |
183 containing <CODE>( )</CODE> or <CODE>[ ]</CODE>. | |
184 </P> | |
185 <P>The simplest way to specify the <A HREF="../gloss.html#GrammarToken">goal token</A> for a <A HREF="../gloss.html#Grammar">grammar</A> is to mark it with | |
186 a dollar sign. You can also simply name it "grammar", or set the "grammar | |
187 token" parameter in a configuration section. | |
188 </P> | |
189 <P><A HREF="#Line11" NAME="Note11">Line 11.</A> For "<CODE>-></CODE>" | |
190 read "produces". | |
191 A question mark following a token name makes it optional. Tokens in a rule | |
192 are separated by commas. Multiple rules with the same left side can also be | |
193 separated with the vertical bar, '<CODE>|</CODE>'. | |
194 </P> | |
195 <P>The rules for character constants are the same as for C. Brackets [] | |
196 indicate the rule is optional. Braces { } would be used if the rule were not | |
197 optional. Brackets and braces can include multiple rules separated by |. The | |
198 ellipsis ... indicates unlimited repetition. These constructs are referred to as | |
199 <A HREF="../gloss.html#VirtualProduction">"virtual productions"</A>. eof is defined at <A HREF="#Line32">line 32</A>. | |
200 | |
201 </P> | |
202 <P><A HREF="#Line10">Lines 10 and 11</A> taken together specify that this | |
203 <A HREF="../gloss.html#Grammar">grammar</A> describes a possibly empty sequence of lines terminated with an eof | |
204 character. Each line contains an optional "calculation" followed by a | |
205 newline character. | |
206 </P> | |
207 <P><A HREF="#Line13" NAME="Note13">Line 13</A>. To assign the value of a token | |
208 (stored on the parser value stack) to a c variable for use in a semantic action, | |
209 or <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A>, simply follow the token name with a colon and the name | |
210 of the variable. | |
211 </P> | |
212 <P>Short form reduction procedures are simple C or C++ expressions terminated | |
213 with a semicolon. They cannot include a newline character. The name of the C | |
214 variable is local to this particular procedure. Normally the value of the | |
215 reduction procedure is assigned to the token on the left side of the <A HREF="../gloss.html#Production">production</A>. | |
216 In this case, since calculation is of type "void", the result of the | |
217 printf call is discarded. | |
218 </P> | |
219 <P><A HREF="#Line14" NAME="Note14">Line 14.</A> When <A HREF="../gloss.html#ReductionProcedure">reduction procedures</A> | |
220 won't fit on a single line or are more complex than a single expression, they | |
221 can be enclosed in braces { }. Use a return statement to return a value. | |
222 </P> | |
223 <P><A HREF="#Line16" NAME="Note16">Line 16.</A> The error token can be used | |
224 to <A HREF="../gloss.html#Resynchronization">resynchronize</A> a parser after | |
225 encountering a syntax error. It works more or | |
226 less like the error token in YACC. In this case it matches any portion of a "calculation" | |
227 up to a syntax error and then everything up to the next newline, as determined | |
228 by the <A HREF="../gloss.html#Production">production</A> on <A HREF="#Line11">line 11</A>. AnaGram also provides an | |
229 alternative form of error continuation called "automatic resynchronization" | |
230 which uses a heuristic approach derived from the <A HREF="../gloss.html#Grammar">grammar</A>. By default, AnaGram | |
231 <A HREF="../gloss.html#Parser">parsers</A> provide syntax error diagnostics. The user may provide his own if he | |
232 wishes. | |
233 </P> | |
234 <P><A HREF="#Line18" NAME="Note18">Line 18.</A> If a <A HREF="../gloss.html#GrammarRule">grammar rule</A> does not | |
235 have a <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A>, the value of the first token in the rule is assigned | |
236 to the token on the left side of the <A HREF="../gloss.html#Production">production</A>. | |
237 </P> | |
238 <P><A HREF="#Line19" NAME="Note19">Line 19.</A> Since the default type | |
239 specification given on <A HREF="#Line4">line 4</A> was "double", x | |
240 and t have type double, and the <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A> returns their sum, also | |
241 double. | |
242 </P> | |
243 <P><A HREF="#Line24" NAME="Note24">Line 24.</A> Note that in the interest of | |
244 simplicity, this <A HREF="../gloss.html#ReductionProcedure">reduction procedure</A> omits any provision for divide by zero | |
245 errors. | |
246 </P> | |
247 <P><A HREF="#Line31" NAME="Note31">Line 31.</A> Definition statements may be | |
248 used to provide shorthand names. '0-9' is a character range, as discussed above. | |
249 | |
250 </P> | |
251 <P><A HREF="#Line32" NAME="Note32">Line 32. </A> Input characters can also be | |
252 defined using decimal, octal or hex notation. They are not limited to any | |
253 particular range, so that it is possible to define the end of file token as the | |
254 standard stream I/O end of file value. | |
255 </P> | |
256 <P><A HREF="#Line33" NAME="Note33">Line 33.</A> Note that AnaGram permits | |
257 embedded blanks in token names. | |
258 </P> | |
259 <P><A HREF="#Line34" NAME="Note34">Line 34.</A> The set consisting of blank, | |
260 tab, return, form feed or vertical tab. | |
261 </P> | |
262 <P><A HREF="#Line35" NAME="Note35">Line 35.</A> Keywords are strings of | |
263 characters enclosed in double quotes. Standard C rules apply for literal | |
264 strings. Keywords stand outside the character space and are recognized in | |
265 preference to individual characters. | |
266 </P> | |
267 <P> | |
268 ~ indicates the <A HREF="../gloss.html#SetComplement">complement</A> of a character set, so that ~eof is any character | |
269 except end of file. The <A HREF="../gloss.html#Universe">character universe</A> is the set of characters on the range | |
270 0..255 unless there are characters outside this range, in which case it is | |
271 extended to the smallest contiguous range which includes the outside characters. | |
272 ?... allows zero or more comment characters. This rule describes a standard C | |
273 comment (no nesting allowed). | |
274 </P> | |
275 <P><A HREF="#Line36" NAME="Note36">Line 36.</A> The value of name is an int, | |
276 an index into the value table. | |
277 </P> | |
278 <P><A HREF="#Line37" NAME="Note37">Line 37.</A> The '+' is <A HREF="../gloss.html#SetUnion">set union</A>. | |
279 Therefore c is any alphabetic character. | |
280 </P> | |
281 <P><A HREF="#Line50" NAME="Note50">Line 50.</A> If you don't have any embedded | |
282 C in your syntax file, AnaGram will create a main program automatically. Since | |
283 there was already embedded C at line 1, AnaGram won't automatically create a | |
284 main program, so we need to define one explicitly. | |
285 </P> | |
286 <P><A HREF="#Line51" NAME="Note51">Line 51.</A> The default function name for | |
287 the <A HREF="../gloss.html#Parser">parser</A> is taken from the file name, in lower case. There is a configuration | |
288 parameter available to set it to something else if necessary. Lacking any | |
289 contrary specification, the parser will read its input from stdin. | |
290 </P> | |
291 | |
292 | |
293 <P> | |
294 <BR> | |
295 | |
296 <IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------" | |
297 WIDTH=1010 HEIGHT=2 > | |
298 <P> | |
299 <IMG ALIGN="right" SRC="../images/pslrb6d.gif" ALT="Parsifal Software" | |
300 WIDTH=181 HEIGHT=25> | |
301 <BR CLEAR="right"> | |
302 <P> | |
303 | |
304 Back to <A HREF="../index.html">Index</A> | |
305 <P> | |
306 <ADDRESS><FONT SIZE="-1"> | |
307 AnaGram parser generator - examples<BR> | |
308 Annotated four function calculator<BR> | |
309 Copyright © 1993-1999, Parsifal Software. <BR> | |
310 All Rights Reserved.<BR> | |
311 </FONT></ADDRESS> | |
312 | |
313 </BODY> | |
314 </HTML> |