Mercurial > ~dholland > hg > ag > index.cgi
comparison doc/misc/html/summary.html @ 0:13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
author | David A. Holland |
---|---|
date | Sat, 22 Dec 2007 17:52:45 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:13d2b8934445 |
---|---|
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> | |
2 <HTML> | |
3 <HEAD> | |
4 <TITLE>Summary of AnaGram Notation</TITLE> | |
5 </HEAD> | |
6 | |
7 <BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif" | |
8 TEXT="#000000" LINK="#0033CC" | |
9 VLINK="#CC0033" ALINK="#CC0099"> | |
10 | |
11 <P> | |
12 | |
13 <IMG ALIGN="right" SRC="images/agrsl6c.gif" ALT="AnaGram" | |
14 WIDTH=124 HEIGHT=30> | |
15 <BR CLEAR="all"> | |
16 Back to <A HREF="index.html">Index</A> | |
17 <P> | |
18 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------" | |
19 WIDTH=1010 HEIGHT=2 > | |
20 | |
21 <BR CLEAR="all"> | |
22 | |
23 | |
24 <H1 ALIGN="LEFT">Summary of AnaGram Notation</H1> | |
25 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------" | |
26 WIDTH=1010 HEIGHT=2 > | |
27 </P> | |
28 <BR> | |
29 | |
30 The rules for using AnaGram are given in Chapters 8 and 9 of | |
31 the AnaGram User's Guide. This page contains a brief summary. | |
32 Section headings and terms which appear in <b> bold face</b> can be found | |
33 in the online <b> Help Topics</b>. | |
34 <P> | |
35 <BR> | |
36 | |
37 <H2>Lexical Conventions</H2> | |
38 AnaGram allows the free use of spaces, tabs and <b> comments</b>. | |
39 Both C style and C++ style comments are allowed. Blank lines are | |
40 allowed, but only <i>between</i> statements. | |
41 <p> | |
42 AnaGram statements may continue onto following lines as long | |
43 as they are clearly incomplete. Normally this rule is satisfied by | |
44 dangling punctuation or open parentheses, brackets, or braces. In no | |
45 case can a statement continue over a blank line. | |
46 <P> | |
47 <BR> | |
48 | |
49 <H2>Names</H2> | |
50 Symbol names must begin with a letter or underscore, and may | |
51 contain letters, digits, or underscores. They may also contain embedded spaces, tabs, and comments. Any sequence of embedded | |
52 space, however, is replaced by a single blank character. | |
53 <p> | |
54 The names <b> eof</b>, <b> error</b>, and <b> grammar</b> have special meanings. | |
55 <P> | |
56 <BR> | |
57 | |
58 <H2>Character Representations</H2> | |
59 You may represent a character using the same rules as for | |
60 <b> character constants</b> in C. You may also use signed integers, using | |
61 either decimal, octal or hexadecimal formats, again following the | |
62 rules for C. You may specify control characters using ^, e.g., ^C. | |
63 <P> | |
64 <BR> | |
65 | |
66 <H2>Character Ranges</H2> | |
67 Character ranges may be specified either in the form 'a-z' or | |
68 with two simple characters separated by "..", e.g, 32..255. | |
69 <P> | |
70 <BR> | |
71 | |
72 <H2>Character Sets</H2> | |
73 Use the following operators for more complex character sets: | |
74 <DL> | |
75 <DT>Set <b> union</b></DT><DD>A + B</DD> | |
76 <DT>Set <b> difference</b></DT><DD>A - B</DD> | |
77 <DT>Set <b> intersection</b></DT><DD>A & B</DD> | |
78 <DT>Set <b> complement</b></DT><DD>~A</DD> | |
79 </DL> | |
80 | |
81 AnaGram interprets a single character to mean the set containing | |
82 only the character itself. | |
83 <P> | |
84 <BR> | |
85 | |
86 <H2>Keywords</H2> | |
87 A character string enclosed in double quotes is a <B>keyword</B>. The | |
88 rules for writing keyword strings are the same as for literal strings | |
89 in C. AnaGram parsers have special lookahead logic to recognize | |
90 keywords, so that keywords get special treatment. They are <i>not</i> | |
91 equivalent to the corresponding sequence of single characters. | |
92 <P> | |
93 <BR> | |
94 | |
95 <H2>Tokens</H2> The units of a grammar are called <A | |
96 HREF="gloss.html#Token">tokens</A>. <A HREF="gloss.html#Terminal"> | |
97 Terminal tokens </A>may be <b> character sets</b>, <b> keywords</b>, | |
98 <b> immediate actions</b>, or <b> virtual productions</b>. <A | |
99 HREF="gloss.html#Nonterminal"> Nonterminal tokens</A> are defined | |
100 in terms of other tokens by means of productions. <P> <BR> | |
101 | |
102 <H2>Productions</H2> | |
103 A <A HREF="gloss.html#Production">production</A> consists of one or more | |
104 token names on the left, an arrow ( <CODE>-></CODE> ), and a | |
105 <A HREF="gloss.html#GrammarRule"> | |
106 grammar rule</A> on the right. A production with more than one name on | |
107 the left is called a <A HREF="gloss.html#SemanticallyDetermined"> | |
108 semantically determined production</A>. Additional productions with | |
109 the same left side may be joined by using | or another arrow. The | |
110 arrow, if used, must start a new line. | |
111 | |
112 <P> If the token on the left | |
113 side of a production is called <i>grammar</i> or is tagged with a | |
114 following dollar sign, it is taken to be the <A | |
115 HREF="gloss.html#GrammarToken"> grammar token</A>, or goal token for | |
116 the grammar. | |
117 | |
118 <P> The names on the left side of a | |
119 production may be preceded by a type cast indicating the data type of | |
120 the <A HREF="gloss.html#SemanticValue"> semantic value</A> of the | |
121 named tokens. | |
122 | |
123 <P> A grammar rule is a sequence of <A HREF="gloss.html#RuleElement"> | |
124 rule elements</A> joined by commas. The rule elements may be <b> | |
125 character sets</b>, <b> keywords</b>, <b> token names</b>, <b> virtual | |
126 productions</b>, or <b> immediate actions</b>. | |
127 | |
128 <P> A <A HREF="gloss.html#VirtualProduction">virtual | |
129 production</A> is a token name or character set expression followed by | |
130 ? or ?..., or a sequence of one or more rules, joined by vertical bars | |
131 ( | ) , inside brackets or braces and optionally followed by an | |
132 ellipsis (...). The ? indicates an optional token. Braces indicate a | |
133 choice among the listed rules. Brackets indicate an optional choice. | |
134 The ellipsis represents unlimited repetition. <P> <BR> | |
135 | |
136 <H2>Reduction Procedures</H2> | |
137 A <A HREF="gloss.html#ReductionProcedure">reduction procedure</A> is a | |
138 piece of C or C++ code following a grammar rule that is to be executed | |
139 when the rule is recognized in the parser's input stream. Reduction | |
140 procedures may be short form: a single expression followed by a | |
141 semicolon, or long form: a block of code enclosed in braces. In either | |
142 case they are preceded by an equal sign. Short form procedures may not | |
143 continue onto another line. | |
144 | |
145 <P> Reduction procedures may access the | |
146 <A HREF="gloss.html#SemanticValue"> semantic values</A> of tokens in | |
147 the grammar rule to which they are attached. To each token whose value | |
148 is needed append a colon and the variable name used for the token value | |
149 in the reduction procedure. In a short form reduction procedure, the | |
150 value of the expression is assigned to the <A | |
151 HREF="gloss.html#ReductionToken"> reduction token</A>, the token on | |
152 the left side of the production. In a long form procedure, use the | |
153 return statement to assign a value to the token on the left side of the | |
154 production. <p> An <b> immediate action</b> differs from a reduction | |
155 procedure in that it may occur in the middle of a grammar rule. To | |
156 distinguish it from a reduction procedure, it begins with an | |
157 exclamation point rather than an equal sign. <P> <BR> | |
158 | |
159 <H2>Definitions</H2> | |
160 You may assign names to frequently used character sets, virtual | |
161 productions, keywords, or immediate actions by using a definition | |
162 statement consisting of a name, an equal sign and the entity to be | |
163 named. | |
164 <P> | |
165 <BR> | |
166 | |
167 <H2>Configuration Section</H2> | |
168 A configuration section is a block of special statements enclosed | |
169 in brackets. These are either <b> attribute statements</b> or assign values | |
170 to <b> configuration parameters</b> or switches, all of which are | |
171 described in on-line help windows. | |
172 <P> | |
173 <BR> | |
174 | |
175 <H2>Embedded C</H2> | |
176 You may include C or C++ code to support your reduction | |
177 procedures at any point in your grammar by enclosing it in braces. | |
178 The beginning brace must be on a fresh line, and no other statement | |
179 may follow on the same line as the terminating brace. A | |
180 block of embedded C at the very beginning of a <b> syntax file</b> is | |
181 called the <b> C prologue</b>. | |
182 | |
183 <P> | |
184 <BR> | |
185 | |
186 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------" | |
187 WIDTH=1010 HEIGHT=2 > | |
188 <P> | |
189 <IMG ALIGN="right" SRC="images/pslrb6d.gif" ALT="Parsifal Software" | |
190 WIDTH=181 HEIGHT=25> | |
191 <BR CLEAR="right"> | |
192 <P> | |
193 Back to <A HREF="index.html">Index</A> | |
194 <P> | |
195 <ADDRESS><FONT SIZE="-1"> | |
196 AnaGram parser generator - documentation<BR> | |
197 Summary of AnaGram Notation<BR> | |
198 Copyright © 1993-1999, Parsifal Software. <BR> | |
199 All Rights Reserved.<BR> | |
200 </FONT></ADDRESS> | |
201 | |
202 </BODY> | |
203 </HTML> |