comparison doc/misc/html/summary.html @ 0:13d2b8934445

Import AnaGram (near-)release tree into Mercurial.
author David A. Holland
date Sat, 22 Dec 2007 17:52:45 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:13d2b8934445
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
2 <HTML>
3 <HEAD>
4 <TITLE>Summary of AnaGram Notation</TITLE>
5 </HEAD>
6
7 <BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif"
8 TEXT="#000000" LINK="#0033CC"
9 VLINK="#CC0033" ALINK="#CC0099">
10
11 <P>
12
13 <IMG ALIGN="right" SRC="images/agrsl6c.gif" ALT="AnaGram"
14 WIDTH=124 HEIGHT=30>
15 <BR CLEAR="all">
16 Back to <A HREF="index.html">Index</A>
17 <P>
18 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
19 WIDTH=1010 HEIGHT=2 >
20
21 <BR CLEAR="all">
22
23
24 <H1 ALIGN="LEFT">Summary of AnaGram Notation</H1>
25 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
26 WIDTH=1010 HEIGHT=2 >
27 </P>
28 <BR>
29
30 The rules for using AnaGram are given in Chapters 8 and 9 of
31 the AnaGram User's Guide. This page contains a brief summary.
32 Section headings and terms which appear in <b> bold face</b> can be found
33 in the online <b> Help Topics</b>.
34 <P>
35 <BR>
36
37 <H2>Lexical Conventions</H2>
38 AnaGram allows the free use of spaces, tabs and <b> comments</b>.
39 Both C style and C++ style comments are allowed. Blank lines are
40 allowed, but only <i>between</i> statements.
41 <p>
42 AnaGram statements may continue onto following lines as long
43 as they are clearly incomplete. Normally this rule is satisfied by
44 dangling punctuation or open parentheses, brackets, or braces. In no
45 case can a statement continue over a blank line.
46 <P>
47 <BR>
48
49 <H2>Names</H2>
50 Symbol names must begin with a letter or underscore, and may
51 contain letters, digits, or underscores. They may also contain embedded spaces, tabs, and comments. Any sequence of embedded
52 space, however, is replaced by a single blank character.
53 <p>
54 The names <b> eof</b>, <b> error</b>, and <b> grammar</b> have special meanings.
55 <P>
56 <BR>
57
58 <H2>Character Representations</H2>
59 You may represent a character using the same rules as for
60 <b> character constants</b> in C. You may also use signed integers, using
61 either decimal, octal or hexadecimal formats, again following the
62 rules for C. You may specify control characters using ^, e.g., ^C.
63 <P>
64 <BR>
65
66 <H2>Character Ranges</H2>
67 Character ranges may be specified either in the form 'a-z' or
68 with two simple characters separated by "..", e.g, 32..255.
69 <P>
70 <BR>
71
72 <H2>Character Sets</H2>
73 Use the following operators for more complex character sets:
74 <DL>
75 <DT>Set <b> union</b></DT><DD>A + B</DD>
76 <DT>Set <b> difference</b></DT><DD>A - B</DD>
77 <DT>Set <b> intersection</b></DT><DD>A &amp; B</DD>
78 <DT>Set <b> complement</b></DT><DD>~A</DD>
79 </DL>
80
81 AnaGram interprets a single character to mean the set containing
82 only the character itself.
83 <P>
84 <BR>
85
86 <H2>Keywords</H2>
87 A character string enclosed in double quotes is a <B>keyword</B>. The
88 rules for writing keyword strings are the same as for literal strings
89 in C. AnaGram parsers have special lookahead logic to recognize
90 keywords, so that keywords get special treatment. They are <i>not</i>
91 equivalent to the corresponding sequence of single characters.
92 <P>
93 <BR>
94
95 <H2>Tokens</H2> The units of a grammar are called <A
96 HREF="gloss.html#Token">tokens</A>. <A HREF="gloss.html#Terminal">
97 Terminal tokens </A>may be <b> character sets</b>, <b> keywords</b>,
98 <b> immediate actions</b>, or <b> virtual productions</b>. <A
99 HREF="gloss.html#Nonterminal"> Nonterminal tokens</A> are defined
100 in terms of other tokens by means of productions. <P> <BR>
101
102 <H2>Productions</H2>
103 A <A HREF="gloss.html#Production">production</A> consists of one or more
104 token names on the left, an arrow ( <CODE>-&gt;</CODE> ), and a
105 <A HREF="gloss.html#GrammarRule">
106 grammar rule</A> on the right. A production with more than one name on
107 the left is called a <A HREF="gloss.html#SemanticallyDetermined">
108 semantically determined production</A>. Additional productions with
109 the same left side may be joined by using | or another arrow. The
110 arrow, if used, must start a new line.
111
112 <P> If the token on the left
113 side of a production is called <i>grammar</i> or is tagged with a
114 following dollar sign, it is taken to be the <A
115 HREF="gloss.html#GrammarToken"> grammar token</A>, or goal token for
116 the grammar.
117
118 <P> The names on the left side of a
119 production may be preceded by a type cast indicating the data type of
120 the <A HREF="gloss.html#SemanticValue"> semantic value</A> of the
121 named tokens.
122
123 <P> A grammar rule is a sequence of <A HREF="gloss.html#RuleElement">
124 rule elements</A> joined by commas. The rule elements may be <b>
125 character sets</b>, <b> keywords</b>, <b> token names</b>, <b> virtual
126 productions</b>, or <b> immediate actions</b>.
127
128 <P> A <A HREF="gloss.html#VirtualProduction">virtual
129 production</A> is a token name or character set expression followed by
130 ? or ?..., or a sequence of one or more rules, joined by vertical bars
131 ( | ) , inside brackets or braces and optionally followed by an
132 ellipsis (...). The ? indicates an optional token. Braces indicate a
133 choice among the listed rules. Brackets indicate an optional choice.
134 The ellipsis represents unlimited repetition. <P> <BR>
135
136 <H2>Reduction Procedures</H2>
137 A <A HREF="gloss.html#ReductionProcedure">reduction procedure</A> is a
138 piece of C or C++ code following a grammar rule that is to be executed
139 when the rule is recognized in the parser's input stream. Reduction
140 procedures may be short form: a single expression followed by a
141 semicolon, or long form: a block of code enclosed in braces. In either
142 case they are preceded by an equal sign. Short form procedures may not
143 continue onto another line.
144
145 <P> Reduction procedures may access the
146 <A HREF="gloss.html#SemanticValue"> semantic values</A> of tokens in
147 the grammar rule to which they are attached. To each token whose value
148 is needed append a colon and the variable name used for the token value
149 in the reduction procedure. In a short form reduction procedure, the
150 value of the expression is assigned to the <A
151 HREF="gloss.html#ReductionToken"> reduction token</A>, the token on
152 the left side of the production. In a long form procedure, use the
153 return statement to assign a value to the token on the left side of the
154 production. <p> An <b> immediate action</b> differs from a reduction
155 procedure in that it may occur in the middle of a grammar rule. To
156 distinguish it from a reduction procedure, it begins with an
157 exclamation point rather than an equal sign. <P> <BR>
158
159 <H2>Definitions</H2>
160 You may assign names to frequently used character sets, virtual
161 productions, keywords, or immediate actions by using a definition
162 statement consisting of a name, an equal sign and the entity to be
163 named.
164 <P>
165 <BR>
166
167 <H2>Configuration Section</H2>
168 A configuration section is a block of special statements enclosed
169 in brackets. These are either <b> attribute statements</b> or assign values
170 to <b> configuration parameters</b> or switches, all of which are
171 described in on-line help windows.
172 <P>
173 <BR>
174
175 <H2>Embedded C</H2>
176 You may include C or C++ code to support your reduction
177 procedures at any point in your grammar by enclosing it in braces.
178 The beginning brace must be on a fresh line, and no other statement
179 may follow on the same line as the terminating brace. A
180 block of embedded C at the very beginning of a <b> syntax file</b> is
181 called the <b> C prologue</b>.
182
183 <P>
184 <BR>
185
186 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
187 WIDTH=1010 HEIGHT=2 >
188 <P>
189 <IMG ALIGN="right" SRC="images/pslrb6d.gif" ALT="Parsifal Software"
190 WIDTH=181 HEIGHT=25>
191 <BR CLEAR="right">
192 <P>
193 Back to <A HREF="index.html">Index</A>
194 <P>
195 <ADDRESS><FONT SIZE="-1">
196 AnaGram parser generator - documentation<BR>
197 Summary of AnaGram Notation<BR>
198 Copyright &copy; 1993-1999, Parsifal Software. <BR>
199 All Rights Reserved.<BR>
200 </FONT></ADDRESS>
201
202 </BODY>
203 </HTML>