annotate doc/manual/gl.tex @ 10:5b21f127e957

silence random gcc warning
author David A. Holland
date Mon, 30 May 2022 23:58:14 -0400
parents 13d2b8934445
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
1 \chapter{Glossary}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
2
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
3 % XXX Add:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
4 % parameter assignment (?)
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
5 % parser (state) stack
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
6 % parser value stack
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
7 % (may be defined in some other version of the glossary)
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
8 % (and appear in the help)
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
9 % (there are apparently dangling links to these in some version)
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
10
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
11 \index{Action}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
12 \index{Parser action}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
13 \index{Parsing engine}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
14 \paragraph{Action.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
15 A ``parser action'' is one of the basic executable elements of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
16 parsing engine. In a traditional parsing engine there are four
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
17 actions: the shift action, the reduce action, the accept action, and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
18 the error action. The parsing engines which AnaGram generates use a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
19 substantially greater number of actions, in order to provide certain
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
20 short cuts in the interest of greater speed and greater compactness of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
21 the tables which control the action of the parsing engine.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
22
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
23 \paragraph{Accept Action.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
24 The accept action is one of the four actions of a traditional parsing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
25 engine. The accept action is performed when the parser has succeeded
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
26 in identifying the goal token for the grammar. When the parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
27 executes the accept action, it returns to the calling program.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
28 % , returning the semantic value of the goal token as its value.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
29 The accept action is thus the last action of the parsing engine and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
30 occurs only once for each successful execution of the parser.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
31
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
32 \paragraph{Associativity.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
33 This is a term used to describe binary operators (q.v.). It describes
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
34 how you interpret a sequence of operations which all involve the same
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
35 operator. Consider $a - b - c$, for instance. Here the convention is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
36 that we subtract $b$ from $a$, and then $c$ from the difference. We
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
37 say that subtraction is left associative, since if there is a sequence
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
38 of subtraction operations, the leftmost one is to be performed
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
39 first. As a second example, consider exponentiation. FORTRAN uses
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
40 \agcode{**} as an operator to indicate exponentiation. In order to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
41 conform as closely as possible to ordinary mathematical notation, the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
42 expression \agcode{a**b**c} is interpreted to mean that first
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
43 \agcode{b} is raised to the power \agcode{c}, and then the resultant
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
44 value is used as the power to which \agcode{a} is raised. We say that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
45 exponentiation is right associative since the rightmost of a sequence
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
46 of exponentiation operations is to be performed first. If a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
47 programming language does not allow for unparenthesized sequences of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
48 particular operator, the operator is said to be non-associative.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
49
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
50 Another way to view left and right associativity is as implied
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
51 parenthesization. Left associative operators are parenthesized from
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
52 the left. Right associative operators are parenthesized from the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
53 right. Thus $a - b - c = ((a-b) - c)$ and \agcode{a**b**c = (a**(b**c))}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
54
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
55 AnaGram offers two ways to specify associativity of binary
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
56 operators. The first way is to use conventional Backus-Naur Form
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
57 syntax descriptions. You would use recursion to describe both
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
58 subtraction and exponentiation. Subtraction, since it is left
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
59 associative, uses left recursion, and exponentiation, being right
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
60 associative, uses right recursion. Thus
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
61 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
62 difference -> value | difference, '-', value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
63 exponential -> value | value, "**", exponential
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
64 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
65 could be used to describe differences and exponents.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
66
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
67 The second way to specify associativity is to use an ambiguous grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
68 and precedence declarations. (See Chapter 9.)
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
69
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
70 \paragraph{Backtracking.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
71 In order to make parse tables more compact and parsers faster, it is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
72 common to use default reductions. In case of error, it is necessary to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
73 undo default reductions before diagnostics can be properly
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
74 determined. In AnaGram, this undo operation is called backtracking.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
75
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
76 \index{Backus-Naur form}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
77 \index{BNF}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
78 \paragraph{Backus-Naur Form.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
79 Backus-Naur Form, or BNF, is a conventional notation for describing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
80 context free grammars. AnaGram uses an extended notation for grammars,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
81 which, except for semantically determined productions, can be shown to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
82 be equivalent to BNF. The term ``BNF'' is often used colloquially to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
83 refer to a grammar specification.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
84 % XXX: shouldn't this say ``any grammar specification''?
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
85
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
86 AnaGram's syntax specification language differs from BNF in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
87 following respects:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
88
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
89 \begin{enumerate}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
90
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
91 \item
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
92 In conventional BNF, a symbol not enclosed in angle brackets ($< >$)
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
93 is taken to represent itself literally. In AnaGram, literal character
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
94 representations must be enclosed in single quotes and literal strings
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
95 within double quotes.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
96
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
97 \item
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
98 In BNF, the right side of a production is simply a list of symbols
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
99 without any delimiters or punctuation. AnaGram requires that the rule
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
100 elements which make up a grammar rule, or right side, be joined by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
101 commas.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
102
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
103 \item
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
104 BNF makes no provision for identifying reduction procedures or their
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
105 arguments. AnaGram provides both reduction procedures, introduced by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
106 an ``\agcode{=}'' at the end of the production, and named arguments,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
107 introduced by a ``\agcode{:}'' at the end of any token on the right
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
108 side of the production.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
109
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
110 \item
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
111 AnaGram allows virtual productions to be used freely.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
112
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
113 \item
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
114 BNF is ``pure'' in that if you wish to define a nonterminal token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
115 called ``digit'' to represent the digits from zero to nine, you must
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
116 provide ten explicit productions to define it. AnaGram treats the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
117 concept of ``terminal token'' as used in language theory as an
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
118 abstraction, and interposes character set functionality between actual
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
119 character input and the terminal tokens of BNF. You can define digit
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
120 to be the character range \agcode{'0-9'}, for instance, and AnaGram
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
121 will determine and define precisely the minimum number of productions
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
122 necessary, taking into account any other usage of the characters
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
123 \agcode{'0'} through \agcode{'9'} in your grammar. This makes your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
124 grammar more compact, more manageable, and easier to modify.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
125
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
126 \item
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
127 AnaGram allows for semantically determined productions, which provide
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
128 a significant increment in the ease with which semantic analysis may
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
129 be joined to syntactic analysis.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
130
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
131 \end{enumerate}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
132
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
133 \paragraph{Binary operator.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
134 A binary operator is an operator that works on two operands to create
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
135 a result. It is often written between its operands and is sometimes
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
136 called an infix operator. In ordinary programming languages
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
137 ``\agcode{+}'' and ``\agcode{-}'' are binary operators.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
138
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
139 \paragraph{Binding.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
140 Most programming languages, rather than executing arithmetic operators
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
141 in simple left to right order, conform to the traditional conventions
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
142 of ordinary algebra, which dictate that, except where parenthesis
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
143 indicate otherwise, exponentiation is done before multiplication,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
144 multiplication before addition, and addition before comparison. One
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
145 says that exponentiation is ``more tightly binding'' than
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
146 multiplication, multiplication is more tightly binding than addition,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
147 and so on. The sense of the word here is that the operator binds
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
148 together its operands into a single entity. An operand which falls
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
149 between two different operators is ``bound'' by the more tightly
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
150 binding operator. An operator that is more tightly binding than
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
151 another is also said to have higher precedence.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
152
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
153 \index{Character sets}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
154 \paragraph{Character sets.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
155 One of the traditional problems with syntax directed programming is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
156 that caused by the fact that most formal language specifications have
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
157 productions of the form
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
158 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
159 letter -> a | b | c | d ... | z
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
160 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
161
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
162 Since the letters are not often distinguished in the grammar, this
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
163 large number of essentially identical productions causes a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
164 correspondingly large number of states in the parser. This problem is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
165 often attacked by using a ``lexical scanner'' which simply specifies a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
166 ``letter token'' whose value indicates precisely which letter is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
167 intended. This works fine as long as there is nothing in the grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
168 which distinguishes any letter at all. If there is, and there is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
169 usually some special use of \agcode{h}, \agcode{x}, \agcode{q},
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
170 \agcode{o}, \agcode{e}, or even \agcode{a-f}, the situation gets more
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
171 complicated. Often the result is to abandon the use of syntax directed
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
172 programming for elemental units of the language and use a more complex
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
173 lexical scanner which identifies names, numbers, key words and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
174 operators. Such lexical scanners are often built using conventional
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
175 programming languages or simple pattern recognizing languages such as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
176 \agfile{lex}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
177
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
178 AnaGram avoids this problem by incorporation the notion of character
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
179 set into its input specifications. Briefly, the way it works is the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
180 following: You specify the set of characters which make up any
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
181 ostensibly terminal token. AnaGram then analyzes the overlaps among
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
182 these definitions and creates a minimal set of tokens which match your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
183 specifications exactly. For instance, suppose you have the following
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
184 definitions:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
185
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
186 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
187 letter = 'a-z' + 'A-Z'
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
188 hex digit = '0-9' + 'a-f' + 'A-F'
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
189 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
190
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
191 AnaGram will define a token for letter which has two productions:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
192
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
193 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
194 letter -> 'a-f' + 'A-F' | 'g-z' + 'G-Z'
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
195 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
196
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
197 With this technique, you have the freedom to specify your grammar in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
198 the easiest possible manner, without the penalty of an absurdly large,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
199 slow parser.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
200
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
201 Character sets may be specified in terms of ranges of characters, as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
202 in the above example, as unions, denoted by ``\agcode{+}'', as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
203 intersections, denoted by ``\agcode{\&}'', as differences, denoted by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
204 ``\agcode{-}'', and as complements, using ``\agcode{\~{}}''. You may
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
205 also specify a set consisting of a single character either with a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
206 literal character or with a numeric specification. If you specify a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
207 character numerically, you may use either decimal, octal or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
208 hexadecimal notation, as in C.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
209
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
210 \index{Character universe}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
211 \index{Universe}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
212 \paragraph{Character Universe.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
213 In ordinary set theory, the ``universe'' is the set of all entities
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
214 which may be elements of a set. It must be defined so that the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
215 complement of a set can mean something unambiguous. In AnaGram, the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
216 character universe normally comprises the ASCII characters and the IBM
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
217 extended character set, that is, all characters in the range from 0
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
218 through 255. If, however, you have defined input characters outside
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
219 this range, the character universe will be extended to the least
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
220 character you have used and to the greatest character you have used in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
221 your grammar.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
222 % XXX this should mention character sets, and it should mention that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
223 % it's limited to character sets of size 65536.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
224
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
225 \index{Characteristic rules}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
226 \index{Rule}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
227 \paragraph{Characteristic Rule.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
228 Each parser state is characterized by a particular set of grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
229 rules, and for each such rule, a marked token which is the next token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
230 expected. These rules are called the characteristic rules for the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
231 state. In the course of doing grammar analysis, AnaGram determines the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
232 characteristic rules for each parser state. After analyzing your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
233 grammar, you may inspect the \agwindow{State Definition Table} to see
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
234 the characteristic rules for any state in your parser.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
235
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
236 \index{Characteristic token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
237 \index{Token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
238 \paragraph{Characteristic Token.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
239 Every state in a parser, except state zero, can be characterized by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
240 the one, unique token which causes a jump to that state. That token is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
241 called the characteristic token of the state, because to get to that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
242 parser state you must have just seen precisely that token in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
243 input. Note that several states could have the same characteristic
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
244 token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
245 % XXX s/one,/one/
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
246
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
247 When you have a list of states, such as is given by the parser state
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
248 stack, it is equivalent to a list of characteristic tokens. This list
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
249 of tokens is the list of tokens that have been recognized so far by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
250 the parser. Some of these tokens, of course, may be nonterminal tokens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
251 and may thus represent the result of reducing a sequence of previously
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
252 recognized tokens.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
253
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
254 \index{Complement}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
255 \paragraph{Complement of a set.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
256 In set theory, the complement of a set, $S$, is the collection of all
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
257 elements of the character universe which are not elements of $S$. Using
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
258 AnaGram's notation, the complement of \agcode{S} is written
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
259 \agcode{\~{}S}. The complement operator has higher precedence than the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
260 difference, intersection, or union operators.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
261
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
262 \index{Rule}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
263 \index{Completed rule}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
264 \paragraph{Completed Rule.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
265 A ``completed rule'' is a characteristic rule whose index is pointing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
266 beyond the last rule element. In other words, it has been completely
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
267 matched and will be reduced by the next input.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
268
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
269 If a state has more than one completed rule, the decision as to which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
270 to reduce is made based on the next input token. If there is only one
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
271 completed rule in a state, it will be reduced by default unless the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
272 default reductions switch has been reset, i.e., turned off.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
273
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
274 \index{Conflicts}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
275 \paragraph{Conflict.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
276 Conflicts arise during a grammar analysis when AnaGram cannot
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
277 determine how to treat a given input token. There are two sorts of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
278 conflicts: \agterm{shift-reduce} conflicts and \agterm{reduce-reduce}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
279 conflicts. Conflicts may arise either because the grammar is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
280 inherently ambiguous, or simply because the grammar analyzer cannot
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
281 look far enough ahead to resolve the conflict. In the latter case, it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
282 is often possible to rewrite the grammar in such a way as to eliminate
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
283 the conflict. Null productions are a common source of conflict.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
284
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
285 There are a number of ways to deal with conflicts. If you understand
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
286 the conflict well, you may simply choose to ignore it. When AnaGram
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
287 encounters a shift-reduce conflict while building parse tables it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
288 resolves it by choosing the shift action. When AnaGram encounters a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
289 reduce-reduce conflict while building parse tables, it resolves it by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
290 selecting the grammar rule which occurred first in the grammar.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
291
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
292 A second way to deal with conflicts is to set operator precedence
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
293 parameters. If you set these parameters, AnaGram will use them
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
294 preferentially to resolve conflicts. Any conflicts so resolved will be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
295 listed in the Resolved Conflicts window.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
296
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
297 A third way to resolve a conflict is to declare some tokens as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
298 \agparam{sticky} or as \agparam{subgrammar}s. This is particularly
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
299 useful for productions whose sole purpose is to skip over
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
300 uninteresting input.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
301
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
302 The fourth way to deal with conflicts is to rewrite the grammar to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
303 eliminate them. Many people prefer this approach since it yields the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
304 highest level of confidence in the resulting program.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
305
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
306 \index{Context free grammar}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
307 \paragraph{Context Free Grammars.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
308 Context free grammars are grammars wherein the definition of a grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
309 unit, or nonterminal token, does not depend on the context in which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
310 the nonterminal is used. That means that if you define ``widget'' in a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
311 certain way, that definition is valid in all contexts in which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
312 ``widget'' might occur. Context free grammars can be represented in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
313 Backus-Naur Form. AnaGram's syntax specification language has no
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
314 facility for representing grammars which are not context free.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
315
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
316 \index{Default reductions}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
317 \index{Reduction}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
318 \paragraph{Default reductions.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
319 A ``default reduction'' is a parser action which may be used in your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
320 parser in any state which has precisely one completed rule.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
321
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
322 If a given parser state has, among its characteristic rules, exactly
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
323 one completed rule, it is usually faster to reduce it on any input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
324 than to check specifically for correct input before reducing it. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
325 only time this default reduction causes trouble is in the event of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
326 syntax error. In this situation you may get an erroneous reduction.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
327 Normally when you are parsing a file, this is inconsequential because
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
328 you are not going to continue semantic action in the presence of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
329 error. But, if you are using your parser to handle real-time
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
330 interactive input, you have to be able to continue semantic processing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
331 after notifying your user that he has entered erroneous input. In this
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
332 case you would want \agparam{default reductions} to have been turned
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
333 off so that productions are reduced only when there is correct input.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
334
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
335 \index{Difference}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
336 % XXX: \index{Set difference} ?
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
337 \paragraph{Difference of two sets.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
338 In set theory, the difference of two sets, $A$ and $B$, is defined to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
339 be the set of all elements of $A$ that are not elements of $B$. In an
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
340 AnaGram syntax file, you represent the difference of two character
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
341 sets by using the ``\agcode{-}'' operator. Thus the difference of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
342 \agcode{A} and \agcode{B} is \agcode{A - B}. The difference operator
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
343 is left associative.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
344
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
345 \paragraph{Error Action.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
346 The error action is one of the four actions of a traditional parsing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
347 engine. The error action is performed when the parser has encountered
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
348 an input token which is not admissible in the current state. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
349 further behavior of a traditional parser is undefined.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
350
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
351 \index{Expansion rule}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
352 \index{Rule}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
353 \paragraph{Expansion Rule.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
354 In analyzing a grammar, we are often interested in the full range of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
355 input that can be expected at a certain point. The expansion of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
356 token or state shows us all the expected input. An expansion yields a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
357 set of marked rules. Looking at the marked token shows us what input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
358 to expect.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
359
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
360 The set of expansion rules of a (nonterminal) token shows all the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
361 expected input that can occur whenever the token appears in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
362 grammar. The set consists of all the grammar rules produced by the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
363 token, plus all the rules produced by the first token of any rule in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
364 the set. The marked tokens for the expansion rules of a token are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
365 always the first element in the rule.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
366
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
367 The expansion of a state consists of its characteristic rules plus the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
368 expansion rules of the marked token in each characteristic rule.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
369
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
370 \index{Grammar}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
371 \paragraph{Grammar.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
372 Traditionally, a grammar is a set of productions which taken together
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
373 specify precisely a set of acceptable input sequences in terms of an
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
374 abstract set of terminal tokens. The set of acceptable input sequences
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
375 is often called the ``language'' defined by the grammar.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
376
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
377 In AnaGram, the term grammar also includes configuration sections,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
378 definitions of character sets, virtual productions, etc. that augment
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
379 the collection of productions. A grammar is often called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
380 ``syntax'', and the rules of the grammar are often called syntactic
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
381 rules.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
382
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
383 % XXX you can't say that step 1 (of 4) of analysis is analysis.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
384 \paragraph{Grammar Analysis.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
385 The major function of AnaGram is the analysis of context free grammars
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
386 written in a particular variant of Backus-Naur Form.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
387
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
388 The analysis of a grammar proceeds in four stages. In the first, the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
389 input grammar is analyzed and a number of tables are built which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
390 describe all of the productions and components of the grammar.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
391
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
392 In the second stage, AnaGram analyzes all of the character sets
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
393 defined in the grammar, and where necessary, defines auxiliary tokens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
394 and productions.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
395
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
396 In the third stage, AnaGram identifies all of the states of the parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
397 and builds the go-to table for the parser.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
398
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
399 In the fourth stage, Anagram identifies reduction tokens for each
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
400 completed grammar rule in each state and checks for conflicts.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
401
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
402 \index{Grammar rule}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
403 \paragraph{Grammar Rule.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
404 A ``grammar rule'' is the right side of a production. It consists of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
405 sequence of rule elements. Each rule element identifies some token,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
406 which can be either a terminal token or nonterminal token. It is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
407 ``matched'' by a corresponding sequence of tokens in the input stream
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
408 to the parser. The left side of the production identifies one or more
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
409 nonterminal tokens, or reduction tokens, to which the rule reduces
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
410 when matched. If there is more than one reduction token, there should
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
411 be a reduction procedure to make the choice.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
412
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
413 \index{Grammar token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
414 \index{Token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
415 \paragraph{Grammar Token.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
416 The grammar token is the token which represents the ``top level'' in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
417 your grammar. Some people refer to it as the ``goal'' or ``goal
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
418 token'' and others as the ``start token''. Whatever it is called, it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
419 is the single token which describes the complete input to your parser.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
420
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
421 \index{Intersection}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
422 \paragraph{Intersection.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
423 In set theory, the intersection of two sets, $A$ and $B$, is defined
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
424 to be the set of all elements of $A$ which are also elements of $B$.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
425 In an AnaGram syntax file, the intersection of two character sets is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
426 represented with the ``\agcode{\&}'' operator. The intersection
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
427 operator has lower precedence than the complement operator, but higher
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
428 precedence than the union and difference operators. The intersection
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
429 operator is left associative.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
430
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
431 \index{LALR parsing}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
432 \paragraph{LALR parsing.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
433 LALR, or ``lookahead LR'' parsing, is a somewhat less powerful variant
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
434 of LR parsing which produces much smaller parsing tables. Thus an LALR
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
435 parser has very significant advantages in size over its LR
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
436 counterpart. It is possible to construct grammars which can be parsed
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
437 by an LR parser but cannot be parsed by an LALR parser. That is, the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
438 LALR parser will find these grammars to be ambiguous and the LR parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
439 will not. In practice, however, such grammars seem to be rare and the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
440 ambiguities can often be eliminated by minor revisions.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
441
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
442 AnaGram generates LALR parsers and, in addition, uses LALR parsing to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
443 interpret syntax files.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
444
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
445 \index{Lexical scanner}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
446 \paragraph{Lexical Scanner.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
447 A lexical scanner is a program or procedure used as a preprocessor for
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
448 a parser. It scans input characters and lumps them together into
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
449 tokens. Many systems for generating parsers, most notably
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
450 \agfile{yacc}, can scarcely be used without a lexical scanner.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
451
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
452 AnaGram, although it can support the use of a lexical scanner, does
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
453 not need a lexical scanner for most practical problems. AnaGram avoids
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
454 the need for a lexical scanner by using character sets and disregard
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
455 and lexeme statements.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
456
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
457 If your problem does in fact need a lexical scanner, you can use
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
458 AnaGram itself to write it, so you don't need to know different
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
459 languages for the scanner and for the parser.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
460
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
461 \index{Lookahead token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
462 \index{Token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
463 \paragraph{Lookahead Token.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
464 The current input token to a parser is often called the ``lookahead''
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
465 token. In many situations, it is not shifted into the input buffer
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
466 immediately, but acts like a catalyst to cause a number of rules to be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
467 reduced before it is eventually shifted in.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
468
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
469 \index{LR parsing}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
470 \paragraph{LR Parsing.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
471 An LR($k$) parser is a type of deterministic bottom-up shift-reduce
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
472 parser using at most $k$ lookahead symbols to determine its next
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
473 action at any point in the parse. If ($k$) is omitted, then $k$ is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
474 assumed to be 1. Discussion of the technical details of LR($k$)
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
475 parsing is beyond the scope of this manual. The interested reader may
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
476 consult any of a variety of works covering the subject\footnote{ See,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
477 for example, Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
478 \textit{Compilers: Principles, Techniques, and Tools}. (Reading,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
479 Massachusetts: Addison-Wesley, 1988).}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
480
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
481 AnaGram produces parsers which employ a variant of LR parsing known as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
482 LALR parsing. It is not necessary to understand the details of LR and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
483 LALR parsing theory in order to use AnaGram.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
484
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
485 \index{Marked rule}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
486 \index{Rule}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
487 \paragraph{Marked Rule.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
488 A ``marked rule'' is a grammar rule together with a ``marked token''
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
489 that indicates how much of the rule has already been matched and what
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
490 input should be expected if the remainder of the rule is to be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
491 matched. Each parser state is defined by a small number of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
492 characteristic rules which indicate what matches have already been
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
493 made and what input can be expected. Note that when a state has more
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
494 than one characteristic rule, any two characteristic rules with the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
495 same number of tokens match identically up to the marked token. Even
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
496 if one has fewer tokens to the left than the other, its tokens match
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
497 the other exactly up to the marked token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
498
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
499 When marked rules are displayed in AnaGram windows, the marked token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
500 is displayed in a distinctive font which the developer can select.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
501
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
502 \paragraph{Non-associative.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
503 A binary operator, say \agcode{\#}, is said to be non-associative if
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
504 the sequence \agcode{x \# y \# z} is not permissible. If this sequence
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
505 is to be interpreted as \agcode{(x \# y) \# z}, the operator is said
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
506 to be left associative. If the sequence is to be interpreted as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
507 \agcode{x \# (y \# z)}, the operator is said to be right
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
508 associative. (See \agterm{associativity}.)
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
509
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
510 \index{Nonterminal token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
511 \index{Token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
512 \paragraph{Nonterminal Token.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
513 A ``nonterminal token'' is a token which results from reducing the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
514 sequence of tokens which match a grammar rule and replacing them with
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
515 the appropriate reduction token. Nonterminal tokens are to be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
516 distinguished from terminal tokens or input tokens.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
517
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
518 \index{Null productions}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
519 \index{Production}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
520 \paragraph{Null Production.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
521 A ``null production'' is one that has no tokens on the right side
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
522 whatsoever. Null productions essentially are identified by the first
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
523 following input token. Null productions are extremely convenient
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
524 syntactic elements when you wish to make some input optional. For
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
525 example, suppose that you wish to allow an optional semicolon at some
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
526 point in your grammar. You could write the following pair of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
527 productions:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
528
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
529 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
530 optional semicolon -> | ';'
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
531 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
532
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
533 Note that a null production can never follow a ``\agcode{|}''.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
534
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
535 The above could also be written on multiple lines thus:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
536
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
537 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
538 optional semicolon
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
539 ->
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
540 -> ';'
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
541 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
542
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
543 You can always rewrite your grammar to eliminate null productions if
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
544 you wish, but you usually pay a price in conciseness and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
545 clarity. Sometimes, however, it is necessary to do such a rewrite in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
546 order to avoid conflicts, to which null productions are especially
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
547 prone.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
548
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
549 If you have a null production with no reduction procedure specified,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
550 your parser will automatically assign the value zero to the reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
551 token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
552
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
553 Null productions can be generated by virtual productions.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
554
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
555 \index{Parser}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
556 \paragraph{Parser.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
557 A parser is a program, or more likely a procedure within a program,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
558 which scans a sequence of input characters or input tokens and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
559 accumulates them in an input buffer or stack as determined by a set of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
560 productions which constitute a grammar. When the parser discovers a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
561 sequence of tokens as defined by a grammar rule, or right side of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
562 production, it ``reduces'' the sequence to a single reduction token as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
563 defined by the left side of the grammar rule. This nonterminal token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
564 now replaces the tokens which matched the grammar rule and the search
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
565 for matches continues. If an input token is encountered which will not
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
566 yield a match for any rule, it is considered a syntax error and some
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
567 kind of error recovery may be required to continue. If a match, or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
568 reduction action, yields the grammar token, sometimes called the goal
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
569 token or start token, the parser deems its work complete and returns
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
570 to whatever procedure may have called it.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
571
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
572 Tokens may have semantic values. If the input values configuration
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
573 switch is on, your parser will expect semantic values to be provided
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
574 by the input process along with the token identification code. If the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
575 \agparam{input values} switch is off, your parser will take the ASCII
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
576 value of the input character, that is, the actual input code, as the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
577 value of the character. When the parser reduces a production, it can
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
578 call a reduction procedure or semantic action to analyze the values of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
579 the constituent tokens. This reduction procedure can then return a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
580 value which characterizes the reduced token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
581
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
582 \paragraph{Parser Generator.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
583 A parser generator, such as AnaGram, is a program that converts a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
584 grammar, a rule-based description of the input to a program, into a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
585 conventional, procedural module called a parser. The parsers AnaGram
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
586 generates are simple C modules which can be compiled on almost any
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
587 platform. AnaGram parsers are also compatible with C++.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
588
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
589 \index{Parsing engine}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
590 \paragraph{Parsing Engine.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
591 A parser consists of three basic components: A set of syntax tables, a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
592 set of reduction procedures and a parsing engine. The parsing engine
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
593 is the body of code that interprets the parsing table, invokes input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
594 functions, and calls the reduction procedures. The \agmenu{Build
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
595 Parser} command configures a parsing engine according to the implicit
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
596 requirements of the syntax specification and according to the explicit
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
597 values of the configuration parameters.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
598
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
599 The parsing engine itself is a simple automaton, characterized by a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
600 set of states and a set of inputs. The inputs are the tokens of your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
601 grammar. Each state is represented by a list of tokens which are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
602 admissible in that state and for each token an action to perform and a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
603 parameter which further defines the action. In a traditional LR
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
604 parser, there are only four actions: the shift action, the reduce
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
605 action, the accept action and the error action. AnaGram, in doing its
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
606 grammar analysis, identifies a number of special cases, and creates a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
607 number of extra actions which make for faster processing, but which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
608 can be represented as combinations of these primitive actions.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
609
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
610 When a shift action is performed, the current state number is pushed
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
611 onto the parser state stack and the new state number is determined by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
612 the current state number and the current lookahead token. Different
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
613 tokens cause different new states.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
614
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
615 When a reduce action is performed, the length of the rule being
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
616 reduced is subtracted from the parser stack index and the new state
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
617 number is read from the top of the parser state stack. The reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
618 token for the rule being reduced is then used as an input token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
619
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
620 Each state in the grammar, with the exception of state zero, has a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
621 characteristic token which must have been recognized in order to jump
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
622 to that state. Therefore, the parser state stack, which is essentially
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
623 a list of state numbers, can also be thought of as a list of token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
624 numbers. This is the list of tokens that have been seen so far in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
625 parse of your input stream. Some of these tokens, of course, may be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
626 nonterminal tokens which have resulted from reducing other sequences
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
627 of tokens.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
628
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
629 \index{Partition}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
630 \paragraph{Partition.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
631 If you use character sets in your grammar, AnaGram will compute a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
632 ``partition'' of the character universe. This partition is a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
633 collection of non-overlapping character sets such that every one of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
634 the sets you have defined can be written as a union of partition
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
635 sets. Each partition set is identified by a unique reference number
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
636 called the partition set number.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
637
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
638 Each partition set is assigned a unique token. If one of your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
639 character sets requires more than one partition set to represent it,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
640 AnaGram will create appropriate productions and add them to your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
641 grammar so your parser can make the necessary distinctions.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
642
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
643 \index{Precedence}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
644 \paragraph{Precedence.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
645 ``Precedence'' is an attribute of binary operators and unary operators
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
646 which can be used to resolve conflicts. Operator precedence is also
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
647 referred to as \agterm{binding}. Suppose that \agcode{\#} and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
648 \agcode{@} are two binary operators. The question is how to interpret
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
649 \agcode{x \# y @ z}. If \agcode{\#} has higher precedence than
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
650 \agcode{@}, it is interpreted as \agcode{(x \# y) @ z}. On the other
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
651 hand if \agcode{\#} has lower precedence than \agcode{@}, it would be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
652 \agcode{x \# (y @ z)}. If \agcode{\#} and \agcode{@} have the same
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
653 precedence then the question must be resolved by associativity.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
654
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
655 Note that all operators at the same precedence level must have the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
656 same associativity.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
657
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
658 % XXX ``neither ... and ...?'' maybe you mean ``neither ... nor ...''
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
659 The situation is somewhat simpler for unary operators. If
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
660 \agcode{\#} and \agcode{@} were both
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
661 prefix operators, or both suffix operators, there would be no
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
662 ambiguity in interpretation, since neither \agcode{\# @ x} and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
663 \agcode{x \# @} offer more than one possible interpretation. The only
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
664 difficulty arises when both a prefix and a suffix operator are applied
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
665 to the same operand.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
666
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
667 % XXX also ``if ... has ... would'' is wrong; it should be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
668 % either ``if ... had ... would'' or ``if ... has ... will''.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
669 Suppose that \agcode{\#} is a prefix operator and \agcode{@} is a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
670 suffix operator. The question then is how to interpret \agcode{\# x @}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
671 If \agcode{\#} has higher precedence than \agcode{@}, it would be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
672 interpreted as \agcode{(\# x) @}. On the other hand, if \agcode{\#}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
673 has lower precedence than \agcode{@}, it would be interpreted as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
674 \agcode{\# (x @)}. If \agcode{\#} and \agcode{@} have the same
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
675 precedence then the question must be resolved by associativity.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
676
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
677 \index{Production}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
678 \paragraph{Production.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
679 Productions are the mechanism used to describe how complex input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
680 structures are built up out of simpler ones. Each production has a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
681 left side and a right side. The right side, or grammar rule, is a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
682 sequence of rule elements, which may represent either terminal tokens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
683 or nonterminal tokens. The left side is a list of reduction tokens. In
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
684 most cases there would be only a single reduction token. Productions
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
685 with more than one token on the left side are called semantically
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
686 determined productions. The ``\agcode{->}'' symbol is used to separate
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
687 the left side from the right side.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
688
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
689 \paragraph{Reduce Action.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
690 The reduce action is one of the four actions of a traditional parsing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
691 engine. The reduce action is performed when the parser has succeeded
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
692 in matching all the elements of a grammar rule and the next input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
693 token is not erroneous. Reducing the grammar rule amounts to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
694 subtracting the length of the rule from the parser stack index,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
695 identifying the reduction token, stacking its semantic value and then
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
696 doing a shift action with the reduction token as though it had been
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
697 input directly.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
698
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
699 \index{Conflicts}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
700 \index{Reduce-reduce conflicts}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
701 \paragraph{Reduce-Reduce Conflict.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
702 A grammar has a ``reduce-reduce'' conflict at some state if a single
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
703 token turns out to be a reducing token for more than one completed
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
704 rule.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
705 % XXX it would be nice to expand this. And given an example.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
706
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
707 \index{Reducing token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
708 \index{Token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
709 \paragraph{Reducing Token.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
710 In a state with more than one completed rule, your parser must be able
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
711 to determine which one was actually found. AnaGram deals with this
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
712 problem by looking at all the states the parser will branch to once
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
713 each rule is reduced. The acceptable input tokens for those states are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
714 the ``reducing tokens'' for the completed rules in the state under
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
715 investigation. If there is a single token which is a reducing token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
716 for more than one rule, then the grammar is said to have a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
717 reduce-reduce conflict at that state. If in a particular state there
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
718 is both a shift action and a reduce action for the same token the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
719 grammar is said to have a shift-reduce conflict in that state.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
720
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
721 A reducing token is not the same as a reduction token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
722
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
723 \index{Reduction procedure}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
724 \paragraph{Reduction Procedure.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
725 A ``reduction procedure'' is a function you write which your parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
726 executes when it has identified the grammar rule to which the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
727 reduction procedure is attached in your grammar. There are two formats
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
728 for reduction procedures, depending on the size and complexity of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
729 procedure.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
730
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
731 \index{Reduction token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
732 \index{Token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
733 \paragraph{Reduction Token.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
734 A token which appears on the left side of a production is called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
735 reduction token. It is so called because when the grammar rule on the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
736 right side of the production is matched in the input stream, your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
737 parser will ``reduce'' the sequence of tokens which matches the rule
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
738 by replacing the sequence of tokens with the reduction token. Note
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
739 that, if more than one reduction token is specified, your reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
740 procedure should choose the exact one. If it does not, your parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
741 will use the leftmost syntactically correct one in the list as the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
742 default.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
743
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
744 A reduction token is not the same as a reducing token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
745
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
746 \index{Resynchronization}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
747 \paragraph{Resynchronization.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
748 ``Resynchronization'' is the process of getting your parser back in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
749 step with its input after encountering a syntax error. As such, it is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
750 one method of error recovery. Of course, you would resynchronize only
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
751 if it is necessary to continue after the error. There are several
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
752 options available when using AnaGram. You could use the \agparam{auto
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
753 resynch} option, which causes AnaGram to incorporate an automatic
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
754 resynchronizing procedure into your parser, or you could use the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
755 \agparam{error resynch} option, which is similar to the technique used
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
756 by \agfile{yacc} programmers.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
757 % XXX That should be ``error token'', not ``error resynch''.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
758
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
759 \index{Rule elements}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
760 \paragraph{Rule Element.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
761 A grammar rule is a list of ``rule elements'', separated by commas.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
762 Rule elements may be token names, character sets, keywords, immediate
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
763 actions, or virtual productions. When AnaGram encounters a rule
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
764 element for which no token presently exists, it creates one.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
765
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
766 \index{Semantic action}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
767 \index{Action}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
768 \paragraph{Semantic Action.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
769 A semantic action is a piece of C or C++ code that is executed when a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
770 grammar rule has been identified by a parser. Semantic actions are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
771 also called reduction procedures, since they are executed when a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
772 grammar rule is reduced to the token on the left side of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
773 production.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
774
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
775 \index{Semantic value}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
776 \index{Value}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
777 \paragraph{Semantic Value.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
778 Tokens, whether terminal or nonterminal, may have a semantic value. In
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
779 the case of terminal tokens, this may be a value assigned by the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
780 lexical scanner or, if the parser is using direct character input, it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
781 will be the ASCII value of the character itself. The values of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
782 nonterminal tokens are created by reduction procedures. As a parse
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
783 progresses, token values are shifted onto the stack, so that in a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
784 reduction procedure, the values of the tokens that comprise the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
785 grammar rule that is being reduced are available for inspection.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
786
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
787 \index{Semantically determined production}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
788 \index{Production}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
789 \paragraph{Semantically Determined Production.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
790 A ``semantically determined production'' is one which has more than
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
791 one reduction token specified on the left side of the production. You
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
792 would write such a production when the reduction tokens are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
793 syntactically indistinguishable. The reduction procedure may then
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
794 specify which of the listed reduction tokens the grammar rule is to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
795 reduce to based on semantic considerations. If there is no reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
796 procedure, or the reduction procedure does not specify a reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
797 token, the parser will use the leftmost syntactically correct
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
798 reduction token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
799
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
800 \index{Set expressions}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
801 \paragraph{Set Expression.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
802 A set expression is an algebraic expression used to define a character
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
803 set in terms of individual characters, ranges of characters, and/or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
804 other sets of characters as constructed using complements, unions,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
805 intersections, and differences.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
806
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
807 \paragraph{Shift Action.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
808 The shift action is one of the four actions of a traditional parsing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
809 engine. The shift action is performed when the input token matches one
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
810 of the acceptable input tokens for the current parser state. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
811 semantic value of the token and the current state number are stacked,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
812 the parser stack index is incremented and the state number is set to a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
813 value determined by the previous state and the input token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
814
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
815 \index{Shift-reduce conflict}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
816 \index{Conflicts}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
817 \paragraph{Shift-Reduce Conflict.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
818 A ``shift-reduce'' conflict occurs if in some state there is a single
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
819 token that should be shifted, because it is legitimate input for one
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
820 of the rules of the state, but should also be used to reduce some
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
821 other rule because it is a reducing token for that rule.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
822 % XXX again, should expand this and give an example.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
823
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
824 \index{Parsing}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
825 \index{Syntax directed parsing}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
826 \paragraph{Syntax Directed Parsing.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
827 Syntax directed parsing, or formal parsing, is an approach to building
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
828 parsers based on formal language theory. Given a suitable description
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
829 of a language, called a grammar, there are algorithms which can be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
830 used to create parsers for the language automatically. In this
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
831 context, the set of all possible inputs to a program may be considered
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
832 to constitute a language, and the rules for formulating the input to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
833 the program constitute the grammar for the language.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
834
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
835 The parsers built from a grammar have the advantage that they can
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
836 recognize any input that conforms to the rules, and can reject as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
837 erroneous any input that fails to conform.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
838
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
839 Since the program logic necessary to parse input is often extremely
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
840 intricate, programs which use formal parsing are usually much more
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
841 reliable than those built by hand. They are also much easier to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
842 maintain, since it is much easier to modify a grammar specification
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
843 than it is to modify complex program logic.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
844
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
845 \index{Terminal token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
846 \index{Token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
847 \paragraph{Terminal Token.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
848 A ``terminal token'' is a token which does not appear on the left side
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
849 of a production. It represents, therefore, a basic unit of input to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
850 your parser. If the input to your parser consists of ASCII
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
851 characters, you may define terminal tokens explicitly as ASCII
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
852 characters or as sets of ASCII characters. If you have an input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
853 procedure which produces numeric codes, you may define the terminal
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
854 tokens directly in terms of these numeric codes.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
855
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
856 \index{Token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
857 \paragraph{Token.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
858 Tokens are the units with which your parser works. There are two kinds
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
859 of tokens: terminal tokens and nonterminal tokens. These latter are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
860 identified by the parser as sequences of tokens. The grouping of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
861 tokens into more complex tokens is governed by the grammar rules, or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
862 productions in your grammar. In your grammar, tokens may be denoted by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
863 token names, by virtual productions, by immediate actions, by explicit
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
864 character representations, or by expressions which yield character
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
865 sets.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
866
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
867 A ``token'' should be thought of more or less as being something like
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
868 a cloakroom token. Imagine you had taken a kindergarten class to the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
869 museum and all the kids checked their coats. Each one gets a token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
870 What is the probability of losing one? You take all of the tokens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
871 from the children, put them in a paper bag, tie it up, and check
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
872 it. You get one token in return. We would say that the kids' coats
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
873 represent the actual input to the system, their individual tokens are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
874 the terminal tokens, and your one token which enables you to redeem
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
875 the paper bag is a nonterminal token. Nonterminal tokens are just
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
876 single token numbers to represent the result of compacting a sequence
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
877 of more primitive tokens.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
878 %Thus we sometimes refer to the input character as the input even though
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
879 %there is a translation process, token conversion, required to turn the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
880 %input character, the winter coat, into a token number, or numbered brass
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
881 %slug.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
882
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
883 Actually, the word ``token'' is commonly used to refer to a number of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
884 apparently different things. The tokens you use in a grammar are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
885 abstract. The tokens identified by a parser are concrete instances of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
886 the abstract tokens in the grammar. Furthermore, we have to identify
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
887 tokens in some manner, so we have token names and token numbers with
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
888 which to refer to particular tokens. As a result the word ``token'',
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
889 in any particular context, may refer directly to either an abstract
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
890 token, or a concrete instance of an abstract token, or may refer only
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
891 indirectly to the token by means of a token name or token number.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
892 %
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
893 % concrete whether concrete or abstract, can be confused
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
894 %
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
895 %we can easily confuse the identification with the token itself, whether
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
896 %concrete or abstract.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
897 %
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
898 %
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
899 %the token, whether concrete or abstract, can be confused with its
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
900 %identification.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
901 %
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
902 %"token" is sometimes used to refer to the token identification rather than
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
903 %the token itself. The correct meaning is usually clear from the context, at
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
904 %least to an experienced reader.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
905 %
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
906
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
907 As an example, consider a C compiler which has a lexical scanner to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
908 identify input tokens. According to Kernighan and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
909 Ritchie\footnote{Brian W. Kernighan and Dennis M. Ritchie. \textit{The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
910 C Programming Language}, 2nd ed. (Englewood Cliffs, New Jersey:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
911 Prentice-Hall, 1988), p. 191.}, there are six classes of C tokens:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
912 identifiers, keywords, constants, string literals, operators, and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
913 other separators. Consider a particular token, a hexadecimal constant,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
914 \agcode{0XF00}. When the lexical scanner hands this over to the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
915 parser, it has to provide an identification code which says what kind
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
916 of token it is, a hexadecimal constant, as well as the ``semantic
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
917 value'' of the token, 3840. The identification code is usually an
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
918 integer value, determined by the designer of the lexical scanner,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
919 which by agreement with the designer of the parser uniquely represents
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
920 a hexadecimal constant. This identification code can be thought of as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
921 an external token number.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
922
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
923 The grammar for the parser, on the other hand, has a token,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
924 \agcode{HEXconstant}, let us say, to represent the abstract usage of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
925 hexadecimal constants in the C language. When AnaGram, or any other
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
926 parser generator for that matter, analyzes the grammar, it assigns its
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
927 own reference number to identify \agcode{HEXconstant}. This reference
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
928 number can be thought of as an internal token number to contrast it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
929 with the external token number provided by the lexical scanner. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
930 parsers AnaGram builds contain a table, \agcode{ag{\us}tcv}, which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
931 provides a translation from the external to the internal token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
932 number. Both of these token numbers, of course, differ from the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
933 semantic value of the token, in this case, 3840.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
934
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
935 Even when an AnaGram parser is using simple character input, the word
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
936 ``token'' may represent several different things. Suppose a grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
937 has a token \agcode{'A-Z'}, which represents the set of upper case
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
938 letters. There will be an internal token number, assigned by AnaGram,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
939 which identifies the (abstract) token in parser tables. The actual
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
940 (concrete) input token to a parser will, however, be a single letter,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
941 say ``Q''. In this case, the ASCII value of the character ``Q'', 81,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
942 is used both as the external token number and as the semantic value of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
943 the token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
944
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
945 \paragraph{Unary Operator.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
946 A ``unary operator'' is a token which represents some sort of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
947 operation which operates on a single entity to produce a new resultant
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
948 entity. It is normally represented by a symbol which is written either
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
949 preceding the operand or following the operand. In the first case it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
950 is called a ``prefix operator'' and in the second case a ``suffix
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
951 operator''.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
952
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
953 \index{Union}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
954 \paragraph{Union.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
955 The union of two sets is the set of all elements that are to be found
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
956 in one or another of the two sets. In an AnaGram syntax file the union
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
957 of two character sets \agcode{A} and \agcode{B} is represented using
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
958 the plus sign, as in \agcode{A + B}. The union operator has the same
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
959 precedence as the difference operator: lower than that of intersection
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
960 and complement. The union operator is left associative.