annotate doc/manual/isdp.tex @ 21:1c9dac05d040

Add lint-style FALLTHROUGH annotations to fallthrough cases. (in the parse engine and thus the output code) Document this, because the old output causes warnings with gcc10.
author David A. Holland
date Mon, 13 Jun 2022 00:04:38 -0400
parents 13d2b8934445
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
1 \chapter{Introduction to Syntax Directed Parsing}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
2
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
3 Every programmer has to deal with input data. The units of data may
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
4 consist of single characters from a keyboard or file, unit records
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
5 from a file, mouse events in a windowing system, or even more complex
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
6 data constructs. When the processing of a unit of data depends only on
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
7 the value of that data, it is usually straightforward. Usually,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
8 however, the processing depends also on what has preceded, and often
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
9 even on what follows, the input under consideration. Keeping track of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
10 these dependencies in order to know how to process the data is called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
11 \index{Parsing}\agterm{parsing}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
12
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
13 It is often relatively easy to keep track of simple dependencies when
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
14 first writing a program. But as the program develops, as new features
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
15 are added, as bugs are fixed, the dependencies often stop being
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
16 simple. Then input processing becomes a headache, since it is hard to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
17 keep track of or even identify all the particular cases. Changes to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
18 the program cause unexpected problems. Program maintenance threatens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
19 to get out of control.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
20
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
21 Syntax directed parsing is a technique for dealing with input data
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
22 streams which keeps you in control of the problem, gives you a high
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
23 degree of confidence in your program, makes bugs less likely and makes
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
24 program maintenance and the addition of new features an easy task.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
25
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
26 To use syntax directed parsing you create a high level description of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
27 the structure of your input data, called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
28 \index{Grammar}\agterm{grammar}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
29 A file which contains a grammar is called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
30 \index{Syntax file}\index{File}\agterm{syntax file}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
31 A
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
32 \index{Parser generator}\agterm{parser generator},
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
33 such as AnaGram,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
34 can create, from a syntax file, a function (or program) called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
35 \index{Parser}\agterm{parser},
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
36 written in C or C++. The parser keeps track of all the dependencies
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
37 in your input, and calls certain functions,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
38 \index{Reduction procedure}\agterm{reduction procedures},
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
39 to deal with specific units or sequences of data as they are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
40 encountered.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
41
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
42 Reduction procedures are functions you write to process your data.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
43 They are linked, in an obvious way in the grammar, to the structures
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
44 in your input, so that your parser will call them at precisely the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
45 right times with precisely the right data. When you write reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
46 procedures, you can concentrate entirely on what you have to do with
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
47 the data. You don't have to encumber your code with switches and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
48 tests to determine the structure of your input.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
49
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
50 This chapter describes in a general way how you write a grammar and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
51 how a parser works. In particular, it defines a number of terms which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
52 are useful in talking about grammars and syntax directed parsing. It
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
53 then introduces a number of special features of AnaGram. More
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
54 detailed treatment of writing grammars is given in Chapter 8.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
55 Techniques for building a complete functioning parser are discussed in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
56 Chapter 9.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
57
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
58
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
59 \section{Describing an Input Sequence}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
60
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
61 Writing a grammar consists of describing the acceptable input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
62 sequences for your program. The vehicle for describing an input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
63 sequence is called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
64 \index{Production}\agterm{production}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
65 Productions show how a logical component of the input can be made up
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
66 of a sequence of more elementary components. A production that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
67 describes a date might be written:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
68
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
69 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
70 date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
71 -> name of month, day, comma, year
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
72 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
73
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
74 The components of the input are called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
75 \index{Token}\agterm{tokens}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
76 The sequence of tokens on the \index{Right side}right side of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
77 production is called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
78 \index{Grammar rule}\agterm{grammar rule},
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
79 or \index{Rule}\agterm{rule} for short.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
80 The individual tokens on the right side of the rule are also called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
81 \index{Rule elements}\index{Rule}\agterm{rule elements}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
82 The \agterm{rule length} is the number of elements in the rule.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
83
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
84 The token on the \index{Left side}left side of the production is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
85 called the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
86 \index{Reduction token}\index{Token}\agterm{reduction token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
87 for the rule. \index{Token}Tokens may have
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
88 \index{Value}\index{Semantic value}\agterm{semantic values},
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
89 as distinguished from \index{Value}\agterm{syntactic values}, which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
90 you can use in your reduction procedures. For instance, the value of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
91 \agcode{name of month} could be an integer in the range zero to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
92 eleven, or it could be a pointer to an ASCII string. The value of day
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
93 could be an integer in the range one to thirty-one.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
94
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
95 A grammar consists of a number of such productions, each of which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
96 describes some component of the input in terms of other components.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
97 It does not take very many productions to describe quite complex input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
98 streams. A grammar for the C language, for instance, requires about
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
99 two hundred productions.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
100
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
101 Many people find the term \agterm{production} quite confusing. It
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
102 comes from theoretical linguistics where it is used to describe how
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
103 one may produce sequences which correspond to a set of grammatical
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
104 rules. Ironically, the major usage of the idea has been in parsing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
105 where the interest is not so much in creating sequences which satisfy
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
106 the grammatical rules as in decoding and analyzing such sequences.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
107 Nonetheless, it is convenient, in the above example, to say that the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
108 token \agcode{date} \agterm{produces} a sequence of tokens consisting
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
109 of \agcode{name of month}, \agcode{day}, \agcode{comma}, and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
110 \agcode{year}. We also say that the sequence \agterm{reduces} to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
111 \agcode{date}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
112
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
113 There may be more than one production to describe a given component,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
114 if there is more than one way it may be represented. For instance,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
115
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
116 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
117 date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
118 -> day, name of month, year
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
119 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
120
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
121 describes another common way of writing a date. In other words, a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
122 reduction token may produce a number of different grammar rules.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
123
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
124 Tokens which appear on the left side of one or more productions are called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
125 \index{Token}\index{Nonterminal token}\agterm{nonterminal tokens}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
126 Those which appear \emph{only} on the right sides of productions are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
127 called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
128 \index{Token}\index{Terminal token}\agterm{terminal tokens}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
129 Terminal tokens are the units which actually appear physically in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
130 input. Nonterminal tokens are identified when a sequence of tokens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
131 that matches the right side of a production is seen in the input.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
132 When AnaGram analyzes a grammar, it assigns a unique
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
133 \index{Token number}\index{Number}\index{Token}\agterm{token number}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
134 to each token it finds in the grammar.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
135
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
136 Nonterminal tokens, such as \agcode{date} in the example above, may
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
137 appear in any grammar rule just as though they were input tokens. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
138 token on the left side of a production can even appear on the right
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
139 side as well. Such a production is called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
140 \index{Production}\index{Recursive productions}\agterm{recursive}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
141 production. When a nonterminal token appears on the right side of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
142 production, it may be represented in this context by \emph{any} of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
143 the grammar rules it produces. Grammars described in this manner are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
144 called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
145 \index{Context free grammar}\index{Grammar}\agterm{context free grammars}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
146 since there is no contextual constraint on which of the rules that a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
147 token produces can appear in any given context.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
148
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
149 Recursive productions may be either left recursive or right recursive.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
150 \agterm{Left recursive} productions are those where the recursively
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
151 defined nonterminal appears as the first element in the recursive
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
152 rule. \agterm{Right recursive} productions are those where it is the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
153 last element. If it appears anywhere between, the production is said
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
154 to be \agterm{center recursive}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
155
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
156 Any nonterminal token which has a recursive production must also have
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
157 at least one simple, non-recursive production. Otherwise, it is not
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
158 possible to create a finite sequence of terminal tokens from the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
159 nonterminal token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
160
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
161 Recursion may also occur implicitly in a grammar when one of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
162 tokens on the right side of a production itself has a production
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
163 involving the token on the left. Such implicit recursion sometimes
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
164 may involve numerous levels of productions. Implicit recursion occurs
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
165 most commonly in describing constructs such as arithmetic expressions
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
166 or the block structure of programming languages.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
167
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
168 Clearly, grammars can accommodate multiple levels of structure in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
169 input sequences they describe. There must, at the top, be a single
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
170 token which encompasses the entire input. This special token is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
171 variously called the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
172 \index{Grammar token}\index{Configuration parameters}\index{Token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
173 \agterm{grammar token},
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
174 the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
175 \index{Goal token}\agterm{goal token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
176 or the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
177 \index{Start token}\agterm{start token}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
178
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
179 In AnaGram grammars, terminal tokens need not be declared or otherwise
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
180 defined. Any token name that appears only on the right side of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
181 production is taken to be a terminal token, and AnaGram presumes that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
182 the input procedure that provides tokens to the parser will be able to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
183 identify them to the parser appropriately. The details on how to do
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
184 this are explained in Chapter 9.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
185
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
186 On the other hand, AnaGram allows you to specify terminal tokens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
187 explicitly as ASCII characters, or even as sets of ASCII characters,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
188 right in the grammar. Thus, you may write \agcode{'0-9'} to represent
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
189 the set of ASCII digits, or \agcode{'A-Z'} to represent the set of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
190 upper case letters. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
191 \index{Semantic value}\index{Value}\index{Token}semantic value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
192 of such a token is the ASCII \index{Character codes}character code
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
193 that actually appears in the input stream. The rules for representing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
194 \index{Character sets}character sets are given in Chapter 8. If the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
195 various sets you use in your grammar overlap, they may not properly
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
196 represent terminal tokens. In this case, AnaGram automatically
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
197 extends your grammar appropriately. This ``set partition'' logic is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
198 described in Chapter 6.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
199
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
200
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
201 \section{How a Parser Works}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
202
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
203 The aim of a \index{Parser}parser is to match its input with the full
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
204 syntactic structure specified by the productions which make up the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
205 grammar. The primary component of a parser is an input buffer,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
206 sometimes thought of as a stack, into which tokens are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
207 \agterm{shifted}, or stored sequentially, as they are encountered in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
208 the input. At the same time that a token is shifted into the input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
209 buffer, its
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
210 \index{Semantic value}\index{Token}\index{Value}semantic value is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
211 pushed onto the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
212 \index{Parser value stack}\index{Value stack}\index{Stack}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
213 \agterm{value stack}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
214 A token is not shifted into the buffer unless it ``makes sense'', that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
215 is, unless it is consistent with the rules of the grammar and with the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
216 input that has preceded it. If a token does not make sense, the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
217 parser signals a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
218 \index{Syntax error}\index{Errors}\agterm{syntax error}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
219 In order to determine whether a token makes sense, the parser has a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
220 sort of decision table which provides a list of acceptable tokens for
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
221 each of a number of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
222 \index{Parser}\index{State}\agterm{states}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
223 The table also specifies what the parser is to do with each acceptable
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
224 token. When the table indicates that a token is to be shifted into
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
225 the buffer, it also specifies a new state. The parser stacks the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
226 current state on a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
227 \index{Parser state stack}\index{State stack}\index{Stack}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
228 \agterm{state stack}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
229 and jumps to the new state. Thus every time a token is shifted into
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
230 the input buffer, a state number is pushed onto the state stack. For
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
231 each state of the parser, excepting only the initial state, there is a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
232 unique token which will cause a jump to that state. This token is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
233 called the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
234 \index{Characteristic token}\index{Token}\agterm{characteristic token}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
235 for the state.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
236
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
237 When the rightmost, or most recent, tokens in the input buffer match
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
238 the right side of a production precisely, the parser \emph{may}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
239 replace the tokens that match the rule with a single token, the token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
240 on the left side of the production. This process of replacing a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
241 sequence of tokens with a single token is called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
242 \index{Reduction}\agterm{reduction}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
243 The token that replaces the sequence of tokens is called the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
244 \index{Reduction token}\index{Token}\agterm{reduction token}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
245
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
246 The actual mechanism of the reduction is quite important. At the same
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
247 time that input tokens are removed from the input buffer, state
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
248 numbers are popped from the state stack, so that when all input tokens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
249 matching the rule have been removed, the parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
250 \index{Parser}\index{State}state has been restored to the value it had
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
251 at the time the first token in the rule was seen. As state numbers
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
252 are popped from the state stack, token values are popped from the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
253 value stack. If the rule has a reduction procedure, temporary
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
254 variables are loaded with the values popped from the stack and the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
255 reduction procedure is called. The reduction token is now shifted
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
256 into the input buffer just as though it were an input token. If the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
257 reduction procedure returned a result it is shifted into the value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
258 stack as the value of the reduction token. The parser stacks the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
259 current state again and jumps to a new state as determined by the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
260 parser tables.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
261
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
262 If there are no errors in your input, when the last token has been
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
263 read from the input, shifted into the input buffer, and reductions
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
264 performed, there will be precisely one token in the input buffer: the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
265 \index{Grammar token}\index{Goal token}grammar, or goal, token which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
266 describes your entire input. At this point your parser declares
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
267 itself finished.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
268
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
269 Reductions do not necessarily occur every time a rule matches the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
270 tokens in the input buffer. If the reduction token does not make
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
271 sense, that is, if it is not consistent with the rules of the grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
272 and with the input that has preceded it, the parser will not perform
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
273 the reduction. Suppose there are tokens in the input which match one
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
274 of the rules given above for \agcode{date}. A reduction will not
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
275 occur unless \agcode{date} is one of the tokens the parser is actually
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
276 looking for at that stage in the input.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
277
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
278 Even when the reduction token would make sense, there are still
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
279 situations where the reduction would not take place. Suppose a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
280 grammar includes the two productions given above for \agcode{date} as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
281 well as the following:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
282
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
283 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
284 date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
285 -> name of month, day
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
286 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
287
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
288 This production is the same as the first, but with no year
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
289 specification. We often write dates without specifying the year
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
290 explicitly. The year is usually understood from context. When a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
291 parser directed by this grammar has encountered \agcode{name of month}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
292 and \agcode{day}, it can't tell without looking further whether it has
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
293 a short form or a long form date. In such a circumstance, the parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
294 looks at the next following token, which is called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
295 \index{Lookahead token}\index{Token}\agterm{lookahead token}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
296 If the lookahead token is a comma, then, in the absence of other
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
297 productions, the input is a long form date. If the lookahead token is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
298 not a comma, then the input is certainly not a long form date, and it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
299 is proper to reduce the short form production.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
300
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
301 Suppose the lookahead token were a comma and the grammar were also to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
302 contain the following production:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
303
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
304 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
305 appointment
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
306 -> date, comma, time
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
307 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
308
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
309 Since a comma can follow date, according to this rule, and can also
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
310 follow day according to the first production, it is impossible to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
311 determine, simply by looking at the lookahead token, whether the date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
312 was being given in short form or long form. One would have to look
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
313 beyond the comma to see if what follows the comma matches the rules
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
314 for time or for year. Although it is possible to build parsers which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
315 can do this, it is not generally feasible. This situation is called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
316 \index{Shift-reduce conflict}\index{Conflicts}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
317 \agterm{shift-reduce conflict},
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
318 because the parser cannot decide simply on the basis of the lookahead
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
319 token whether to reduce the short form rule or to ignore it and shift
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
320 the comma into the input buffer.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
321
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
322 AnaGram finds and warns you about the conflicts in your grammar when
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
323 it analyzes your grammar. To do this, it checks all states to find
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
324 all completed rules. A
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
325 \index{Completed rule}\index{Rule}\agterm{completed rule}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
326 is a rule that is completely matched by the rightmost tokens in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
327 input buffer. For each such rule, AnaGram determines all of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
328 terminal tokens which could follow once that rule has been reduced.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
329 These tokens are sometimes called the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
330 \index{Token}\agterm{reducing tokens}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
331 for the given rule in the given state. (Note that ``reducing tokens''
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
332 are quite different from ``reduction tokens''.) AnaGram checks each
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
333 state for shift-reduce conflicts by comparing the set of tokens which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
334 can be shifted against all sets of reducing tokens for that state. If
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
335 there is an overlap, there is a conflict. When AnaGram finds a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
336 conflict, it prepares a detailed report on the conflict. This
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
337 information is then available in the \agwindow{Conflicts} window. A
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
338 shift-reduce conflict does not prevent you from building a parser. If
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
339 you build a parser for a grammar which has such a shift-reduce
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
340 conflict, the parser will resolve the issue by shifting the lookahead
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
341 token into the input buffer and ignoring the reduction.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
342
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
343 There is another type of conflict involving reductions. Suppose a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
344 given state has more than one
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
345 \index{Completed rule}\index{Rule}completed rule.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
346 AnaGram then checks the sets of reducing tokens for all the completed
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
347 rules. If there is an overlap, there is no way for AnaGram to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
348 determine which rule should be reduced simply on the basis of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
349 \index{Token}\index{Lookahead token}lookahead token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
350 This situation is called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
351 \index{Reduce-reduce conflicts}\index{Conflicts}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
352 \agterm{reduce-reduce conflict}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
353 AnaGram will also diagnose reduce-reduce conflicts in the Conflicts
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
354 window. If you decide to ignore a reduce-reduce conflict and ask
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
355 AnaGram to build a parser, the parser will choose the rule AnaGram
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
356 encountered first when it read your grammar.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
357
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
358 \index{Conflicts}Conflicts are not necessarily errors. Sometimes you
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
359 can build a more compact, faster parser by deliberately introducing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
360 conflicts. You can rely on AnaGram's default resolution of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
361 conflicts, or you may use operator precedence to resolve the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
362 conflicts. The use of operator precedence is described in Chapter 9.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
363
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
364 AnaGram has extensive facilities for identifying and correcting
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
365 unwanted conflicts. These facilities are described in Chapter 7.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
366
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
367 Traditionally, parsers are built using only shift and reduce actions
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
368 as described above. The parsers AnaGram builds normally use a number
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
369 of more complex actions, each of which is equivalent to a number of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
370 shift and reduce actions. These complex actions permit AnaGram to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
371 build faster, more compact parsers. If you wish, however, you may
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
372 restrict AnaGram to using only shift and reduce actions by setting the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
373 \index{Traditional engine}\index{Configuration switches}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
374 \agparam{traditional engine} configuration switch. See Appendix A,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
375 Configuration Parameters.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
376
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
377
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
378 \section{A Note on Notation}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
379
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
380 \index{Context free grammar}\index{Grammar}Context free grammars have
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
381 been traditionally represented in the literature using
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
382 \index{Backus-Naur Form}\agterm{Backus-Naur Form}, or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
383 \index{BNF}\agterm{BNF}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
384 In Backus-Naur Form, certain characters, called metacharacters, are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
385 used to punctuate productions and all other printable characters are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
386 taken to represent themselves literally. Named tokens are denoted by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
387 enclosing the name within angle brackets ($< >$). The left side of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
388 production is distinguished from the right side by the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
389 % XXX s/characters/symbol/
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
390 characters $::=$.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
391 If several productions have the same left side, the vertical
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
392 bar $|$ is used to separate them. The elements of a grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
393 rule are simply juxtaposed to indicate that one follows another.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
394 Blanks are ignored. Thus, in BNF, the first production given for
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
395 \agcode{date}, above, would be:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
396
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
397 % ~ is a hard space
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
398 $$
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
399 <date>~~::=~~<name~of~month>~~<day>~~,~~<year>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
400 $$
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
401
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
402 AnaGram uses a notation more consonant with ordinary programming
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
403 usage. Thus token names need not be bracketed and literal characters
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
404 must be appropriately quoted. The elements of rules are joined by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
405 commas. Using this approach, there is no need for metacharacters and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
406 it becomes possible to make a number of useful extensions to the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
407 notation. The detailed rules for writing grammars using AnaGram's
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
408 notation are given in Chapter 8.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
409
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
410
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
411 \section{Reduction Procedures}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
412 \index{Reduction procedure}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
413
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
414 Of course, the reason for parsing an input stream is to interpret the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
415 data in the stream and to process it in some useful manner. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
416 primary tool for doing this is the \agterm{reduction procedure}. A
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
417 reduction procedure is a piece of C code that is executed when a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
418 particular grammar rule is reduced. Often, a reduction procedure
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
419 calculates a value which becomes the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
420 \index{Value}\index{Semantic value}semantic value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
421 of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
422 \index{Token}\index{Reduction token}reduction token.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
423 The input to the reduction procedure consists of the values of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
424 tokens that make up the grammar rule.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
425
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
426 AnaGram allows you to assign C variable names to the tokens in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
427 grammar rule, so that you can refer to them in the reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
428 procedure. To assign a C variable name, simply follow the token in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
429 the rule with a colon and the C variable name. Simple reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
430 procedures can be written as C or C++ expressions. At the end of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
431 rule, write an equal sign and an expression, terminated by a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
432 semicolon. For example:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
433
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
434 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
435 (int) hex digit
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
436 -> '0-9':d = d-'0';
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
437 -> 'a-f':d = d-'a'+10;
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
438 -> 'A-F':d = d-'A'+10;
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
439 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
440
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
441 When any one of these rules is matched, the value of the token in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
442 rule is assigned to the temporary variable \agcode{d}. The expression
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
443 to the right of the equal sign is evaluated and the value of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
444 expression is stored as the value of \agcode{hex digit}, which has
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
445 been declared to be an \agcode{int}. These productions define
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
446 hexadecimal digits as ASCII characters, and calculate the binary value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
447 of the digit in terms of the ASCII character code, \agcode{d}. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
448 binary value becomes the value of \agcode{hex digit}. Hexadecimal
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
449 digits can be combined to make hexadecimal numbers by writing the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
450 following productions:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
451
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
452 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
453 (int) hex number
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
454 -> hex digit
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
455 -> hex number:n, hex digit:d = 16*n+d;
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
456 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
457
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
458 %In this example, note that if you do not specify a reduction procedure for
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
459 %a grammar rule, the value of the reduction token will be taken to be the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
460 %value of the first token in the rule. Thus, it is not necessary to provide
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
461 %a reduction procedure in the first production for hex number. The second
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
462 %production recursively defines the value of hexadecimal numbers that have
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
463 %more than one digit.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
464
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
465 There are several important points to notice in this example. First,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
466 reduction procedures are executed ``from the bottom up''. That is,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
467 the reduction procedure for \agcode{hex digit} is executed before any
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
468 reduction procedure for \agcode{hex number}. Second, if there is no
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
469 reduction procedure for a production, the value of the first token in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
470 the rule is assigned to the reduction token. Thus, it is not
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
471 necessary to provide a reduction procedure in the first production for
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
472 \agcode{hex number}. Third, the reduction procedures for recursive
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
473 productions are always executed \emph{after} the reduction procedure,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
474 if any, for the non-recursive production which begins the recursion.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
475 Finally, when an input sequence is described using left recursion, as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
476 in this example, the elements of the sequence are processed left to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
477 right.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
478
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
479 %
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
480 % XXX this is a bad example as you never want to read floats this way!
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
481 % Can we think of something else?
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
482 %
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
483 If you wish to process the elements of a sequence right to left, you
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
484 may use right recursion. For example, it is sometimes convenient to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
485 define the fraction part of a decimal number thus:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
486
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
487 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
488 (double) fraction part
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
489 -> '0-9':d = (d - '0')/10.0;
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
490 -> '0-9':d, fraction part:f = (d - '0' + f)/10.0;
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
491 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
492
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
493 In this case the leading digits are stored temporarily in the parse
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
494 stack, and then the fraction part is evaluated right to left only when
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
495 the last digit has been found.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
496
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
497 Reduction procedures can be more complex than simple expressions. After the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
498 equal sign you may include an arbitrary block of C or C++ code, enclosed in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
499 braces, \agcode{\bra \ket}. To return a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
500 \index{Semantic value}\index{Token}\index{Value}semantic value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
501 for the reduction token simply use a return statement. Of course,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
502 reduction procedures have the full resources of C or C++ at their
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
503 disposal. They may set and interrogate global variables and may call
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
504 functions freely. Reduction procedures are usually implemented as C
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
505 functions. Very simple procedures may be implemented as macros,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
506 unless you have disabled this option by turning off the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
507 \index{Macros}\index{Allow macros}\agparam{allow macros} configuration
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
508 switch. See Chapter 8 for the rules for writing reduction procedures.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
509 When AnaGram builds a parser it copies all of the reduction procedures
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
510 you have defined to the parser file, and includes code to call them at
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
511 the right time.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
512
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
513 Since the reduction procedures you write will probably need some
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
514 support code, such as \agcode{\#include} statements and declarations,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
515 you may incorporate C or C++ code into your syntax file at any point.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
516 You need only enclose it in braces, \agcode{\bra \ket}. Such code
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
517 is called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
518 \index{Embedded C}\agterm{embedded C}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
519 All embedded C code is also copied to the \index{File}parser file, and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
520 \emph{precedes} all of your reduction procedures.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
521
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
522
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
523 \section{Building a Parser}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
524 \index{Building a Parser}\index{Parser}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
525
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
526 In order to round out a parser into a functioning program it needs
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
527 input procedures, as well as error diagnosis and recovery
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
528 capabilities. AnaGram has a number of options available which give
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
529 you a high degree of flexibility in configuring a parser to suit your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
530 particular needs. All of the options are provided with reasonable
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
531 defaults, so that you can safely disregard any option until you need
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
532 the features it provides. The following paragraphs summarize the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
533 options AnaGram provides. These topics are discussed in greater
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
534 detail in Chapter 9.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
535
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
536 \paragraph{Invoking a Parser.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
537 \index{Parser}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
538 Normally, AnaGram configures parsers as functions which you can call
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
539 from elsewhere in your program. In this situation, you call the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
540 parser, it processes its input, and returns either when it has
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
541 finished or cannot proceed because of errors. Alternatively, if you
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
542 set the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
543 \index{Event driven}\agparam{event driven}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
544 configuration switch, your parser will be configured so that you have
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
545 two procedures to call: an \index{Initializer}\agterm{initializer} and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
546 a \index{Parser}\agterm{parser}. In the event driven configuration
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
547 you start the parse by calling the initializer and then you call the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
548 parser once for each unit of input. Using the event driven mode makes
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
549 it quite easy to configure a parser as a filter and to chain several
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
550 parsers together so that the output from one parser is the input to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
551 the next. Such
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
552 \index{Parsing}\index{Multi-stage parsing}\agterm{multi-stage parsing}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
553 is a convenient way to deal with complex input that is not context free.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
554
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
555 \paragraph{Communicating with a Parser.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
556 The complete status of your parser is contained in a single data
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
557 structure called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
558 \index{Parser control block}\index{Control block}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
559 \agterm{parser control block}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
560 All communications with a parser take place via the parser control
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
561 block. \index{Input procedures}Input procedures must place input data
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
562 in the appropriate field in the parser control block. When the parse
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
563 is complete or encounters an error, the results of the parse may be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
564 found in the parser control block. When AnaGram builds a parser it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
565 includes, in the \index{File}\index{Header file}header file it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
566 generates, a \agcode{typedef} statement which defines the structure of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
567 the parser control block.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
568
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
569 \paragraph{Parser Input.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
570 \index{Input}\index{Parser}\index{Input procedures}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
571 The input to your parser may be either characters read directly from
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
572 an input stream, or tokens accumulated by a pre-processor or lexical
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
573 scanner. The way you provide input to your parser depends on how your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
574 grammar defines input tokens and also on whether or not you have
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
575 requested an event driven parser.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
576
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
577 If your parser is event driven, you provide its input by storing the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
578 input code and the input value, if any, into the parser control block
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
579 and calling the parser.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
580
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
581 If you have set the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
582 \index{Pointer input}\agparam{pointer input}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
583 \index{Configuration switches}configuration switch
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
584 in your syntax file, you simply initialize the pointer field in your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
585 parser control block before you call your parser. Your parser will
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
586 then read its input directly from memory by simply incrementing the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
587 pointer as necessary.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
588
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
589 Otherwise, your parser will invoke a macro called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
590 \index{\agcode{GET{\us}INPUT}}\index{Macros}\agcode{GET{\us}INPUT} every
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
591 time it needs more input. You may define \agcode{GET{\us}INPUT}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
592 according to your needs. You can define it so that it calls an input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
593 function, or you can define it so that it executes in-line code each
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
594 time it is invoked. Your \agcode{GET{\us}INPUT} macro should store its
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
595 input code in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
596 \index{\agcode{input{\us}code}}\agcode{input{\us}code}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
597 field of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
598 \index{PCB}parser control block.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
599 If you do not write a \agcode{GET{\us}INPUT} macro, AnaGram will provide
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
600 one which will read characters from \agcode{stdin}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
601
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
602 If your grammar does not define terminal tokens in terms of ASCII
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
603 characters or external token numbers, your \agcode{GET{\us}INPUT} will
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
604 have to determine the appropriate internal
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
605 \index{Token number}\index{Token}\index{Number}token number
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
606 for each input token. To assist you in determining these token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
607 numbers AnaGram provides a \agcode{typedef enum} statement in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
608 \index{File}\index{Header file}header file. You can then use named
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
609 constants to specify the internal token numbers for the input tokens.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
610
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
611 \paragraph{Error Handling.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
612 \index{Error handling}\index{Error diagnosis}\index{Error recovery}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
613 Your parser must be prepared to deal with erroneous input. There are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
614 two aspects to error handling: diagnosing the error, and recovering
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
615 from the error. On encountering an error, your parser will invoke a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
616 macro called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
617 \index{\agcode{SYNTAX{\us}ERROR}}\index{Macros}\agcode{SYNTAX{\us}ERROR}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
618 If you do not provide a definition for \agcode{SYNTAX{\us}ERROR}, AnaGram
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
619 will provide a simple error diagnostic. AnaGram can also provide
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
620 automatic error diagnoses which pinpoint the location of the error.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
621
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
622 \index{Resynchronization}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
623 AnaGram provides two options for error recovery:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
624 \agterm{error token resynchronization} and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
625 \agterm{automatic resynchronization}. These are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
626 techniques for getting your parser back in synch with its input so it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
627 can proceed after encountering an error. Normally, if you do not
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
628 select one of these recovery techniques, your parser will terminate
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
629 when it encounters an error; however, you may override this default if
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
630 you wish and provide your own recovery technique.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
631
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
632
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
633 \section{Special Features of AnaGram}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
634
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
635 AnaGram provides a number of special features not generally available
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
636 in other parsing systems. These features extend the applicability of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
637 syntax directed parsing, make it easier for you to write and debug
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
638 your grammars, and make it easier to interface your parser to other
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
639 parsers and to your main program. The following paragraphs provide a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
640 brief introduction to these features. Their use is described more
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
641 fully in the appropriate chapters in this manual.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
642
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
643 \paragraph{Configuration Parameters.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
644 \index{Configuration parameters}\index{Parameters}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
645 AnaGram has a number of configuration switches and parameters which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
646 you can use to control the way AnaGram works in various circumstances.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
647 Since programmers often have different requirements, AnaGram strives
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
648 to be as flexible as possible.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
649
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
650 Configuration parameters may be set in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
651 \index{Configuration file}\index{File}configuration files, or in your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
652 syntax file. The rules for writing configuration parameters are given
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
653 in Chapter 8. A summary of all configuration parameters is given in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
654 Appendix A. The use of particular configuration parameters is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
655 described as appropriate throughout this manual.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
656
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
657 \paragraph{Character Sets}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
658 \index{Character sets}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
659 In your grammar, if you write terminal tokens explicitly as characters
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
660 or as sets of ASCII characters, AnaGram will set up your parser to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
661 accept ASCII characters directly, so you will not need to build a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
662 \index{Lexical scanner}lexical scanner and interface it to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
663 your parser. Character sets may be specified in terms of ranges of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
664 characters, such as \agcode{'a-z'}, as unions, intersections or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
665 differences of sets, denoted by \agcode{+}, \agcode{\&}, or \agcode{-}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
666 respectively, or as the complement of a set, denoted by \agcode{\~{}}.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
667 You may also specify a set consisting of a single character either
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
668 with a literal character or with a numeric specification. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
669 detailed rules for character set expressions are given in Chapter 8.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
670 Some useful character sets are:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
671
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
672 \begin{itemize}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
673 \item Letters: \agcode{'a-z' + 'A-Z'}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
674 \item Digits: \agcode{'0-9'}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
675 \item DOS end of file: \agcode{-1 + \^{}Z}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
676 \item Neither end of file nor end of line:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
677 \agcode{\~{}(-1 + \^{}Z
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
678 + '{\bs}n')}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
679 \end{itemize}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
680
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
681 \paragraph{Keywords.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
682 \index{Keywords}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
683 AnaGram makes special provision for the use of keywords in the parsers
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
684 it builds. In your grammar, you may specify a keyword as a quoted
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
685 string. AnaGram will incorporate special string matching logic in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
686 your parser to recognize keywords in the parser input. When a keyword
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
687 is recognized, it is treated as an individual token, completely
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
688 independent of the characters which make up the keyword. Some
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
689 examples of keywords: \agcode{"let"}, \agcode{".EQ."},
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
690 \agcode{"<>"}, \agcode{"/*"}. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
691 rules for the use of keyword strings are given in Chapter 8.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
692
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
693 Keywords are a powerful and useful tool, but they are not completely
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
694 consistent with the assumptions underlying the use of context-free
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
695 grammars. Thus under certain circumstances it is possible for a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
696 parser built from a grammar that uses keywords to fail
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
697 % XXX s/parse correctly/correctly parse/
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
698 to parse correctly text that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
699 appears to be consistent with the grammar. Such a situation is called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
700 \index{Keyword anomalies}\index{Anomaly}keyword anomaly.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
701 AnaGram detects and diagnoses keyword anomalies in your grammar, and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
702 provides tools to help you find and correct the problem. Many such
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
703 apparent anomalies, however, are completely innocuous. Keyword
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
704 anomalies are discussed in greater detail in Chapter 7.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
705
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
706 \paragraph{Virtual Productions.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
707 \index{Virtual productions}\index{Production}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
708 Virtual productions are a powerful, shorthand notation for specifying
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
709 optional or repetitive input. Using a virtual production can often
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
710 spare you the trouble of writing out four or five regular productions
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
711 and naming them.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
712
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
713 Most of the notation for virtual productions should be familiar since
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
714 it has been used in programming manuals for years. The basic forms
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
715 are these:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
716
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
717 \index{\agcode{[]}}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
718 \index{\_opb\_clb} % {}, apparently. XXX?
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
719 \index{\agcode{...}}\index{Ellipsis}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
720 \index{\agcode{?}}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
721 \index{\agcode{?...}}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
722 \index{\agcode{/...}}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
723
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
724 % XXX figure out how to not have to encode the line breaks manually.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
725 % I think we need one of the fancier extension forms of tabular.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
726
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
727 \begin{tabular}[t]{ll}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
728
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
729 \agcode{[]}&enclosing a list of one or more grammar rules separated
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
730 by \agcode{|} characters denotes an optional\\
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
731 \phantom{}&choice of one of the rules.\\
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
732
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
733 \agcode{\bra \ket}&enclosing a list of one or more grammar rules separated by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
734 \agcode{|} characters indicates a required\\
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
735 \phantom{}&choice of one of the rules.\\
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
736
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
737 \agcode{...}&following a token, or following \agcode{[]} or \agcode{\bra \ket},
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
738 indicates arbitrary repetition of the previous choice or\\
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
739 \phantom{}&token.\\
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
740
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
741 \agcode{?}&following a token or set expression makes the token optional.\\
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
742
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
743 \agcode{?...}&following a token or set expression denotes zero or more
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
744 repetitions of the token.\\
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
745
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
746 \agcode{/...}&following \agcode{[]} or \agcode{\bra \ket} indicates arbitrary
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
747 repetition of the choice subject to the constraint that\\
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
748 \phantom{}&choices must alternate.\\
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
749
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
750 \end{tabular}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
751
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
752 \index{Production}\index{Virtual productions}Virtual productions
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
753 are simply syntactic devices. When AnaGram encounters a virtual
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
754 production it translates the virtual production into conventional
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
755 productions which show up in your grammar tables. Virtual productions
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
756 are described in Chapter 8.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
757
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
758 \paragraph{Definition Statements.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
759 \index{Statement}\index{Definition statement}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
760 AnaGram provides a mechanism for naming particular character sets,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
761 keywords or virtual productions in your syntax file. The name you
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
762 define can be used freely in place of the thing defined. Definition
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
763 statements have the form:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
764
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
765 % XXX why the hell is the first line offset by a small amount?
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
766 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
767 name = \codemeta{character set}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
768 name = \codemeta{virtual production}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
769 name = \codemeta{keyword}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
770 name = \codemeta{immediate action}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
771 name = \codemeta{token name}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
772 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
773
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
774 The name may be any name acceptable to AnaGram. The name can then be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
775 used anywhere you might have used the expression on the right side.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
776 For example:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
777
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
778 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
779 upper case letter = 'A-Z'
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
780 lower case letter = 'a-z'
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
781 letter = upper case letter + lower case letter
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
782 statement list = statement?...
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
783 while = "WHILE"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
784 dos eof = \^{}Z
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
785 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
786
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
787 \paragraph{Disregard Statements.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
788 \index{Disregard statement}\index{Statement}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
789 If you wish your parser generally to skip over certain characters or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
790 constructs in its input, such as blanks, newlines, or comments, you
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
791 may use disregard statements to specify precisely what is to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
792 be skipped and under what circumstances. You may use any number of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
793 disregard statements and you may cause your parser to disregard any
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
794 number of tokens, of whatever complexity. Any \agparam{disregard}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
795 statements must
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
796 be placed in a configuration section. They may appear anywhere in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
797 your syntax file.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
798
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
799 The format of the \agparam{disregard} statement is as follows:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
800
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
801 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
802 disregard \codemeta{token name}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
803 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
804
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
805 The rules for using disregard statements are given in Chapter 8.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
806
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
807 \paragraph{Lexeme Statements.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
808 \index{Lexeme statement}\index{Statement}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
809 The lexeme statement is used to fine tune the disregard statement.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
810 Any \agparam{lexeme} statements must be placed in a configuration
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
811 section. They may appear anywhere in your syntax file.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
812
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
813 The format of the \agparam{lexeme} statement is as follows:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
814
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
815 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
816 lexeme \bra \codemeta{token list} \ket
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
817 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
818
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
819 where \textit{token list} is a list of nonterminal tokens, separated
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
820 by commas.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
821
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
822 The lexeme statement allows you to specify that for the purposes of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
823 the disregard statement, the listed tokens are to be treated as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
824 % XXX make this: indivisible lexical units
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
825 lexical units, so that the disregard statement will be inoperative
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
826 within the tokens specified. Lexeme statements are discussed in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
827 Chapter 8.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
828
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
829 \paragraph{Semantically Determined Productions.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
830 \index{Semantically determined production}\index{Production}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
831 Traditionally, syntax directed parsing systems have enforced a rigid
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
832 distinction between syntactic and semantic information. In these
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
833 systems, it is not possible to use the knowledge that, for example, a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
834 symbol is a function name and not a variable name in the syntax of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
835 programming language. AnaGram provides semantically determined
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
836 productions as a mechanism to allow semantic information to control
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
837 syntactic analysis.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
838
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
839 A semantically determined production has more than one
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
840 \index{Token}\index{Reduction token}reduction token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
841 on the left side. The reduction procedure determines which reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
842 token the parser should use. For example:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
843
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
844 \begin{indentingcode}{0.4in}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
845 variable name, function name
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
846 -> symbol:s = identify{\us}symbol(s);
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
847 \end{indentingcode}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
848
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
849 In this situation, the parser will set the reduction token to the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
850 leftmost token, \agcode{variable name}, before calling the reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
851 procedure. If the reduction procedure finds that \agcode{s} in fact
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
852 identifies a function name, it can change the reduction token to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
853 \agcode{function name}. Using this capability, you can write your grammar in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
854 terms of \agcode{variable name} and \agcode{function name}, just as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
855 though they were syntactically distinguishable. Further discussion of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
856 semantically determined productions may be found in Chapter 9.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
857
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
858 \paragraph{Event Driven Parsers.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
859 \index{Parser}\index{Event driven parser}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
860 Traditionally, parsers have been constructed as subroutines which are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
861 called to analyze a stream of input and return only when they have
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
862 finished parsing the input or have encountered an unrecoverable error.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
863 Such a parser itself calls a subroutine to obtain each unit of input.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
864 There are a number of circumstances where this traditional
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
865 architecture is inconvenient. The most obvious is in situations where
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
866 programs must be event-driven, as in many popular windowing systems.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
867 A somewhat less obvious situation is where one parser, often called a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
868 \index{Lexical scanner}\agterm{lexical scanner}, provides input to a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
869 second. In these situations it is convenient to invert the role of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
870 calling and the called programs.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
871
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
872 AnaGram provides the capability of specifying that a parser it builds
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
873 should be event-driven. In this case, the parser consists of an
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
874 initialization procedure and an event handler. The main program calls
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
875 the initialization procedure and then calls the event handler with
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
876 each input token in turn. The event handler returns every time the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
877 parser needs more input. Details on event-driven parsers are given in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
878 Chapter 9.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
879
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
880 \paragraph{Context tracking.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
881 \index{Context tracking}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
882 When you are writing a reduction procedure for a particular grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
883 rule, you often need to know the value one or another of your program
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
884 variables had at the time the first token in the rule was encountered.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
885 Examples of such variables are:
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
886
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
887 \begin{itemize}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
888 \item Line or column number
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
889 \item Index in an input file
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
890 \item Index into an array
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
891 \item Counters, as of symbols defined, etc.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
892 \end{itemize}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
893
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
894 Such variables can be thought of as representing the ``context'' of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
895 the rule you are reducing. Sometimes it is possible to incorporate
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
896 the values of such variables into the values of reduction tokens, but
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
897 this can become quite cumbersome. AnaGram provides a feature known as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
898 ``context tracking'' to deal with this problem. Details on context
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
899 tracking can be found in Chapter 9.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
900
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
901 \paragraph{Coverage Analysis.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
902 \index{Coverage analysis}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
903 AnaGram has facilities to help you determine the adequacy of your test
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
904 suites. If you set appropriate configuration switches, AnaGram will
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
905 include code in your parser to count the number of times each rule in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
906 your grammar is reduced or the number of times each reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
907 procedure in your grammar is executed. You can use these counts to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
908 determine the completeness of your testing. The use of these
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
909 facilities is discussed in Chapter 9.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
910
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
911 \paragraph{Immediate Actions.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
912 \index{Action}\index{Immediate action}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
913 AnaGram provides for executing particular functions, called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
914 ``immediate actions'', at any point in a grammar rule. Thus,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
915 computation does not need to wait until the rule is complete.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
916 Immediate actions are most useful when using AnaGram to control an
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
917 interactive process. Immediate actions are discussed in Chapter 8.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
918
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
919 \paragraph{Grammar Trace.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
920 \index{Grammar Trace}\index{Window}\index{Trace}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
921 The \agwindow{Grammar Trace} facility of AnaGram allows you to examine
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
922 the workings of your parser in detail. You can set up a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
923 representation of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
924 \index{Parser state stack}\index{State stack}\index{Stack}parser state stack
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
925 and parser state as they might appear in the course of execution of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
926 your parser. You can then examine the possible inputs and see how the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
927 state and the state stack change in response to any input you choose.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
928 You can back up and try other options. You can have several grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
929 traces active simultaneously so you can compare the results of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
930 different sequences of input. For any configuration of the state
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
931 stack, you can see what grammar rules the parser is in the process of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
932 matching.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
933
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
934 The use of the Grammar Trace is described in Chapter 5.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
935
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
936 \paragraph{File Trace.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
937 \index{Trace}\index{File Trace}\index{Window}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
938 The \agwindow{File Trace} is a facility which allows you to see how
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
939 your parser will work on real data. File Trace is an interactive,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
940 interpretive parser governed by the rules in your grammar. It lets
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
941 you see precisely how your grammar parses a test file. You can see
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
942 the contents of the parse stack at any point in the parse. You can
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
943 watch the parse progress at any level of detail you choose. If you
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
944 wish, you may back up and try again. When AnaGram runs the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
945 \agwindow{File Trace} or \agwindow{Grammar Trace} it also builds a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
946 \index{Coverage}\index{Trace Coverage}\index{Window}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
947 \agwindow{Trace Coverage} table.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
948
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
949 Note that AnaGram normally uses a number of short-cut parsing actions.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
950 To make the parse look like a textbook parse, set the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
951 \index{Traditional engine}\agparam{traditional engine}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
952 \index{Configuration parameters}configuration parameter
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
953 in your syntax file.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
954
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
955 \paragraph{Aids to Debugging.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
956 AnaGram provides a number of facilities to help you debug your parser.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
957 It provides numerous tables that summarize your grammar and a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
958 windowing environment to assist you in inspecting and comparing this
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
959 data. To assist in identifying a conflict in your grammar it can
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
960 provide a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
961 \index{Trace}\index{Conflict Trace}\index{Window}\agwindow{Conflict Trace},
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
962 a pre-built grammar trace showing a sequence of input which will
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
963 trigger the conflict. If your parser encounters a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
964 \index{Syntax error}\index{Errors}syntax error,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
965 you can arrange to have AnaGram build an \agwindow{Error Trace}, a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
966 pre-built grammar trace showing the sequence of input tokens that led
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
967 to the syntax error. These facilities are described in Chapter 5.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
968
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
969 \paragraph{Conflict Resolution.}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
970 \index{Conflict}
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
971 In order to help you deal with conflicts in your grammar, AnaGram
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
972 provides several methods for resolving conflicts. You may include
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
973 \agterm{precedence declarations} to assist in the construction of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
974 simple precedence grammars, you may declare tokens to be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
975 \agparam{sticky}, or you may declare tokens to be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
976 \agterm{subgrammar} tokens.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
977 These techniques are described in Chapter 7.