Mercurial > ~dholland > hg > ag > index.cgi
annotate doc/manual/isdp.tex @ 19:db7ff952e01e
both mansupps seem to be html
author | David A. Holland |
---|---|
date | Tue, 31 May 2022 02:06:45 -0400 |
parents | 13d2b8934445 |
children |
rev | line source |
---|---|
0
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
1 \chapter{Introduction to Syntax Directed Parsing} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
2 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
3 Every programmer has to deal with input data. The units of data may |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
4 consist of single characters from a keyboard or file, unit records |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
5 from a file, mouse events in a windowing system, or even more complex |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
6 data constructs. When the processing of a unit of data depends only on |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
7 the value of that data, it is usually straightforward. Usually, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
8 however, the processing depends also on what has preceded, and often |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
9 even on what follows, the input under consideration. Keeping track of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
10 these dependencies in order to know how to process the data is called |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
11 \index{Parsing}\agterm{parsing}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
12 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
13 It is often relatively easy to keep track of simple dependencies when |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
14 first writing a program. But as the program develops, as new features |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
15 are added, as bugs are fixed, the dependencies often stop being |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
16 simple. Then input processing becomes a headache, since it is hard to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
17 keep track of or even identify all the particular cases. Changes to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
18 the program cause unexpected problems. Program maintenance threatens |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
19 to get out of control. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
20 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
21 Syntax directed parsing is a technique for dealing with input data |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
22 streams which keeps you in control of the problem, gives you a high |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
23 degree of confidence in your program, makes bugs less likely and makes |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
24 program maintenance and the addition of new features an easy task. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
25 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
26 To use syntax directed parsing you create a high level description of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
27 the structure of your input data, called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
28 \index{Grammar}\agterm{grammar}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
29 A file which contains a grammar is called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
30 \index{Syntax file}\index{File}\agterm{syntax file}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
31 A |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
32 \index{Parser generator}\agterm{parser generator}, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
33 such as AnaGram, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
34 can create, from a syntax file, a function (or program) called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
35 \index{Parser}\agterm{parser}, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
36 written in C or C++. The parser keeps track of all the dependencies |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
37 in your input, and calls certain functions, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
38 \index{Reduction procedure}\agterm{reduction procedures}, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
39 to deal with specific units or sequences of data as they are |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
40 encountered. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
41 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
42 Reduction procedures are functions you write to process your data. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
43 They are linked, in an obvious way in the grammar, to the structures |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
44 in your input, so that your parser will call them at precisely the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
45 right times with precisely the right data. When you write reduction |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
46 procedures, you can concentrate entirely on what you have to do with |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
47 the data. You don't have to encumber your code with switches and |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
48 tests to determine the structure of your input. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
49 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
50 This chapter describes in a general way how you write a grammar and |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
51 how a parser works. In particular, it defines a number of terms which |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
52 are useful in talking about grammars and syntax directed parsing. It |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
53 then introduces a number of special features of AnaGram. More |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
54 detailed treatment of writing grammars is given in Chapter 8. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
55 Techniques for building a complete functioning parser are discussed in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
56 Chapter 9. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
57 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
58 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
59 \section{Describing an Input Sequence} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
60 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
61 Writing a grammar consists of describing the acceptable input |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
62 sequences for your program. The vehicle for describing an input |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
63 sequence is called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
64 \index{Production}\agterm{production}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
65 Productions show how a logical component of the input can be made up |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
66 of a sequence of more elementary components. A production that |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
67 describes a date might be written: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
68 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
69 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
70 date |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
71 -> name of month, day, comma, year |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
72 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
73 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
74 The components of the input are called |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
75 \index{Token}\agterm{tokens}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
76 The sequence of tokens on the \index{Right side}right side of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
77 production is called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
78 \index{Grammar rule}\agterm{grammar rule}, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
79 or \index{Rule}\agterm{rule} for short. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
80 The individual tokens on the right side of the rule are also called |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
81 \index{Rule elements}\index{Rule}\agterm{rule elements}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
82 The \agterm{rule length} is the number of elements in the rule. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
83 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
84 The token on the \index{Left side}left side of the production is |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
85 called the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
86 \index{Reduction token}\index{Token}\agterm{reduction token} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
87 for the rule. \index{Token}Tokens may have |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
88 \index{Value}\index{Semantic value}\agterm{semantic values}, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
89 as distinguished from \index{Value}\agterm{syntactic values}, which |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
90 you can use in your reduction procedures. For instance, the value of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
91 \agcode{name of month} could be an integer in the range zero to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
92 eleven, or it could be a pointer to an ASCII string. The value of day |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
93 could be an integer in the range one to thirty-one. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
94 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
95 A grammar consists of a number of such productions, each of which |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
96 describes some component of the input in terms of other components. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
97 It does not take very many productions to describe quite complex input |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
98 streams. A grammar for the C language, for instance, requires about |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
99 two hundred productions. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
100 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
101 Many people find the term \agterm{production} quite confusing. It |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
102 comes from theoretical linguistics where it is used to describe how |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
103 one may produce sequences which correspond to a set of grammatical |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
104 rules. Ironically, the major usage of the idea has been in parsing |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
105 where the interest is not so much in creating sequences which satisfy |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
106 the grammatical rules as in decoding and analyzing such sequences. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
107 Nonetheless, it is convenient, in the above example, to say that the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
108 token \agcode{date} \agterm{produces} a sequence of tokens consisting |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
109 of \agcode{name of month}, \agcode{day}, \agcode{comma}, and |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
110 \agcode{year}. We also say that the sequence \agterm{reduces} to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
111 \agcode{date}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
112 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
113 There may be more than one production to describe a given component, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
114 if there is more than one way it may be represented. For instance, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
115 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
116 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
117 date |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
118 -> day, name of month, year |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
119 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
120 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
121 describes another common way of writing a date. In other words, a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
122 reduction token may produce a number of different grammar rules. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
123 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
124 Tokens which appear on the left side of one or more productions are called |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
125 \index{Token}\index{Nonterminal token}\agterm{nonterminal tokens}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
126 Those which appear \emph{only} on the right sides of productions are |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
127 called |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
128 \index{Token}\index{Terminal token}\agterm{terminal tokens}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
129 Terminal tokens are the units which actually appear physically in the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
130 input. Nonterminal tokens are identified when a sequence of tokens |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
131 that matches the right side of a production is seen in the input. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
132 When AnaGram analyzes a grammar, it assigns a unique |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
133 \index{Token number}\index{Number}\index{Token}\agterm{token number} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
134 to each token it finds in the grammar. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
135 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
136 Nonterminal tokens, such as \agcode{date} in the example above, may |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
137 appear in any grammar rule just as though they were input tokens. The |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
138 token on the left side of a production can even appear on the right |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
139 side as well. Such a production is called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
140 \index{Production}\index{Recursive productions}\agterm{recursive} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
141 production. When a nonterminal token appears on the right side of a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
142 production, it may be represented in this context by \emph{any} of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
143 the grammar rules it produces. Grammars described in this manner are |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
144 called |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
145 \index{Context free grammar}\index{Grammar}\agterm{context free grammars} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
146 since there is no contextual constraint on which of the rules that a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
147 token produces can appear in any given context. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
148 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
149 Recursive productions may be either left recursive or right recursive. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
150 \agterm{Left recursive} productions are those where the recursively |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
151 defined nonterminal appears as the first element in the recursive |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
152 rule. \agterm{Right recursive} productions are those where it is the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
153 last element. If it appears anywhere between, the production is said |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
154 to be \agterm{center recursive}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
155 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
156 Any nonterminal token which has a recursive production must also have |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
157 at least one simple, non-recursive production. Otherwise, it is not |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
158 possible to create a finite sequence of terminal tokens from the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
159 nonterminal token. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
160 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
161 Recursion may also occur implicitly in a grammar when one of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
162 tokens on the right side of a production itself has a production |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
163 involving the token on the left. Such implicit recursion sometimes |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
164 may involve numerous levels of productions. Implicit recursion occurs |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
165 most commonly in describing constructs such as arithmetic expressions |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
166 or the block structure of programming languages. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
167 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
168 Clearly, grammars can accommodate multiple levels of structure in the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
169 input sequences they describe. There must, at the top, be a single |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
170 token which encompasses the entire input. This special token is |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
171 variously called the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
172 \index{Grammar token}\index{Configuration parameters}\index{Token} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
173 \agterm{grammar token}, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
174 the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
175 \index{Goal token}\agterm{goal token} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
176 or the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
177 \index{Start token}\agterm{start token}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
178 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
179 In AnaGram grammars, terminal tokens need not be declared or otherwise |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
180 defined. Any token name that appears only on the right side of a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
181 production is taken to be a terminal token, and AnaGram presumes that |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
182 the input procedure that provides tokens to the parser will be able to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
183 identify them to the parser appropriately. The details on how to do |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
184 this are explained in Chapter 9. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
185 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
186 On the other hand, AnaGram allows you to specify terminal tokens |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
187 explicitly as ASCII characters, or even as sets of ASCII characters, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
188 right in the grammar. Thus, you may write \agcode{'0-9'} to represent |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
189 the set of ASCII digits, or \agcode{'A-Z'} to represent the set of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
190 upper case letters. The |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
191 \index{Semantic value}\index{Value}\index{Token}semantic value |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
192 of such a token is the ASCII \index{Character codes}character code |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
193 that actually appears in the input stream. The rules for representing |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
194 \index{Character sets}character sets are given in Chapter 8. If the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
195 various sets you use in your grammar overlap, they may not properly |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
196 represent terminal tokens. In this case, AnaGram automatically |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
197 extends your grammar appropriately. This ``set partition'' logic is |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
198 described in Chapter 6. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
199 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
200 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
201 \section{How a Parser Works} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
202 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
203 The aim of a \index{Parser}parser is to match its input with the full |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
204 syntactic structure specified by the productions which make up the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
205 grammar. The primary component of a parser is an input buffer, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
206 sometimes thought of as a stack, into which tokens are |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
207 \agterm{shifted}, or stored sequentially, as they are encountered in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
208 the input. At the same time that a token is shifted into the input |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
209 buffer, its |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
210 \index{Semantic value}\index{Token}\index{Value}semantic value is |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
211 pushed onto the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
212 \index{Parser value stack}\index{Value stack}\index{Stack} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
213 \agterm{value stack}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
214 A token is not shifted into the buffer unless it ``makes sense'', that |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
215 is, unless it is consistent with the rules of the grammar and with the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
216 input that has preceded it. If a token does not make sense, the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
217 parser signals a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
218 \index{Syntax error}\index{Errors}\agterm{syntax error}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
219 In order to determine whether a token makes sense, the parser has a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
220 sort of decision table which provides a list of acceptable tokens for |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
221 each of a number of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
222 \index{Parser}\index{State}\agterm{states}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
223 The table also specifies what the parser is to do with each acceptable |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
224 token. When the table indicates that a token is to be shifted into |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
225 the buffer, it also specifies a new state. The parser stacks the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
226 current state on a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
227 \index{Parser state stack}\index{State stack}\index{Stack} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
228 \agterm{state stack} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
229 and jumps to the new state. Thus every time a token is shifted into |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
230 the input buffer, a state number is pushed onto the state stack. For |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
231 each state of the parser, excepting only the initial state, there is a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
232 unique token which will cause a jump to that state. This token is |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
233 called the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
234 \index{Characteristic token}\index{Token}\agterm{characteristic token} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
235 for the state. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
236 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
237 When the rightmost, or most recent, tokens in the input buffer match |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
238 the right side of a production precisely, the parser \emph{may} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
239 replace the tokens that match the rule with a single token, the token |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
240 on the left side of the production. This process of replacing a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
241 sequence of tokens with a single token is called |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
242 \index{Reduction}\agterm{reduction}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
243 The token that replaces the sequence of tokens is called the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
244 \index{Reduction token}\index{Token}\agterm{reduction token}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
245 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
246 The actual mechanism of the reduction is quite important. At the same |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
247 time that input tokens are removed from the input buffer, state |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
248 numbers are popped from the state stack, so that when all input tokens |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
249 matching the rule have been removed, the parser |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
250 \index{Parser}\index{State}state has been restored to the value it had |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
251 at the time the first token in the rule was seen. As state numbers |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
252 are popped from the state stack, token values are popped from the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
253 value stack. If the rule has a reduction procedure, temporary |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
254 variables are loaded with the values popped from the stack and the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
255 reduction procedure is called. The reduction token is now shifted |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
256 into the input buffer just as though it were an input token. If the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
257 reduction procedure returned a result it is shifted into the value |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
258 stack as the value of the reduction token. The parser stacks the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
259 current state again and jumps to a new state as determined by the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
260 parser tables. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
261 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
262 If there are no errors in your input, when the last token has been |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
263 read from the input, shifted into the input buffer, and reductions |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
264 performed, there will be precisely one token in the input buffer: the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
265 \index{Grammar token}\index{Goal token}grammar, or goal, token which |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
266 describes your entire input. At this point your parser declares |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
267 itself finished. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
268 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
269 Reductions do not necessarily occur every time a rule matches the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
270 tokens in the input buffer. If the reduction token does not make |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
271 sense, that is, if it is not consistent with the rules of the grammar |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
272 and with the input that has preceded it, the parser will not perform |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
273 the reduction. Suppose there are tokens in the input which match one |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
274 of the rules given above for \agcode{date}. A reduction will not |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
275 occur unless \agcode{date} is one of the tokens the parser is actually |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
276 looking for at that stage in the input. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
277 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
278 Even when the reduction token would make sense, there are still |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
279 situations where the reduction would not take place. Suppose a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
280 grammar includes the two productions given above for \agcode{date} as |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
281 well as the following: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
282 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
283 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
284 date |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
285 -> name of month, day |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
286 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
287 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
288 This production is the same as the first, but with no year |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
289 specification. We often write dates without specifying the year |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
290 explicitly. The year is usually understood from context. When a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
291 parser directed by this grammar has encountered \agcode{name of month} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
292 and \agcode{day}, it can't tell without looking further whether it has |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
293 a short form or a long form date. In such a circumstance, the parser |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
294 looks at the next following token, which is called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
295 \index{Lookahead token}\index{Token}\agterm{lookahead token}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
296 If the lookahead token is a comma, then, in the absence of other |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
297 productions, the input is a long form date. If the lookahead token is |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
298 not a comma, then the input is certainly not a long form date, and it |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
299 is proper to reduce the short form production. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
300 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
301 Suppose the lookahead token were a comma and the grammar were also to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
302 contain the following production: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
303 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
304 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
305 appointment |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
306 -> date, comma, time |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
307 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
308 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
309 Since a comma can follow date, according to this rule, and can also |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
310 follow day according to the first production, it is impossible to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
311 determine, simply by looking at the lookahead token, whether the date |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
312 was being given in short form or long form. One would have to look |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
313 beyond the comma to see if what follows the comma matches the rules |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
314 for time or for year. Although it is possible to build parsers which |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
315 can do this, it is not generally feasible. This situation is called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
316 \index{Shift-reduce conflict}\index{Conflicts} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
317 \agterm{shift-reduce conflict}, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
318 because the parser cannot decide simply on the basis of the lookahead |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
319 token whether to reduce the short form rule or to ignore it and shift |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
320 the comma into the input buffer. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
321 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
322 AnaGram finds and warns you about the conflicts in your grammar when |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
323 it analyzes your grammar. To do this, it checks all states to find |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
324 all completed rules. A |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
325 \index{Completed rule}\index{Rule}\agterm{completed rule} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
326 is a rule that is completely matched by the rightmost tokens in the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
327 input buffer. For each such rule, AnaGram determines all of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
328 terminal tokens which could follow once that rule has been reduced. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
329 These tokens are sometimes called the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
330 \index{Token}\agterm{reducing tokens} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
331 for the given rule in the given state. (Note that ``reducing tokens'' |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
332 are quite different from ``reduction tokens''.) AnaGram checks each |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
333 state for shift-reduce conflicts by comparing the set of tokens which |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
334 can be shifted against all sets of reducing tokens for that state. If |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
335 there is an overlap, there is a conflict. When AnaGram finds a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
336 conflict, it prepares a detailed report on the conflict. This |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
337 information is then available in the \agwindow{Conflicts} window. A |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
338 shift-reduce conflict does not prevent you from building a parser. If |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
339 you build a parser for a grammar which has such a shift-reduce |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
340 conflict, the parser will resolve the issue by shifting the lookahead |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
341 token into the input buffer and ignoring the reduction. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
342 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
343 There is another type of conflict involving reductions. Suppose a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
344 given state has more than one |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
345 \index{Completed rule}\index{Rule}completed rule. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
346 AnaGram then checks the sets of reducing tokens for all the completed |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
347 rules. If there is an overlap, there is no way for AnaGram to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
348 determine which rule should be reduced simply on the basis of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
349 \index{Token}\index{Lookahead token}lookahead token. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
350 This situation is called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
351 \index{Reduce-reduce conflicts}\index{Conflicts} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
352 \agterm{reduce-reduce conflict}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
353 AnaGram will also diagnose reduce-reduce conflicts in the Conflicts |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
354 window. If you decide to ignore a reduce-reduce conflict and ask |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
355 AnaGram to build a parser, the parser will choose the rule AnaGram |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
356 encountered first when it read your grammar. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
357 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
358 \index{Conflicts}Conflicts are not necessarily errors. Sometimes you |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
359 can build a more compact, faster parser by deliberately introducing |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
360 conflicts. You can rely on AnaGram's default resolution of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
361 conflicts, or you may use operator precedence to resolve the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
362 conflicts. The use of operator precedence is described in Chapter 9. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
363 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
364 AnaGram has extensive facilities for identifying and correcting |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
365 unwanted conflicts. These facilities are described in Chapter 7. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
366 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
367 Traditionally, parsers are built using only shift and reduce actions |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
368 as described above. The parsers AnaGram builds normally use a number |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
369 of more complex actions, each of which is equivalent to a number of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
370 shift and reduce actions. These complex actions permit AnaGram to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
371 build faster, more compact parsers. If you wish, however, you may |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
372 restrict AnaGram to using only shift and reduce actions by setting the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
373 \index{Traditional engine}\index{Configuration switches} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
374 \agparam{traditional engine} configuration switch. See Appendix A, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
375 Configuration Parameters. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
376 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
377 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
378 \section{A Note on Notation} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
379 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
380 \index{Context free grammar}\index{Grammar}Context free grammars have |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
381 been traditionally represented in the literature using |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
382 \index{Backus-Naur Form}\agterm{Backus-Naur Form}, or |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
383 \index{BNF}\agterm{BNF}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
384 In Backus-Naur Form, certain characters, called metacharacters, are |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
385 used to punctuate productions and all other printable characters are |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
386 taken to represent themselves literally. Named tokens are denoted by |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
387 enclosing the name within angle brackets ($< >$). The left side of a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
388 production is distinguished from the right side by the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
389 % XXX s/characters/symbol/ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
390 characters $::=$. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
391 If several productions have the same left side, the vertical |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
392 bar $|$ is used to separate them. The elements of a grammar |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
393 rule are simply juxtaposed to indicate that one follows another. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
394 Blanks are ignored. Thus, in BNF, the first production given for |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
395 \agcode{date}, above, would be: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
396 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
397 % ~ is a hard space |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
398 $$ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
399 <date>~~::=~~<name~of~month>~~<day>~~,~~<year> |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
400 $$ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
401 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
402 AnaGram uses a notation more consonant with ordinary programming |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
403 usage. Thus token names need not be bracketed and literal characters |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
404 must be appropriately quoted. The elements of rules are joined by |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
405 commas. Using this approach, there is no need for metacharacters and |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
406 it becomes possible to make a number of useful extensions to the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
407 notation. The detailed rules for writing grammars using AnaGram's |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
408 notation are given in Chapter 8. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
409 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
410 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
411 \section{Reduction Procedures} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
412 \index{Reduction procedure} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
413 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
414 Of course, the reason for parsing an input stream is to interpret the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
415 data in the stream and to process it in some useful manner. The |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
416 primary tool for doing this is the \agterm{reduction procedure}. A |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
417 reduction procedure is a piece of C code that is executed when a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
418 particular grammar rule is reduced. Often, a reduction procedure |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
419 calculates a value which becomes the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
420 \index{Value}\index{Semantic value}semantic value |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
421 of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
422 \index{Token}\index{Reduction token}reduction token. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
423 The input to the reduction procedure consists of the values of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
424 tokens that make up the grammar rule. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
425 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
426 AnaGram allows you to assign C variable names to the tokens in the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
427 grammar rule, so that you can refer to them in the reduction |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
428 procedure. To assign a C variable name, simply follow the token in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
429 the rule with a colon and the C variable name. Simple reduction |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
430 procedures can be written as C or C++ expressions. At the end of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
431 rule, write an equal sign and an expression, terminated by a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
432 semicolon. For example: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
433 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
434 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
435 (int) hex digit |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
436 -> '0-9':d = d-'0'; |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
437 -> 'a-f':d = d-'a'+10; |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
438 -> 'A-F':d = d-'A'+10; |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
439 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
440 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
441 When any one of these rules is matched, the value of the token in the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
442 rule is assigned to the temporary variable \agcode{d}. The expression |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
443 to the right of the equal sign is evaluated and the value of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
444 expression is stored as the value of \agcode{hex digit}, which has |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
445 been declared to be an \agcode{int}. These productions define |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
446 hexadecimal digits as ASCII characters, and calculate the binary value |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
447 of the digit in terms of the ASCII character code, \agcode{d}. The |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
448 binary value becomes the value of \agcode{hex digit}. Hexadecimal |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
449 digits can be combined to make hexadecimal numbers by writing the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
450 following productions: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
451 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
452 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
453 (int) hex number |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
454 -> hex digit |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
455 -> hex number:n, hex digit:d = 16*n+d; |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
456 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
457 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
458 %In this example, note that if you do not specify a reduction procedure for |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
459 %a grammar rule, the value of the reduction token will be taken to be the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
460 %value of the first token in the rule. Thus, it is not necessary to provide |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
461 %a reduction procedure in the first production for hex number. The second |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
462 %production recursively defines the value of hexadecimal numbers that have |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
463 %more than one digit. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
464 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
465 There are several important points to notice in this example. First, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
466 reduction procedures are executed ``from the bottom up''. That is, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
467 the reduction procedure for \agcode{hex digit} is executed before any |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
468 reduction procedure for \agcode{hex number}. Second, if there is no |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
469 reduction procedure for a production, the value of the first token in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
470 the rule is assigned to the reduction token. Thus, it is not |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
471 necessary to provide a reduction procedure in the first production for |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
472 \agcode{hex number}. Third, the reduction procedures for recursive |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
473 productions are always executed \emph{after} the reduction procedure, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
474 if any, for the non-recursive production which begins the recursion. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
475 Finally, when an input sequence is described using left recursion, as |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
476 in this example, the elements of the sequence are processed left to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
477 right. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
478 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
479 % |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
480 % XXX this is a bad example as you never want to read floats this way! |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
481 % Can we think of something else? |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
482 % |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
483 If you wish to process the elements of a sequence right to left, you |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
484 may use right recursion. For example, it is sometimes convenient to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
485 define the fraction part of a decimal number thus: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
486 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
487 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
488 (double) fraction part |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
489 -> '0-9':d = (d - '0')/10.0; |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
490 -> '0-9':d, fraction part:f = (d - '0' + f)/10.0; |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
491 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
492 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
493 In this case the leading digits are stored temporarily in the parse |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
494 stack, and then the fraction part is evaluated right to left only when |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
495 the last digit has been found. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
496 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
497 Reduction procedures can be more complex than simple expressions. After the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
498 equal sign you may include an arbitrary block of C or C++ code, enclosed in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
499 braces, \agcode{\bra \ket}. To return a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
500 \index{Semantic value}\index{Token}\index{Value}semantic value |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
501 for the reduction token simply use a return statement. Of course, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
502 reduction procedures have the full resources of C or C++ at their |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
503 disposal. They may set and interrogate global variables and may call |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
504 functions freely. Reduction procedures are usually implemented as C |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
505 functions. Very simple procedures may be implemented as macros, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
506 unless you have disabled this option by turning off the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
507 \index{Macros}\index{Allow macros}\agparam{allow macros} configuration |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
508 switch. See Chapter 8 for the rules for writing reduction procedures. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
509 When AnaGram builds a parser it copies all of the reduction procedures |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
510 you have defined to the parser file, and includes code to call them at |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
511 the right time. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
512 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
513 Since the reduction procedures you write will probably need some |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
514 support code, such as \agcode{\#include} statements and declarations, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
515 you may incorporate C or C++ code into your syntax file at any point. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
516 You need only enclose it in braces, \agcode{\bra \ket}. Such code |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
517 is called |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
518 \index{Embedded C}\agterm{embedded C}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
519 All embedded C code is also copied to the \index{File}parser file, and |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
520 \emph{precedes} all of your reduction procedures. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
521 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
522 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
523 \section{Building a Parser} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
524 \index{Building a Parser}\index{Parser} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
525 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
526 In order to round out a parser into a functioning program it needs |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
527 input procedures, as well as error diagnosis and recovery |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
528 capabilities. AnaGram has a number of options available which give |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
529 you a high degree of flexibility in configuring a parser to suit your |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
530 particular needs. All of the options are provided with reasonable |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
531 defaults, so that you can safely disregard any option until you need |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
532 the features it provides. The following paragraphs summarize the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
533 options AnaGram provides. These topics are discussed in greater |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
534 detail in Chapter 9. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
535 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
536 \paragraph{Invoking a Parser.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
537 \index{Parser} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
538 Normally, AnaGram configures parsers as functions which you can call |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
539 from elsewhere in your program. In this situation, you call the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
540 parser, it processes its input, and returns either when it has |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
541 finished or cannot proceed because of errors. Alternatively, if you |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
542 set the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
543 \index{Event driven}\agparam{event driven} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
544 configuration switch, your parser will be configured so that you have |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
545 two procedures to call: an \index{Initializer}\agterm{initializer} and |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
546 a \index{Parser}\agterm{parser}. In the event driven configuration |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
547 you start the parse by calling the initializer and then you call the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
548 parser once for each unit of input. Using the event driven mode makes |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
549 it quite easy to configure a parser as a filter and to chain several |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
550 parsers together so that the output from one parser is the input to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
551 the next. Such |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
552 \index{Parsing}\index{Multi-stage parsing}\agterm{multi-stage parsing} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
553 is a convenient way to deal with complex input that is not context free. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
554 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
555 \paragraph{Communicating with a Parser.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
556 The complete status of your parser is contained in a single data |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
557 structure called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
558 \index{Parser control block}\index{Control block} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
559 \agterm{parser control block}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
560 All communications with a parser take place via the parser control |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
561 block. \index{Input procedures}Input procedures must place input data |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
562 in the appropriate field in the parser control block. When the parse |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
563 is complete or encounters an error, the results of the parse may be |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
564 found in the parser control block. When AnaGram builds a parser it |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
565 includes, in the \index{File}\index{Header file}header file it |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
566 generates, a \agcode{typedef} statement which defines the structure of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
567 the parser control block. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
568 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
569 \paragraph{Parser Input.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
570 \index{Input}\index{Parser}\index{Input procedures} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
571 The input to your parser may be either characters read directly from |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
572 an input stream, or tokens accumulated by a pre-processor or lexical |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
573 scanner. The way you provide input to your parser depends on how your |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
574 grammar defines input tokens and also on whether or not you have |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
575 requested an event driven parser. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
576 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
577 If your parser is event driven, you provide its input by storing the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
578 input code and the input value, if any, into the parser control block |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
579 and calling the parser. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
580 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
581 If you have set the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
582 \index{Pointer input}\agparam{pointer input} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
583 \index{Configuration switches}configuration switch |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
584 in your syntax file, you simply initialize the pointer field in your |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
585 parser control block before you call your parser. Your parser will |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
586 then read its input directly from memory by simply incrementing the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
587 pointer as necessary. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
588 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
589 Otherwise, your parser will invoke a macro called |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
590 \index{\agcode{GET{\us}INPUT}}\index{Macros}\agcode{GET{\us}INPUT} every |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
591 time it needs more input. You may define \agcode{GET{\us}INPUT} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
592 according to your needs. You can define it so that it calls an input |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
593 function, or you can define it so that it executes in-line code each |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
594 time it is invoked. Your \agcode{GET{\us}INPUT} macro should store its |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
595 input code in the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
596 \index{\agcode{input{\us}code}}\agcode{input{\us}code} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
597 field of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
598 \index{PCB}parser control block. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
599 If you do not write a \agcode{GET{\us}INPUT} macro, AnaGram will provide |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
600 one which will read characters from \agcode{stdin}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
601 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
602 If your grammar does not define terminal tokens in terms of ASCII |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
603 characters or external token numbers, your \agcode{GET{\us}INPUT} will |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
604 have to determine the appropriate internal |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
605 \index{Token number}\index{Token}\index{Number}token number |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
606 for each input token. To assist you in determining these token |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
607 numbers AnaGram provides a \agcode{typedef enum} statement in the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
608 \index{File}\index{Header file}header file. You can then use named |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
609 constants to specify the internal token numbers for the input tokens. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
610 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
611 \paragraph{Error Handling.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
612 \index{Error handling}\index{Error diagnosis}\index{Error recovery} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
613 Your parser must be prepared to deal with erroneous input. There are |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
614 two aspects to error handling: diagnosing the error, and recovering |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
615 from the error. On encountering an error, your parser will invoke a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
616 macro called |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
617 \index{\agcode{SYNTAX{\us}ERROR}}\index{Macros}\agcode{SYNTAX{\us}ERROR}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
618 If you do not provide a definition for \agcode{SYNTAX{\us}ERROR}, AnaGram |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
619 will provide a simple error diagnostic. AnaGram can also provide |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
620 automatic error diagnoses which pinpoint the location of the error. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
621 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
622 \index{Resynchronization} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
623 AnaGram provides two options for error recovery: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
624 \agterm{error token resynchronization} and |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
625 \agterm{automatic resynchronization}. These are |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
626 techniques for getting your parser back in synch with its input so it |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
627 can proceed after encountering an error. Normally, if you do not |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
628 select one of these recovery techniques, your parser will terminate |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
629 when it encounters an error; however, you may override this default if |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
630 you wish and provide your own recovery technique. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
631 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
632 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
633 \section{Special Features of AnaGram} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
634 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
635 AnaGram provides a number of special features not generally available |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
636 in other parsing systems. These features extend the applicability of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
637 syntax directed parsing, make it easier for you to write and debug |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
638 your grammars, and make it easier to interface your parser to other |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
639 parsers and to your main program. The following paragraphs provide a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
640 brief introduction to these features. Their use is described more |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
641 fully in the appropriate chapters in this manual. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
642 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
643 \paragraph{Configuration Parameters.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
644 \index{Configuration parameters}\index{Parameters} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
645 AnaGram has a number of configuration switches and parameters which |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
646 you can use to control the way AnaGram works in various circumstances. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
647 Since programmers often have different requirements, AnaGram strives |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
648 to be as flexible as possible. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
649 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
650 Configuration parameters may be set in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
651 \index{Configuration file}\index{File}configuration files, or in your |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
652 syntax file. The rules for writing configuration parameters are given |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
653 in Chapter 8. A summary of all configuration parameters is given in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
654 Appendix A. The use of particular configuration parameters is |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
655 described as appropriate throughout this manual. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
656 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
657 \paragraph{Character Sets}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
658 \index{Character sets} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
659 In your grammar, if you write terminal tokens explicitly as characters |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
660 or as sets of ASCII characters, AnaGram will set up your parser to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
661 accept ASCII characters directly, so you will not need to build a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
662 \index{Lexical scanner}lexical scanner and interface it to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
663 your parser. Character sets may be specified in terms of ranges of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
664 characters, such as \agcode{'a-z'}, as unions, intersections or |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
665 differences of sets, denoted by \agcode{+}, \agcode{\&}, or \agcode{-} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
666 respectively, or as the complement of a set, denoted by \agcode{\~{}}. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
667 You may also specify a set consisting of a single character either |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
668 with a literal character or with a numeric specification. The |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
669 detailed rules for character set expressions are given in Chapter 8. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
670 Some useful character sets are: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
671 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
672 \begin{itemize} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
673 \item Letters: \agcode{'a-z' + 'A-Z'} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
674 \item Digits: \agcode{'0-9'} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
675 \item DOS end of file: \agcode{-1 + \^{}Z} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
676 \item Neither end of file nor end of line: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
677 \agcode{\~{}(-1 + \^{}Z |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
678 + '{\bs}n')} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
679 \end{itemize} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
680 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
681 \paragraph{Keywords.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
682 \index{Keywords} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
683 AnaGram makes special provision for the use of keywords in the parsers |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
684 it builds. In your grammar, you may specify a keyword as a quoted |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
685 string. AnaGram will incorporate special string matching logic in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
686 your parser to recognize keywords in the parser input. When a keyword |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
687 is recognized, it is treated as an individual token, completely |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
688 independent of the characters which make up the keyword. Some |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
689 examples of keywords: \agcode{"let"}, \agcode{".EQ."}, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
690 \agcode{"<>"}, \agcode{"/*"}. The |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
691 rules for the use of keyword strings are given in Chapter 8. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
692 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
693 Keywords are a powerful and useful tool, but they are not completely |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
694 consistent with the assumptions underlying the use of context-free |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
695 grammars. Thus under certain circumstances it is possible for a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
696 parser built from a grammar that uses keywords to fail |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
697 % XXX s/parse correctly/correctly parse/ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
698 to parse correctly text that |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
699 appears to be consistent with the grammar. Such a situation is called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
700 \index{Keyword anomalies}\index{Anomaly}keyword anomaly. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
701 AnaGram detects and diagnoses keyword anomalies in your grammar, and |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
702 provides tools to help you find and correct the problem. Many such |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
703 apparent anomalies, however, are completely innocuous. Keyword |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
704 anomalies are discussed in greater detail in Chapter 7. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
705 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
706 \paragraph{Virtual Productions.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
707 \index{Virtual productions}\index{Production} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
708 Virtual productions are a powerful, shorthand notation for specifying |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
709 optional or repetitive input. Using a virtual production can often |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
710 spare you the trouble of writing out four or five regular productions |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
711 and naming them. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
712 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
713 Most of the notation for virtual productions should be familiar since |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
714 it has been used in programming manuals for years. The basic forms |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
715 are these: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
716 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
717 \index{\agcode{[]}} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
718 \index{\_opb\_clb} % {}, apparently. XXX? |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
719 \index{\agcode{...}}\index{Ellipsis} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
720 \index{\agcode{?}} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
721 \index{\agcode{?...}} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
722 \index{\agcode{/...}} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
723 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
724 % XXX figure out how to not have to encode the line breaks manually. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
725 % I think we need one of the fancier extension forms of tabular. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
726 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
727 \begin{tabular}[t]{ll} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
728 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
729 \agcode{[]}&enclosing a list of one or more grammar rules separated |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
730 by \agcode{|} characters denotes an optional\\ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
731 \phantom{}&choice of one of the rules.\\ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
732 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
733 \agcode{\bra \ket}&enclosing a list of one or more grammar rules separated by |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
734 \agcode{|} characters indicates a required\\ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
735 \phantom{}&choice of one of the rules.\\ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
736 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
737 \agcode{...}&following a token, or following \agcode{[]} or \agcode{\bra \ket}, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
738 indicates arbitrary repetition of the previous choice or\\ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
739 \phantom{}&token.\\ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
740 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
741 \agcode{?}&following a token or set expression makes the token optional.\\ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
742 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
743 \agcode{?...}&following a token or set expression denotes zero or more |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
744 repetitions of the token.\\ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
745 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
746 \agcode{/...}&following \agcode{[]} or \agcode{\bra \ket} indicates arbitrary |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
747 repetition of the choice subject to the constraint that\\ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
748 \phantom{}&choices must alternate.\\ |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
749 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
750 \end{tabular} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
751 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
752 \index{Production}\index{Virtual productions}Virtual productions |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
753 are simply syntactic devices. When AnaGram encounters a virtual |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
754 production it translates the virtual production into conventional |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
755 productions which show up in your grammar tables. Virtual productions |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
756 are described in Chapter 8. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
757 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
758 \paragraph{Definition Statements.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
759 \index{Statement}\index{Definition statement} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
760 AnaGram provides a mechanism for naming particular character sets, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
761 keywords or virtual productions in your syntax file. The name you |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
762 define can be used freely in place of the thing defined. Definition |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
763 statements have the form: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
764 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
765 % XXX why the hell is the first line offset by a small amount? |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
766 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
767 name = \codemeta{character set} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
768 name = \codemeta{virtual production} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
769 name = \codemeta{keyword} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
770 name = \codemeta{immediate action} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
771 name = \codemeta{token name} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
772 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
773 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
774 The name may be any name acceptable to AnaGram. The name can then be |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
775 used anywhere you might have used the expression on the right side. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
776 For example: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
777 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
778 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
779 upper case letter = 'A-Z' |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
780 lower case letter = 'a-z' |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
781 letter = upper case letter + lower case letter |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
782 statement list = statement?... |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
783 while = "WHILE" |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
784 dos eof = \^{}Z |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
785 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
786 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
787 \paragraph{Disregard Statements.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
788 \index{Disregard statement}\index{Statement} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
789 If you wish your parser generally to skip over certain characters or |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
790 constructs in its input, such as blanks, newlines, or comments, you |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
791 may use disregard statements to specify precisely what is to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
792 be skipped and under what circumstances. You may use any number of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
793 disregard statements and you may cause your parser to disregard any |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
794 number of tokens, of whatever complexity. Any \agparam{disregard} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
795 statements must |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
796 be placed in a configuration section. They may appear anywhere in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
797 your syntax file. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
798 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
799 The format of the \agparam{disregard} statement is as follows: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
800 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
801 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
802 disregard \codemeta{token name} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
803 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
804 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
805 The rules for using disregard statements are given in Chapter 8. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
806 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
807 \paragraph{Lexeme Statements.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
808 \index{Lexeme statement}\index{Statement} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
809 The lexeme statement is used to fine tune the disregard statement. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
810 Any \agparam{lexeme} statements must be placed in a configuration |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
811 section. They may appear anywhere in your syntax file. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
812 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
813 The format of the \agparam{lexeme} statement is as follows: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
814 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
815 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
816 lexeme \bra \codemeta{token list} \ket |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
817 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
818 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
819 where \textit{token list} is a list of nonterminal tokens, separated |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
820 by commas. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
821 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
822 The lexeme statement allows you to specify that for the purposes of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
823 the disregard statement, the listed tokens are to be treated as |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
824 % XXX make this: indivisible lexical units |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
825 lexical units, so that the disregard statement will be inoperative |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
826 within the tokens specified. Lexeme statements are discussed in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
827 Chapter 8. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
828 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
829 \paragraph{Semantically Determined Productions.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
830 \index{Semantically determined production}\index{Production} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
831 Traditionally, syntax directed parsing systems have enforced a rigid |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
832 distinction between syntactic and semantic information. In these |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
833 systems, it is not possible to use the knowledge that, for example, a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
834 symbol is a function name and not a variable name in the syntax of a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
835 programming language. AnaGram provides semantically determined |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
836 productions as a mechanism to allow semantic information to control |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
837 syntactic analysis. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
838 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
839 A semantically determined production has more than one |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
840 \index{Token}\index{Reduction token}reduction token |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
841 on the left side. The reduction procedure determines which reduction |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
842 token the parser should use. For example: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
843 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
844 \begin{indentingcode}{0.4in} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
845 variable name, function name |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
846 -> symbol:s = identify{\us}symbol(s); |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
847 \end{indentingcode} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
848 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
849 In this situation, the parser will set the reduction token to the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
850 leftmost token, \agcode{variable name}, before calling the reduction |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
851 procedure. If the reduction procedure finds that \agcode{s} in fact |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
852 identifies a function name, it can change the reduction token to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
853 \agcode{function name}. Using this capability, you can write your grammar in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
854 terms of \agcode{variable name} and \agcode{function name}, just as |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
855 though they were syntactically distinguishable. Further discussion of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
856 semantically determined productions may be found in Chapter 9. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
857 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
858 \paragraph{Event Driven Parsers.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
859 \index{Parser}\index{Event driven parser} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
860 Traditionally, parsers have been constructed as subroutines which are |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
861 called to analyze a stream of input and return only when they have |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
862 finished parsing the input or have encountered an unrecoverable error. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
863 Such a parser itself calls a subroutine to obtain each unit of input. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
864 There are a number of circumstances where this traditional |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
865 architecture is inconvenient. The most obvious is in situations where |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
866 programs must be event-driven, as in many popular windowing systems. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
867 A somewhat less obvious situation is where one parser, often called a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
868 \index{Lexical scanner}\agterm{lexical scanner}, provides input to a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
869 second. In these situations it is convenient to invert the role of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
870 calling and the called programs. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
871 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
872 AnaGram provides the capability of specifying that a parser it builds |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
873 should be event-driven. In this case, the parser consists of an |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
874 initialization procedure and an event handler. The main program calls |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
875 the initialization procedure and then calls the event handler with |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
876 each input token in turn. The event handler returns every time the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
877 parser needs more input. Details on event-driven parsers are given in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
878 Chapter 9. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
879 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
880 \paragraph{Context tracking.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
881 \index{Context tracking} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
882 When you are writing a reduction procedure for a particular grammar |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
883 rule, you often need to know the value one or another of your program |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
884 variables had at the time the first token in the rule was encountered. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
885 Examples of such variables are: |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
886 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
887 \begin{itemize} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
888 \item Line or column number |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
889 \item Index in an input file |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
890 \item Index into an array |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
891 \item Counters, as of symbols defined, etc. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
892 \end{itemize} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
893 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
894 Such variables can be thought of as representing the ``context'' of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
895 the rule you are reducing. Sometimes it is possible to incorporate |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
896 the values of such variables into the values of reduction tokens, but |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
897 this can become quite cumbersome. AnaGram provides a feature known as |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
898 ``context tracking'' to deal with this problem. Details on context |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
899 tracking can be found in Chapter 9. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
900 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
901 \paragraph{Coverage Analysis.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
902 \index{Coverage analysis} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
903 AnaGram has facilities to help you determine the adequacy of your test |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
904 suites. If you set appropriate configuration switches, AnaGram will |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
905 include code in your parser to count the number of times each rule in |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
906 your grammar is reduced or the number of times each reduction |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
907 procedure in your grammar is executed. You can use these counts to |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
908 determine the completeness of your testing. The use of these |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
909 facilities is discussed in Chapter 9. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
910 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
911 \paragraph{Immediate Actions.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
912 \index{Action}\index{Immediate action} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
913 AnaGram provides for executing particular functions, called |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
914 ``immediate actions'', at any point in a grammar rule. Thus, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
915 computation does not need to wait until the rule is complete. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
916 Immediate actions are most useful when using AnaGram to control an |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
917 interactive process. Immediate actions are discussed in Chapter 8. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
918 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
919 \paragraph{Grammar Trace.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
920 \index{Grammar Trace}\index{Window}\index{Trace} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
921 The \agwindow{Grammar Trace} facility of AnaGram allows you to examine |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
922 the workings of your parser in detail. You can set up a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
923 representation of the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
924 \index{Parser state stack}\index{State stack}\index{Stack}parser state stack |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
925 and parser state as they might appear in the course of execution of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
926 your parser. You can then examine the possible inputs and see how the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
927 state and the state stack change in response to any input you choose. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
928 You can back up and try other options. You can have several grammar |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
929 traces active simultaneously so you can compare the results of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
930 different sequences of input. For any configuration of the state |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
931 stack, you can see what grammar rules the parser is in the process of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
932 matching. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
933 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
934 The use of the Grammar Trace is described in Chapter 5. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
935 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
936 \paragraph{File Trace.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
937 \index{Trace}\index{File Trace}\index{Window} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
938 The \agwindow{File Trace} is a facility which allows you to see how |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
939 your parser will work on real data. File Trace is an interactive, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
940 interpretive parser governed by the rules in your grammar. It lets |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
941 you see precisely how your grammar parses a test file. You can see |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
942 the contents of the parse stack at any point in the parse. You can |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
943 watch the parse progress at any level of detail you choose. If you |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
944 wish, you may back up and try again. When AnaGram runs the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
945 \agwindow{File Trace} or \agwindow{Grammar Trace} it also builds a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
946 \index{Coverage}\index{Trace Coverage}\index{Window} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
947 \agwindow{Trace Coverage} table. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
948 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
949 Note that AnaGram normally uses a number of short-cut parsing actions. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
950 To make the parse look like a textbook parse, set the |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
951 \index{Traditional engine}\agparam{traditional engine} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
952 \index{Configuration parameters}configuration parameter |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
953 in your syntax file. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
954 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
955 \paragraph{Aids to Debugging.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
956 AnaGram provides a number of facilities to help you debug your parser. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
957 It provides numerous tables that summarize your grammar and a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
958 windowing environment to assist you in inspecting and comparing this |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
959 data. To assist in identifying a conflict in your grammar it can |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
960 provide a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
961 \index{Trace}\index{Conflict Trace}\index{Window}\agwindow{Conflict Trace}, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
962 a pre-built grammar trace showing a sequence of input which will |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
963 trigger the conflict. If your parser encounters a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
964 \index{Syntax error}\index{Errors}syntax error, |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
965 you can arrange to have AnaGram build an \agwindow{Error Trace}, a |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
966 pre-built grammar trace showing the sequence of input tokens that led |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
967 to the syntax error. These facilities are described in Chapter 5. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
968 |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
969 \paragraph{Conflict Resolution.} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
970 \index{Conflict} |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
971 In order to help you deal with conflicts in your grammar, AnaGram |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
972 provides several methods for resolving conflicts. You may include |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
973 \agterm{precedence declarations} to assist in the construction of |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
974 simple precedence grammars, you may declare tokens to be |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
975 \agparam{sticky}, or you may declare tokens to be |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
976 \agterm{subgrammar} tokens. |
13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff
changeset
|
977 These techniques are described in Chapter 7. |