annotate doc/misc/html/isdp.html @ 21:1c9dac05d040

Add lint-style FALLTHROUGH annotations to fallthrough cases. (in the parse engine and thus the output code) Document this, because the old output causes warnings with gcc10.
author David A. Holland
date Mon, 13 Jun 2022 00:04:38 -0400
parents 13d2b8934445
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
2 <HTML>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
3 <HEAD>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
4 <TITLE>Introduction to Syntax Directed Parsing</TITLE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
5 </HEAD>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
6
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
7 <BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
8 TEXT="#000000" LINK="#0033CC"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
9 VLINK="#CC0033" ALINK="#CC0099">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
10
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
11 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
12
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
13 <IMG ALIGN="right" SRC="images/agrsl6c.gif" ALT="AnaGram"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
14 WIDTH=124 HEIGHT=30>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
15 <BR CLEAR="all">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
16 Back to <A HREF="index.html">Index</A>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
17 </P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
18 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
19 WIDTH=1010 HEIGHT=2 >
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
20
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
21 <BR CLEAR="all">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
22
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
23
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
24 <H1 ALIGN="LEFT">Introduction to Syntax Directed Parsing</H1>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
25 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
26 WIDTH=1010 HEIGHT=2 >
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
27 </P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
28
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
29 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
30 (Adapted from Chapter 4, AnaGram User's Guide)</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
31 </P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
32 <UL>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
33 <LI SRC="#What"><A HREF="#What">What is Syntax Directed Parsing?</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
34 <LI><A HREF="#Describing">Describing an Input Sequence</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
35 <LI><A HREF="#How">How a Parser Works</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
36 <LI><A HREF="#Note">A Note on Notation</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
37 <LI><A HREF="#Reduction">Reduction Procedures</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
38 <LI><A HREF="#Building">Building a Parser</A>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
39 <UL>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
40 <LI SRC="#Invoking"><A HREF="#Invoking">Invoking a Parser</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
41 <LI><A HREF="#Communicating">Communicating with a Parser</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
42 <LI><A HREF="#Parser">Parser Input</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
43 <LI><A HREF="#Error">Error Handling</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
44 </UL>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
45 </LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
46 </UL>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
47 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
48
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
49 <H2><A NAME="What">What is Syntax Directed Parsing?</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
50 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
51 Every programmer has to deal with input data. Usually processing input data
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
52 depends on what has preceded, and often even on what follows, the input under
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
53 consideration. Keeping track of these dependencies in order to know how to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
54 process the data is called parsing. It is often relatively easy to keep track of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
55 simple dependencies when first writing a program. As the program develops, as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
56 new features are added, as bugs are fixed, the dependencies often stop being
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
57 simple. Input processing becomes a headache, since it is hard to keep track of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
58 or even identify all the particular cases. Changes to the program cause
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
59 unexpected problems and program maintenance threatens to get out of control.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
60 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
61 Syntax directed parsing is a technique for addressing these difficulties. In
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
62 syntax directed parsing, the input part of a program is constructed
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
63 automatically, by means of a standard algorithm, from a high level description
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
64 of the input data structure. Code to perform any required processing of the data
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
65 is attached to the description in a convenient way.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
66 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
67 The description, being non-procedural, is generally easier to write and to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
68 modify than the equivalent program code, and much less likely to harbor bugs. It
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
69 is easier to read, and easier to maintain. It is easy to re-use in other
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
70 programs requiring the same input, thus encouraging uniform interfaces. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
71 technique also simplifies the overall program by decoupling the input and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
72 processing components and providing a natural, modular structure.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
73 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
74 To use syntax directed parsing you first write the input data description,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
75 called a <A HREF="gloss.html#Grammar">grammar</A>. A file which contains a grammar is called a syntax
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
76 file. A parser generator, such as AnaGram, then can create, from the syntax
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
77 file, a function (or program) called a <A HREF="gloss.html#Parser">parser</A>, written in C or C++. The parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
78 keeps track of all the dependencies in your input, and calls certain functions,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
79 <A HREF="gloss.html#ReductionProcedure">reduction procedures</A>, to deal with specific units or sequences of data as they
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
80 are encountered.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
81 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
82 Reduction procedures are the code you write to process your data. In your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
83 grammar they are linked to the structures in your input, so that your parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
84 will call them at precisely the right times with precisely the right data. Note
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
85 that with this technique you need only provide a non-procedural <I>description</I>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
86 of the input. The details of flow of control are handled entirely by the parser.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
87 When you write reduction procedures, you can concentrate entirely on what you
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
88 have to do with the data. You don't have to encumber your code with switches and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
89 tests to determine the structure of your input.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
90 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
91 The parsers you build using a parser generator such as AnaGram may be complete
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
92 stand-alone programs, or they may serve as input routines for a more extensive
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
93 program. Some programs may even use more than one parser.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
94 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
95
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
96 <H2><A NAME="Describing">Describing an Input Sequence</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
97 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
98 Writing a grammar consists of describing the acceptable input sequences for your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
99 program. The vehicle for describing an input sequence is called a <A HREF="gloss.html#Production">production</A>.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
100 Productions show how a logical component of the input can be made up of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
101 sequence of more elementary components. A production that describes a date might
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
102 be written:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
103 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
104 date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
105 -&gt; name of month, day, comma, year </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
106 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
107 The components of the input are called <A HREF="gloss.html#Token">tokens</A>. The sequence of tokens on the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
108 side of the production is called a <A HREF="gloss.html#GrammarRule">grammar rule</A>, or rule for short. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
109 individual tokens on the right side of the rule are also called <A HREF="gloss.html#RuleElement">rule elements</A>.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
110 </P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
111 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
112 The token on the left side of the production is called the <A HREF="gloss.html#ReductionToken">reduction token</A> for
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
113 the rule. Tokens may have semantic values, as distinguished from syntactic
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
114 values, which you can use in your <A HREF="gloss.html#ReductionProcedure">reduction procedures</A>. For instance, the value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
115 of name of month could be an integer in the range zero to eleven, or it could be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
116 a pointer to an ascii string. The value of day could be an integer in the range
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
117 one to thirty-one.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
118 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
119 A <A HREF="gloss.html#Grammar">grammar</A> consists of a number of such <A HREF="gloss.html#Production">productions</A>, each of which describes some
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
120 component of the input in terms of other components. It does not take very many
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
121 productions to describe quite complex input streams. A grammar for the C
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
122 language, for instance, requires about two hundred productions.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
123 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
124 Many people find the term "production" quite confusing. It comes from theoretical
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
125 linguistics where it is used to describe how one may produce sequences which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
126 correspond to a set of grammatical rules. Ironically, the major usage of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
127 idea has been in parsing where the interest is not so much in creating sequences
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
128 which satisfy the grammatical rules as in decoding and analyzing such sequences.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
129 Nonetheless, it is convenient, in the above example, to say that the token date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
130 produces a sequence of tokens consisting of name of month, day, comma, and year.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
131 We also say that the sequence reduces to date.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
132 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
133 There may be more than one production to describe a given component, if there is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
134 more than one way it may be represented. For instance,</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
135 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
136 date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
137 -&gt; day, name of month, year </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
138 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
139 describes another common way of writing a date. In other words, a <A HREF="gloss.html#ReductionToken">reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
140 token</A> may produce a number of different <A HREF="gloss.html#GrammarRule">grammar rules</A>.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
141 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
142 Tokens which appear on the left side of one or more productions are called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
143 <A HREF="gloss.html#Nonterminal">nonterminal tokens</A>. Those which appear <I>only</I> on the right sides of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
144 productions are called <A HREF="gloss.html#Terminal">terminal tokens</A>. Terminal tokens are the units which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
145 actually appear physically in the input. Nonterminal tokens are identified when
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
146 a sequence of tokens that matches the right side of a production is seen in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
147 input. When AnaGram analyzes a <A HREF="gloss.html#Grammar">grammar</A>, it assigns a unique token number to each
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
148 token it finds in the grammar.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
149 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
150 Nonterminal tokens, such as date in the example above, may appear in any grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
151 rule just as though they were input tokens. The token on the left side of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
152 production can even appear on the right side as well. Such a production is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
153 called a recursive production. When a nonterminal token appears on the right
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
154 side of a production, it may be represented in this context by <I>any</I> of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
155 the <A HREF="gloss.html#GrammarRule">grammar rules</A> it produces. Grammars described in this manner are called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
156 <A HREF="gloss.html#ContextFreeGrammar">context free grammars</A> since there is no contextual constraint on which of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
157 rules that a token produces can appear in any given context.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
158 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
159 Recursive productions may be either left recursive or right recursive. Left
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
160 recursive productions are those where the recursively defined <A HREF="gloss.html#Nonterminal">nonterminal</A>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
161 appears as the first <A HREF="gloss.html#RuleElement">element</A> in the recursive rule. Right recursive productions
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
162 are those where it is the last element. If it appears anywhere between, the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
163 production is said to be center recursive.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
164 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
165 Any nonterminal token which has a recursive production must also have at least
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
166 one simple, non-recursive production. Otherwise, it is not possible to create a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
167 finite sequence of terminal tokens from the nonterminal token.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
168 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
169 Recursion may also occur implicitly in a <A HREF="gloss.html#Grammar">grammar</A> when one of the tokens on the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
170 right side of a production itself has a production involving the token on the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
171 left. Such implicit recursion sometimes may involve numerous levels of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
172 productions. Implicit recursion occurs most commonly in describing constructs
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
173 such as arithmetic expressions or the block structure of programming languages.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
174 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
175 Clearly, grammars can accommodate multiple levels of structure in the input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
176 sequences they describe. There must, at the top, be a single token which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
177 encompasses the entire input. This special token is variously called the <A HREF="gloss.html#GrammarToken">grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
178 token</A>, the goal token or the start token.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
179 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
180 AnaGram allows you to specify <A HREF="gloss.html#Terminal">terminal tokens</A> explicitly as ascii characters, or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
181 even as <A HREF="gloss.html#CharacterSets">sets of ascii characters</A>, right in the grammar. Thus, you may write
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
182 '0-9' to represent the set of ascii digits, or 'A-Z' to represent the set of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
183 upper case letters. The semantic value of such a <A HREF="gloss.html#Token">token</A> is the ascii character
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
184 code that actually appears in the input stream. If the various sets you use in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
185 your grammar overlap, they may not properly represent terminal tokens. In this
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
186 case, AnaGram automatically extends your <A HREF="gloss.html#Grammar">grammar</A> appropriately. </P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
187 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
188
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
189 <H2><A NAME="How">How a Parser Works</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
190 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
191 The aim of a <A HREF="gloss.html#Parser">parser</A> is to match its input with the full syntactic structure
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
192 specified by the <A HREF="gloss.html#Production">productions</A> which make up the grammar. The primary component of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
193 a parser is an input buffer, sometimes thought of as a stack, into which tokens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
194 are shifted, or stored sequentially, as they are encountered in the input. At
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
195 the same time that a token is shifted into the input buffer, its semantic value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
196 is pushed onto the value stack. A token is not shifted into the buffer unless it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
197 "makes sense", that is, unless it is consistent with the rules of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
198 <A HREF="gloss.html#Grammar">grammar</A> and with the input that has preceded it. If a token does not make sense,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
199 the parser signals a syntax error. In order to determine whether a token makes
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
200 sense, the parser has a sort of decision table which provides a list of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
201 acceptable tokens for each of a number of states. The table also specifies what
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
202 the parser is to do with each acceptable token. When the table indicates that a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
203 token is to be shifted into the buffer, it also specifies a new state. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
204 parser stacks the current state on a state stack and jumps to the new state.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
205 Thus every time a token is shifted into the input buffer, a state number is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
206 pushed onto the state stack. For each state of the parser, excepting only the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
207 initial state, there is a unique token which will cause a jump to that state.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
208 This token is called the <A HREF="gloss.html#CharacteristicToken">characteristic token</A> for the state.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
209 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
210 When the rightmost, or most recent, tokens in the input buffer match the right
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
211 side of a production precisely, the parser <I>may</I> replace the tokens that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
212 match the rule with a single token, the token on the left side of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
213 production. This process of replacing a sequence of tokens with a single token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
214 is called reduction. The token that replaces the sequence of tokens is called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
215 the <A HREF="gloss.html#ReductionToken">reduction token</A>.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
216 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
217 The actual mechanism of the reduction is quite important. At the same time that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
218 input tokens are removed from the input buffer, state numbers are popped from
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
219 the state stack, so that when all input tokens matching the rule have been
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
220 removed, the parser state has been restored to the value it had at the time the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
221 first token in the rule was seen. As state numbers are popped from the state
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
222 stack, token values are popped from the value stack. If the rule has a <A HREF="gloss.html#ReductionProcedure">reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
223 procedure</A>, temporary variables are loaded with the values popped from the stack
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
224 and the reduction procedure is called. The reduction token is now shifted into
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
225 the input buffer just as though it were an input token. If the reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
226 procedure returned a result it is shifted into the value stack as the value of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
227 the reduction token. The parser stacks the current state again and jumps to a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
228 new state as determined by the parser tables.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
229 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
230 If there are no errors in your input, when the last token has been read from the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
231 input, shifted into the input buffer, and reductions performed, there will be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
232 precisely one token in the input buffer: the <A HREF="gloss.html#GrammarToken">grammar, or goal, token</A> which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
233 describes your entire input. At this point your parser declares itself finished.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
234 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
235 Reductions do not necessarily occur every time a rule matches the tokens in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
236 input buffer. If the reduction token does not make sense, that is, if it is not
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
237 consistent with the rules of the <A HREF="gloss.html#GrammarToken">grammar</A> and with the input that has preceded
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
238 it, the parser will not perform the reduction. Suppose there are tokens in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
239 input which match one of the rules given above for date. A reduction will not
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
240 occur unless date is one of the tokens the parser is actually looking for at
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
241 that stage in the input.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
242 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
243 Even when the reduction token would make sense, there are still situations where
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
244 the reduction would not take place. Suppose a grammar includes the two
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
245 <A HREF="gloss.html#Production">productions</A> given above for date as well as the following:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
246 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
247
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
248 date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
249 -&gt; name of month, day </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
250 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
251 This production is the same as the first, but with no year specification. When a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
252 parser directed by this grammar has encountered name of month and day, it can't
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
253 tell without looking further whether it has a short form or a long form date. In
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
254 such a circumstance, the parser looks at the next following token, which is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
255 called a <A HREF="gloss.html#Lookahead">lookahead token</A>. If the lookahead token is a comma, then, in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
256 absence of other productions, the input is a long form date. If the lookahead
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
257 token is not a comma, then the input is certainly not a long form date, and it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
258 is proper to reduce the short form production.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
259 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
260 Suppose the lookahead token were a comma and the grammar were also to contain
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
261 the following production:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
262 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
263
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
264 appointment
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
265 -&gt; date, comma, time </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
266 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
267 Since a comma can follow date, according to this rule, and can also follow day
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
268 according to the first production, it is impossible to determine, simply by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
269 looking at the <A HREF="gloss.html#Lookahead">lookahead token</A>, whether the date was being given in short form
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
270 or long form. One would have to look beyond the comma to see if what follows the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
271 comma matches the rules for time or for year. Although it is possible to build
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
272 parsers which can do this, it is not generally feasible. Situations of this sort
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
273 are called <A HREF="gloss.html#Conflict">conflicts</A>. AnaGram warns you about the conflicts in your <A HREF="gloss.html#Grammar">grammar</A> when
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
274 it analyzes your grammar, and provides numerous facilities to help you correct
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
275 them.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
276 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
277
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
278 <H2><A NAME="Note">A Note on Notation</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
279 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
280 <A HREF="gloss.html#ContextFreeGrammar">Context free grammars</A> have been traditionally represented in the literature
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
281 using <A HREF="gloss.html#BNF">Backus-Naur Form</A>, or BNF. In Backus-Naur Form, certain characters, called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
282 metacharacters, are used to punctuate <A HREF="gloss.html#Production">productions</A> and all other printable
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
283 characters are taken to represent themselves literally. Named tokens are denoted
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
284 by enclosing the name within angle brackets <CODE>&lt; &gt;</CODE> . The left side of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
285 production is distinguished from the right side by the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
286 characters <CODE> ::= </CODE>.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
287 If several productions have the same left side, the vertical
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
288 bar <CODE> | </CODE> is used to separate them. The <A HREF="gloss.html#RuleElement">elements</A> of a <A HREF="gloss
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
289 .html#GrammarRule">grammar rule</A> are simply juxtaposed
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
290 to indicate that one follows another. Blanks are ignored. Thus, in BNF, the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
291 first production given for date, above, would be:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
292 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
293 &lt;date&gt; ::= &lt;name of month&gt; &lt;day&gt;, &lt;year&gt;
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
294 </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
295 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
296 AnaGram uses a notation more consonant with ordinary programming usage. Thus
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
297 token names need not be bracketed and literal characters must be appropriately
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
298 quoted. The elements of rules are joined by commas. Using this approach, there
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
299 is no need for metacharacters and it becomes possible to make a number of useful
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
300 extensions to the notation.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
301 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
302
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
303 <H2><A NAME="Reduction">Reduction Procedures</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
304 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
305 Of course, the reason for parsing an input stream is to interpret the data in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
306 the stream and to process it in some useful manner. The primary tool for doing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
307 this is the <A HREF="gloss.html#ReductionProcedure">reduction procedure</A>. A reduction procedure is a piece of C code that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
308 is executed when a particular <A HREF="gloss.html#GrammarRule">grammar rule</A> is reduced. Often, a reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
309 procedure calculates a value which becomes the semantic value of the reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
310 token. The input to the reduction procedure consists of the values of the tokens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
311 that make up the grammar rule.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
312 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
313 AnaGram allows you to assign C variable names to the tokens in a grammar rule,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
314 so that you can refer to them in the reduction procedure. To assign a C variable
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
315 name, simply follow the token in the rule with a colon and the C variable name.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
316 Simple reduction procedures can be written as C or C++ expressions. At the end
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
317 of the rule, write an equal sign and an expression, terminated by a semicolon.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
318 For example:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
319 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
320 (int) hex digit
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
321 -&gt; '0-9':d =d-'0';
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
322 -&gt; 'a-f':d =d-'a'+10;
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
323 -&gt; 'A-F':d =d-'A'+10; </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
324 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
325 When any one of these rules is matched, the value of the token in the rule is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
326 assigned to the temporary variable. The expression to the right of the equal
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
327 sign is evaluated and the value of the expression is stored as the value of hex
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
328 digit, which has been declared to be an int. These <A HREF="gloss.html#Production">productions</A> define
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
329 hexadecimal digits as ascii characters, and calculate the binary value of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
330 digit in terms of the ascii character code, d. The binary value becomes the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
331 value of hex digit. Hexadecimal digits can be combined to make hexadecimal
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
332 numbers by writing the following productions:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
333 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
334 (int) hex number
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
335 -&gt; hex digit
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
336 -&gt; hex number:n, hex digit:d =16*n+d; </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
337 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
338 There are several important points to notice in this example. First, <A HREF="gloss.html#ReductionProcedure">reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
339 procedures</A> are executed "from the bottom up". That is, the reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
340 procedure for hex digit is executed before any reduction procedure for hex
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
341 number. Second, if there is no reduction procedure for a production, the value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
342 of the first token in the rule is assigned to the reduction token. Thus, it is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
343 not necessary to provide a reduction procedure in the first production for hex
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
344 number. Third, the reduction procedures for recursive productions are always
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
345 executed <I>after</I> the reduction procedure, if any, for the non-recursive
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
346 production which begins the recursion. Finally, when an input sequence is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
347 described using left recursion, as in this example, the <A HREF="gloss.html#RuleElement">elements</A> of the sequence
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
348 are processed left to right.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
349 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
350 If you wish to process the elements of a sequence right to left, you may use
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
351 right recursion. For example, it is sometimes convenient to define the fraction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
352 part of a decimal number thus:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
353 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
354 (double) fraction part
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
355 -&gt; '0-9':d =(d-'0')/10.;
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
356 -&gt; '0-9':d,fraction part:f =(d-'0'+f)/10.;</PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
357 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
358 In this case the leading digits are stored temporarily in the parse stack, and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
359 then the fraction part is evaluated right to left only when the last digit has
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
360 been found.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
361 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
362 Reduction procedures can be more complex than simple expressions. After the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
363 equal sign you may include an arbitrary block of C or C++ code, enclosed in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
364 braces, { }. To return a semantic value for the reduction token simply use a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
365 return statement. Of course, reduction procedures have the full resources of C
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
366 or C++ at their disposal. They may set and interrogate global variables and may
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
367 call functions freely.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
368 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
369 Since the reduction procedures you write will probably need some support code,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
370 such as #include statements and declarations, you may incorporate C or C++ code
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
371 into your syntax file at any point. You need only enclose it in braces ({}).
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
372 Such code is called embedded C. All embedded C code is also copied to the parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
373 file, and <I>precedes</I> all of your reduction procedures.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
374 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
375
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
376 <H2><A NAME="Building">Building a Parser</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
377 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
378 In order to round out a <A HREF="gloss.html#Parser">parser</A> into a functioning program it needs input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
379 procedures, as well as error diagnosis and recovery capabilities. AnaGram has a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
380 number of options available which give you a high degree of flexibility in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
381 configuring a parser to suit your particular needs. All of the options are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
382 provided with reasonable defaults, so that you can safely disregard any option
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
383 until you need the features it provides.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
384 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
385
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
386 <H4><A NAME="Invoking">Invoking a Parser</A></H4>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
387 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
388 Normally, AnaGram configures <A HREF="gloss.html#Parser">parsers</A> as functions which you can call from
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
389 elsewhere in your program. In this situation, you call the parser, it processes
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
390 its input, and returns either when it has finished or cannot proceed because of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
391 errors. Alternatively, if you set the event driven configuration switch, your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
392 parser will be configured so that you have two procedures to call: an
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
393 initializer and a parser. In the event driven configuration you start the parse
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
394 by calling the initializer and then you call the parser once for each unit of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
395 input. Using the event driven mode makes it quite easy to configure a parser as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
396 a filter and to chain several parsers together so that the output from one
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
397 parser is the input to the next. Such multi-stage parsing is a convenient way to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
398 deal with complex input that is not context free.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
399 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
400
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
401 <H4><A NAME="Communicating">Communicating with a Parser</A></H4>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
402 <P>The complete status of your parser is contained in a single data structure
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
403 called a parser control block. All communications with a parser take place via
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
404 the parser control block. Input procedures must place input data in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
405 appropriate field in the parser control block. When the parse is complete or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
406 encounters an error, the results of the parse may be found in the parser control
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
407 block. When AnaGram builds a parser it includes, in the header file it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
408 generates, a typedef statement which defines the structure of the parser control
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
409 block.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
410 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
411
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
412 <H4><A NAME="Parser">Parser Input</A></H4>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
413 <P>The input to your <A HREF="gloss.html#Parser">parser</A> may be either characters read directly from an
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
414 input stream, or tokens accumulated by a pre-processor or <A HREF="gloss.html#LexicalScanner">lexical scanner</A>. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
415 way you provide input to your parser depends on how your <A HREF="gloss.html#Grammar">grammar</A> defines input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
416 tokens and also on whether or not you have requested an event driven parser.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
417 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
418 If your parser is event driven, you provide its input by storing the input code
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
419 and the input value, if any, into the parser control block and calling the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
420 parser.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
421 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
422 If you have set the pointer input configuration switch in your syntax file, you
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
423 simply initialize the pointer field in your parser control block before you call
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
424 your parser. Your parser will then read its input directly from memory by simply
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
425 incrementing the pointer as necessary.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
426 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
427 Otherwise, your parser will invoke a macro called GET_INPUT every time it needs
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
428 more input. You may define GET_INPUT according to your needs. You can define it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
429 so that it calls an input function, or you can define it so that it executes
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
430 in-line code each time it is invoked. Your GET_INPUT macro should store its
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
431 input code in the input_code field of the parser control block. If you do not
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
432 write a GET_ INPUT macro, AnaGram will provide one which will read characters
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
433 from stdin.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
434 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
435 If your <A HREF="gloss.html#Grammar">grammar</A> does not define <A HREF="gloss.html#Terminal">terminal tokens</A> in terms of ascii characters or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
436 external token numbers, your GET_INPUT will have to determine the appropriate
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
437 internal token number for each input token. To assist you in determining these
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
438 token numbers AnaGram provides a typedef enum statement in the header file. You
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
439 can then use named constants to specify the internal token numbers for the input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
440 tokens.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
441 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
442
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
443 <H4><A NAME="Error">Error Handling</A></H4>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
444 <P>Your <A HREF="gloss.html#Parser">parser</A> must be prepared to deal with erroneous input. There are two
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
445 aspects to error handling: diagnosing the error, and recovering from the error.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
446 On encountering an error, your parser will invoke a macro called SYNTAX_ ERROR.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
447 If you do not provide a definition for SYNTAX_ERROR, AnaGram will provide a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
448 simple error diagnostic. AnaGram can also provide automatic error diagnoses
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
449 which pinpoint the location of the error.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
450 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
451 AnaGram provides two options for error recovery: error
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
452 token <A HREF="gloss.html#Resynchronization">resynchronization</A>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
453 and automatic resynchronization. These are techniques for getting your parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
454 back in synch with its input so it can proceed after encountering an error.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
455 Normally, if you do not select one of these recovery techniques, your parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
456 will terminate when it encounters an error; however, you may override this
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
457 default if you wish and provide your own recovery technique.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
458
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
459
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
460 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
461 WIDTH=1010 HEIGHT=2 >
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
462 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
463 <IMG ALIGN="right" SRC="images/pslrb6d.gif" ALT="Parsifal Software"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
464 WIDTH=181 HEIGHT=25>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
465 <BR CLEAR="right">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
466
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
467 Back to <A HREF="index.html">Index</A>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
468 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
469 <ADDRESS><FONT SIZE="-1">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
470 AnaGram parser generator - documentation<BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
471 Introduction to Syntax Directed Parsing<BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
472 Copyright &copy; 1993-1999, Parsifal Software. <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
473 All Rights Reserved.<BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
474 </FONT></ADDRESS>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
475
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
476 </BODY>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
477 </HTML>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
478
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
479