annotate doc/misc/html/isdp.html @ 0:13d2b8934445

Import AnaGram (near-)release tree into Mercurial.
author David A. Holland
date Sat, 22 Dec 2007 17:52:45 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
2 <HTML>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
3 <HEAD>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
4 <TITLE>Introduction to Syntax Directed Parsing</TITLE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
5 </HEAD>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
6
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
7 <BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
8 TEXT="#000000" LINK="#0033CC"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
9 VLINK="#CC0033" ALINK="#CC0099">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
10
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
11 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
12
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
13 <IMG ALIGN="right" SRC="images/agrsl6c.gif" ALT="AnaGram"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
14 WIDTH=124 HEIGHT=30>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
15 <BR CLEAR="all">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
16 Back to <A HREF="index.html">Index</A>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
17 </P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
18 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
19 WIDTH=1010 HEIGHT=2 >
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
20
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
21 <BR CLEAR="all">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
22
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
23
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
24 <H1 ALIGN="LEFT">Introduction to Syntax Directed Parsing</H1>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
25 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
26 WIDTH=1010 HEIGHT=2 >
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
27 </P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
28
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
29 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
30 (Adapted from Chapter 4, AnaGram User's Guide)</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
31 </P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
32 <UL>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
33 <LI SRC="#What"><A HREF="#What">What is Syntax Directed Parsing?</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
34 <LI><A HREF="#Describing">Describing an Input Sequence</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
35 <LI><A HREF="#How">How a Parser Works</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
36 <LI><A HREF="#Note">A Note on Notation</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
37 <LI><A HREF="#Reduction">Reduction Procedures</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
38 <LI><A HREF="#Building">Building a Parser</A>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
39 <UL>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
40 <LI SRC="#Invoking"><A HREF="#Invoking">Invoking a Parser</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
41 <LI><A HREF="#Communicating">Communicating with a Parser</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
42 <LI><A HREF="#Parser">Parser Input</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
43 <LI><A HREF="#Error">Error Handling</A></LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
44 </UL>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
45 </LI>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
46 </UL>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
47 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
48
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
49 <H2><A NAME="What">What is Syntax Directed Parsing?</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
50 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
51 Every programmer has to deal with input data. Usually processing input data
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
52 depends on what has preceded, and often even on what follows, the input under
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
53 consideration. Keeping track of these dependencies in order to know how to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
54 process the data is called parsing. It is often relatively easy to keep track of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
55 simple dependencies when first writing a program. As the program develops, as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
56 new features are added, as bugs are fixed, the dependencies often stop being
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
57 simple. Input processing becomes a headache, since it is hard to keep track of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
58 or even identify all the particular cases. Changes to the program cause
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
59 unexpected problems and program maintenance threatens to get out of control.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
60 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
61 Syntax directed parsing is a technique for addressing these difficulties. In
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
62 syntax directed parsing, the input part of a program is constructed
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
63 automatically, by means of a standard algorithm, from a high level description
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
64 of the input data structure. Code to perform any required processing of the data
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
65 is attached to the description in a convenient way.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
66 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
67 The description, being non-procedural, is generally easier to write and to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
68 modify than the equivalent program code, and much less likely to harbor bugs. It
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
69 is easier to read, and easier to maintain. It is easy to re-use in other
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
70 programs requiring the same input, thus encouraging uniform interfaces. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
71 technique also simplifies the overall program by decoupling the input and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
72 processing components and providing a natural, modular structure.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
73 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
74 To use syntax directed parsing you first write the input data description,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
75 called a <A HREF="gloss.html#Grammar">grammar</A>. A file which contains a grammar is called a syntax
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
76 file. A parser generator, such as AnaGram, then can create, from the syntax
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
77 file, a function (or program) called a <A HREF="gloss.html#Parser">parser</A>, written in C or C++. The parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
78 keeps track of all the dependencies in your input, and calls certain functions,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
79 <A HREF="gloss.html#ReductionProcedure">reduction procedures</A>, to deal with specific units or sequences of data as they
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
80 are encountered.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
81 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
82 Reduction procedures are the code you write to process your data. In your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
83 grammar they are linked to the structures in your input, so that your parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
84 will call them at precisely the right times with precisely the right data. Note
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
85 that with this technique you need only provide a non-procedural <I>description</I>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
86 of the input. The details of flow of control are handled entirely by the parser.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
87 When you write reduction procedures, you can concentrate entirely on what you
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
88 have to do with the data. You don't have to encumber your code with switches and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
89 tests to determine the structure of your input.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
90 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
91 The parsers you build using a parser generator such as AnaGram may be complete
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
92 stand-alone programs, or they may serve as input routines for a more extensive
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
93 program. Some programs may even use more than one parser.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
94 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
95
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
96 <H2><A NAME="Describing">Describing an Input Sequence</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
97 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
98 Writing a grammar consists of describing the acceptable input sequences for your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
99 program. The vehicle for describing an input sequence is called a <A HREF="gloss.html#Production">production</A>.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
100 Productions show how a logical component of the input can be made up of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
101 sequence of more elementary components. A production that describes a date might
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
102 be written:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
103 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
104 date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
105 -&gt; name of month, day, comma, year </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
106 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
107 The components of the input are called <A HREF="gloss.html#Token">tokens</A>. The sequence of tokens on the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
108 side of the production is called a <A HREF="gloss.html#GrammarRule">grammar rule</A>, or rule for short. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
109 individual tokens on the right side of the rule are also called <A HREF="gloss.html#RuleElement">rule elements</A>.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
110 </P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
111 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
112 The token on the left side of the production is called the <A HREF="gloss.html#ReductionToken">reduction token</A> for
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
113 the rule. Tokens may have semantic values, as distinguished from syntactic
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
114 values, which you can use in your <A HREF="gloss.html#ReductionProcedure">reduction procedures</A>. For instance, the value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
115 of name of month could be an integer in the range zero to eleven, or it could be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
116 a pointer to an ascii string. The value of day could be an integer in the range
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
117 one to thirty-one.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
118 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
119 A <A HREF="gloss.html#Grammar">grammar</A> consists of a number of such <A HREF="gloss.html#Production">productions</A>, each of which describes some
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
120 component of the input in terms of other components. It does not take very many
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
121 productions to describe quite complex input streams. A grammar for the C
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
122 language, for instance, requires about two hundred productions.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
123 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
124 Many people find the term "production" quite confusing. It comes from theoretical
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
125 linguistics where it is used to describe how one may produce sequences which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
126 correspond to a set of grammatical rules. Ironically, the major usage of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
127 idea has been in parsing where the interest is not so much in creating sequences
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
128 which satisfy the grammatical rules as in decoding and analyzing such sequences.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
129 Nonetheless, it is convenient, in the above example, to say that the token date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
130 produces a sequence of tokens consisting of name of month, day, comma, and year.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
131 We also say that the sequence reduces to date.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
132 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
133 There may be more than one production to describe a given component, if there is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
134 more than one way it may be represented. For instance,</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
135 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
136 date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
137 -&gt; day, name of month, year </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
138 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
139 describes another common way of writing a date. In other words, a <A HREF="gloss.html#ReductionToken">reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
140 token</A> may produce a number of different <A HREF="gloss.html#GrammarRule">grammar rules</A>.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
141 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
142 Tokens which appear on the left side of one or more productions are called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
143 <A HREF="gloss.html#Nonterminal">nonterminal tokens</A>. Those which appear <I>only</I> on the right sides of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
144 productions are called <A HREF="gloss.html#Terminal">terminal tokens</A>. Terminal tokens are the units which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
145 actually appear physically in the input. Nonterminal tokens are identified when
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
146 a sequence of tokens that matches the right side of a production is seen in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
147 input. When AnaGram analyzes a <A HREF="gloss.html#Grammar">grammar</A>, it assigns a unique token number to each
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
148 token it finds in the grammar.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
149 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
150 Nonterminal tokens, such as date in the example above, may appear in any grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
151 rule just as though they were input tokens. The token on the left side of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
152 production can even appear on the right side as well. Such a production is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
153 called a recursive production. When a nonterminal token appears on the right
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
154 side of a production, it may be represented in this context by <I>any</I> of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
155 the <A HREF="gloss.html#GrammarRule">grammar rules</A> it produces. Grammars described in this manner are called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
156 <A HREF="gloss.html#ContextFreeGrammar">context free grammars</A> since there is no contextual constraint on which of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
157 rules that a token produces can appear in any given context.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
158 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
159 Recursive productions may be either left recursive or right recursive. Left
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
160 recursive productions are those where the recursively defined <A HREF="gloss.html#Nonterminal">nonterminal</A>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
161 appears as the first <A HREF="gloss.html#RuleElement">element</A> in the recursive rule. Right recursive productions
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
162 are those where it is the last element. If it appears anywhere between, the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
163 production is said to be center recursive.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
164 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
165 Any nonterminal token which has a recursive production must also have at least
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
166 one simple, non-recursive production. Otherwise, it is not possible to create a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
167 finite sequence of terminal tokens from the nonterminal token.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
168 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
169 Recursion may also occur implicitly in a <A HREF="gloss.html#Grammar">grammar</A> when one of the tokens on the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
170 right side of a production itself has a production involving the token on the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
171 left. Such implicit recursion sometimes may involve numerous levels of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
172 productions. Implicit recursion occurs most commonly in describing constructs
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
173 such as arithmetic expressions or the block structure of programming languages.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
174 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
175 Clearly, grammars can accommodate multiple levels of structure in the input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
176 sequences they describe. There must, at the top, be a single token which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
177 encompasses the entire input. This special token is variously called the <A HREF="gloss.html#GrammarToken">grammar
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
178 token</A>, the goal token or the start token.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
179 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
180 AnaGram allows you to specify <A HREF="gloss.html#Terminal">terminal tokens</A> explicitly as ascii characters, or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
181 even as <A HREF="gloss.html#CharacterSets">sets of ascii characters</A>, right in the grammar. Thus, you may write
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
182 '0-9' to represent the set of ascii digits, or 'A-Z' to represent the set of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
183 upper case letters. The semantic value of such a <A HREF="gloss.html#Token">token</A> is the ascii character
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
184 code that actually appears in the input stream. If the various sets you use in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
185 your grammar overlap, they may not properly represent terminal tokens. In this
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
186 case, AnaGram automatically extends your <A HREF="gloss.html#Grammar">grammar</A> appropriately. </P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
187 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
188
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
189 <H2><A NAME="How">How a Parser Works</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
190 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
191 The aim of a <A HREF="gloss.html#Parser">parser</A> is to match its input with the full syntactic structure
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
192 specified by the <A HREF="gloss.html#Production">productions</A> which make up the grammar. The primary component of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
193 a parser is an input buffer, sometimes thought of as a stack, into which tokens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
194 are shifted, or stored sequentially, as they are encountered in the input. At
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
195 the same time that a token is shifted into the input buffer, its semantic value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
196 is pushed onto the value stack. A token is not shifted into the buffer unless it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
197 "makes sense", that is, unless it is consistent with the rules of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
198 <A HREF="gloss.html#Grammar">grammar</A> and with the input that has preceded it. If a token does not make sense,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
199 the parser signals a syntax error. In order to determine whether a token makes
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
200 sense, the parser has a sort of decision table which provides a list of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
201 acceptable tokens for each of a number of states. The table also specifies what
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
202 the parser is to do with each acceptable token. When the table indicates that a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
203 token is to be shifted into the buffer, it also specifies a new state. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
204 parser stacks the current state on a state stack and jumps to the new state.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
205 Thus every time a token is shifted into the input buffer, a state number is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
206 pushed onto the state stack. For each state of the parser, excepting only the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
207 initial state, there is a unique token which will cause a jump to that state.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
208 This token is called the <A HREF="gloss.html#CharacteristicToken">characteristic token</A> for the state.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
209 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
210 When the rightmost, or most recent, tokens in the input buffer match the right
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
211 side of a production precisely, the parser <I>may</I> replace the tokens that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
212 match the rule with a single token, the token on the left side of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
213 production. This process of replacing a sequence of tokens with a single token
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
214 is called reduction. The token that replaces the sequence of tokens is called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
215 the <A HREF="gloss.html#ReductionToken">reduction token</A>.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
216 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
217 The actual mechanism of the reduction is quite important. At the same time that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
218 input tokens are removed from the input buffer, state numbers are popped from
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
219 the state stack, so that when all input tokens matching the rule have been
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
220 removed, the parser state has been restored to the value it had at the time the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
221 first token in the rule was seen. As state numbers are popped from the state
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
222 stack, token values are popped from the value stack. If the rule has a <A HREF="gloss.html#ReductionProcedure">reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
223 procedure</A>, temporary variables are loaded with the values popped from the stack
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
224 and the reduction procedure is called. The reduction token is now shifted into
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
225 the input buffer just as though it were an input token. If the reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
226 procedure returned a result it is shifted into the value stack as the value of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
227 the reduction token. The parser stacks the current state again and jumps to a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
228 new state as determined by the parser tables.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
229 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
230 If there are no errors in your input, when the last token has been read from the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
231 input, shifted into the input buffer, and reductions performed, there will be
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
232 precisely one token in the input buffer: the <A HREF="gloss.html#GrammarToken">grammar, or goal, token</A> which
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
233 describes your entire input. At this point your parser declares itself finished.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
234 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
235 Reductions do not necessarily occur every time a rule matches the tokens in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
236 input buffer. If the reduction token does not make sense, that is, if it is not
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
237 consistent with the rules of the <A HREF="gloss.html#GrammarToken">grammar</A> and with the input that has preceded
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
238 it, the parser will not perform the reduction. Suppose there are tokens in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
239 input which match one of the rules given above for date. A reduction will not
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
240 occur unless date is one of the tokens the parser is actually looking for at
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
241 that stage in the input.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
242 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
243 Even when the reduction token would make sense, there are still situations where
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
244 the reduction would not take place. Suppose a grammar includes the two
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
245 <A HREF="gloss.html#Production">productions</A> given above for date as well as the following:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
246 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
247
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
248 date
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
249 -&gt; name of month, day </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
250 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
251 This production is the same as the first, but with no year specification. When a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
252 parser directed by this grammar has encountered name of month and day, it can't
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
253 tell without looking further whether it has a short form or a long form date. In
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
254 such a circumstance, the parser looks at the next following token, which is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
255 called a <A HREF="gloss.html#Lookahead">lookahead token</A>. If the lookahead token is a comma, then, in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
256 absence of other productions, the input is a long form date. If the lookahead
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
257 token is not a comma, then the input is certainly not a long form date, and it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
258 is proper to reduce the short form production.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
259 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
260 Suppose the lookahead token were a comma and the grammar were also to contain
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
261 the following production:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
262 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
263
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
264 appointment
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
265 -&gt; date, comma, time </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
266 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
267 Since a comma can follow date, according to this rule, and can also follow day
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
268 according to the first production, it is impossible to determine, simply by
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
269 looking at the <A HREF="gloss.html#Lookahead">lookahead token</A>, whether the date was being given in short form
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
270 or long form. One would have to look beyond the comma to see if what follows the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
271 comma matches the rules for time or for year. Although it is possible to build
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
272 parsers which can do this, it is not generally feasible. Situations of this sort
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
273 are called <A HREF="gloss.html#Conflict">conflicts</A>. AnaGram warns you about the conflicts in your <A HREF="gloss.html#Grammar">grammar</A> when
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
274 it analyzes your grammar, and provides numerous facilities to help you correct
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
275 them.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
276 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
277
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
278 <H2><A NAME="Note">A Note on Notation</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
279 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
280 <A HREF="gloss.html#ContextFreeGrammar">Context free grammars</A> have been traditionally represented in the literature
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
281 using <A HREF="gloss.html#BNF">Backus-Naur Form</A>, or BNF. In Backus-Naur Form, certain characters, called
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
282 metacharacters, are used to punctuate <A HREF="gloss.html#Production">productions</A> and all other printable
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
283 characters are taken to represent themselves literally. Named tokens are denoted
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
284 by enclosing the name within angle brackets <CODE>&lt; &gt;</CODE> . The left side of a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
285 production is distinguished from the right side by the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
286 characters <CODE> ::= </CODE>.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
287 If several productions have the same left side, the vertical
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
288 bar <CODE> | </CODE> is used to separate them. The <A HREF="gloss.html#RuleElement">elements</A> of a <A HREF="gloss
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
289 .html#GrammarRule">grammar rule</A> are simply juxtaposed
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
290 to indicate that one follows another. Blanks are ignored. Thus, in BNF, the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
291 first production given for date, above, would be:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
292 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
293 &lt;date&gt; ::= &lt;name of month&gt; &lt;day&gt;, &lt;year&gt;
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
294 </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
295 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
296 AnaGram uses a notation more consonant with ordinary programming usage. Thus
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
297 token names need not be bracketed and literal characters must be appropriately
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
298 quoted. The elements of rules are joined by commas. Using this approach, there
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
299 is no need for metacharacters and it becomes possible to make a number of useful
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
300 extensions to the notation.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
301 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
302
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
303 <H2><A NAME="Reduction">Reduction Procedures</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
304 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
305 Of course, the reason for parsing an input stream is to interpret the data in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
306 the stream and to process it in some useful manner. The primary tool for doing
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
307 this is the <A HREF="gloss.html#ReductionProcedure">reduction procedure</A>. A reduction procedure is a piece of C code that
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
308 is executed when a particular <A HREF="gloss.html#GrammarRule">grammar rule</A> is reduced. Often, a reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
309 procedure calculates a value which becomes the semantic value of the reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
310 token. The input to the reduction procedure consists of the values of the tokens
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
311 that make up the grammar rule.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
312 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
313 AnaGram allows you to assign C variable names to the tokens in a grammar rule,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
314 so that you can refer to them in the reduction procedure. To assign a C variable
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
315 name, simply follow the token in the rule with a colon and the C variable name.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
316 Simple reduction procedures can be written as C or C++ expressions. At the end
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
317 of the rule, write an equal sign and an expression, terminated by a semicolon.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
318 For example:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
319 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
320 (int) hex digit
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
321 -&gt; '0-9':d =d-'0';
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
322 -&gt; 'a-f':d =d-'a'+10;
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
323 -&gt; 'A-F':d =d-'A'+10; </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
324 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
325 When any one of these rules is matched, the value of the token in the rule is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
326 assigned to the temporary variable. The expression to the right of the equal
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
327 sign is evaluated and the value of the expression is stored as the value of hex
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
328 digit, which has been declared to be an int. These <A HREF="gloss.html#Production">productions</A> define
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
329 hexadecimal digits as ascii characters, and calculate the binary value of the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
330 digit in terms of the ascii character code, d. The binary value becomes the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
331 value of hex digit. Hexadecimal digits can be combined to make hexadecimal
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
332 numbers by writing the following productions:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
333 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
334 (int) hex number
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
335 -&gt; hex digit
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
336 -&gt; hex number:n, hex digit:d =16*n+d; </PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
337 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
338 There are several important points to notice in this example. First, <A HREF="gloss.html#ReductionProcedure">reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
339 procedures</A> are executed "from the bottom up". That is, the reduction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
340 procedure for hex digit is executed before any reduction procedure for hex
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
341 number. Second, if there is no reduction procedure for a production, the value
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
342 of the first token in the rule is assigned to the reduction token. Thus, it is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
343 not necessary to provide a reduction procedure in the first production for hex
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
344 number. Third, the reduction procedures for recursive productions are always
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
345 executed <I>after</I> the reduction procedure, if any, for the non-recursive
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
346 production which begins the recursion. Finally, when an input sequence is
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
347 described using left recursion, as in this example, the <A HREF="gloss.html#RuleElement">elements</A> of the sequence
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
348 are processed left to right.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
349 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
350 If you wish to process the elements of a sequence right to left, you may use
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
351 right recursion. For example, it is sometimes convenient to define the fraction
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
352 part of a decimal number thus:</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
353 <PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
354 (double) fraction part
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
355 -&gt; '0-9':d =(d-'0')/10.;
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
356 -&gt; '0-9':d,fraction part:f =(d-'0'+f)/10.;</PRE>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
357 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
358 In this case the leading digits are stored temporarily in the parse stack, and
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
359 then the fraction part is evaluated right to left only when the last digit has
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
360 been found.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
361 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
362 Reduction procedures can be more complex than simple expressions. After the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
363 equal sign you may include an arbitrary block of C or C++ code, enclosed in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
364 braces, { }. To return a semantic value for the reduction token simply use a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
365 return statement. Of course, reduction procedures have the full resources of C
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
366 or C++ at their disposal. They may set and interrogate global variables and may
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
367 call functions freely.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
368 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
369 Since the reduction procedures you write will probably need some support code,
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
370 such as #include statements and declarations, you may incorporate C or C++ code
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
371 into your syntax file at any point. You need only enclose it in braces ({}).
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
372 Such code is called embedded C. All embedded C code is also copied to the parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
373 file, and <I>precedes</I> all of your reduction procedures.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
374 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
375
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
376 <H2><A NAME="Building">Building a Parser</A></H2>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
377 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
378 In order to round out a <A HREF="gloss.html#Parser">parser</A> into a functioning program it needs input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
379 procedures, as well as error diagnosis and recovery capabilities. AnaGram has a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
380 number of options available which give you a high degree of flexibility in
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
381 configuring a parser to suit your particular needs. All of the options are
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
382 provided with reasonable defaults, so that you can safely disregard any option
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
383 until you need the features it provides.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
384 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
385
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
386 <H4><A NAME="Invoking">Invoking a Parser</A></H4>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
387 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
388 Normally, AnaGram configures <A HREF="gloss.html#Parser">parsers</A> as functions which you can call from
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
389 elsewhere in your program. In this situation, you call the parser, it processes
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
390 its input, and returns either when it has finished or cannot proceed because of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
391 errors. Alternatively, if you set the event driven configuration switch, your
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
392 parser will be configured so that you have two procedures to call: an
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
393 initializer and a parser. In the event driven configuration you start the parse
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
394 by calling the initializer and then you call the parser once for each unit of
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
395 input. Using the event driven mode makes it quite easy to configure a parser as
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
396 a filter and to chain several parsers together so that the output from one
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
397 parser is the input to the next. Such multi-stage parsing is a convenient way to
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
398 deal with complex input that is not context free.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
399 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
400
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
401 <H4><A NAME="Communicating">Communicating with a Parser</A></H4>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
402 <P>The complete status of your parser is contained in a single data structure
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
403 called a parser control block. All communications with a parser take place via
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
404 the parser control block. Input procedures must place input data in the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
405 appropriate field in the parser control block. When the parse is complete or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
406 encounters an error, the results of the parse may be found in the parser control
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
407 block. When AnaGram builds a parser it includes, in the header file it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
408 generates, a typedef statement which defines the structure of the parser control
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
409 block.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
410 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
411
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
412 <H4><A NAME="Parser">Parser Input</A></H4>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
413 <P>The input to your <A HREF="gloss.html#Parser">parser</A> may be either characters read directly from an
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
414 input stream, or tokens accumulated by a pre-processor or <A HREF="gloss.html#LexicalScanner">lexical scanner</A>. The
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
415 way you provide input to your parser depends on how your <A HREF="gloss.html#Grammar">grammar</A> defines input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
416 tokens and also on whether or not you have requested an event driven parser.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
417 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
418 If your parser is event driven, you provide its input by storing the input code
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
419 and the input value, if any, into the parser control block and calling the
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
420 parser.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
421 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
422 If you have set the pointer input configuration switch in your syntax file, you
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
423 simply initialize the pointer field in your parser control block before you call
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
424 your parser. Your parser will then read its input directly from memory by simply
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
425 incrementing the pointer as necessary.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
426 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
427 Otherwise, your parser will invoke a macro called GET_INPUT every time it needs
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
428 more input. You may define GET_INPUT according to your needs. You can define it
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
429 so that it calls an input function, or you can define it so that it executes
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
430 in-line code each time it is invoked. Your GET_INPUT macro should store its
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
431 input code in the input_code field of the parser control block. If you do not
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
432 write a GET_ INPUT macro, AnaGram will provide one which will read characters
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
433 from stdin.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
434 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
435 If your <A HREF="gloss.html#Grammar">grammar</A> does not define <A HREF="gloss.html#Terminal">terminal tokens</A> in terms of ascii characters or
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
436 external token numbers, your GET_INPUT will have to determine the appropriate
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
437 internal token number for each input token. To assist you in determining these
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
438 token numbers AnaGram provides a typedef enum statement in the header file. You
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
439 can then use named constants to specify the internal token numbers for the input
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
440 tokens.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
441 <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
442
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
443 <H4><A NAME="Error">Error Handling</A></H4>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
444 <P>Your <A HREF="gloss.html#Parser">parser</A> must be prepared to deal with erroneous input. There are two
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
445 aspects to error handling: diagnosing the error, and recovering from the error.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
446 On encountering an error, your parser will invoke a macro called SYNTAX_ ERROR.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
447 If you do not provide a definition for SYNTAX_ERROR, AnaGram will provide a
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
448 simple error diagnostic. AnaGram can also provide automatic error diagnoses
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
449 which pinpoint the location of the error.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
450 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
451 AnaGram provides two options for error recovery: error
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
452 token <A HREF="gloss.html#Resynchronization">resynchronization</A>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
453 and automatic resynchronization. These are techniques for getting your parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
454 back in synch with its input so it can proceed after encountering an error.
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
455 Normally, if you do not select one of these recovery techniques, your parser
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
456 will terminate when it encounters an error; however, you may override this
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
457 default if you wish and provide your own recovery technique.</P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
458
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
459
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
460 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
461 WIDTH=1010 HEIGHT=2 >
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
462 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
463 <IMG ALIGN="right" SRC="images/pslrb6d.gif" ALT="Parsifal Software"
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
464 WIDTH=181 HEIGHT=25>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
465 <BR CLEAR="right">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
466
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
467 Back to <A HREF="index.html">Index</A>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
468 <P>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
469 <ADDRESS><FONT SIZE="-1">
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
470 AnaGram parser generator - documentation<BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
471 Introduction to Syntax Directed Parsing<BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
472 Copyright &copy; 1993-1999, Parsifal Software. <BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
473 All Rights Reserved.<BR>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
474 </FONT></ADDRESS>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
475
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
476 </BODY>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
477 </HTML>
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
478
13d2b8934445 Import AnaGram (near-)release tree into Mercurial.
David A. Holland
parents:
diff changeset
479