comparison doc/misc/html/isdp.html @ 0:13d2b8934445

Import AnaGram (near-)release tree into Mercurial.
author David A. Holland
date Sat, 22 Dec 2007 17:52:45 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:13d2b8934445
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
2 <HTML>
3 <HEAD>
4 <TITLE>Introduction to Syntax Directed Parsing</TITLE>
5 </HEAD>
6
7 <BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif"
8 TEXT="#000000" LINK="#0033CC"
9 VLINK="#CC0033" ALINK="#CC0099">
10
11 <P>
12
13 <IMG ALIGN="right" SRC="images/agrsl6c.gif" ALT="AnaGram"
14 WIDTH=124 HEIGHT=30>
15 <BR CLEAR="all">
16 Back to <A HREF="index.html">Index</A>
17 </P>
18 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
19 WIDTH=1010 HEIGHT=2 >
20
21 <BR CLEAR="all">
22
23
24 <H1 ALIGN="LEFT">Introduction to Syntax Directed Parsing</H1>
25 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
26 WIDTH=1010 HEIGHT=2 >
27 </P>
28
29 <P>
30 (Adapted from Chapter 4, AnaGram User's Guide)</P>
31 </P>
32 <UL>
33 <LI SRC="#What"><A HREF="#What">What is Syntax Directed Parsing?</A></LI>
34 <LI><A HREF="#Describing">Describing an Input Sequence</A></LI>
35 <LI><A HREF="#How">How a Parser Works</A></LI>
36 <LI><A HREF="#Note">A Note on Notation</A></LI>
37 <LI><A HREF="#Reduction">Reduction Procedures</A></LI>
38 <LI><A HREF="#Building">Building a Parser</A>
39 <UL>
40 <LI SRC="#Invoking"><A HREF="#Invoking">Invoking a Parser</A></LI>
41 <LI><A HREF="#Communicating">Communicating with a Parser</A></LI>
42 <LI><A HREF="#Parser">Parser Input</A></LI>
43 <LI><A HREF="#Error">Error Handling</A></LI>
44 </UL>
45 </LI>
46 </UL>
47 <BR>
48
49 <H2><A NAME="What">What is Syntax Directed Parsing?</A></H2>
50 <P>
51 Every programmer has to deal with input data. Usually processing input data
52 depends on what has preceded, and often even on what follows, the input under
53 consideration. Keeping track of these dependencies in order to know how to
54 process the data is called parsing. It is often relatively easy to keep track of
55 simple dependencies when first writing a program. As the program develops, as
56 new features are added, as bugs are fixed, the dependencies often stop being
57 simple. Input processing becomes a headache, since it is hard to keep track of
58 or even identify all the particular cases. Changes to the program cause
59 unexpected problems and program maintenance threatens to get out of control.</P>
60 <P>
61 Syntax directed parsing is a technique for addressing these difficulties. In
62 syntax directed parsing, the input part of a program is constructed
63 automatically, by means of a standard algorithm, from a high level description
64 of the input data structure. Code to perform any required processing of the data
65 is attached to the description in a convenient way.</P>
66 <P>
67 The description, being non-procedural, is generally easier to write and to
68 modify than the equivalent program code, and much less likely to harbor bugs. It
69 is easier to read, and easier to maintain. It is easy to re-use in other
70 programs requiring the same input, thus encouraging uniform interfaces. The
71 technique also simplifies the overall program by decoupling the input and
72 processing components and providing a natural, modular structure.</P>
73 <P>
74 To use syntax directed parsing you first write the input data description,
75 called a <A HREF="gloss.html#Grammar">grammar</A>. A file which contains a grammar is called a syntax
76 file. A parser generator, such as AnaGram, then can create, from the syntax
77 file, a function (or program) called a <A HREF="gloss.html#Parser">parser</A>, written in C or C++. The parser
78 keeps track of all the dependencies in your input, and calls certain functions,
79 <A HREF="gloss.html#ReductionProcedure">reduction procedures</A>, to deal with specific units or sequences of data as they
80 are encountered.</P>
81 <P>
82 Reduction procedures are the code you write to process your data. In your
83 grammar they are linked to the structures in your input, so that your parser
84 will call them at precisely the right times with precisely the right data. Note
85 that with this technique you need only provide a non-procedural <I>description</I>
86 of the input. The details of flow of control are handled entirely by the parser.
87 When you write reduction procedures, you can concentrate entirely on what you
88 have to do with the data. You don't have to encumber your code with switches and
89 tests to determine the structure of your input.</P>
90 <P>
91 The parsers you build using a parser generator such as AnaGram may be complete
92 stand-alone programs, or they may serve as input routines for a more extensive
93 program. Some programs may even use more than one parser.</P>
94 <BR>
95
96 <H2><A NAME="Describing">Describing an Input Sequence</A></H2>
97 <P>
98 Writing a grammar consists of describing the acceptable input sequences for your
99 program. The vehicle for describing an input sequence is called a <A HREF="gloss.html#Production">production</A>.
100 Productions show how a logical component of the input can be made up of a
101 sequence of more elementary components. A production that describes a date might
102 be written:</P>
103 <PRE>
104 date
105 -&gt; name of month, day, comma, year </PRE>
106 <P>
107 The components of the input are called <A HREF="gloss.html#Token">tokens</A>. The sequence of tokens on the
108 side of the production is called a <A HREF="gloss.html#GrammarRule">grammar rule</A>, or rule for short. The
109 individual tokens on the right side of the rule are also called <A HREF="gloss.html#RuleElement">rule elements</A>.
110 </P>
111 <P>
112 The token on the left side of the production is called the <A HREF="gloss.html#ReductionToken">reduction token</A> for
113 the rule. Tokens may have semantic values, as distinguished from syntactic
114 values, which you can use in your <A HREF="gloss.html#ReductionProcedure">reduction procedures</A>. For instance, the value
115 of name of month could be an integer in the range zero to eleven, or it could be
116 a pointer to an ascii string. The value of day could be an integer in the range
117 one to thirty-one.</P>
118 <P>
119 A <A HREF="gloss.html#Grammar">grammar</A> consists of a number of such <A HREF="gloss.html#Production">productions</A>, each of which describes some
120 component of the input in terms of other components. It does not take very many
121 productions to describe quite complex input streams. A grammar for the C
122 language, for instance, requires about two hundred productions.</P>
123 <P>
124 Many people find the term "production" quite confusing. It comes from theoretical
125 linguistics where it is used to describe how one may produce sequences which
126 correspond to a set of grammatical rules. Ironically, the major usage of the
127 idea has been in parsing where the interest is not so much in creating sequences
128 which satisfy the grammatical rules as in decoding and analyzing such sequences.
129 Nonetheless, it is convenient, in the above example, to say that the token date
130 produces a sequence of tokens consisting of name of month, day, comma, and year.
131 We also say that the sequence reduces to date.</P>
132 <P>
133 There may be more than one production to describe a given component, if there is
134 more than one way it may be represented. For instance,</P>
135 <PRE>
136 date
137 -&gt; day, name of month, year </PRE>
138 <P>
139 describes another common way of writing a date. In other words, a <A HREF="gloss.html#ReductionToken">reduction
140 token</A> may produce a number of different <A HREF="gloss.html#GrammarRule">grammar rules</A>.</P>
141 <P>
142 Tokens which appear on the left side of one or more productions are called
143 <A HREF="gloss.html#Nonterminal">nonterminal tokens</A>. Those which appear <I>only</I> on the right sides of
144 productions are called <A HREF="gloss.html#Terminal">terminal tokens</A>. Terminal tokens are the units which
145 actually appear physically in the input. Nonterminal tokens are identified when
146 a sequence of tokens that matches the right side of a production is seen in the
147 input. When AnaGram analyzes a <A HREF="gloss.html#Grammar">grammar</A>, it assigns a unique token number to each
148 token it finds in the grammar.</P>
149 <P>
150 Nonterminal tokens, such as date in the example above, may appear in any grammar
151 rule just as though they were input tokens. The token on the left side of a
152 production can even appear on the right side as well. Such a production is
153 called a recursive production. When a nonterminal token appears on the right
154 side of a production, it may be represented in this context by <I>any</I> of
155 the <A HREF="gloss.html#GrammarRule">grammar rules</A> it produces. Grammars described in this manner are called
156 <A HREF="gloss.html#ContextFreeGrammar">context free grammars</A> since there is no contextual constraint on which of the
157 rules that a token produces can appear in any given context.</P>
158 <P>
159 Recursive productions may be either left recursive or right recursive. Left
160 recursive productions are those where the recursively defined <A HREF="gloss.html#Nonterminal">nonterminal</A>
161 appears as the first <A HREF="gloss.html#RuleElement">element</A> in the recursive rule. Right recursive productions
162 are those where it is the last element. If it appears anywhere between, the
163 production is said to be center recursive.</P>
164 <P>
165 Any nonterminal token which has a recursive production must also have at least
166 one simple, non-recursive production. Otherwise, it is not possible to create a
167 finite sequence of terminal tokens from the nonterminal token.</P>
168 <P>
169 Recursion may also occur implicitly in a <A HREF="gloss.html#Grammar">grammar</A> when one of the tokens on the
170 right side of a production itself has a production involving the token on the
171 left. Such implicit recursion sometimes may involve numerous levels of
172 productions. Implicit recursion occurs most commonly in describing constructs
173 such as arithmetic expressions or the block structure of programming languages.</P>
174 <P>
175 Clearly, grammars can accommodate multiple levels of structure in the input
176 sequences they describe. There must, at the top, be a single token which
177 encompasses the entire input. This special token is variously called the <A HREF="gloss.html#GrammarToken">grammar
178 token</A>, the goal token or the start token.</P>
179 <P>
180 AnaGram allows you to specify <A HREF="gloss.html#Terminal">terminal tokens</A> explicitly as ascii characters, or
181 even as <A HREF="gloss.html#CharacterSets">sets of ascii characters</A>, right in the grammar. Thus, you may write
182 '0-9' to represent the set of ascii digits, or 'A-Z' to represent the set of
183 upper case letters. The semantic value of such a <A HREF="gloss.html#Token">token</A> is the ascii character
184 code that actually appears in the input stream. If the various sets you use in
185 your grammar overlap, they may not properly represent terminal tokens. In this
186 case, AnaGram automatically extends your <A HREF="gloss.html#Grammar">grammar</A> appropriately. </P>
187 <BR>
188
189 <H2><A NAME="How">How a Parser Works</A></H2>
190 <P>
191 The aim of a <A HREF="gloss.html#Parser">parser</A> is to match its input with the full syntactic structure
192 specified by the <A HREF="gloss.html#Production">productions</A> which make up the grammar. The primary component of
193 a parser is an input buffer, sometimes thought of as a stack, into which tokens
194 are shifted, or stored sequentially, as they are encountered in the input. At
195 the same time that a token is shifted into the input buffer, its semantic value
196 is pushed onto the value stack. A token is not shifted into the buffer unless it
197 "makes sense", that is, unless it is consistent with the rules of the
198 <A HREF="gloss.html#Grammar">grammar</A> and with the input that has preceded it. If a token does not make sense,
199 the parser signals a syntax error. In order to determine whether a token makes
200 sense, the parser has a sort of decision table which provides a list of
201 acceptable tokens for each of a number of states. The table also specifies what
202 the parser is to do with each acceptable token. When the table indicates that a
203 token is to be shifted into the buffer, it also specifies a new state. The
204 parser stacks the current state on a state stack and jumps to the new state.
205 Thus every time a token is shifted into the input buffer, a state number is
206 pushed onto the state stack. For each state of the parser, excepting only the
207 initial state, there is a unique token which will cause a jump to that state.
208 This token is called the <A HREF="gloss.html#CharacteristicToken">characteristic token</A> for the state.</P>
209 <P>
210 When the rightmost, or most recent, tokens in the input buffer match the right
211 side of a production precisely, the parser <I>may</I> replace the tokens that
212 match the rule with a single token, the token on the left side of the
213 production. This process of replacing a sequence of tokens with a single token
214 is called reduction. The token that replaces the sequence of tokens is called
215 the <A HREF="gloss.html#ReductionToken">reduction token</A>.</P>
216 <P>
217 The actual mechanism of the reduction is quite important. At the same time that
218 input tokens are removed from the input buffer, state numbers are popped from
219 the state stack, so that when all input tokens matching the rule have been
220 removed, the parser state has been restored to the value it had at the time the
221 first token in the rule was seen. As state numbers are popped from the state
222 stack, token values are popped from the value stack. If the rule has a <A HREF="gloss.html#ReductionProcedure">reduction
223 procedure</A>, temporary variables are loaded with the values popped from the stack
224 and the reduction procedure is called. The reduction token is now shifted into
225 the input buffer just as though it were an input token. If the reduction
226 procedure returned a result it is shifted into the value stack as the value of
227 the reduction token. The parser stacks the current state again and jumps to a
228 new state as determined by the parser tables.</P>
229 <P>
230 If there are no errors in your input, when the last token has been read from the
231 input, shifted into the input buffer, and reductions performed, there will be
232 precisely one token in the input buffer: the <A HREF="gloss.html#GrammarToken">grammar, or goal, token</A> which
233 describes your entire input. At this point your parser declares itself finished.</P>
234 <P>
235 Reductions do not necessarily occur every time a rule matches the tokens in the
236 input buffer. If the reduction token does not make sense, that is, if it is not
237 consistent with the rules of the <A HREF="gloss.html#GrammarToken">grammar</A> and with the input that has preceded
238 it, the parser will not perform the reduction. Suppose there are tokens in the
239 input which match one of the rules given above for date. A reduction will not
240 occur unless date is one of the tokens the parser is actually looking for at
241 that stage in the input.</P>
242 <P>
243 Even when the reduction token would make sense, there are still situations where
244 the reduction would not take place. Suppose a grammar includes the two
245 <A HREF="gloss.html#Production">productions</A> given above for date as well as the following:</P>
246 <PRE>
247
248 date
249 -&gt; name of month, day </PRE>
250 <P>
251 This production is the same as the first, but with no year specification. When a
252 parser directed by this grammar has encountered name of month and day, it can't
253 tell without looking further whether it has a short form or a long form date. In
254 such a circumstance, the parser looks at the next following token, which is
255 called a <A HREF="gloss.html#Lookahead">lookahead token</A>. If the lookahead token is a comma, then, in the
256 absence of other productions, the input is a long form date. If the lookahead
257 token is not a comma, then the input is certainly not a long form date, and it
258 is proper to reduce the short form production.</P>
259 <P>
260 Suppose the lookahead token were a comma and the grammar were also to contain
261 the following production:</P>
262 <PRE>
263
264 appointment
265 -&gt; date, comma, time </PRE>
266 <P>
267 Since a comma can follow date, according to this rule, and can also follow day
268 according to the first production, it is impossible to determine, simply by
269 looking at the <A HREF="gloss.html#Lookahead">lookahead token</A>, whether the date was being given in short form
270 or long form. One would have to look beyond the comma to see if what follows the
271 comma matches the rules for time or for year. Although it is possible to build
272 parsers which can do this, it is not generally feasible. Situations of this sort
273 are called <A HREF="gloss.html#Conflict">conflicts</A>. AnaGram warns you about the conflicts in your <A HREF="gloss.html#Grammar">grammar</A> when
274 it analyzes your grammar, and provides numerous facilities to help you correct
275 them.</P>
276 <BR>
277
278 <H2><A NAME="Note">A Note on Notation</A></H2>
279 <P>
280 <A HREF="gloss.html#ContextFreeGrammar">Context free grammars</A> have been traditionally represented in the literature
281 using <A HREF="gloss.html#BNF">Backus-Naur Form</A>, or BNF. In Backus-Naur Form, certain characters, called
282 metacharacters, are used to punctuate <A HREF="gloss.html#Production">productions</A> and all other printable
283 characters are taken to represent themselves literally. Named tokens are denoted
284 by enclosing the name within angle brackets <CODE>&lt; &gt;</CODE> . The left side of a
285 production is distinguished from the right side by the
286 characters <CODE> ::= </CODE>.
287 If several productions have the same left side, the vertical
288 bar <CODE> | </CODE> is used to separate them. The <A HREF="gloss.html#RuleElement">elements</A> of a <A HREF="gloss
289 .html#GrammarRule">grammar rule</A> are simply juxtaposed
290 to indicate that one follows another. Blanks are ignored. Thus, in BNF, the
291 first production given for date, above, would be:</P>
292 <PRE>
293 &lt;date&gt; ::= &lt;name of month&gt; &lt;day&gt;, &lt;year&gt;
294 </PRE>
295 <P>
296 AnaGram uses a notation more consonant with ordinary programming usage. Thus
297 token names need not be bracketed and literal characters must be appropriately
298 quoted. The elements of rules are joined by commas. Using this approach, there
299 is no need for metacharacters and it becomes possible to make a number of useful
300 extensions to the notation.</P>
301 <BR>
302
303 <H2><A NAME="Reduction">Reduction Procedures</A></H2>
304 <P>
305 Of course, the reason for parsing an input stream is to interpret the data in
306 the stream and to process it in some useful manner. The primary tool for doing
307 this is the <A HREF="gloss.html#ReductionProcedure">reduction procedure</A>. A reduction procedure is a piece of C code that
308 is executed when a particular <A HREF="gloss.html#GrammarRule">grammar rule</A> is reduced. Often, a reduction
309 procedure calculates a value which becomes the semantic value of the reduction
310 token. The input to the reduction procedure consists of the values of the tokens
311 that make up the grammar rule.</P>
312 <P>
313 AnaGram allows you to assign C variable names to the tokens in a grammar rule,
314 so that you can refer to them in the reduction procedure. To assign a C variable
315 name, simply follow the token in the rule with a colon and the C variable name.
316 Simple reduction procedures can be written as C or C++ expressions. At the end
317 of the rule, write an equal sign and an expression, terminated by a semicolon.
318 For example:</P>
319 <PRE>
320 (int) hex digit
321 -&gt; '0-9':d =d-'0';
322 -&gt; 'a-f':d =d-'a'+10;
323 -&gt; 'A-F':d =d-'A'+10; </PRE>
324 <P>
325 When any one of these rules is matched, the value of the token in the rule is
326 assigned to the temporary variable. The expression to the right of the equal
327 sign is evaluated and the value of the expression is stored as the value of hex
328 digit, which has been declared to be an int. These <A HREF="gloss.html#Production">productions</A> define
329 hexadecimal digits as ascii characters, and calculate the binary value of the
330 digit in terms of the ascii character code, d. The binary value becomes the
331 value of hex digit. Hexadecimal digits can be combined to make hexadecimal
332 numbers by writing the following productions:</P>
333 <PRE>
334 (int) hex number
335 -&gt; hex digit
336 -&gt; hex number:n, hex digit:d =16*n+d; </PRE>
337 <P>
338 There are several important points to notice in this example. First, <A HREF="gloss.html#ReductionProcedure">reduction
339 procedures</A> are executed "from the bottom up". That is, the reduction
340 procedure for hex digit is executed before any reduction procedure for hex
341 number. Second, if there is no reduction procedure for a production, the value
342 of the first token in the rule is assigned to the reduction token. Thus, it is
343 not necessary to provide a reduction procedure in the first production for hex
344 number. Third, the reduction procedures for recursive productions are always
345 executed <I>after</I> the reduction procedure, if any, for the non-recursive
346 production which begins the recursion. Finally, when an input sequence is
347 described using left recursion, as in this example, the <A HREF="gloss.html#RuleElement">elements</A> of the sequence
348 are processed left to right.</P>
349 <P>
350 If you wish to process the elements of a sequence right to left, you may use
351 right recursion. For example, it is sometimes convenient to define the fraction
352 part of a decimal number thus:</P>
353 <PRE>
354 (double) fraction part
355 -&gt; '0-9':d =(d-'0')/10.;
356 -&gt; '0-9':d,fraction part:f =(d-'0'+f)/10.;</PRE>
357 <P>
358 In this case the leading digits are stored temporarily in the parse stack, and
359 then the fraction part is evaluated right to left only when the last digit has
360 been found.</P>
361 <P>
362 Reduction procedures can be more complex than simple expressions. After the
363 equal sign you may include an arbitrary block of C or C++ code, enclosed in
364 braces, { }. To return a semantic value for the reduction token simply use a
365 return statement. Of course, reduction procedures have the full resources of C
366 or C++ at their disposal. They may set and interrogate global variables and may
367 call functions freely.</P>
368 <P>
369 Since the reduction procedures you write will probably need some support code,
370 such as #include statements and declarations, you may incorporate C or C++ code
371 into your syntax file at any point. You need only enclose it in braces ({}).
372 Such code is called embedded C. All embedded C code is also copied to the parser
373 file, and <I>precedes</I> all of your reduction procedures.</P>
374 <BR>
375
376 <H2><A NAME="Building">Building a Parser</A></H2>
377 <P>
378 In order to round out a <A HREF="gloss.html#Parser">parser</A> into a functioning program it needs input
379 procedures, as well as error diagnosis and recovery capabilities. AnaGram has a
380 number of options available which give you a high degree of flexibility in
381 configuring a parser to suit your particular needs. All of the options are
382 provided with reasonable defaults, so that you can safely disregard any option
383 until you need the features it provides.</P>
384 <BR>
385
386 <H4><A NAME="Invoking">Invoking a Parser</A></H4>
387 <P>
388 Normally, AnaGram configures <A HREF="gloss.html#Parser">parsers</A> as functions which you can call from
389 elsewhere in your program. In this situation, you call the parser, it processes
390 its input, and returns either when it has finished or cannot proceed because of
391 errors. Alternatively, if you set the event driven configuration switch, your
392 parser will be configured so that you have two procedures to call: an
393 initializer and a parser. In the event driven configuration you start the parse
394 by calling the initializer and then you call the parser once for each unit of
395 input. Using the event driven mode makes it quite easy to configure a parser as
396 a filter and to chain several parsers together so that the output from one
397 parser is the input to the next. Such multi-stage parsing is a convenient way to
398 deal with complex input that is not context free.</P>
399 <BR>
400
401 <H4><A NAME="Communicating">Communicating with a Parser</A></H4>
402 <P>The complete status of your parser is contained in a single data structure
403 called a parser control block. All communications with a parser take place via
404 the parser control block. Input procedures must place input data in the
405 appropriate field in the parser control block. When the parse is complete or
406 encounters an error, the results of the parse may be found in the parser control
407 block. When AnaGram builds a parser it includes, in the header file it
408 generates, a typedef statement which defines the structure of the parser control
409 block.</P>
410 <BR>
411
412 <H4><A NAME="Parser">Parser Input</A></H4>
413 <P>The input to your <A HREF="gloss.html#Parser">parser</A> may be either characters read directly from an
414 input stream, or tokens accumulated by a pre-processor or <A HREF="gloss.html#LexicalScanner">lexical scanner</A>. The
415 way you provide input to your parser depends on how your <A HREF="gloss.html#Grammar">grammar</A> defines input
416 tokens and also on whether or not you have requested an event driven parser.</P>
417 <P>
418 If your parser is event driven, you provide its input by storing the input code
419 and the input value, if any, into the parser control block and calling the
420 parser.</P>
421 <P>
422 If you have set the pointer input configuration switch in your syntax file, you
423 simply initialize the pointer field in your parser control block before you call
424 your parser. Your parser will then read its input directly from memory by simply
425 incrementing the pointer as necessary.</P>
426 <P>
427 Otherwise, your parser will invoke a macro called GET_INPUT every time it needs
428 more input. You may define GET_INPUT according to your needs. You can define it
429 so that it calls an input function, or you can define it so that it executes
430 in-line code each time it is invoked. Your GET_INPUT macro should store its
431 input code in the input_code field of the parser control block. If you do not
432 write a GET_ INPUT macro, AnaGram will provide one which will read characters
433 from stdin.</P>
434 <P>
435 If your <A HREF="gloss.html#Grammar">grammar</A> does not define <A HREF="gloss.html#Terminal">terminal tokens</A> in terms of ascii characters or
436 external token numbers, your GET_INPUT will have to determine the appropriate
437 internal token number for each input token. To assist you in determining these
438 token numbers AnaGram provides a typedef enum statement in the header file. You
439 can then use named constants to specify the internal token numbers for the input
440 tokens.</P>
441 <BR>
442
443 <H4><A NAME="Error">Error Handling</A></H4>
444 <P>Your <A HREF="gloss.html#Parser">parser</A> must be prepared to deal with erroneous input. There are two
445 aspects to error handling: diagnosing the error, and recovering from the error.
446 On encountering an error, your parser will invoke a macro called SYNTAX_ ERROR.
447 If you do not provide a definition for SYNTAX_ERROR, AnaGram will provide a
448 simple error diagnostic. AnaGram can also provide automatic error diagnoses
449 which pinpoint the location of the error.</P>
450 <P>
451 AnaGram provides two options for error recovery: error
452 token <A HREF="gloss.html#Resynchronization">resynchronization</A>
453 and automatic resynchronization. These are techniques for getting your parser
454 back in synch with its input so it can proceed after encountering an error.
455 Normally, if you do not select one of these recovery techniques, your parser
456 will terminate when it encounters an error; however, you may override this
457 default if you wish and provide your own recovery technique.</P>
458
459
460 <IMG ALIGN="bottom" SRC="images/rbline6j.gif" ALT="----------------------"
461 WIDTH=1010 HEIGHT=2 >
462 <P>
463 <IMG ALIGN="right" SRC="images/pslrb6d.gif" ALT="Parsifal Software"
464 WIDTH=181 HEIGHT=25>
465 <BR CLEAR="right">
466
467 Back to <A HREF="index.html">Index</A>
468 <P>
469 <ADDRESS><FONT SIZE="-1">
470 AnaGram parser generator - documentation<BR>
471 Introduction to Syntax Directed Parsing<BR>
472 Copyright &copy; 1993-1999, Parsifal Software. <BR>
473 All Rights Reserved.<BR>
474 </FONT></ADDRESS>
475
476 </BODY>
477 </HTML>
478
479