comparison doc/manual/cfp.tex @ 0:13d2b8934445

Import AnaGram (near-)release tree into Mercurial.
author David A. Holland
date Sat, 22 Dec 2007 17:52:45 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:13d2b8934445
1 \chapter{Configuration Parameters}
2 \index{Configuration parameters}\index{Parameters}
3
4 \agterm{Configuration parameters} are named constants that control the
5 way AnaGram works. AnaGram ignores case\index{Case sensitivity} when
6 it looks up the names of configuration parameters, so that
7 \agcode{parser name} and \agcode{Parser Name} both refer to the same
8 parameter. Configuration parameters that have only true/false or
9 on/off values are often referred to as
10 \index{Configuration switches}\agterm{configuration switches}.
11
12 Configuration parameters are used to control:
13
14 \begin{itemize}
15 \item Comment nesting
16 \item Grammar analysis
17 \item Parser generation
18 \end{itemize}
19
20 Every configuration parameter has a default value which has been
21 chosen to correspond to a standard if it exists, customary usage if
22 such can be determined, or otherwise to the most likely usage.
23
24 Configuration parameters may be specified either in
25 \index{Configuration file}\index{File}\agparam{configuration files},
26 always named \agfile{AnaGram.cfg}, or in a syntax file. A
27 configuration file is a normal ASCII file containing parameter
28 specifications. The syntax of a configuration file is the same as
29 that of a configuration segment within a syntax file, except that a
30 configuration file does not have the brackets ( \agcode{[ ]} ) that
31 enclose a configuration segment in a syntax file. You may comment the
32 configuration file freely, just as though it were a syntax file.
33 % XXX ``configuration segment'' is a forward reference and we should
34 % rearrange all this so it isn't. Also, the forward reference is
35 % ``configuration section''. Sigh.
36
37 % Parameters can be set in either a configuration file or in your syntax
38 % file.
39 Apart from the \agparam{nest comments} switch, if a parameter
40 is specified more than once, only the last value is used (see below).
41 The \agparam{nest comments} switch, which affects the way AnaGram
42 reads your configuration and syntax files, takes effect as soon as
43 AnaGram encounters it in a file and stays in effect unless it is later
44 turned off.
45
46 % XXX this should be belabored less. Also, good practice dictates that
47 % if you ship a project or a grammar it should compile in someone
48 % else's environment, and we shouldn't encourage people to do things
49 % like put \agparam{pointer input} in a systemwide AnaGram.cfg.
50 %
51 % XXX also in the Unix world it ought to read
52 % /usr/local/etc/AnaGram.cfg and then also ~/.AnaGram.cfg - or
53 % something like that. And it ought to be possible to set params
54 % on the agcl command line. We need to think about this. (Well,
55 % there's not really any valid use for either, so perhaps it
56 % doesn't matter.)
57 %
58 % How about something like
59 %
60 % Support for a global configuration file dates from the DOS-based
61 % AnaGram 1.x, where the same configuration mechanism was used to
62 % establish user interface preferences. AnaGram 2.0 and above handle
63 % preferences separately, and the configuration system is only used
64 % for code-related options. Since good practice dictates that code
65 % should continue to work if exported outside of one's personal
66 % environment, there are few or no legitimate uses of the global
67 % configuration file and support for it will likely be removed in a
68 % future AnaGram release.
69 %
70 % (But there really should be support for params on the agcl command
71 % line; if nothing else it would make it a lot easier to test
72 % combinations of settings.)
73 %
74 On initialization, AnaGram checks the directory that contains the
75 AnaGram executable file. If it finds \agfile{AnaGram.cfg}, it reads it
76 and sets internal parameters accordingly. It then looks for
77 \agfile{AnaGram.cfg} in your working directory and, if it finds it, reads
78 it in turn. If any parameter is set in both files, the last setting
79 wins. The effect of this two stage process is to allow you to set
80 your standard preferences in the principal directory, with specific
81 overrides in your working directories. You may also put configuration
82 parameters in your syntax file, which override the settings in the
83 configuration files. Note that neither configuration file is
84 necessary.
85
86 Before executing an Analyze Grammar or Build Parser command, AnaGram
87 resets configuration parameters to their initial values, as determined
88 by the built in defaults and the configuration files read at program
89 initialization.
90
91 There are, therefore, four levels at which parameters may be set. At
92 the first level, there are the settings built into AnaGram. If you
93 don't like some of these, you can override them with a configuration
94 file at the second level, the tools directory where you installed
95 AnaGram. If a particular project needs overrides, you can put them in
96 a configuration file at the third level, the working directory for
97 this project. And if you have specific configuration requirements for
98 a particular parser, the best place for them is the fourth level, the
99 syntax file for the parser.
100
101 For all of this flexibility, some people prefer to set every
102 configuration parameter explicitly in their syntax files so there is
103 no question as to what setting is being used. AnaGram is set up so
104 you can do it whichever way you prefer.
105
106 If you are uncertain as to the actual parameters that AnaGram is using
107 at any time, the
108 \index{Configuration Parameters}\index{Window}
109 \agwindow{Configuration Parameters} window listed in the
110 \agmenu{Windows} menu will show you the current state of all
111 parameters.
112
113 The different varieties of configuration parameters are described
114 below. Each definition of a parameter must start on a new line. A
115 configuration file is just a sequence of parameter definitions, each
116 on a separate line. Blank lines can be used as separators where you
117 please, and comments may be used as described for syntax files.
118 Case\index{Case sensitivity} is ignored for parameter names (but not
119 for the whole definition). In a syntax file, each set of definitions
120 must be enclosed with brackets ( \agcode{[ ]} ), forming a
121 \index{Configuration section}\agterm{configuration section}, one of
122 the four kinds of AnaGram statements. Configuration sections can be
123 scattered throughout a syntax file, but each section should begin on a
124 new line, and following statements should also of course start on new
125 lines. There is no restriction on the number of sections, or on the
126 number of times a parameter appears. The last setting of a parameter
127 wins.
128
129 The first variety of configuration parameter is a simple
130 \index{Switches}\index{Configuration switches}switch that controls
131 one of the various features of AnaGram. Such parameters are also called
132 \agterm{configuration switches}. They need simply be stated to set the
133 condition (turn it on) or negated with the tilde (\agcode{\~{}}) to
134 reset the condition (turn it off). Thus
135
136 \begin{indentingcode}{0.4in}
137 nest comments
138 \end{indentingcode}
139 causes AnaGram to allow nested comments, and
140
141 \begin{indentingcode}{0.4in}
142 \~{}nest comments
143 \end{indentingcode}
144 causes AnaGram to disallow nested comments.
145
146 You may also set or reset configuration switches with explicit on or
147 off values:
148
149 \begin{indentingcode}{0.4in}
150 nest comments = on
151 nest comments = off
152 \end{indentingcode}
153
154 A second variety of configuration parameter takes a value which is the
155 name of a token. Thus
156
157 \begin{indentingcode}{0.4in}
158 grammar token = c grammar
159 \end{indentingcode}
160 specifies that the token \agcode{c grammar} is the grammar that
161 AnaGram should use as the starting point for analyzing your grammar.
162
163 A third variety of configuration parameter takes a value which is a C
164 or C++ data type. Thus
165
166 \begin{indentingcode}{0.4in}
167 default token type = unsigned char *
168 \end{indentingcode}
169 signifies that the value of a token, unless otherwise specified, is a
170 pointer to an \agcode{unsigned char}. AnaGram does not accept the
171 full panoply of C and C++ \index{Data type}data types. The
172 restrictions are that AnaGram does not allow specification of array or
173 function types, nor explicit structure types. Types that are defined
174 with typedef statements, structure definitions, or class definitions,
175 including template classes, in your embedded C or C++ are acceptable.
176 If you have more complex data types, you should define a simple name
177 using a typedef statement.
178
179 A fourth variety of configuration parameter takes a string value to
180 set an ASCII string used by AnaGram. Thus
181
182 \begin{indentingcode}{0.4in}
183 header file name = "widget.h"
184 \end{indentingcode}
185 signifies that the header file created by AnaGram should be called
186 \agfile{widget.h}. In
187 those strings which are used to name the parser or files which AnaGram
188 builds, the character ``\agcode{\#}'' is used to indicate that AnaGram
189 should substitute the name of your syntax file. In strings used to
190 determine the names of program variables or functions, ``\agcode{\$}''
191 is used to indicate that AnaGram should substitute the name of your
192 parser. When building enumeration constants for the names of the
193 tokens in your grammar, ``\agcode{\%}'' will be replaced by the name
194 of the token.
195
196 The final variety of configuration parameter takes a numeric value.
197 The value may be decimal, octal or hexadecimal, following the C
198 conventions, and may have an optional sign. Thus
199
200 \begin{indentingcode}{0.4in}
201 parser stack size = 50
202 \end{indentingcode}
203 tells AnaGram to allocate space for at least fifty stack entries when
204 it creates your parser.
205
206 If AnaGram does not recognize a parameter, it will give you a warning
207 with line number, column number, and the message ``no such
208 parameter''. If the value for a parameter is inappropriate, such as a
209 string value for a parameter which should have a numeric value, the
210 message will be ``inappropriate value''. If the error occurs in the
211 configuration file found in the AnaGram directory, AnaGram will prefix
212 the warning with the complete path name for the file. If the error
213 occurs in the configuration file in your working directory, AnaGram
214 will prefix the warning with ``AnaGram.cfg:''. If AnaGram encounters a
215 syntax error while reading a configuration file, it will honor the
216 parameter settings it found before the syntax error, but will ignore
217 everything that follows the error.
218
219 \section{Alphabetic Listing of Configuration Parameters}
220
221 \index{Configuration switches}\index{Allow macros}\index{Macros}
222 \agparamheading{allow macros}{switch, default on}
223
224 When this switch is set, i.e., on, reduction procedures will be
225 implemented as macros if they are sufficiently simple. This makes
226 your parser some what more compact and faster but makes it somewhat
227 more difficult to debug. It's a good idea to turn this switch off for
228 debugging.
229
230 \index{Configuration switches}\index{Auto init}
231 \agparamheading{auto init}{switch, default on}
232
233 This switch controls the initialization of any parser that is not
234 \agparam{event driven}. When it is on, the
235 \index{Initializer}initializer for your parser is automatically called
236 every time the parser is called.
237 This is the normal situation. On occasion, however, it
238 is desirable to call a parser several times without reinitializing it.
239 In this case, you may set the \agparam{auto init} parameter to off.
240 Should you do this, you must call the initializer yourself whenever
241 appropriate.
242 % XXX characterize the occasion...
243
244 When \agparam{event driven} is set, \agparam{auto init} has no effect.
245
246 \index{Configuration switches}\index{Auto resynch}
247 \agparamheading{auto resynch}{switch, default off}
248
249 Setting this switch causes AnaGram to include an automatic
250 resynchronization procedure in the parser. The resynchronization
251 procedure will be invoked upon encountering a syntax error and will
252 skip over input until it finds input characters or tokens consistent
253 with its state at the time of the error. The purpose of the
254 resynchronization procedure is to provide a simple way for your parser
255 to proceed in the event of syntax errors so that it can find more than
256 one syntax error on a given pass. The resynchronization procedure
257 uses a heuristic based on your own syntax. AnaGram itself uses this
258 technique to resynchronize after syntax errors in its input.
259
260 A disadvantage to using this resynchronization technique is that the
261 resynchronization procedure turns off all reduction procedures. The
262 reason is that the resynchronization may cause a number of reduction
263 procedures to be skipped. This means that the parameters for any
264 reduction procedures that might be called later would be suspect and
265 could cause serious problems. It seems more prudent simply to shut
266 them down. Semantically determined productions will subsequently, of
267 course, always use the default reduction token.
268
269 If you have a
270 \index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR}
271 macro, it will be called \emph{before} the resynchronization
272 process. It will also be called on subsequent syntax errors, so your
273 program will not lose control entirely.
274
275 If you use the auto resynchronization procedure, you must also specify
276 the \agparam{eof token} configuration parameter (see below) so that
277 the synchronizer doesn't inadvertently try to pass over the end of
278 file.
279
280 For other methods of recovering from syntax errors, see Chapter 9.
281
282 \index{Configuration switches}\index{Backtrack}
283 \agparamheading{backtrack}{switch, default on}
284
285 If your parser does not continue after encountering a syntax error,
286 you can speed up your parser and make it a little smaller by turning
287 off the \agparam{backtrack} switch. If \agparam{backtrack} is on,
288 AnaGram configures your parser so that in case of syntax error it can
289 undo any default reductions it might have made as a consequence of the
290 erroneous input. The purpose of such an undo function is to identify
291 the proper error frame and to maximize the probability of being able
292 to recover gracefully.
293
294 % XXX shouldn't these be indexed as ``obsolete parameters'' or
295 % something, with xrefs so if you look up ``Bottom margin'' in the
296 % index it says ``see ``obsolete parameters''''?
297 %
298 % Also, shouldn't the various obsolete parameters be described with
299 % the same text?
300 %
301 \index{Configuration parameters}\index{Bottom margin}
302 \agparamheading{bottom margin}{integer value, default = 3}
303
304 This is an obsolete parameter which was used in the DOS version of
305 AnaGram. It is no longer used, but is still recognized for the sake
306 of compatibility.
307
308 \index{Configuration switches}\index{Bright background}
309 \agparamheading{bright background}{switch, default on}
310
311 This configuration switch is not used in AnaGram 2.0. It is retained
312 for compatibility with configuration files used with the DOS versions
313 of AnaGram.
314
315 \index{Configuration switches}\index{Case sensitive}
316 \index{Case sensitivity}
317 \agparamheading{case sensitive}{switch, default on}
318
319 Use this switch to control how your parser deals with distinctions
320 between upper and lower case. When \agparam{case sensitive} is on,
321 AnaGram builds a parser which distinguishes upper from lower case.
322 When this switch is off, AnaGram builds a parser which ignores case
323 for all input. This does not mean that the values of character set
324 tokens are not case sensitive. Although 'a' and 'A' would map to the
325 same token, the values would still be lower and upper case
326 respectively.
327
328 % XXX the last bit could be explained more clearly. (something like
329 % ``parsers still preserve case'')
330
331 % XXX this should discuss character sets, locales, and other such
332 % garbage.
333
334 \index{Configuration parameters}\index{Compile command}
335 \agparamheading{compile command}{string, default = \agcode{NULL}}
336
337 This parameter is retained only for compatibility with the DOS version
338 of AnaGram. It is ignored in the Windows version.
339
340 \index{Configuration switches}\index{Const data}
341 \agparamheading{const data}{switch, default on}
342
343 The \agparam{const data} switch controls the use of \agcode{const}
344 qualifiers in generated C code. If the switch is on, all fixed data
345 arrays in the parser file will be qualified as \agcode{const}. The
346 \agparam{const data} switch is ignored if the \agparam{old style}
347 switch is set.
348
349 \index{Configuration parameters}\index{Context type}
350 %XXX: \index{context tracking} ?
351 \agparamheading{context type}{c data type, no default}
352
353 By default, \agparam{context type} is undefined. If you assign the
354 name of a C data type, AnaGram will implement ``context tracking'' in
355 your parser. See Chapter 9. The data type name can be either a
356 standard, pre-defined data type or one which you create with a
357 \agcode{typedef} statement.
358
359 \index{Configuration parameters}\index{Coverage file name}
360 \index{File extension}\index{nrc}
361 \agparamheading{coverage file name}{string, default = \agcode{"\#.nrc"}}
362
363 If you set the \agparam{rule coverage} configuration switch, AnaGram
364 will provide functions in your parser to read and write rule counts to
365 a file. The name of the file will be determined by \agparam{coverage
366 file name}. The name of your syntax file will be substituted for the
367 ``\agcode{\#}'' character.
368
369 \index{Configuration switches}\index{Declare pcb}
370 % XXX \index{Parser control block} ?
371 \agparamheading{declare pcb}{switch, default on}
372
373 When AnaGram builds a parser, it checks the status of the
374 \agparam{declare pcb} switch. If it is on, AnaGram declares a parser
375 control block for you. AnaGram creates the name of the control block
376 variable by appending \agcode{{\us}pcb} to the name of your parser.
377 AnaGram will also code an \agcode{\#include} statement to include your
378 parser header file, and will define the \agcode{PCB} macro for you.
379 If you wish to declare the parser control block yourself you should
380 turn this switch off.
381
382 \index{Configuration parameters}\index{Default input type}
383 \index{Input type}
384 % XXX: \index{Types} ?
385 \agparamheading{default input type}{c data type, default = \agcode{int}}
386
387 This parameter tells AnaGram what data type to assume for terminal
388 tokens if they are not explicitly declared. Normally, you would
389 explicitly declare terminal tokens only when you have set the
390 \agparam{input values} configuration switch. The default type for
391 nonterminal tokens is given by \agparam{default token type}.
392
393 \index{Configuration switches}\index{Default reductions}\index{Reduction}
394 \agparamheading{default reductions}{switch, default on}
395
396 If in a given parser state there is only one production that could be
397 possibly reduced, it is usually faster to reduce it on any input than
398 to check specifically for correct input before reducing it. The only
399 time this default reduction causes trouble is in the event of
400 erroneous input. In this situation you may get an erroneous
401 reduction. Normally when you are parsing a file, this is
402 inconsequential because you are not going to continue semantic action
403 in the presence of error. But, if you are using your parser to handle
404 real-time interactive input, you have to be able to continue semantic
405 processing after notifying your user that he has entered erroneous
406 input. In this case you would want to turn the \agparam{default
407 reductions} switch off so that productions are reduced only when there
408 is correct input.
409
410 \index{Configuration parameters}\index{Default token type}\index{token}
411 % XXX \index{Types} ?
412 \agparamheading{default token type}{c data type, default = \agcode{void}}
413
414 This parameter takes a C data type as its value. It is used to set
415 the data type for the semantic values of nonterminal tokens whose type
416 is not explicitly specified in the grammar. To set the default type
417 for terminal tokens use \agparam{default input type}.
418
419 \index{Diagnose errors}\index{Configuration switches}
420 \agparamheading{diagnose errors}{switch, default on}
421
422 If you set this switch, AnaGram will include a syntax error diagnostic
423 procedure in your parser. This procedure will be called before your
424 \index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} macro is
425 called. It will store a pointer to a string in the
426 \agcode{error{\us}message} field of your parser control
427 block. The string will contain a diagnostic message. If there is
428 only one syntactically correct input, x, for example, the message will
429 be ``Missing x''. Otherwise it will be ``Unexpected x'' if the input
430 is recognizable but incorrect and ``Unexpected input'' otherwise. If
431 the \agparam{error frame} switch has been set, the
432 \agcode{error{\us}frame{\us}ssx} and
433 \agcode{error{\us}frame{\us}token} fields
434 in the parser control block will be set as described in Chapter 9.
435
436 % XXX say: diagnose errors causes the token_names[] array to be
437 % included in the parser. and index token_names[]...
438
439 \index{Distinguish lexemes}\index{Configuration switches}
440 % XXX \index{Disregard} ?
441 \agparamheading{distinguish lexemes}{switch, default off}
442
443 The \agparam{distinguish lexemes} switch has no effect unless a
444 disregard token has been defined. Normally, the disregard token
445 (usually white space) is optional between lexemes. This may lead to
446 apparent shift-reduce conflicts if the characters that comprise the
447 second of two successive lexemes can be construed as part of the first
448 lexeme. In this situatation, turning on the \agparam{distinguish
449 lexemes} switch effectively requires a disregard token to separate the
450 two lexemes.
451
452 \index{Edit command}\index{Configuration parameters}
453 \index{File extension}\index{syn}
454 \agparamheading{edit command}{string, default = \agcode{"ed \#.syn"}}
455
456 This parameter is no longer used and is retained only for file
457 compatibility with the DOS version of AnaGram.
458
459 \index{Enable mouse}\index{Configuration switches}
460 \agparamheading{enable mouse}{switch, default on}
461
462 This parameter is no longer used and is retained only for file
463 compatibility with the DOS version of AnaGram.
464
465 \index{Enum constant name}\index{Configuration parameters}
466 \agparamheading{enum constant name}{string,
467 default = \agcode{"\${\us}\%{\us}token"}}
468
469 Use the \agparam{enum constant name} parameter to control the names
470 AnaGram uses for the enumeration constants it defines in the
471 header file for your parser. The value of \agparam{enum constant
472 name} should be a string containing the ``\agcode{\%}'' character.
473 AnaGram will substitute each token name in turn for the
474 ``\agcode{\%}'' character in this template as it creates the list of
475 enumeration constants. If it finds a ``\agcode{\$}'' character it
476 will substitute the name of your parser.
477
478 \index{Eof token}\index{Configuration parameters}\index{Token}
479 \agparamheading{eof token}{token name, no default}
480
481 If you use the auto resynchronization capability of AnaGram, you must
482 specify an end of file token explicitly. You can do this either by
483 specifying a terminal token in your grammar called \agcode{eof} or by
484 using the \agparam{eof token} parameter to identify some other
485 terminal token to be used as the end of file marker. You would do
486 this only if you must use the name \agcode{eof} for some other
487 purpose.
488
489 \index{Error frame}\index{Error frame}\index{Configuration switches}
490 \agparamheading{error frame}{switch, default off}
491
492 AnaGram uses the \agparam{error frame} switch in conjunction with the
493 \index{Diagnose errors}\index{Configuration switches}\agparam{diagnose errors}
494 switch. If both are set, when your parser encounters a syntax error,
495 before invoking the
496 \index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} macro,
497 your parser will determine the frame in which the error occurred, that
498 is, the production the parser was trying to match at the time of the
499 error.
500
501 % XXX: See chapter (dd.tex) for a complete discussion.
502
503 \index{Configuration parameters}\index{Error token}\index{Token}
504 \agparamheading{error token}{token name, no default}
505
506 One of your options for error recovery after a syntax error is a
507 technique similar to that provided in \agfile{yacc}. You include a
508 terminal token called \agcode{error} in your grammar. When the parser
509 encounters an error in the input it backs up the state stack to the
510 most recent state in which \agcode{error} was an acceptable input. It
511 then shifts to the new state as though it had seen an actual
512 \agcode{error} token. At this point, it skips over any character in
513 the input which is not an acceptable input character for this state.
514 Once it does find an acceptable input character, it continues
515 processing as though nothing had happened. If you wish to use this
516 approach and for some reason you wish to use the name \agcode{error}
517 for some other token in your grammar, you may use the \agparam{error
518 token} parameter to identify some other terminal token in your grammar
519 as the ``error token''.
520
521 \index{Configuration switches}\index{Error trace}\index{Trace}
522 \index{Window}
523 \agparamheading{error trace}{switch, default off}
524
525 If you turn the \agparam{error trace} switch on, AnaGram will include
526 code in your parser so that when it encounters a syntax error it will
527 write the contents of the \index{Parser state stack}\index{State
528 stack}\index{Stack}parser state stack to a file. The name of the file
529 is the same as the name of your syntax file but with the extension
530 \index{File extension}\index{etr}\agfile{.etr}. You may override this
531 definition by defining
532 \index{AG{\us}TRACE{\us}FILE{\us}NAME}\index{Macros}\agcode{AG{\us}TRACE{\us}FILE{\us}NAME}
533 in your embedded C.
534
535 The \agmenu{Error Trace} option in the \agmenu{Action} menu can then
536 read this information and prepare a pre-built \agwindow{Grammar Trace}
537 showing you the status of the parse at the time of the syntax error.
538 You would use this switch primarily when you are first checking out
539 your grammar to make sure it accurately represents the input you
540 desire to handle. You would also use it any time your parser
541 encounters a syntax error you don't understand. For more information,
542 see Chapter 5.
543
544 \index{Escape backslashes}\index{Configuration switches}
545 \agparamheading{escape backslashes}{switch, default off}
546
547 \agparam{Escape backslashes} is used only in conjunction with the
548 \agparam{line numbers} option. When turned on, it causes the
549 backslashes in the pathname generated by the \agparam{line numbers}
550 option to be doubled. This switch has been provided because C and C++
551 compilers are not consistent in their handling of backslashes in path
552 names.
553
554 \index{Event driven}\index{Configuration switches}
555 % XXX \index{AG{\us}RUNNING{\us}CODE} ?
556 % XXX \index{exit{\us}flag} ?
557 \agparamheading{event driven}{switch, default off}
558
559 If you turn the \agparam{event driven} switch on, when you build a
560 parser, it will be configured as an ``event driven'' parser. This
561 means that after calling its initializer function, you call it once
562 with each discrete unit of input. The parser proceeds until it
563 needs more input, finishes the grammar, or encounters an error. It
564 then returns. The \agcode{exit{\us}flag} field in the parser control
565 block is equal to \agcode{AG{\us}RUNNING{\us}CODE} if more input is needed.
566 Other values indicate other reasons for termination.
567 % XXX crossreference the discussion of exit codes?
568
569 When \agparam{event driven} is on, \agparam{auto init} has no effect;
570 you must always call the initializer function yourself.
571
572 \index{Far tables}\index{Configuration switches}
573 \agparamheading{far tables}{switch, default = off}
574
575 If \agparam{far tables} is on when AnaGram builds a parser, it will
576 declare the larger tables it builds as \agcode{far}. This can be a
577 convenience when using some memory models of the 8086 architecture.
578
579 \index{Grammar token}\index{Configuration parameters}\index{Token}
580 \agparamheading{grammar token}{token name, no default}
581
582 The \agparam{grammar token} parameter may be used to specify the
583 grammar, or ``goal'', token for the syntax analyzer portion of
584 AnaGram. An alternative method is to append a ``\$'' to the goal
585 token when you define it. You may also simply use the name
586 \agcode{grammar} to identify the grammar token.
587
588 \index{Header file name}\index{Configuration parameters}\index{File name}
589 \agparamheading{header file name}{string, default = \agcode{"\#.h"}}
590
591 This parameter names the parser header file AnaGram generates. The
592 contents of the header file are described in Chapter 9. When AnaGram
593 creates the file, it copies the value of \agparam{header file name},
594 substituting the name of your syntax file for the ``\agcode{\#}''
595 character, in order to create the pathname and extension for the file.
596 You can therefore use this parameter to give the header file a
597 particular name, independent of the syntax file name, or to specify a
598 particular drive or directory where you want the header file to
599 reside. Note that if you include a full DOS/Windows pathname,
600 backslash characters must be quoted.
601
602 \index{Input values}\index{Configuration switches}
603 \agparamheading{input values}{switch, default off}
604
605 % XXX this shouldn't say ASCII because it's true even if the
606 % characters are some other character set...
607 If the input to your parser includes explicit token values which are
608 not simply the ASCII values of corresponding ASCII input characters,
609 you must set the \agparam{input values} switch to inform AnaGram.
610 Unless your parser is \agparam{event driven}, you must also provide
611 your own \agcode{GET{\us}INPUT} macro.
612
613 \index{Line length}\index{Configuration parameters}
614 \agparamheading{line length}{integer value, default = 80}
615
616 \agparam{Line length} is an obsolete configuration parameter, recognized
617 for the sake of compatibility with configuration files prepared for
618 the DOS version of AnaGram. It is ignored in AnaGram 2.0.
619
620 \index{Line numbers}\index{configuration switches}
621 \agparamheading{line numbers}{switch, default off}
622
623 If \agparam{line numbers} is set, AnaGram will put syntax file line
624 numbers into the generated C code file using the
625 \index{\#line}\agcode{\#line}
626 directive so that your compiler diagnostics will refer to lines in the
627 syntax file rather than in the generated C code file. If
628 \agparam{line numbers} is off, AnaGram will put syntax file line
629 numbers in comments. The
630 \index{Line numbers path}\index{Configuration parameters}
631 \agparam{line numbers path} and
632 \index{Escape backslashes}\index{Configuration switch}
633 \agparam{escape backslashes}
634 switches may be used to control the generation of the line number
635 directives.
636
637 \index{Line numbers path}\index{Configuration parameters}
638 \agparamheading{line numbers path}{string, default = \agcode{NULL}}
639
640 When you have set the \agparam{line numbers} switch and
641 \agparam{line numbers path} is not NULL, AnaGram uses it in the
642 \agcode{\#line} directive in place of the full path name of your
643 syntax file.
644 % XXX update for unix where we (maybe) don't generate full pathnames
645
646 \index{Lines and columns}\index{Configuration switches}
647 \agparamheading{lines and columns}{switch, default on}
648
649 If this switch is set, AnaGram will incorporate code into your parser
650 to track line numbers and column numbers in its input. At all times,
651 the \agcode{line} and \agcode{column} fields in your parser control
652 block will mark the location of the current lookahead character. The
653 treatment of tab characters is controlled by the
654 \index{TAB{\us}SPACING}\index{Macros}\agcode{TAB{\us}SPACING} macro.
655
656 \index{Main program}\index{Configuration switches}
657 \agparamheading{main program}{switch, default on}
658
659 The \agparam{main program} switch determines what AnaGram does if you
660 invoke the Build Parser command, but have no embedded C in your syntax
661 file. If the switch is on, AnaGram creates a main program which does
662 nothing but call your parser. The switch is ignored if your parser
663 uses \agparam{pointer input} or is \agparam{event driven}.
664
665 \index{Max conflicts}\index{Configuration parameters}\index{Conflicts}
666 \agparamheading{max conflicts}{integer value, default = 50}
667
668 \agparam{Max conflicts} limits the number of conflicts AnaGram will
669 record. Sometimes, a simple editing error in your syntax file can
670 cause hundreds of conflicts, which you don't need to see in gory
671 detail. If you have a grammar that is in serious trouble and you want
672 to see more conflicts, you may change \agparam{max conflicts} to suit
673 your needs.
674
675 \index{Near functions}\index{Configuration switches}
676 \agparamheading{near functions}{switch, default off}
677
678 \agparam{Near functions} controls the use of the \agcode{near} keyword
679 for static functions in your parser. If your parser is to run on a
680 16-bit 80x86 processor you would want to turn it on. If you are
681 going to run your parser on some other processor or use a C compiler
682 that does not support the \agcode{near} keyword you should leave
683 \agparam{near functions} off.
684
685 \index{Configuration switches}\index{Nest comments}\index{Comments}
686 \agparamheading{nest comments}{switch, default off}
687
688 Use this switch to allow nested comments in your syntax or
689 configuration files. It defaults to off, in accordance with the ANSI
690 standard for C. Note that AnaGram scans comments in any embedded C
691 code as well as in the grammar specification. You may turn this
692 switch on and off as many times as necessary in a single file.
693
694 \index{Old style}\index{Configuration switches}
695 \agparamheading{old style}{switch, default off}
696
697 \agparam{Old style} controls the function definitions in the code
698 AnaGram generates. When \agparam{old style} is off, AnaGram generates
699 ANSI style calling sequences with prototypes as necessary. When
700 \agparam{old style} is on, it generates old style function definitions,
701 and no prototypes. It also causes the
702 \index{Const data}\index{Configuration switch}\agparam{const data}
703 switch to be ignored.
704
705 \index{Page length}\index{Configuration parameters}
706 \agparamheading{page length}{integer value, default = 66}
707
708 \agparam{Page length} is an obsolete configuration parameter,
709 recognized for the sake of compatibility with configuration files
710 prepared for the DOS version of AnaGram. It is ignored in AnaGram
711 2.0.
712
713 \index{Parser file name}\index{Configuration parameters}\index{File name}
714 \agparamheading{parser file name}{string, default = \agcode{"\#.c"}}
715
716 AnaGram creates a parser which consists of all the embedded C code in
717 your syntax file, the syntax tables created by the syntax analyzer,
718 and a parsing engine configured to your requirements. This code is
719 written to a file whose name is given by this parameter. When AnaGram
720 creates your parser file, it copies the value of the \agparam{parser
721 file name} parameter, substituting the name of your syntax file for
722 the ``\agcode{\#}'' character, in order to create the pathname and
723 extension for the file. You can therefore use this parameter to give
724 the parser file a particular name, independent of the syntax file
725 name, or to specify a particular drive or directory where you want the
726 parser file to reside. Note that if you include a full DOS/Windows
727 pathname, you must quote the backslash characters. If writing a C++
728 parser you would use this parameter to set the output filename suffix.
729
730 \index{Parser}\index{Parser name}\index{Configuration parameters}
731 \agparamheading{parser name}{string, default = \agcode{"\$"}}
732
733 % XXX This should say something other than ``name your parser''
734 AnaGram uses the value of \agparam{parser name} to name your parser,
735 substituting the name (not including the extension) of your syntax
736 file for a ``\agcode{\$}'' character. If you accept the default value of
737 \agparam{parser name} and have a syntax file called \agfile{ana.syn},
738 AnaGram will name your parser \agcode{ana}.
739
740 The \index{Initializer}initializer for your parser will have the same
741 name preceded by \agcode{init{\us}}. In the above example, the
742 initializer would be called \agcode{init{\us}ana}.
743
744 \index{Configuration parameters}\index{Stack}\index{Parser stack alignment}
745 \agparamheading{parser stack alignment}{c data type, default = \agcode{int}}
746
747 \agparam{Parser stack alignment} is used to control byte alignment of
748 the parser stack, \agcode{PCB.vs}. AnaGram normally adds a field of
749 the specified data type to the \agcode{union} declaration that defines
750 the data type for the parser stack. This parameter can be used to
751 deal with byte alignment problems when a parser is to be run on a
752 processor with byte alignment restrictions. For instance, if your
753 grammar has tokens of type \agcode{double} and your processor requires
754 double precision variables to be properly aligned, you can include the
755 following statement in a configuration section in your grammar or in
756 your configuration file:
757 \begin{indentingcode}{0.4in}
758 parser stack alignment = double
759 \end{indentingcode}
760 If the data type is \agcode{void}, no alignment declaration will be
761 made.
762 % You will not need to change this parameter if your parser is to
763 % run on a PC or compatible processor.
764 %
765 % XXX this really ought to be updated for the century of the fruitbat
766
767 \index{Configuration parameters}\index{Parser stack size}
768 \agparamheading{parser stack size}{integer value, default = 32}
769
770 \agparam{Parser stack size} is used to set the sizes of the parser
771 stacks in your parser control block. When AnaGram analyzes your
772 grammar, it determines the minimum amount of stack space required for
773 the deepest left recursion. To this depth it adds one half the value
774 of the \agparam{parser stack size} parameter. It then sets the actual
775 stack size to the larger of this value and the \agparam{parser stack
776 size} parameter. If you find 32 wastefully large or dangerously
777 small, you can define it to suit the needs of your particular parser.
778
779 \index{Pointer input}\index{Configuration switches}
780 \agparamheading{pointer input}{switch, default off}
781
782 When you turn \agparam{pointer input} on you tell AnaGram that the
783 input to your parser is in memory and can be scanned simply by
784 incrementing a pointer. Before calling your parser you should make
785 sure that the \agcode{pointer} field in your parser control block is
786 properly initialized to point to the first character or token in your
787 input.
788
789 Use the parameter
790 \index{Pointer type}\index{Configuration parameters}\agparam{pointer type}
791 to specify the type of the pointer. The default value of pointer type
792 is \agcode{unsigned char *}.
793
794 \index{Pointer type}\index{Configuration parameters}
795 \agparamheading{pointer type}{c data type, default = \agcode{unsigned char *}}
796
797 If you have set the \agparam{pointer input} switch, AnaGram will use
798 the value of the \agparam{pointer type} parameter to declare the
799 \agcode{pointer} field in your parser control block.
800
801 \index{Print file name}\index{Configuration parameters}\index{File name}
802 \agparamheading{print file name}{string, default = \agcode{"LPT1"}}
803
804 \agparam{Print file name} is an obsolete configuration parameter,
805 recognized for the sake of compatibility with configuration files
806 prepared for the DOS version of AnaGram. It is ignored by AnaGram
807 2.0.
808
809 \index{Quick reference}\index{Configuration switches}
810 \agparamheading{quick reference}{switch, default off}
811
812 The \agparam{quick reference} switch is no longer used, but is still
813 recognized for compatiblity's sake. In future versions of AnaGram it
814 may no longer be recognized.
815
816 \index{Configuration switches}\index{Reduction choices}
817 \agparamheading{reduction choices}{switch, default off}
818
819 If the \agparam{reduction choices} switch is set when AnaGram builds a
820 parser, it will include in your parser file a function which can
821 identify the acceptable choices for the reduction token in the current
822 state. You would use this switch only if you were using semantically
823 determined productions in your grammar and if there were states in
824 which not all the tokens on the left side of the production were valid
825 reduction tokens.
826
827 \index{Rule coverage}\index{Configuration switches}\index{Coverage}
828 \agparamheading{rule coverage}{switch, default off}
829
830 If you set the \agparam{rule coverage} switch, AnaGram will include
831 code in your parser to count the number of times your parser identifies
832 each rule in your grammar. To maintain the counts, AnaGram declares,
833 at the beginning of your parser, an integer array, whose name is
834 created by appending \agcode{{\us}nrc} to the name of your parser. The
835 array contains one counter for each rule you have defined in your
836 grammar. There are no entries for the auxiliary rules that AnaGram
837 creates to deal with set overlaps or disregard statements. In order
838 to identify every rule that the parser reduces in the course of
839 execution, AnaGram
840 has to turn off certain optimization features in your parser.
841 Therefore, a parser that has the \agparam{rule coverage} switch
842 enabled will run slightly slower than one with the switch off. An
843 entry on the \agmenu{Browse} menu allows you to view the coverage data.
844 % XXX See Chapter ???.
845
846 \index{Tab spacing}\index{Configuration parameters}
847 \agparamheading{tab spacing}{integer value, default = 8}
848
849 \agparam{Tab spacing} controls the expansion of tabs when AnaGram
850 displays your syntax file or the \agwindow{File Trace} test file.
851
852 The value of \agparam{tab spacing} is also used to set the default
853 value of the \index{TAB{\us}SPACING}\index{Macros}\agcode{TAB{\us}SPACING}
854 macro in your parser.
855
856 The default value of \agparam{tab spacing} is 8. If you prefer a
857 different value, you should probably include an appropriate statement
858 in your configuration file. For example:
859
860 \begin{indentingcode}{0.4in}
861 tab spacing = 2
862 \end{indentingcode}
863
864 \index{Test file binary}\index{Configuration switch}
865 \agparamheading{test file binary}{switch, default off}
866
867 \agparam{Test file binary} causes \agwindow{File Trace} to read test
868 files in binary mode. When \agwindow{File Trace} reads a test file,
869 it normally reads it in text mode, which in Windows causes carriage return
870 characters to be stripped out. Occasionally it is necessary to test a
871 grammar where carriage return characters are important and should not
872 be stripped. In this situation, set \agparam{test file binary} to on,
873 and the carriage return characters will not be discarded.
874 % XXX rewrite the second half of this paragraph?
875
876 \index{Test file mask}\index{Configuration parameters}
877 \agparamheading{test file mask}{string, default = \agcode{"*.*"}}
878
879 % XXX default should be ``*'' on unix
880 AnaGram uses \agparam{test file mask} to filter the pick list of test
881 files when you use the
882 \index{File Trace}\index{Trace}\index{Window}\agwindow{File Trace}
883 feature.
884 You may set it to any value you wish, including a pathname.
885 % XXX: test this
886 For instance, if you know that all your test files are in the directory
887 \agfile{C:{\bs}PROJECT{\bs}SOURCE} and have
888 extension \agfile{.FOO} you could set test file mask to
889 \agcode{"C:{\bs\bs}PROJECT{\bs\bs}SOURCE{\bs\bs}*.FOO"}.
890 Note that, as in any string literal, backslash characters must be
891 escaped.
892
893 \index{Test range}\index{Configuration switches}\index{Range}
894 \agparamheading{test range}{switch, default off}
895 % XXX should this really default to off?
896
897 When \agparam{test range} is on, AnaGram will insert code in your
898 parser to make sure all input characters or token identifiers are
899 within the range specified in your grammar. If you do not turn this
900 switch on, your parser will run slightly faster, but its behavior will
901 be undefined if it gets input outside the range you have specified
902 in your grammar.
903
904 \index{Token names}\index{Configuration switches}
905 \agparamheading{token names}{switch, default off}
906
907 When \agparam{token names} is set, AnaGram includes a static array of
908 ASCII strings in your parser containing the names of your tokens. The
909 name of this array is \agcode{\#{\us}token{\us}names} where the
910 ``\agcode{\#}'' character is replaced with the name of your parser.
911 The entry for tokens which do not have names is an empty string:
912 \agcode{""}.
913
914 \index{Top margin}\index{Configuration parameters}
915 \agparamheading{top margin}{integer value, default = 3}
916
917 \agparam{Top margin} is an obsolete configuration parameter,
918 recognized for the sake of compatibility with configuration files
919 prepared for the DOS version of AnaGram. It is ignored by AnaGram
920 2.0.
921
922 \index{Traditional engine}\index{Configuration switches}
923 \agparamheading{traditional engine}{switch, default off}
924
925 Traditional LALR-1 parsers use a parsing engine which has only four
926 actions: shift, reduce, accept, and error. AnaGram, in the interests
927 of faster execution and more compact tables, uses a parsing engine
928 with a number of short-cut actions. The \agparam{traditional engine}
929 switch tells AnaGram not to use the short-cut actions.
930
931 You would set this switch primarily in conjunction with use of the
932 \index{Grammar Trace}\index{Trace}\index{Window}\agwindow{Grammar Trace}
933 in order to have a clearer idea of what is happening. AnaGram will
934 then be using the same parsing actions as textbook parsers. Note that
935 if a lookahead token has already been selected, AnaGram will display
936 it on the last line of the \agwindow{Parser Stack} pane in the
937 \agwindow{Grammar Trace} window.
938 % XXX what is this note doing here?
939
940 You should turn this switch back off when you have finished debugging
941 or your parser will be larger and slower than necessary.
942
943 % XXX: say that in production code traditional engine is not useful
944 % and only serves to slow things down.
945
946 \index{Video mode}\index{Configuration parameters}
947 \agparamheading{video mode}{integer value, default = $-$1}
948
949 \agparam{Video mode} is an obsolete configuration parameter,
950 recognized for the sake of compatibility with configuration files
951 prepared for the DOS version of AnaGram. It is ignored by AnaGram
952 2.0.