Mercurial > ~dholland > hg > ag > index.cgi
view doc/manual/cfp.tex @ 15:f5acaf0c8a29
Don't cast through "volatile int". Causes a gcc warning nowadays.
XXX: should put something else back here to frighten the optimizer
author | David A. Holland |
---|---|
date | Tue, 31 May 2022 01:00:55 -0400 (2022-05-31) |
parents | 13d2b8934445 |
children |
line wrap: on
line source
\chapter{Configuration Parameters} \index{Configuration parameters}\index{Parameters} \agterm{Configuration parameters} are named constants that control the way AnaGram works. AnaGram ignores case\index{Case sensitivity} when it looks up the names of configuration parameters, so that \agcode{parser name} and \agcode{Parser Name} both refer to the same parameter. Configuration parameters that have only true/false or on/off values are often referred to as \index{Configuration switches}\agterm{configuration switches}. Configuration parameters are used to control: \begin{itemize} \item Comment nesting \item Grammar analysis \item Parser generation \end{itemize} Every configuration parameter has a default value which has been chosen to correspond to a standard if it exists, customary usage if such can be determined, or otherwise to the most likely usage. Configuration parameters may be specified either in \index{Configuration file}\index{File}\agparam{configuration files}, always named \agfile{AnaGram.cfg}, or in a syntax file. A configuration file is a normal ASCII file containing parameter specifications. The syntax of a configuration file is the same as that of a configuration segment within a syntax file, except that a configuration file does not have the brackets ( \agcode{[ ]} ) that enclose a configuration segment in a syntax file. You may comment the configuration file freely, just as though it were a syntax file. % XXX ``configuration segment'' is a forward reference and we should % rearrange all this so it isn't. Also, the forward reference is % ``configuration section''. Sigh. % Parameters can be set in either a configuration file or in your syntax % file. Apart from the \agparam{nest comments} switch, if a parameter is specified more than once, only the last value is used (see below). The \agparam{nest comments} switch, which affects the way AnaGram reads your configuration and syntax files, takes effect as soon as AnaGram encounters it in a file and stays in effect unless it is later turned off. % XXX this should be belabored less. Also, good practice dictates that % if you ship a project or a grammar it should compile in someone % else's environment, and we shouldn't encourage people to do things % like put \agparam{pointer input} in a systemwide AnaGram.cfg. % % XXX also in the Unix world it ought to read % /usr/local/etc/AnaGram.cfg and then also ~/.AnaGram.cfg - or % something like that. And it ought to be possible to set params % on the agcl command line. We need to think about this. (Well, % there's not really any valid use for either, so perhaps it % doesn't matter.) % % How about something like % % Support for a global configuration file dates from the DOS-based % AnaGram 1.x, where the same configuration mechanism was used to % establish user interface preferences. AnaGram 2.0 and above handle % preferences separately, and the configuration system is only used % for code-related options. Since good practice dictates that code % should continue to work if exported outside of one's personal % environment, there are few or no legitimate uses of the global % configuration file and support for it will likely be removed in a % future AnaGram release. % % (But there really should be support for params on the agcl command % line; if nothing else it would make it a lot easier to test % combinations of settings.) % On initialization, AnaGram checks the directory that contains the AnaGram executable file. If it finds \agfile{AnaGram.cfg}, it reads it and sets internal parameters accordingly. It then looks for \agfile{AnaGram.cfg} in your working directory and, if it finds it, reads it in turn. If any parameter is set in both files, the last setting wins. The effect of this two stage process is to allow you to set your standard preferences in the principal directory, with specific overrides in your working directories. You may also put configuration parameters in your syntax file, which override the settings in the configuration files. Note that neither configuration file is necessary. Before executing an Analyze Grammar or Build Parser command, AnaGram resets configuration parameters to their initial values, as determined by the built in defaults and the configuration files read at program initialization. There are, therefore, four levels at which parameters may be set. At the first level, there are the settings built into AnaGram. If you don't like some of these, you can override them with a configuration file at the second level, the tools directory where you installed AnaGram. If a particular project needs overrides, you can put them in a configuration file at the third level, the working directory for this project. And if you have specific configuration requirements for a particular parser, the best place for them is the fourth level, the syntax file for the parser. For all of this flexibility, some people prefer to set every configuration parameter explicitly in their syntax files so there is no question as to what setting is being used. AnaGram is set up so you can do it whichever way you prefer. If you are uncertain as to the actual parameters that AnaGram is using at any time, the \index{Configuration Parameters}\index{Window} \agwindow{Configuration Parameters} window listed in the \agmenu{Windows} menu will show you the current state of all parameters. The different varieties of configuration parameters are described below. Each definition of a parameter must start on a new line. A configuration file is just a sequence of parameter definitions, each on a separate line. Blank lines can be used as separators where you please, and comments may be used as described for syntax files. Case\index{Case sensitivity} is ignored for parameter names (but not for the whole definition). In a syntax file, each set of definitions must be enclosed with brackets ( \agcode{[ ]} ), forming a \index{Configuration section}\agterm{configuration section}, one of the four kinds of AnaGram statements. Configuration sections can be scattered throughout a syntax file, but each section should begin on a new line, and following statements should also of course start on new lines. There is no restriction on the number of sections, or on the number of times a parameter appears. The last setting of a parameter wins. The first variety of configuration parameter is a simple \index{Switches}\index{Configuration switches}switch that controls one of the various features of AnaGram. Such parameters are also called \agterm{configuration switches}. They need simply be stated to set the condition (turn it on) or negated with the tilde (\agcode{\~{}}) to reset the condition (turn it off). Thus \begin{indentingcode}{0.4in} nest comments \end{indentingcode} causes AnaGram to allow nested comments, and \begin{indentingcode}{0.4in} \~{}nest comments \end{indentingcode} causes AnaGram to disallow nested comments. You may also set or reset configuration switches with explicit on or off values: \begin{indentingcode}{0.4in} nest comments = on nest comments = off \end{indentingcode} A second variety of configuration parameter takes a value which is the name of a token. Thus \begin{indentingcode}{0.4in} grammar token = c grammar \end{indentingcode} specifies that the token \agcode{c grammar} is the grammar that AnaGram should use as the starting point for analyzing your grammar. A third variety of configuration parameter takes a value which is a C or C++ data type. Thus \begin{indentingcode}{0.4in} default token type = unsigned char * \end{indentingcode} signifies that the value of a token, unless otherwise specified, is a pointer to an \agcode{unsigned char}. AnaGram does not accept the full panoply of C and C++ \index{Data type}data types. The restrictions are that AnaGram does not allow specification of array or function types, nor explicit structure types. Types that are defined with typedef statements, structure definitions, or class definitions, including template classes, in your embedded C or C++ are acceptable. If you have more complex data types, you should define a simple name using a typedef statement. A fourth variety of configuration parameter takes a string value to set an ASCII string used by AnaGram. Thus \begin{indentingcode}{0.4in} header file name = "widget.h" \end{indentingcode} signifies that the header file created by AnaGram should be called \agfile{widget.h}. In those strings which are used to name the parser or files which AnaGram builds, the character ``\agcode{\#}'' is used to indicate that AnaGram should substitute the name of your syntax file. In strings used to determine the names of program variables or functions, ``\agcode{\$}'' is used to indicate that AnaGram should substitute the name of your parser. When building enumeration constants for the names of the tokens in your grammar, ``\agcode{\%}'' will be replaced by the name of the token. The final variety of configuration parameter takes a numeric value. The value may be decimal, octal or hexadecimal, following the C conventions, and may have an optional sign. Thus \begin{indentingcode}{0.4in} parser stack size = 50 \end{indentingcode} tells AnaGram to allocate space for at least fifty stack entries when it creates your parser. If AnaGram does not recognize a parameter, it will give you a warning with line number, column number, and the message ``no such parameter''. If the value for a parameter is inappropriate, such as a string value for a parameter which should have a numeric value, the message will be ``inappropriate value''. If the error occurs in the configuration file found in the AnaGram directory, AnaGram will prefix the warning with the complete path name for the file. If the error occurs in the configuration file in your working directory, AnaGram will prefix the warning with ``AnaGram.cfg:''. If AnaGram encounters a syntax error while reading a configuration file, it will honor the parameter settings it found before the syntax error, but will ignore everything that follows the error. \section{Alphabetic Listing of Configuration Parameters} \index{Configuration switches}\index{Allow macros}\index{Macros} \agparamheading{allow macros}{switch, default on} When this switch is set, i.e., on, reduction procedures will be implemented as macros if they are sufficiently simple. This makes your parser some what more compact and faster but makes it somewhat more difficult to debug. It's a good idea to turn this switch off for debugging. \index{Configuration switches}\index{Auto init} \agparamheading{auto init}{switch, default on} This switch controls the initialization of any parser that is not \agparam{event driven}. When it is on, the \index{Initializer}initializer for your parser is automatically called every time the parser is called. This is the normal situation. On occasion, however, it is desirable to call a parser several times without reinitializing it. In this case, you may set the \agparam{auto init} parameter to off. Should you do this, you must call the initializer yourself whenever appropriate. % XXX characterize the occasion... When \agparam{event driven} is set, \agparam{auto init} has no effect. \index{Configuration switches}\index{Auto resynch} \agparamheading{auto resynch}{switch, default off} Setting this switch causes AnaGram to include an automatic resynchronization procedure in the parser. The resynchronization procedure will be invoked upon encountering a syntax error and will skip over input until it finds input characters or tokens consistent with its state at the time of the error. The purpose of the resynchronization procedure is to provide a simple way for your parser to proceed in the event of syntax errors so that it can find more than one syntax error on a given pass. The resynchronization procedure uses a heuristic based on your own syntax. AnaGram itself uses this technique to resynchronize after syntax errors in its input. A disadvantage to using this resynchronization technique is that the resynchronization procedure turns off all reduction procedures. The reason is that the resynchronization may cause a number of reduction procedures to be skipped. This means that the parameters for any reduction procedures that might be called later would be suspect and could cause serious problems. It seems more prudent simply to shut them down. Semantically determined productions will subsequently, of course, always use the default reduction token. If you have a \index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} macro, it will be called \emph{before} the resynchronization process. It will also be called on subsequent syntax errors, so your program will not lose control entirely. If you use the auto resynchronization procedure, you must also specify the \agparam{eof token} configuration parameter (see below) so that the synchronizer doesn't inadvertently try to pass over the end of file. For other methods of recovering from syntax errors, see Chapter 9. \index{Configuration switches}\index{Backtrack} \agparamheading{backtrack}{switch, default on} If your parser does not continue after encountering a syntax error, you can speed up your parser and make it a little smaller by turning off the \agparam{backtrack} switch. If \agparam{backtrack} is on, AnaGram configures your parser so that in case of syntax error it can undo any default reductions it might have made as a consequence of the erroneous input. The purpose of such an undo function is to identify the proper error frame and to maximize the probability of being able to recover gracefully. % XXX shouldn't these be indexed as ``obsolete parameters'' or % something, with xrefs so if you look up ``Bottom margin'' in the % index it says ``see ``obsolete parameters''''? % % Also, shouldn't the various obsolete parameters be described with % the same text? % \index{Configuration parameters}\index{Bottom margin} \agparamheading{bottom margin}{integer value, default = 3} This is an obsolete parameter which was used in the DOS version of AnaGram. It is no longer used, but is still recognized for the sake of compatibility. \index{Configuration switches}\index{Bright background} \agparamheading{bright background}{switch, default on} This configuration switch is not used in AnaGram 2.0. It is retained for compatibility with configuration files used with the DOS versions of AnaGram. \index{Configuration switches}\index{Case sensitive} \index{Case sensitivity} \agparamheading{case sensitive}{switch, default on} Use this switch to control how your parser deals with distinctions between upper and lower case. When \agparam{case sensitive} is on, AnaGram builds a parser which distinguishes upper from lower case. When this switch is off, AnaGram builds a parser which ignores case for all input. This does not mean that the values of character set tokens are not case sensitive. Although 'a' and 'A' would map to the same token, the values would still be lower and upper case respectively. % XXX the last bit could be explained more clearly. (something like % ``parsers still preserve case'') % XXX this should discuss character sets, locales, and other such % garbage. \index{Configuration parameters}\index{Compile command} \agparamheading{compile command}{string, default = \agcode{NULL}} This parameter is retained only for compatibility with the DOS version of AnaGram. It is ignored in the Windows version. \index{Configuration switches}\index{Const data} \agparamheading{const data}{switch, default on} The \agparam{const data} switch controls the use of \agcode{const} qualifiers in generated C code. If the switch is on, all fixed data arrays in the parser file will be qualified as \agcode{const}. The \agparam{const data} switch is ignored if the \agparam{old style} switch is set. \index{Configuration parameters}\index{Context type} %XXX: \index{context tracking} ? \agparamheading{context type}{c data type, no default} By default, \agparam{context type} is undefined. If you assign the name of a C data type, AnaGram will implement ``context tracking'' in your parser. See Chapter 9. The data type name can be either a standard, pre-defined data type or one which you create with a \agcode{typedef} statement. \index{Configuration parameters}\index{Coverage file name} \index{File extension}\index{nrc} \agparamheading{coverage file name}{string, default = \agcode{"\#.nrc"}} If you set the \agparam{rule coverage} configuration switch, AnaGram will provide functions in your parser to read and write rule counts to a file. The name of the file will be determined by \agparam{coverage file name}. The name of your syntax file will be substituted for the ``\agcode{\#}'' character. \index{Configuration switches}\index{Declare pcb} % XXX \index{Parser control block} ? \agparamheading{declare pcb}{switch, default on} When AnaGram builds a parser, it checks the status of the \agparam{declare pcb} switch. If it is on, AnaGram declares a parser control block for you. AnaGram creates the name of the control block variable by appending \agcode{{\us}pcb} to the name of your parser. AnaGram will also code an \agcode{\#include} statement to include your parser header file, and will define the \agcode{PCB} macro for you. If you wish to declare the parser control block yourself you should turn this switch off. \index{Configuration parameters}\index{Default input type} \index{Input type} % XXX: \index{Types} ? \agparamheading{default input type}{c data type, default = \agcode{int}} This parameter tells AnaGram what data type to assume for terminal tokens if they are not explicitly declared. Normally, you would explicitly declare terminal tokens only when you have set the \agparam{input values} configuration switch. The default type for nonterminal tokens is given by \agparam{default token type}. \index{Configuration switches}\index{Default reductions}\index{Reduction} \agparamheading{default reductions}{switch, default on} If in a given parser state there is only one production that could be possibly reduced, it is usually faster to reduce it on any input than to check specifically for correct input before reducing it. The only time this default reduction causes trouble is in the event of erroneous input. In this situation you may get an erroneous reduction. Normally when you are parsing a file, this is inconsequential because you are not going to continue semantic action in the presence of error. But, if you are using your parser to handle real-time interactive input, you have to be able to continue semantic processing after notifying your user that he has entered erroneous input. In this case you would want to turn the \agparam{default reductions} switch off so that productions are reduced only when there is correct input. \index{Configuration parameters}\index{Default token type}\index{token} % XXX \index{Types} ? \agparamheading{default token type}{c data type, default = \agcode{void}} This parameter takes a C data type as its value. It is used to set the data type for the semantic values of nonterminal tokens whose type is not explicitly specified in the grammar. To set the default type for terminal tokens use \agparam{default input type}. \index{Diagnose errors}\index{Configuration switches} \agparamheading{diagnose errors}{switch, default on} If you set this switch, AnaGram will include a syntax error diagnostic procedure in your parser. This procedure will be called before your \index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} macro is called. It will store a pointer to a string in the \agcode{error{\us}message} field of your parser control block. The string will contain a diagnostic message. If there is only one syntactically correct input, x, for example, the message will be ``Missing x''. Otherwise it will be ``Unexpected x'' if the input is recognizable but incorrect and ``Unexpected input'' otherwise. If the \agparam{error frame} switch has been set, the \agcode{error{\us}frame{\us}ssx} and \agcode{error{\us}frame{\us}token} fields in the parser control block will be set as described in Chapter 9. % XXX say: diagnose errors causes the token_names[] array to be % included in the parser. and index token_names[]... \index{Distinguish lexemes}\index{Configuration switches} % XXX \index{Disregard} ? \agparamheading{distinguish lexemes}{switch, default off} The \agparam{distinguish lexemes} switch has no effect unless a disregard token has been defined. Normally, the disregard token (usually white space) is optional between lexemes. This may lead to apparent shift-reduce conflicts if the characters that comprise the second of two successive lexemes can be construed as part of the first lexeme. In this situatation, turning on the \agparam{distinguish lexemes} switch effectively requires a disregard token to separate the two lexemes. \index{Edit command}\index{Configuration parameters} \index{File extension}\index{syn} \agparamheading{edit command}{string, default = \agcode{"ed \#.syn"}} This parameter is no longer used and is retained only for file compatibility with the DOS version of AnaGram. \index{Enable mouse}\index{Configuration switches} \agparamheading{enable mouse}{switch, default on} This parameter is no longer used and is retained only for file compatibility with the DOS version of AnaGram. \index{Enum constant name}\index{Configuration parameters} \agparamheading{enum constant name}{string, default = \agcode{"\${\us}\%{\us}token"}} Use the \agparam{enum constant name} parameter to control the names AnaGram uses for the enumeration constants it defines in the header file for your parser. The value of \agparam{enum constant name} should be a string containing the ``\agcode{\%}'' character. AnaGram will substitute each token name in turn for the ``\agcode{\%}'' character in this template as it creates the list of enumeration constants. If it finds a ``\agcode{\$}'' character it will substitute the name of your parser. \index{Eof token}\index{Configuration parameters}\index{Token} \agparamheading{eof token}{token name, no default} If you use the auto resynchronization capability of AnaGram, you must specify an end of file token explicitly. You can do this either by specifying a terminal token in your grammar called \agcode{eof} or by using the \agparam{eof token} parameter to identify some other terminal token to be used as the end of file marker. You would do this only if you must use the name \agcode{eof} for some other purpose. \index{Error frame}\index{Error frame}\index{Configuration switches} \agparamheading{error frame}{switch, default off} AnaGram uses the \agparam{error frame} switch in conjunction with the \index{Diagnose errors}\index{Configuration switches}\agparam{diagnose errors} switch. If both are set, when your parser encounters a syntax error, before invoking the \index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} macro, your parser will determine the frame in which the error occurred, that is, the production the parser was trying to match at the time of the error. % XXX: See chapter (dd.tex) for a complete discussion. \index{Configuration parameters}\index{Error token}\index{Token} \agparamheading{error token}{token name, no default} One of your options for error recovery after a syntax error is a technique similar to that provided in \agfile{yacc}. You include a terminal token called \agcode{error} in your grammar. When the parser encounters an error in the input it backs up the state stack to the most recent state in which \agcode{error} was an acceptable input. It then shifts to the new state as though it had seen an actual \agcode{error} token. At this point, it skips over any character in the input which is not an acceptable input character for this state. Once it does find an acceptable input character, it continues processing as though nothing had happened. If you wish to use this approach and for some reason you wish to use the name \agcode{error} for some other token in your grammar, you may use the \agparam{error token} parameter to identify some other terminal token in your grammar as the ``error token''. \index{Configuration switches}\index{Error trace}\index{Trace} \index{Window} \agparamheading{error trace}{switch, default off} If you turn the \agparam{error trace} switch on, AnaGram will include code in your parser so that when it encounters a syntax error it will write the contents of the \index{Parser state stack}\index{State stack}\index{Stack}parser state stack to a file. The name of the file is the same as the name of your syntax file but with the extension \index{File extension}\index{etr}\agfile{.etr}. You may override this definition by defining \index{AG{\us}TRACE{\us}FILE{\us}NAME}\index{Macros}\agcode{AG{\us}TRACE{\us}FILE{\us}NAME} in your embedded C. The \agmenu{Error Trace} option in the \agmenu{Action} menu can then read this information and prepare a pre-built \agwindow{Grammar Trace} showing you the status of the parse at the time of the syntax error. You would use this switch primarily when you are first checking out your grammar to make sure it accurately represents the input you desire to handle. You would also use it any time your parser encounters a syntax error you don't understand. For more information, see Chapter 5. \index{Escape backslashes}\index{Configuration switches} \agparamheading{escape backslashes}{switch, default off} \agparam{Escape backslashes} is used only in conjunction with the \agparam{line numbers} option. When turned on, it causes the backslashes in the pathname generated by the \agparam{line numbers} option to be doubled. This switch has been provided because C and C++ compilers are not consistent in their handling of backslashes in path names. \index{Event driven}\index{Configuration switches} % XXX \index{AG{\us}RUNNING{\us}CODE} ? % XXX \index{exit{\us}flag} ? \agparamheading{event driven}{switch, default off} If you turn the \agparam{event driven} switch on, when you build a parser, it will be configured as an ``event driven'' parser. This means that after calling its initializer function, you call it once with each discrete unit of input. The parser proceeds until it needs more input, finishes the grammar, or encounters an error. It then returns. The \agcode{exit{\us}flag} field in the parser control block is equal to \agcode{AG{\us}RUNNING{\us}CODE} if more input is needed. Other values indicate other reasons for termination. % XXX crossreference the discussion of exit codes? When \agparam{event driven} is on, \agparam{auto init} has no effect; you must always call the initializer function yourself. \index{Far tables}\index{Configuration switches} \agparamheading{far tables}{switch, default = off} If \agparam{far tables} is on when AnaGram builds a parser, it will declare the larger tables it builds as \agcode{far}. This can be a convenience when using some memory models of the 8086 architecture. \index{Grammar token}\index{Configuration parameters}\index{Token} \agparamheading{grammar token}{token name, no default} The \agparam{grammar token} parameter may be used to specify the grammar, or ``goal'', token for the syntax analyzer portion of AnaGram. An alternative method is to append a ``\$'' to the goal token when you define it. You may also simply use the name \agcode{grammar} to identify the grammar token. \index{Header file name}\index{Configuration parameters}\index{File name} \agparamheading{header file name}{string, default = \agcode{"\#.h"}} This parameter names the parser header file AnaGram generates. The contents of the header file are described in Chapter 9. When AnaGram creates the file, it copies the value of \agparam{header file name}, substituting the name of your syntax file for the ``\agcode{\#}'' character, in order to create the pathname and extension for the file. You can therefore use this parameter to give the header file a particular name, independent of the syntax file name, or to specify a particular drive or directory where you want the header file to reside. Note that if you include a full DOS/Windows pathname, backslash characters must be quoted. \index{Input values}\index{Configuration switches} \agparamheading{input values}{switch, default off} % XXX this shouldn't say ASCII because it's true even if the % characters are some other character set... If the input to your parser includes explicit token values which are not simply the ASCII values of corresponding ASCII input characters, you must set the \agparam{input values} switch to inform AnaGram. Unless your parser is \agparam{event driven}, you must also provide your own \agcode{GET{\us}INPUT} macro. \index{Line length}\index{Configuration parameters} \agparamheading{line length}{integer value, default = 80} \agparam{Line length} is an obsolete configuration parameter, recognized for the sake of compatibility with configuration files prepared for the DOS version of AnaGram. It is ignored in AnaGram 2.0. \index{Line numbers}\index{configuration switches} \agparamheading{line numbers}{switch, default off} If \agparam{line numbers} is set, AnaGram will put syntax file line numbers into the generated C code file using the \index{\#line}\agcode{\#line} directive so that your compiler diagnostics will refer to lines in the syntax file rather than in the generated C code file. If \agparam{line numbers} is off, AnaGram will put syntax file line numbers in comments. The \index{Line numbers path}\index{Configuration parameters} \agparam{line numbers path} and \index{Escape backslashes}\index{Configuration switch} \agparam{escape backslashes} switches may be used to control the generation of the line number directives. \index{Line numbers path}\index{Configuration parameters} \agparamheading{line numbers path}{string, default = \agcode{NULL}} When you have set the \agparam{line numbers} switch and \agparam{line numbers path} is not NULL, AnaGram uses it in the \agcode{\#line} directive in place of the full path name of your syntax file. % XXX update for unix where we (maybe) don't generate full pathnames \index{Lines and columns}\index{Configuration switches} \agparamheading{lines and columns}{switch, default on} If this switch is set, AnaGram will incorporate code into your parser to track line numbers and column numbers in its input. At all times, the \agcode{line} and \agcode{column} fields in your parser control block will mark the location of the current lookahead character. The treatment of tab characters is controlled by the \index{TAB{\us}SPACING}\index{Macros}\agcode{TAB{\us}SPACING} macro. \index{Main program}\index{Configuration switches} \agparamheading{main program}{switch, default on} The \agparam{main program} switch determines what AnaGram does if you invoke the Build Parser command, but have no embedded C in your syntax file. If the switch is on, AnaGram creates a main program which does nothing but call your parser. The switch is ignored if your parser uses \agparam{pointer input} or is \agparam{event driven}. \index{Max conflicts}\index{Configuration parameters}\index{Conflicts} \agparamheading{max conflicts}{integer value, default = 50} \agparam{Max conflicts} limits the number of conflicts AnaGram will record. Sometimes, a simple editing error in your syntax file can cause hundreds of conflicts, which you don't need to see in gory detail. If you have a grammar that is in serious trouble and you want to see more conflicts, you may change \agparam{max conflicts} to suit your needs. \index{Near functions}\index{Configuration switches} \agparamheading{near functions}{switch, default off} \agparam{Near functions} controls the use of the \agcode{near} keyword for static functions in your parser. If your parser is to run on a 16-bit 80x86 processor you would want to turn it on. If you are going to run your parser on some other processor or use a C compiler that does not support the \agcode{near} keyword you should leave \agparam{near functions} off. \index{Configuration switches}\index{Nest comments}\index{Comments} \agparamheading{nest comments}{switch, default off} Use this switch to allow nested comments in your syntax or configuration files. It defaults to off, in accordance with the ANSI standard for C. Note that AnaGram scans comments in any embedded C code as well as in the grammar specification. You may turn this switch on and off as many times as necessary in a single file. \index{Old style}\index{Configuration switches} \agparamheading{old style}{switch, default off} \agparam{Old style} controls the function definitions in the code AnaGram generates. When \agparam{old style} is off, AnaGram generates ANSI style calling sequences with prototypes as necessary. When \agparam{old style} is on, it generates old style function definitions, and no prototypes. It also causes the \index{Const data}\index{Configuration switch}\agparam{const data} switch to be ignored. \index{Page length}\index{Configuration parameters} \agparamheading{page length}{integer value, default = 66} \agparam{Page length} is an obsolete configuration parameter, recognized for the sake of compatibility with configuration files prepared for the DOS version of AnaGram. It is ignored in AnaGram 2.0. \index{Parser file name}\index{Configuration parameters}\index{File name} \agparamheading{parser file name}{string, default = \agcode{"\#.c"}} AnaGram creates a parser which consists of all the embedded C code in your syntax file, the syntax tables created by the syntax analyzer, and a parsing engine configured to your requirements. This code is written to a file whose name is given by this parameter. When AnaGram creates your parser file, it copies the value of the \agparam{parser file name} parameter, substituting the name of your syntax file for the ``\agcode{\#}'' character, in order to create the pathname and extension for the file. You can therefore use this parameter to give the parser file a particular name, independent of the syntax file name, or to specify a particular drive or directory where you want the parser file to reside. Note that if you include a full DOS/Windows pathname, you must quote the backslash characters. If writing a C++ parser you would use this parameter to set the output filename suffix. \index{Parser}\index{Parser name}\index{Configuration parameters} \agparamheading{parser name}{string, default = \agcode{"\$"}} % XXX This should say something other than ``name your parser'' AnaGram uses the value of \agparam{parser name} to name your parser, substituting the name (not including the extension) of your syntax file for a ``\agcode{\$}'' character. If you accept the default value of \agparam{parser name} and have a syntax file called \agfile{ana.syn}, AnaGram will name your parser \agcode{ana}. The \index{Initializer}initializer for your parser will have the same name preceded by \agcode{init{\us}}. In the above example, the initializer would be called \agcode{init{\us}ana}. \index{Configuration parameters}\index{Stack}\index{Parser stack alignment} \agparamheading{parser stack alignment}{c data type, default = \agcode{int}} \agparam{Parser stack alignment} is used to control byte alignment of the parser stack, \agcode{PCB.vs}. AnaGram normally adds a field of the specified data type to the \agcode{union} declaration that defines the data type for the parser stack. This parameter can be used to deal with byte alignment problems when a parser is to be run on a processor with byte alignment restrictions. For instance, if your grammar has tokens of type \agcode{double} and your processor requires double precision variables to be properly aligned, you can include the following statement in a configuration section in your grammar or in your configuration file: \begin{indentingcode}{0.4in} parser stack alignment = double \end{indentingcode} If the data type is \agcode{void}, no alignment declaration will be made. % You will not need to change this parameter if your parser is to % run on a PC or compatible processor. % % XXX this really ought to be updated for the century of the fruitbat \index{Configuration parameters}\index{Parser stack size} \agparamheading{parser stack size}{integer value, default = 32} \agparam{Parser stack size} is used to set the sizes of the parser stacks in your parser control block. When AnaGram analyzes your grammar, it determines the minimum amount of stack space required for the deepest left recursion. To this depth it adds one half the value of the \agparam{parser stack size} parameter. It then sets the actual stack size to the larger of this value and the \agparam{parser stack size} parameter. If you find 32 wastefully large or dangerously small, you can define it to suit the needs of your particular parser. \index{Pointer input}\index{Configuration switches} \agparamheading{pointer input}{switch, default off} When you turn \agparam{pointer input} on you tell AnaGram that the input to your parser is in memory and can be scanned simply by incrementing a pointer. Before calling your parser you should make sure that the \agcode{pointer} field in your parser control block is properly initialized to point to the first character or token in your input. Use the parameter \index{Pointer type}\index{Configuration parameters}\agparam{pointer type} to specify the type of the pointer. The default value of pointer type is \agcode{unsigned char *}. \index{Pointer type}\index{Configuration parameters} \agparamheading{pointer type}{c data type, default = \agcode{unsigned char *}} If you have set the \agparam{pointer input} switch, AnaGram will use the value of the \agparam{pointer type} parameter to declare the \agcode{pointer} field in your parser control block. \index{Print file name}\index{Configuration parameters}\index{File name} \agparamheading{print file name}{string, default = \agcode{"LPT1"}} \agparam{Print file name} is an obsolete configuration parameter, recognized for the sake of compatibility with configuration files prepared for the DOS version of AnaGram. It is ignored by AnaGram 2.0. \index{Quick reference}\index{Configuration switches} \agparamheading{quick reference}{switch, default off} The \agparam{quick reference} switch is no longer used, but is still recognized for compatiblity's sake. In future versions of AnaGram it may no longer be recognized. \index{Configuration switches}\index{Reduction choices} \agparamheading{reduction choices}{switch, default off} If the \agparam{reduction choices} switch is set when AnaGram builds a parser, it will include in your parser file a function which can identify the acceptable choices for the reduction token in the current state. You would use this switch only if you were using semantically determined productions in your grammar and if there were states in which not all the tokens on the left side of the production were valid reduction tokens. \index{Rule coverage}\index{Configuration switches}\index{Coverage} \agparamheading{rule coverage}{switch, default off} If you set the \agparam{rule coverage} switch, AnaGram will include code in your parser to count the number of times your parser identifies each rule in your grammar. To maintain the counts, AnaGram declares, at the beginning of your parser, an integer array, whose name is created by appending \agcode{{\us}nrc} to the name of your parser. The array contains one counter for each rule you have defined in your grammar. There are no entries for the auxiliary rules that AnaGram creates to deal with set overlaps or disregard statements. In order to identify every rule that the parser reduces in the course of execution, AnaGram has to turn off certain optimization features in your parser. Therefore, a parser that has the \agparam{rule coverage} switch enabled will run slightly slower than one with the switch off. An entry on the \agmenu{Browse} menu allows you to view the coverage data. % XXX See Chapter ???. \index{Tab spacing}\index{Configuration parameters} \agparamheading{tab spacing}{integer value, default = 8} \agparam{Tab spacing} controls the expansion of tabs when AnaGram displays your syntax file or the \agwindow{File Trace} test file. The value of \agparam{tab spacing} is also used to set the default value of the \index{TAB{\us}SPACING}\index{Macros}\agcode{TAB{\us}SPACING} macro in your parser. The default value of \agparam{tab spacing} is 8. If you prefer a different value, you should probably include an appropriate statement in your configuration file. For example: \begin{indentingcode}{0.4in} tab spacing = 2 \end{indentingcode} \index{Test file binary}\index{Configuration switch} \agparamheading{test file binary}{switch, default off} \agparam{Test file binary} causes \agwindow{File Trace} to read test files in binary mode. When \agwindow{File Trace} reads a test file, it normally reads it in text mode, which in Windows causes carriage return characters to be stripped out. Occasionally it is necessary to test a grammar where carriage return characters are important and should not be stripped. In this situation, set \agparam{test file binary} to on, and the carriage return characters will not be discarded. % XXX rewrite the second half of this paragraph? \index{Test file mask}\index{Configuration parameters} \agparamheading{test file mask}{string, default = \agcode{"*.*"}} % XXX default should be ``*'' on unix AnaGram uses \agparam{test file mask} to filter the pick list of test files when you use the \index{File Trace}\index{Trace}\index{Window}\agwindow{File Trace} feature. You may set it to any value you wish, including a pathname. % XXX: test this For instance, if you know that all your test files are in the directory \agfile{C:{\bs}PROJECT{\bs}SOURCE} and have extension \agfile{.FOO} you could set test file mask to \agcode{"C:{\bs\bs}PROJECT{\bs\bs}SOURCE{\bs\bs}*.FOO"}. Note that, as in any string literal, backslash characters must be escaped. \index{Test range}\index{Configuration switches}\index{Range} \agparamheading{test range}{switch, default off} % XXX should this really default to off? When \agparam{test range} is on, AnaGram will insert code in your parser to make sure all input characters or token identifiers are within the range specified in your grammar. If you do not turn this switch on, your parser will run slightly faster, but its behavior will be undefined if it gets input outside the range you have specified in your grammar. \index{Token names}\index{Configuration switches} \agparamheading{token names}{switch, default off} When \agparam{token names} is set, AnaGram includes a static array of ASCII strings in your parser containing the names of your tokens. The name of this array is \agcode{\#{\us}token{\us}names} where the ``\agcode{\#}'' character is replaced with the name of your parser. The entry for tokens which do not have names is an empty string: \agcode{""}. \index{Top margin}\index{Configuration parameters} \agparamheading{top margin}{integer value, default = 3} \agparam{Top margin} is an obsolete configuration parameter, recognized for the sake of compatibility with configuration files prepared for the DOS version of AnaGram. It is ignored by AnaGram 2.0. \index{Traditional engine}\index{Configuration switches} \agparamheading{traditional engine}{switch, default off} Traditional LALR-1 parsers use a parsing engine which has only four actions: shift, reduce, accept, and error. AnaGram, in the interests of faster execution and more compact tables, uses a parsing engine with a number of short-cut actions. The \agparam{traditional engine} switch tells AnaGram not to use the short-cut actions. You would set this switch primarily in conjunction with use of the \index{Grammar Trace}\index{Trace}\index{Window}\agwindow{Grammar Trace} in order to have a clearer idea of what is happening. AnaGram will then be using the same parsing actions as textbook parsers. Note that if a lookahead token has already been selected, AnaGram will display it on the last line of the \agwindow{Parser Stack} pane in the \agwindow{Grammar Trace} window. % XXX what is this note doing here? You should turn this switch back off when you have finished debugging or your parser will be larger and slower than necessary. % XXX: say that in production code traditional engine is not useful % and only serves to slow things down. \index{Video mode}\index{Configuration parameters} \agparamheading{video mode}{integer value, default = $-$1} \agparam{Video mode} is an obsolete configuration parameter, recognized for the sake of compatibility with configuration files prepared for the DOS version of AnaGram. It is ignored by AnaGram 2.0.