Mercurial > ~dholland > hg > ag > index.cgi
diff doc/manual/cfp.tex @ 0:13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
author | David A. Holland |
---|---|
date | Sat, 22 Dec 2007 17:52:45 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/manual/cfp.tex Sat Dec 22 17:52:45 2007 -0500 @@ -0,0 +1,952 @@ +\chapter{Configuration Parameters} +\index{Configuration parameters}\index{Parameters} + +\agterm{Configuration parameters} are named constants that control the +way AnaGram works. AnaGram ignores case\index{Case sensitivity} when +it looks up the names of configuration parameters, so that +\agcode{parser name} and \agcode{Parser Name} both refer to the same +parameter. Configuration parameters that have only true/false or +on/off values are often referred to as +\index{Configuration switches}\agterm{configuration switches}. + +Configuration parameters are used to control: + +\begin{itemize} +\item Comment nesting +\item Grammar analysis +\item Parser generation +\end{itemize} + +Every configuration parameter has a default value which has been +chosen to correspond to a standard if it exists, customary usage if +such can be determined, or otherwise to the most likely usage. + +Configuration parameters may be specified either in +\index{Configuration file}\index{File}\agparam{configuration files}, +always named \agfile{AnaGram.cfg}, or in a syntax file. A +configuration file is a normal ASCII file containing parameter +specifications. The syntax of a configuration file is the same as +that of a configuration segment within a syntax file, except that a +configuration file does not have the brackets ( \agcode{[ ]} ) that +enclose a configuration segment in a syntax file. You may comment the +configuration file freely, just as though it were a syntax file. +% XXX ``configuration segment'' is a forward reference and we should +% rearrange all this so it isn't. Also, the forward reference is +% ``configuration section''. Sigh. + +% Parameters can be set in either a configuration file or in your syntax +% file. +Apart from the \agparam{nest comments} switch, if a parameter +is specified more than once, only the last value is used (see below). +The \agparam{nest comments} switch, which affects the way AnaGram +reads your configuration and syntax files, takes effect as soon as +AnaGram encounters it in a file and stays in effect unless it is later +turned off. + +% XXX this should be belabored less. Also, good practice dictates that +% if you ship a project or a grammar it should compile in someone +% else's environment, and we shouldn't encourage people to do things +% like put \agparam{pointer input} in a systemwide AnaGram.cfg. +% +% XXX also in the Unix world it ought to read +% /usr/local/etc/AnaGram.cfg and then also ~/.AnaGram.cfg - or +% something like that. And it ought to be possible to set params +% on the agcl command line. We need to think about this. (Well, +% there's not really any valid use for either, so perhaps it +% doesn't matter.) +% +% How about something like +% +% Support for a global configuration file dates from the DOS-based +% AnaGram 1.x, where the same configuration mechanism was used to +% establish user interface preferences. AnaGram 2.0 and above handle +% preferences separately, and the configuration system is only used +% for code-related options. Since good practice dictates that code +% should continue to work if exported outside of one's personal +% environment, there are few or no legitimate uses of the global +% configuration file and support for it will likely be removed in a +% future AnaGram release. +% +% (But there really should be support for params on the agcl command +% line; if nothing else it would make it a lot easier to test +% combinations of settings.) +% +On initialization, AnaGram checks the directory that contains the +AnaGram executable file. If it finds \agfile{AnaGram.cfg}, it reads it +and sets internal parameters accordingly. It then looks for +\agfile{AnaGram.cfg} in your working directory and, if it finds it, reads +it in turn. If any parameter is set in both files, the last setting +wins. The effect of this two stage process is to allow you to set +your standard preferences in the principal directory, with specific +overrides in your working directories. You may also put configuration +parameters in your syntax file, which override the settings in the +configuration files. Note that neither configuration file is +necessary. + +Before executing an Analyze Grammar or Build Parser command, AnaGram +resets configuration parameters to their initial values, as determined +by the built in defaults and the configuration files read at program +initialization. + +There are, therefore, four levels at which parameters may be set. At +the first level, there are the settings built into AnaGram. If you +don't like some of these, you can override them with a configuration +file at the second level, the tools directory where you installed +AnaGram. If a particular project needs overrides, you can put them in +a configuration file at the third level, the working directory for +this project. And if you have specific configuration requirements for +a particular parser, the best place for them is the fourth level, the +syntax file for the parser. + +For all of this flexibility, some people prefer to set every +configuration parameter explicitly in their syntax files so there is +no question as to what setting is being used. AnaGram is set up so +you can do it whichever way you prefer. + +If you are uncertain as to the actual parameters that AnaGram is using +at any time, the +\index{Configuration Parameters}\index{Window} +\agwindow{Configuration Parameters} window listed in the +\agmenu{Windows} menu will show you the current state of all +parameters. + +The different varieties of configuration parameters are described +below. Each definition of a parameter must start on a new line. A +configuration file is just a sequence of parameter definitions, each +on a separate line. Blank lines can be used as separators where you +please, and comments may be used as described for syntax files. +Case\index{Case sensitivity} is ignored for parameter names (but not +for the whole definition). In a syntax file, each set of definitions +must be enclosed with brackets ( \agcode{[ ]} ), forming a +\index{Configuration section}\agterm{configuration section}, one of +the four kinds of AnaGram statements. Configuration sections can be +scattered throughout a syntax file, but each section should begin on a +new line, and following statements should also of course start on new +lines. There is no restriction on the number of sections, or on the +number of times a parameter appears. The last setting of a parameter +wins. + +The first variety of configuration parameter is a simple +\index{Switches}\index{Configuration switches}switch that controls +one of the various features of AnaGram. Such parameters are also called +\agterm{configuration switches}. They need simply be stated to set the +condition (turn it on) or negated with the tilde (\agcode{\~{}}) to +reset the condition (turn it off). Thus + +\begin{indentingcode}{0.4in} +nest comments +\end{indentingcode} +causes AnaGram to allow nested comments, and + +\begin{indentingcode}{0.4in} +\~{}nest comments +\end{indentingcode} +causes AnaGram to disallow nested comments. + +You may also set or reset configuration switches with explicit on or +off values: + +\begin{indentingcode}{0.4in} +nest comments = on +nest comments = off +\end{indentingcode} + +A second variety of configuration parameter takes a value which is the +name of a token. Thus + +\begin{indentingcode}{0.4in} +grammar token = c grammar +\end{indentingcode} +specifies that the token \agcode{c grammar} is the grammar that +AnaGram should use as the starting point for analyzing your grammar. + +A third variety of configuration parameter takes a value which is a C +or C++ data type. Thus + +\begin{indentingcode}{0.4in} +default token type = unsigned char * +\end{indentingcode} +signifies that the value of a token, unless otherwise specified, is a +pointer to an \agcode{unsigned char}. AnaGram does not accept the +full panoply of C and C++ \index{Data type}data types. The +restrictions are that AnaGram does not allow specification of array or +function types, nor explicit structure types. Types that are defined +with typedef statements, structure definitions, or class definitions, +including template classes, in your embedded C or C++ are acceptable. +If you have more complex data types, you should define a simple name +using a typedef statement. + +A fourth variety of configuration parameter takes a string value to +set an ASCII string used by AnaGram. Thus + +\begin{indentingcode}{0.4in} +header file name = "widget.h" +\end{indentingcode} +signifies that the header file created by AnaGram should be called +\agfile{widget.h}. In +those strings which are used to name the parser or files which AnaGram +builds, the character ``\agcode{\#}'' is used to indicate that AnaGram +should substitute the name of your syntax file. In strings used to +determine the names of program variables or functions, ``\agcode{\$}'' +is used to indicate that AnaGram should substitute the name of your +parser. When building enumeration constants for the names of the +tokens in your grammar, ``\agcode{\%}'' will be replaced by the name +of the token. + +The final variety of configuration parameter takes a numeric value. +The value may be decimal, octal or hexadecimal, following the C +conventions, and may have an optional sign. Thus + +\begin{indentingcode}{0.4in} +parser stack size = 50 +\end{indentingcode} +tells AnaGram to allocate space for at least fifty stack entries when +it creates your parser. + +If AnaGram does not recognize a parameter, it will give you a warning +with line number, column number, and the message ``no such +parameter''. If the value for a parameter is inappropriate, such as a +string value for a parameter which should have a numeric value, the +message will be ``inappropriate value''. If the error occurs in the +configuration file found in the AnaGram directory, AnaGram will prefix +the warning with the complete path name for the file. If the error +occurs in the configuration file in your working directory, AnaGram +will prefix the warning with ``AnaGram.cfg:''. If AnaGram encounters a +syntax error while reading a configuration file, it will honor the +parameter settings it found before the syntax error, but will ignore +everything that follows the error. + +\section{Alphabetic Listing of Configuration Parameters} + +\index{Configuration switches}\index{Allow macros}\index{Macros} +\agparamheading{allow macros}{switch, default on} + +When this switch is set, i.e., on, reduction procedures will be +implemented as macros if they are sufficiently simple. This makes +your parser some what more compact and faster but makes it somewhat +more difficult to debug. It's a good idea to turn this switch off for +debugging. + +\index{Configuration switches}\index{Auto init} +\agparamheading{auto init}{switch, default on} + +This switch controls the initialization of any parser that is not +\agparam{event driven}. When it is on, the +\index{Initializer}initializer for your parser is automatically called +every time the parser is called. +This is the normal situation. On occasion, however, it +is desirable to call a parser several times without reinitializing it. +In this case, you may set the \agparam{auto init} parameter to off. +Should you do this, you must call the initializer yourself whenever +appropriate. +% XXX characterize the occasion... + +When \agparam{event driven} is set, \agparam{auto init} has no effect. + +\index{Configuration switches}\index{Auto resynch} +\agparamheading{auto resynch}{switch, default off} + +Setting this switch causes AnaGram to include an automatic +resynchronization procedure in the parser. The resynchronization +procedure will be invoked upon encountering a syntax error and will +skip over input until it finds input characters or tokens consistent +with its state at the time of the error. The purpose of the +resynchronization procedure is to provide a simple way for your parser +to proceed in the event of syntax errors so that it can find more than +one syntax error on a given pass. The resynchronization procedure +uses a heuristic based on your own syntax. AnaGram itself uses this +technique to resynchronize after syntax errors in its input. + +A disadvantage to using this resynchronization technique is that the +resynchronization procedure turns off all reduction procedures. The +reason is that the resynchronization may cause a number of reduction +procedures to be skipped. This means that the parameters for any +reduction procedures that might be called later would be suspect and +could cause serious problems. It seems more prudent simply to shut +them down. Semantically determined productions will subsequently, of +course, always use the default reduction token. + +If you have a +\index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} +macro, it will be called \emph{before} the resynchronization +process. It will also be called on subsequent syntax errors, so your +program will not lose control entirely. + +If you use the auto resynchronization procedure, you must also specify +the \agparam{eof token} configuration parameter (see below) so that +the synchronizer doesn't inadvertently try to pass over the end of +file. + +For other methods of recovering from syntax errors, see Chapter 9. + +\index{Configuration switches}\index{Backtrack} +\agparamheading{backtrack}{switch, default on} + +If your parser does not continue after encountering a syntax error, +you can speed up your parser and make it a little smaller by turning +off the \agparam{backtrack} switch. If \agparam{backtrack} is on, +AnaGram configures your parser so that in case of syntax error it can +undo any default reductions it might have made as a consequence of the +erroneous input. The purpose of such an undo function is to identify +the proper error frame and to maximize the probability of being able +to recover gracefully. + +% XXX shouldn't these be indexed as ``obsolete parameters'' or +% something, with xrefs so if you look up ``Bottom margin'' in the +% index it says ``see ``obsolete parameters''''? +% +% Also, shouldn't the various obsolete parameters be described with +% the same text? +% +\index{Configuration parameters}\index{Bottom margin} +\agparamheading{bottom margin}{integer value, default = 3} + +This is an obsolete parameter which was used in the DOS version of +AnaGram. It is no longer used, but is still recognized for the sake +of compatibility. + +\index{Configuration switches}\index{Bright background} +\agparamheading{bright background}{switch, default on} + +This configuration switch is not used in AnaGram 2.0. It is retained +for compatibility with configuration files used with the DOS versions +of AnaGram. + +\index{Configuration switches}\index{Case sensitive} +\index{Case sensitivity} +\agparamheading{case sensitive}{switch, default on} + +Use this switch to control how your parser deals with distinctions +between upper and lower case. When \agparam{case sensitive} is on, +AnaGram builds a parser which distinguishes upper from lower case. +When this switch is off, AnaGram builds a parser which ignores case +for all input. This does not mean that the values of character set +tokens are not case sensitive. Although 'a' and 'A' would map to the +same token, the values would still be lower and upper case +respectively. + +% XXX the last bit could be explained more clearly. (something like +% ``parsers still preserve case'') + +% XXX this should discuss character sets, locales, and other such +% garbage. + +\index{Configuration parameters}\index{Compile command} +\agparamheading{compile command}{string, default = \agcode{NULL}} + +This parameter is retained only for compatibility with the DOS version +of AnaGram. It is ignored in the Windows version. + +\index{Configuration switches}\index{Const data} +\agparamheading{const data}{switch, default on} + +The \agparam{const data} switch controls the use of \agcode{const} +qualifiers in generated C code. If the switch is on, all fixed data +arrays in the parser file will be qualified as \agcode{const}. The +\agparam{const data} switch is ignored if the \agparam{old style} +switch is set. + +\index{Configuration parameters}\index{Context type} +%XXX: \index{context tracking} ? +\agparamheading{context type}{c data type, no default} + +By default, \agparam{context type} is undefined. If you assign the +name of a C data type, AnaGram will implement ``context tracking'' in +your parser. See Chapter 9. The data type name can be either a +standard, pre-defined data type or one which you create with a +\agcode{typedef} statement. + +\index{Configuration parameters}\index{Coverage file name} +\index{File extension}\index{nrc} +\agparamheading{coverage file name}{string, default = \agcode{"\#.nrc"}} + +If you set the \agparam{rule coverage} configuration switch, AnaGram +will provide functions in your parser to read and write rule counts to +a file. The name of the file will be determined by \agparam{coverage +file name}. The name of your syntax file will be substituted for the +``\agcode{\#}'' character. + +\index{Configuration switches}\index{Declare pcb} +% XXX \index{Parser control block} ? +\agparamheading{declare pcb}{switch, default on} + +When AnaGram builds a parser, it checks the status of the +\agparam{declare pcb} switch. If it is on, AnaGram declares a parser +control block for you. AnaGram creates the name of the control block +variable by appending \agcode{{\us}pcb} to the name of your parser. +AnaGram will also code an \agcode{\#include} statement to include your +parser header file, and will define the \agcode{PCB} macro for you. +If you wish to declare the parser control block yourself you should +turn this switch off. + +\index{Configuration parameters}\index{Default input type} +\index{Input type} +% XXX: \index{Types} ? +\agparamheading{default input type}{c data type, default = \agcode{int}} + +This parameter tells AnaGram what data type to assume for terminal +tokens if they are not explicitly declared. Normally, you would +explicitly declare terminal tokens only when you have set the +\agparam{input values} configuration switch. The default type for +nonterminal tokens is given by \agparam{default token type}. + +\index{Configuration switches}\index{Default reductions}\index{Reduction} +\agparamheading{default reductions}{switch, default on} + +If in a given parser state there is only one production that could be +possibly reduced, it is usually faster to reduce it on any input than +to check specifically for correct input before reducing it. The only +time this default reduction causes trouble is in the event of +erroneous input. In this situation you may get an erroneous +reduction. Normally when you are parsing a file, this is +inconsequential because you are not going to continue semantic action +in the presence of error. But, if you are using your parser to handle +real-time interactive input, you have to be able to continue semantic +processing after notifying your user that he has entered erroneous +input. In this case you would want to turn the \agparam{default +reductions} switch off so that productions are reduced only when there +is correct input. + +\index{Configuration parameters}\index{Default token type}\index{token} +% XXX \index{Types} ? +\agparamheading{default token type}{c data type, default = \agcode{void}} + +This parameter takes a C data type as its value. It is used to set +the data type for the semantic values of nonterminal tokens whose type +is not explicitly specified in the grammar. To set the default type +for terminal tokens use \agparam{default input type}. + +\index{Diagnose errors}\index{Configuration switches} +\agparamheading{diagnose errors}{switch, default on} + +If you set this switch, AnaGram will include a syntax error diagnostic +procedure in your parser. This procedure will be called before your +\index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} macro is +called. It will store a pointer to a string in the +\agcode{error{\us}message} field of your parser control +block. The string will contain a diagnostic message. If there is +only one syntactically correct input, x, for example, the message will +be ``Missing x''. Otherwise it will be ``Unexpected x'' if the input +is recognizable but incorrect and ``Unexpected input'' otherwise. If +the \agparam{error frame} switch has been set, the +\agcode{error{\us}frame{\us}ssx} and +\agcode{error{\us}frame{\us}token} fields +in the parser control block will be set as described in Chapter 9. + +% XXX say: diagnose errors causes the token_names[] array to be +% included in the parser. and index token_names[]... + +\index{Distinguish lexemes}\index{Configuration switches} +% XXX \index{Disregard} ? +\agparamheading{distinguish lexemes}{switch, default off} + +The \agparam{distinguish lexemes} switch has no effect unless a +disregard token has been defined. Normally, the disregard token +(usually white space) is optional between lexemes. This may lead to +apparent shift-reduce conflicts if the characters that comprise the +second of two successive lexemes can be construed as part of the first +lexeme. In this situatation, turning on the \agparam{distinguish +lexemes} switch effectively requires a disregard token to separate the +two lexemes. + +\index{Edit command}\index{Configuration parameters} +\index{File extension}\index{syn} +\agparamheading{edit command}{string, default = \agcode{"ed \#.syn"}} + +This parameter is no longer used and is retained only for file +compatibility with the DOS version of AnaGram. + +\index{Enable mouse}\index{Configuration switches} +\agparamheading{enable mouse}{switch, default on} + +This parameter is no longer used and is retained only for file +compatibility with the DOS version of AnaGram. + +\index{Enum constant name}\index{Configuration parameters} +\agparamheading{enum constant name}{string, +default = \agcode{"\${\us}\%{\us}token"}} + +Use the \agparam{enum constant name} parameter to control the names +AnaGram uses for the enumeration constants it defines in the +header file for your parser. The value of \agparam{enum constant +name} should be a string containing the ``\agcode{\%}'' character. +AnaGram will substitute each token name in turn for the +``\agcode{\%}'' character in this template as it creates the list of +enumeration constants. If it finds a ``\agcode{\$}'' character it +will substitute the name of your parser. + +\index{Eof token}\index{Configuration parameters}\index{Token} +\agparamheading{eof token}{token name, no default} + +If you use the auto resynchronization capability of AnaGram, you must +specify an end of file token explicitly. You can do this either by +specifying a terminal token in your grammar called \agcode{eof} or by +using the \agparam{eof token} parameter to identify some other +terminal token to be used as the end of file marker. You would do +this only if you must use the name \agcode{eof} for some other +purpose. + +\index{Error frame}\index{Error frame}\index{Configuration switches} +\agparamheading{error frame}{switch, default off} + +AnaGram uses the \agparam{error frame} switch in conjunction with the +\index{Diagnose errors}\index{Configuration switches}\agparam{diagnose errors} +switch. If both are set, when your parser encounters a syntax error, +before invoking the +\index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} macro, +your parser will determine the frame in which the error occurred, that +is, the production the parser was trying to match at the time of the +error. + +% XXX: See chapter (dd.tex) for a complete discussion. + +\index{Configuration parameters}\index{Error token}\index{Token} +\agparamheading{error token}{token name, no default} + +One of your options for error recovery after a syntax error is a +technique similar to that provided in \agfile{yacc}. You include a +terminal token called \agcode{error} in your grammar. When the parser +encounters an error in the input it backs up the state stack to the +most recent state in which \agcode{error} was an acceptable input. It +then shifts to the new state as though it had seen an actual +\agcode{error} token. At this point, it skips over any character in +the input which is not an acceptable input character for this state. +Once it does find an acceptable input character, it continues +processing as though nothing had happened. If you wish to use this +approach and for some reason you wish to use the name \agcode{error} +for some other token in your grammar, you may use the \agparam{error +token} parameter to identify some other terminal token in your grammar +as the ``error token''. + +\index{Configuration switches}\index{Error trace}\index{Trace} +\index{Window} +\agparamheading{error trace}{switch, default off} + +If you turn the \agparam{error trace} switch on, AnaGram will include +code in your parser so that when it encounters a syntax error it will +write the contents of the \index{Parser state stack}\index{State +stack}\index{Stack}parser state stack to a file. The name of the file +is the same as the name of your syntax file but with the extension +\index{File extension}\index{etr}\agfile{.etr}. You may override this +definition by defining +\index{AG{\us}TRACE{\us}FILE{\us}NAME}\index{Macros}\agcode{AG{\us}TRACE{\us}FILE{\us}NAME} +in your embedded C. + +The \agmenu{Error Trace} option in the \agmenu{Action} menu can then +read this information and prepare a pre-built \agwindow{Grammar Trace} +showing you the status of the parse at the time of the syntax error. +You would use this switch primarily when you are first checking out +your grammar to make sure it accurately represents the input you +desire to handle. You would also use it any time your parser +encounters a syntax error you don't understand. For more information, +see Chapter 5. + +\index{Escape backslashes}\index{Configuration switches} +\agparamheading{escape backslashes}{switch, default off} + +\agparam{Escape backslashes} is used only in conjunction with the +\agparam{line numbers} option. When turned on, it causes the +backslashes in the pathname generated by the \agparam{line numbers} +option to be doubled. This switch has been provided because C and C++ +compilers are not consistent in their handling of backslashes in path +names. + +\index{Event driven}\index{Configuration switches} +% XXX \index{AG{\us}RUNNING{\us}CODE} ? +% XXX \index{exit{\us}flag} ? +\agparamheading{event driven}{switch, default off} + +If you turn the \agparam{event driven} switch on, when you build a +parser, it will be configured as an ``event driven'' parser. This +means that after calling its initializer function, you call it once +with each discrete unit of input. The parser proceeds until it +needs more input, finishes the grammar, or encounters an error. It +then returns. The \agcode{exit{\us}flag} field in the parser control +block is equal to \agcode{AG{\us}RUNNING{\us}CODE} if more input is needed. +Other values indicate other reasons for termination. +% XXX crossreference the discussion of exit codes? + +When \agparam{event driven} is on, \agparam{auto init} has no effect; +you must always call the initializer function yourself. + +\index{Far tables}\index{Configuration switches} +\agparamheading{far tables}{switch, default = off} + +If \agparam{far tables} is on when AnaGram builds a parser, it will +declare the larger tables it builds as \agcode{far}. This can be a +convenience when using some memory models of the 8086 architecture. + +\index{Grammar token}\index{Configuration parameters}\index{Token} +\agparamheading{grammar token}{token name, no default} + +The \agparam{grammar token} parameter may be used to specify the +grammar, or ``goal'', token for the syntax analyzer portion of +AnaGram. An alternative method is to append a ``\$'' to the goal +token when you define it. You may also simply use the name +\agcode{grammar} to identify the grammar token. + +\index{Header file name}\index{Configuration parameters}\index{File name} +\agparamheading{header file name}{string, default = \agcode{"\#.h"}} + +This parameter names the parser header file AnaGram generates. The +contents of the header file are described in Chapter 9. When AnaGram +creates the file, it copies the value of \agparam{header file name}, +substituting the name of your syntax file for the ``\agcode{\#}'' +character, in order to create the pathname and extension for the file. +You can therefore use this parameter to give the header file a +particular name, independent of the syntax file name, or to specify a +particular drive or directory where you want the header file to +reside. Note that if you include a full DOS/Windows pathname, +backslash characters must be quoted. + +\index{Input values}\index{Configuration switches} +\agparamheading{input values}{switch, default off} + +% XXX this shouldn't say ASCII because it's true even if the +% characters are some other character set... +If the input to your parser includes explicit token values which are +not simply the ASCII values of corresponding ASCII input characters, +you must set the \agparam{input values} switch to inform AnaGram. +Unless your parser is \agparam{event driven}, you must also provide +your own \agcode{GET{\us}INPUT} macro. + +\index{Line length}\index{Configuration parameters} +\agparamheading{line length}{integer value, default = 80} + +\agparam{Line length} is an obsolete configuration parameter, recognized +for the sake of compatibility with configuration files prepared for +the DOS version of AnaGram. It is ignored in AnaGram 2.0. + +\index{Line numbers}\index{configuration switches} +\agparamheading{line numbers}{switch, default off} + +If \agparam{line numbers} is set, AnaGram will put syntax file line +numbers into the generated C code file using the +\index{\#line}\agcode{\#line} +directive so that your compiler diagnostics will refer to lines in the +syntax file rather than in the generated C code file. If +\agparam{line numbers} is off, AnaGram will put syntax file line +numbers in comments. The +\index{Line numbers path}\index{Configuration parameters} +\agparam{line numbers path} and +\index{Escape backslashes}\index{Configuration switch} +\agparam{escape backslashes} +switches may be used to control the generation of the line number +directives. + +\index{Line numbers path}\index{Configuration parameters} +\agparamheading{line numbers path}{string, default = \agcode{NULL}} + +When you have set the \agparam{line numbers} switch and +\agparam{line numbers path} is not NULL, AnaGram uses it in the +\agcode{\#line} directive in place of the full path name of your +syntax file. +% XXX update for unix where we (maybe) don't generate full pathnames + +\index{Lines and columns}\index{Configuration switches} +\agparamheading{lines and columns}{switch, default on} + +If this switch is set, AnaGram will incorporate code into your parser +to track line numbers and column numbers in its input. At all times, +the \agcode{line} and \agcode{column} fields in your parser control +block will mark the location of the current lookahead character. The +treatment of tab characters is controlled by the +\index{TAB{\us}SPACING}\index{Macros}\agcode{TAB{\us}SPACING} macro. + +\index{Main program}\index{Configuration switches} +\agparamheading{main program}{switch, default on} + +The \agparam{main program} switch determines what AnaGram does if you +invoke the Build Parser command, but have no embedded C in your syntax +file. If the switch is on, AnaGram creates a main program which does +nothing but call your parser. The switch is ignored if your parser +uses \agparam{pointer input} or is \agparam{event driven}. + +\index{Max conflicts}\index{Configuration parameters}\index{Conflicts} +\agparamheading{max conflicts}{integer value, default = 50} + +\agparam{Max conflicts} limits the number of conflicts AnaGram will +record. Sometimes, a simple editing error in your syntax file can +cause hundreds of conflicts, which you don't need to see in gory +detail. If you have a grammar that is in serious trouble and you want +to see more conflicts, you may change \agparam{max conflicts} to suit +your needs. + +\index{Near functions}\index{Configuration switches} +\agparamheading{near functions}{switch, default off} + +\agparam{Near functions} controls the use of the \agcode{near} keyword +for static functions in your parser. If your parser is to run on a +16-bit 80x86 processor you would want to turn it on. If you are +going to run your parser on some other processor or use a C compiler +that does not support the \agcode{near} keyword you should leave +\agparam{near functions} off. + +\index{Configuration switches}\index{Nest comments}\index{Comments} +\agparamheading{nest comments}{switch, default off} + +Use this switch to allow nested comments in your syntax or +configuration files. It defaults to off, in accordance with the ANSI +standard for C. Note that AnaGram scans comments in any embedded C +code as well as in the grammar specification. You may turn this +switch on and off as many times as necessary in a single file. + +\index{Old style}\index{Configuration switches} +\agparamheading{old style}{switch, default off} + +\agparam{Old style} controls the function definitions in the code +AnaGram generates. When \agparam{old style} is off, AnaGram generates +ANSI style calling sequences with prototypes as necessary. When +\agparam{old style} is on, it generates old style function definitions, +and no prototypes. It also causes the +\index{Const data}\index{Configuration switch}\agparam{const data} +switch to be ignored. + +\index{Page length}\index{Configuration parameters} +\agparamheading{page length}{integer value, default = 66} + +\agparam{Page length} is an obsolete configuration parameter, +recognized for the sake of compatibility with configuration files +prepared for the DOS version of AnaGram. It is ignored in AnaGram +2.0. + +\index{Parser file name}\index{Configuration parameters}\index{File name} +\agparamheading{parser file name}{string, default = \agcode{"\#.c"}} + +AnaGram creates a parser which consists of all the embedded C code in +your syntax file, the syntax tables created by the syntax analyzer, +and a parsing engine configured to your requirements. This code is +written to a file whose name is given by this parameter. When AnaGram +creates your parser file, it copies the value of the \agparam{parser +file name} parameter, substituting the name of your syntax file for +the ``\agcode{\#}'' character, in order to create the pathname and +extension for the file. You can therefore use this parameter to give +the parser file a particular name, independent of the syntax file +name, or to specify a particular drive or directory where you want the +parser file to reside. Note that if you include a full DOS/Windows +pathname, you must quote the backslash characters. If writing a C++ +parser you would use this parameter to set the output filename suffix. + +\index{Parser}\index{Parser name}\index{Configuration parameters} +\agparamheading{parser name}{string, default = \agcode{"\$"}} + +% XXX This should say something other than ``name your parser'' +AnaGram uses the value of \agparam{parser name} to name your parser, +substituting the name (not including the extension) of your syntax +file for a ``\agcode{\$}'' character. If you accept the default value of +\agparam{parser name} and have a syntax file called \agfile{ana.syn}, +AnaGram will name your parser \agcode{ana}. + +The \index{Initializer}initializer for your parser will have the same +name preceded by \agcode{init{\us}}. In the above example, the +initializer would be called \agcode{init{\us}ana}. + +\index{Configuration parameters}\index{Stack}\index{Parser stack alignment} +\agparamheading{parser stack alignment}{c data type, default = \agcode{int}} + +\agparam{Parser stack alignment} is used to control byte alignment of +the parser stack, \agcode{PCB.vs}. AnaGram normally adds a field of +the specified data type to the \agcode{union} declaration that defines +the data type for the parser stack. This parameter can be used to +deal with byte alignment problems when a parser is to be run on a +processor with byte alignment restrictions. For instance, if your +grammar has tokens of type \agcode{double} and your processor requires +double precision variables to be properly aligned, you can include the +following statement in a configuration section in your grammar or in +your configuration file: +\begin{indentingcode}{0.4in} +parser stack alignment = double +\end{indentingcode} +If the data type is \agcode{void}, no alignment declaration will be +made. +% You will not need to change this parameter if your parser is to +% run on a PC or compatible processor. +% +% XXX this really ought to be updated for the century of the fruitbat + +\index{Configuration parameters}\index{Parser stack size} +\agparamheading{parser stack size}{integer value, default = 32} + +\agparam{Parser stack size} is used to set the sizes of the parser +stacks in your parser control block. When AnaGram analyzes your +grammar, it determines the minimum amount of stack space required for +the deepest left recursion. To this depth it adds one half the value +of the \agparam{parser stack size} parameter. It then sets the actual +stack size to the larger of this value and the \agparam{parser stack +size} parameter. If you find 32 wastefully large or dangerously +small, you can define it to suit the needs of your particular parser. + +\index{Pointer input}\index{Configuration switches} +\agparamheading{pointer input}{switch, default off} + +When you turn \agparam{pointer input} on you tell AnaGram that the +input to your parser is in memory and can be scanned simply by +incrementing a pointer. Before calling your parser you should make +sure that the \agcode{pointer} field in your parser control block is +properly initialized to point to the first character or token in your +input. + +Use the parameter +\index{Pointer type}\index{Configuration parameters}\agparam{pointer type} +to specify the type of the pointer. The default value of pointer type +is \agcode{unsigned char *}. + +\index{Pointer type}\index{Configuration parameters} +\agparamheading{pointer type}{c data type, default = \agcode{unsigned char *}} + +If you have set the \agparam{pointer input} switch, AnaGram will use +the value of the \agparam{pointer type} parameter to declare the +\agcode{pointer} field in your parser control block. + +\index{Print file name}\index{Configuration parameters}\index{File name} +\agparamheading{print file name}{string, default = \agcode{"LPT1"}} + +\agparam{Print file name} is an obsolete configuration parameter, +recognized for the sake of compatibility with configuration files +prepared for the DOS version of AnaGram. It is ignored by AnaGram +2.0. + +\index{Quick reference}\index{Configuration switches} +\agparamheading{quick reference}{switch, default off} + +The \agparam{quick reference} switch is no longer used, but is still +recognized for compatiblity's sake. In future versions of AnaGram it +may no longer be recognized. + +\index{Configuration switches}\index{Reduction choices} +\agparamheading{reduction choices}{switch, default off} + +If the \agparam{reduction choices} switch is set when AnaGram builds a +parser, it will include in your parser file a function which can +identify the acceptable choices for the reduction token in the current +state. You would use this switch only if you were using semantically +determined productions in your grammar and if there were states in +which not all the tokens on the left side of the production were valid +reduction tokens. + +\index{Rule coverage}\index{Configuration switches}\index{Coverage} +\agparamheading{rule coverage}{switch, default off} + +If you set the \agparam{rule coverage} switch, AnaGram will include +code in your parser to count the number of times your parser identifies +each rule in your grammar. To maintain the counts, AnaGram declares, +at the beginning of your parser, an integer array, whose name is +created by appending \agcode{{\us}nrc} to the name of your parser. The +array contains one counter for each rule you have defined in your +grammar. There are no entries for the auxiliary rules that AnaGram +creates to deal with set overlaps or disregard statements. In order +to identify every rule that the parser reduces in the course of +execution, AnaGram +has to turn off certain optimization features in your parser. +Therefore, a parser that has the \agparam{rule coverage} switch +enabled will run slightly slower than one with the switch off. An +entry on the \agmenu{Browse} menu allows you to view the coverage data. +% XXX See Chapter ???. + +\index{Tab spacing}\index{Configuration parameters} +\agparamheading{tab spacing}{integer value, default = 8} + +\agparam{Tab spacing} controls the expansion of tabs when AnaGram +displays your syntax file or the \agwindow{File Trace} test file. + +The value of \agparam{tab spacing} is also used to set the default +value of the \index{TAB{\us}SPACING}\index{Macros}\agcode{TAB{\us}SPACING} +macro in your parser. + +The default value of \agparam{tab spacing} is 8. If you prefer a +different value, you should probably include an appropriate statement +in your configuration file. For example: + +\begin{indentingcode}{0.4in} +tab spacing = 2 +\end{indentingcode} + +\index{Test file binary}\index{Configuration switch} +\agparamheading{test file binary}{switch, default off} + +\agparam{Test file binary} causes \agwindow{File Trace} to read test +files in binary mode. When \agwindow{File Trace} reads a test file, +it normally reads it in text mode, which in Windows causes carriage return +characters to be stripped out. Occasionally it is necessary to test a +grammar where carriage return characters are important and should not +be stripped. In this situation, set \agparam{test file binary} to on, +and the carriage return characters will not be discarded. +% XXX rewrite the second half of this paragraph? + +\index{Test file mask}\index{Configuration parameters} +\agparamheading{test file mask}{string, default = \agcode{"*.*"}} + +% XXX default should be ``*'' on unix +AnaGram uses \agparam{test file mask} to filter the pick list of test +files when you use the +\index{File Trace}\index{Trace}\index{Window}\agwindow{File Trace} +feature. +You may set it to any value you wish, including a pathname. +% XXX: test this +For instance, if you know that all your test files are in the directory +\agfile{C:{\bs}PROJECT{\bs}SOURCE} and have +extension \agfile{.FOO} you could set test file mask to +\agcode{"C:{\bs\bs}PROJECT{\bs\bs}SOURCE{\bs\bs}*.FOO"}. +Note that, as in any string literal, backslash characters must be +escaped. + +\index{Test range}\index{Configuration switches}\index{Range} +\agparamheading{test range}{switch, default off} +% XXX should this really default to off? + +When \agparam{test range} is on, AnaGram will insert code in your +parser to make sure all input characters or token identifiers are +within the range specified in your grammar. If you do not turn this +switch on, your parser will run slightly faster, but its behavior will +be undefined if it gets input outside the range you have specified +in your grammar. + +\index{Token names}\index{Configuration switches} +\agparamheading{token names}{switch, default off} + +When \agparam{token names} is set, AnaGram includes a static array of +ASCII strings in your parser containing the names of your tokens. The +name of this array is \agcode{\#{\us}token{\us}names} where the +``\agcode{\#}'' character is replaced with the name of your parser. +The entry for tokens which do not have names is an empty string: +\agcode{""}. + +\index{Top margin}\index{Configuration parameters} +\agparamheading{top margin}{integer value, default = 3} + +\agparam{Top margin} is an obsolete configuration parameter, +recognized for the sake of compatibility with configuration files +prepared for the DOS version of AnaGram. It is ignored by AnaGram +2.0. + +\index{Traditional engine}\index{Configuration switches} +\agparamheading{traditional engine}{switch, default off} + +Traditional LALR-1 parsers use a parsing engine which has only four +actions: shift, reduce, accept, and error. AnaGram, in the interests +of faster execution and more compact tables, uses a parsing engine +with a number of short-cut actions. The \agparam{traditional engine} +switch tells AnaGram not to use the short-cut actions. + +You would set this switch primarily in conjunction with use of the +\index{Grammar Trace}\index{Trace}\index{Window}\agwindow{Grammar Trace} +in order to have a clearer idea of what is happening. AnaGram will +then be using the same parsing actions as textbook parsers. Note that +if a lookahead token has already been selected, AnaGram will display +it on the last line of the \agwindow{Parser Stack} pane in the +\agwindow{Grammar Trace} window. +% XXX what is this note doing here? + +You should turn this switch back off when you have finished debugging +or your parser will be larger and slower than necessary. + +% XXX: say that in production code traditional engine is not useful +% and only serves to slow things down. + +\index{Video mode}\index{Configuration parameters} +\agparamheading{video mode}{integer value, default = $-$1} + +\agparam{Video mode} is an obsolete configuration parameter, +recognized for the sake of compatibility with configuration files +prepared for the DOS version of AnaGram. It is ignored by AnaGram +2.0.