Mercurial > ~dholland > hg > ag > index.cgi

diff doc/manual/cfp.tex @ 0:13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
author: David A. Holland
date: Sat, 22 Dec 2007 17:52:45 -0500
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/manual/cfp.tex	Sat Dec 22 17:52:45 2007 -0500
@@ -0,0 +1,952 @@
+\chapter{Configuration Parameters}
+\index{Configuration parameters}\index{Parameters}
+
+\agterm{Configuration parameters} are named constants that control the
+way AnaGram works.  AnaGram ignores case\index{Case sensitivity} when
+it looks up the names of configuration parameters, so that
+\agcode{parser name} and \agcode{Parser Name} both refer to the same
+parameter.  Configuration parameters that have only true/false or
+on/off values are often referred to as
+\index{Configuration switches}\agterm{configuration switches}.
+
+Configuration parameters are used to control:
+
+\begin{itemize}
+\item Comment nesting
+\item Grammar analysis
+\item Parser generation
+\end{itemize}
+
+Every configuration parameter has a default value which has been
+chosen to correspond to a standard if it exists, customary usage if
+such can be determined, or otherwise to the most likely usage.
+
+Configuration parameters may be specified either in
+\index{Configuration file}\index{File}\agparam{configuration files},
+always named \agfile{AnaGram.cfg}, or in a syntax file.  A
+configuration file is a normal ASCII file containing parameter
+specifications.  The syntax of a configuration file is the same as
+that of a configuration segment within a syntax file, except that a
+configuration file does not have the brackets ( \agcode{[ ]} ) that
+enclose a configuration segment in a syntax file.  You may comment the
+configuration file freely, just as though it were a syntax file.
+% XXX ``configuration segment'' is a forward reference and we should
+% rearrange all this so it isn't. Also, the forward reference is
+% ``configuration section''. Sigh.
+
+% Parameters can be set in either a configuration file or in your syntax
+% file.
+Apart from the \agparam{nest comments} switch, if a parameter
+is specified more than once, only the last value is used (see below).
+The \agparam{nest comments} switch, which affects the way AnaGram
+reads your configuration and syntax files, takes effect as soon as
+AnaGram encounters it in a file and stays in effect unless it is later
+turned off.
+
+% XXX this should be belabored less. Also, good practice dictates that
+% if you ship a project or a grammar it should compile in someone
+% else's environment, and we shouldn't encourage people to do things
+% like put \agparam{pointer input} in a systemwide AnaGram.cfg.
+%
+% XXX also in the Unix world it ought to read
+% /usr/local/etc/AnaGram.cfg and then also ~/.AnaGram.cfg - or
+% something like that. And it ought to be possible to set params
+% on the agcl command line. We need to think about this. (Well, 
+% there's not really any valid use for either, so perhaps it 
+% doesn't matter.)
+%
+% How about something like
+%
+% Support for a global configuration file dates from the DOS-based
+% AnaGram 1.x, where the same configuration mechanism was used to
+% establish user interface preferences. AnaGram 2.0 and above handle
+% preferences separately, and the configuration system is only used
+% for code-related options. Since good practice dictates that code
+% should continue to work if exported outside of one's personal
+% environment, there are few or no legitimate uses of the global
+% configuration file and support for it will likely be removed in a
+% future AnaGram release.
+%
+% (But there really should be support for params on the agcl command
+% line; if nothing else it would make it a lot easier to test
+% combinations of settings.)
+%
+On initialization, AnaGram checks the directory that contains the
+AnaGram executable file.  If it finds \agfile{AnaGram.cfg}, it reads it
+and sets internal parameters accordingly.  It then looks for 
+\agfile{AnaGram.cfg} in your working directory and, if it finds it, reads
+it in turn.  If any parameter is set in both files, the last setting
+wins.  The effect of this two stage process is to allow you to set
+your standard preferences in the principal directory, with specific
+overrides in your working directories.  You may also put configuration
+parameters in your syntax file, which override the settings in the
+configuration files.  Note that neither configuration file is
+necessary.
+
+Before executing an Analyze Grammar or Build Parser command, AnaGram
+resets configuration parameters to their initial values, as determined
+by the built in defaults and the configuration files read at program
+initialization.
+
+There are, therefore, four levels at which parameters may be set. At
+the first level, there are the settings built into AnaGram.  If you
+don't like some of these, you can override them with a configuration
+file at the second level, the tools directory where you installed
+AnaGram.  If a particular project needs overrides, you can put them in
+a configuration file at the third level, the working directory for
+this project.  And if you have specific configuration requirements for
+a particular parser, the best place for them is the fourth level, the
+syntax file for the parser.
+
+For all of this flexibility, some people prefer to set every
+configuration parameter explicitly in their syntax files so there is
+no question as to what setting is being used.  AnaGram is set up so
+you can do it whichever way you prefer.
+
+If you are uncertain as to the actual parameters that AnaGram is using
+at any time, the
+\index{Configuration Parameters}\index{Window}
+\agwindow{Configuration Parameters} window listed in the
+\agmenu{Windows} menu will show you the current state of all
+parameters.
+
+The different varieties of configuration parameters are described
+below.  Each definition of a parameter must start on a new line.  A
+configuration file is just a sequence of parameter definitions, each
+on a separate line.  Blank lines can be used as separators where you
+please, and comments may be used as described for syntax files.
+Case\index{Case sensitivity} is ignored for parameter names (but not
+for the whole definition).  In a syntax file, each set of definitions
+must be enclosed with brackets ( \agcode{[ ]} ), forming a
+\index{Configuration section}\agterm{configuration section}, one of
+the four kinds of AnaGram statements.  Configuration sections can be
+scattered throughout a syntax file, but each section should begin on a
+new line, and following statements should also of course start on new
+lines.  There is no restriction on the number of sections, or on the
+number of times a parameter appears.  The last setting of a parameter
+wins.
+
+The first variety of configuration parameter is a simple
+\index{Switches}\index{Configuration switches}switch that controls
+one of the various features of AnaGram.  Such parameters are also called
+\agterm{configuration switches}.  They need simply be stated to set the
+condition (turn it on) or negated with the tilde (\agcode{\~{}}) to
+reset the condition (turn it off). Thus
+
+\begin{indentingcode}{0.4in}
+nest comments
+\end{indentingcode}
+causes AnaGram to allow nested comments, and
+
+\begin{indentingcode}{0.4in}
+\~{}nest comments
+\end{indentingcode}
+causes AnaGram to disallow nested comments.
+
+You may also set or reset configuration switches with explicit on or
+off values:
+
+\begin{indentingcode}{0.4in}
+nest comments = on
+nest comments = off
+\end{indentingcode}
+
+A second variety of configuration parameter takes a value which is the
+name of a token.  Thus
+
+\begin{indentingcode}{0.4in}
+grammar token = c grammar
+\end{indentingcode}
+specifies that the token \agcode{c grammar} is the grammar that
+AnaGram should use as the starting point for analyzing your grammar.
+
+A third variety of configuration parameter takes a value which is a C
+or C++ data type.  Thus
+
+\begin{indentingcode}{0.4in}
+default token type = unsigned char *
+\end{indentingcode}
+signifies that the value of a token, unless otherwise specified, is a
+pointer to an \agcode{unsigned char}.  AnaGram does not accept the
+full panoply of C and C++ \index{Data type}data types.  The
+restrictions are that AnaGram does not allow specification of array or
+function types, nor explicit structure types.  Types that are defined
+with typedef statements, structure definitions, or class definitions,
+including template classes, in your embedded C or C++ are acceptable.
+If you have more complex data types, you should define a simple name
+using a typedef statement.
+
+A fourth variety of configuration parameter takes a string value to
+set an ASCII string used by AnaGram.  Thus
+
+\begin{indentingcode}{0.4in}
+header file name = "widget.h"
+\end{indentingcode}
+signifies that the header file created by AnaGram should be called
+\agfile{widget.h}.  In
+those strings which are used to name the parser or files which AnaGram
+builds, the character ``\agcode{\#}'' is used to indicate that AnaGram
+should substitute the name of your syntax file.  In strings used to
+determine the names of program variables or functions, ``\agcode{\$}''
+is used to indicate that AnaGram should substitute the name of your
+parser. When building enumeration constants for the names of the
+tokens in your grammar, ``\agcode{\%}'' will be replaced by the name
+of the token.
+
+The final variety of configuration parameter takes a numeric value.
+The value may be decimal, octal or hexadecimal, following the C
+conventions, and may have an optional sign.  Thus
+
+\begin{indentingcode}{0.4in}
+parser stack size = 50
+\end{indentingcode}
+tells AnaGram to allocate space for at least fifty stack entries when
+it creates your parser.
+
+If AnaGram does not recognize a parameter, it will give you a warning
+with line number, column number, and the message ``no such
+parameter''.  If the value for a parameter is inappropriate, such as a
+string value for a parameter which should have a numeric value, the
+message will be ``inappropriate value''.  If the error occurs in the
+configuration file found in the AnaGram directory, AnaGram will prefix
+the warning with the complete path name for the file.  If the error
+occurs in the configuration file in your working directory, AnaGram
+will prefix the warning with ``AnaGram.cfg:''. If AnaGram encounters a
+syntax error while reading a configuration file, it will honor the
+parameter settings it found before the syntax error, but will ignore
+everything that follows the error.
+
+\section{Alphabetic Listing of Configuration Parameters}
+
+\index{Configuration switches}\index{Allow macros}\index{Macros}
+\agparamheading{allow macros}{switch, default on}
+
+When this switch is set, i.e., on, reduction procedures will be
+implemented as macros if they are sufficiently simple.  This makes
+your parser some what more compact and faster but makes it somewhat
+more difficult to debug.  It's a good idea to turn this switch off for
+debugging.
+
+\index{Configuration switches}\index{Auto init}
+\agparamheading{auto init}{switch, default on}
+
+This switch controls the initialization of any parser that is not
+\agparam{event driven}.  When it is on, the
+\index{Initializer}initializer for your parser is automatically called
+every time the parser is called.
+This is the normal situation.  On occasion, however, it
+is desirable to call a parser several times without reinitializing it.
+In this case, you may set the \agparam{auto init} parameter to off.
+Should you do this, you must call the initializer yourself whenever
+appropriate.
+% XXX characterize the occasion...
+
+When \agparam{event driven} is set, \agparam{auto init} has no effect.
+
+\index{Configuration switches}\index{Auto resynch}
+\agparamheading{auto resynch}{switch, default off}
+
+Setting this switch causes AnaGram to include an automatic
+resynchronization procedure in the parser.  The resynchronization
+procedure will be invoked upon encountering a syntax error and will
+skip over input until it finds input characters or tokens consistent
+with its state at the time of the error.  The purpose of the
+resynchronization procedure is to provide a simple way for your parser
+to proceed in the event of syntax errors so that it can find more than
+one syntax error on a given pass.  The resynchronization procedure
+uses a heuristic based on your own syntax.  AnaGram itself uses this
+technique to resynchronize after syntax errors in its input.
+
+A disadvantage to using this resynchronization technique is that the
+resynchronization procedure turns off all reduction procedures.  The
+reason is that the resynchronization may cause a number of reduction
+procedures to be skipped.  This means that the parameters for any
+reduction procedures that might be called later would be suspect and
+could cause serious problems.  It seems more prudent simply to shut
+them down. Semantically determined productions will subsequently, of
+course, always use the default reduction token.
+
+If you have a
+\index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR}
+macro, it will be called \emph{before} the resynchronization
+process. It will also be called on subsequent syntax errors, so your
+program will not lose control entirely.
+
+If you use the auto resynchronization procedure, you must also specify
+the \agparam{eof token} configuration parameter (see below) so that
+the synchronizer doesn't inadvertently try to pass over the end of
+file.
+
+For other methods of recovering from syntax errors, see Chapter 9.
+
+\index{Configuration switches}\index{Backtrack}
+\agparamheading{backtrack}{switch, default on}
+
+If your parser does not continue after encountering a syntax error,
+you can speed up your parser and make it a little smaller by turning
+off the \agparam{backtrack} switch.  If \agparam{backtrack} is on,
+AnaGram configures your parser so that in case of syntax error it can
+undo any default reductions it might have made as a consequence of the
+erroneous input.  The purpose of such an undo function is to identify
+the proper error frame and to maximize the probability of being able
+to recover gracefully.
+
+% XXX shouldn't these be indexed as ``obsolete parameters'' or
+% something, with xrefs so if you look up ``Bottom margin'' in the
+% index it says ``see ``obsolete parameters''''?
+%
+% Also, shouldn't the various obsolete parameters be described with
+% the same text?
+%
+\index{Configuration parameters}\index{Bottom margin}
+\agparamheading{bottom margin}{integer value, default = 3}
+
+This is an obsolete parameter which was used in the DOS version of
+AnaGram.  It is no longer used, but is still recognized for the sake
+of compatibility.
+
+\index{Configuration switches}\index{Bright background}
+\agparamheading{bright background}{switch, default on}
+
+This configuration switch is not used in AnaGram 2.0.  It is retained
+for compatibility with configuration files used with the DOS versions
+of AnaGram.
+
+\index{Configuration switches}\index{Case sensitive}
+\index{Case sensitivity}
+\agparamheading{case sensitive}{switch, default on}
+
+Use this switch to control how your parser deals with distinctions
+between upper and lower case.  When \agparam{case sensitive} is on,
+AnaGram builds a parser which distinguishes upper from lower case.
+When this switch is off, AnaGram builds a parser which ignores case
+for all input.  This does not mean that the values of character set
+tokens are not case sensitive.  Although 'a' and 'A' would map to the
+same token, the values would still be lower and upper case
+respectively.
+
+% XXX the last bit could be explained more clearly. (something like
+% ``parsers still preserve case'')
+
+% XXX this should discuss character sets, locales, and other such
+% garbage.
+
+\index{Configuration parameters}\index{Compile command}
+\agparamheading{compile command}{string, default = \agcode{NULL}}
+
+This parameter is retained only for compatibility with the DOS version
+of AnaGram.  It is ignored in the Windows version.
+
+\index{Configuration switches}\index{Const data}
+\agparamheading{const data}{switch, default on}
+
+The \agparam{const data} switch controls the use of \agcode{const}
+qualifiers in generated C code.  If the switch is on, all fixed data
+arrays in the parser file will be qualified as \agcode{const}.  The
+\agparam{const data} switch is ignored if the \agparam{old style}
+switch is set.
+
+\index{Configuration parameters}\index{Context type}
+%XXX: \index{context tracking} ?
+\agparamheading{context type}{c data type, no default}
+
+By default, \agparam{context type} is undefined.  If you assign the
+name of a C data type, AnaGram will implement ``context tracking'' in
+your parser.  See Chapter 9.  The data type name can be either a
+standard, pre-defined data type or one which you create with a
+\agcode{typedef} statement.
+
+\index{Configuration parameters}\index{Coverage file name}
+\index{File extension}\index{nrc}
+\agparamheading{coverage file name}{string, default = \agcode{"\#.nrc"}}
+
+If you set the \agparam{rule coverage} configuration switch, AnaGram
+will provide functions in your parser to read and write rule counts to
+a file.  The name of the file will be determined by \agparam{coverage
+file name}.  The name of your syntax file will be substituted for the
+``\agcode{\#}'' character.
+
+\index{Configuration switches}\index{Declare pcb}
+% XXX \index{Parser control block} ?
+\agparamheading{declare pcb}{switch, default on}
+
+When AnaGram builds a parser, it checks the status of the
+\agparam{declare pcb} switch. If it is on, AnaGram declares a parser
+control block for you.  AnaGram creates the name of the control block
+variable by appending \agcode{{\us}pcb} to the name of your parser.
+AnaGram will also code an \agcode{\#include} statement to include your
+parser header file, and will define the \agcode{PCB} macro for you.
+If you wish to declare the parser control block yourself you should
+turn this switch off.
+
+\index{Configuration parameters}\index{Default input type}
+\index{Input type}
+% XXX: \index{Types} ?
+\agparamheading{default input type}{c data type, default = \agcode{int}}
+
+This parameter tells AnaGram what data type to assume for terminal
+tokens if they are not explicitly declared.  Normally, you would
+explicitly declare terminal tokens only when you have set the
+\agparam{input values} configuration switch.  The default type for
+nonterminal tokens is given by \agparam{default token type}.
+
+\index{Configuration switches}\index{Default reductions}\index{Reduction}
+\agparamheading{default reductions}{switch, default on}
+
+If in a given parser state there is only one production that could be
+possibly reduced, it is usually faster to reduce it on any input than
+to check specifically for correct input before reducing it.  The only
+time this default reduction causes trouble is in the event of
+erroneous input.  In this situation you may get an erroneous
+reduction.  Normally when you are parsing a file, this is
+inconsequential because you are not going to continue semantic action
+in the presence of error.  But, if you are using your parser to handle
+real-time interactive input, you have to be able to continue semantic
+processing after notifying your user that he has entered erroneous
+input.  In this case you would want to turn the \agparam{default
+reductions} switch off so that productions are reduced only when there
+is correct input.
+
+\index{Configuration parameters}\index{Default token type}\index{token}
+% XXX \index{Types} ?
+\agparamheading{default token type}{c data type, default = \agcode{void}}
+
+This parameter takes a C data type as its value.  It is used to set
+the data type for the semantic values of nonterminal tokens whose type
+is not explicitly specified in the grammar.  To set the default type
+for terminal tokens use \agparam{default input type}.
+
+\index{Diagnose errors}\index{Configuration switches}
+\agparamheading{diagnose errors}{switch, default on}
+
+If you set this switch, AnaGram will include a syntax error diagnostic
+procedure in your parser. This procedure will be called before your
+\index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} macro is
+called.  It will store a pointer to a string in the
+\agcode{error{\us}message} field of your parser control
+block.  The string will contain a diagnostic message.  If there is
+only one syntactically correct input, x, for example, the message will
+be ``Missing x''.  Otherwise it will be ``Unexpected x'' if the input
+is recognizable but incorrect and ``Unexpected input'' otherwise.  If
+the \agparam{error frame} switch has been set, the
+\agcode{error{\us}frame{\us}ssx} and
+\agcode{error{\us}frame{\us}token} fields
+in the parser control block will be set as described in Chapter 9.
+
+% XXX say: diagnose errors causes the token_names[] array to be
+% included in the parser. and index token_names[]...
+
+\index{Distinguish lexemes}\index{Configuration switches}
+% XXX \index{Disregard} ?
+\agparamheading{distinguish lexemes}{switch, default off}
+
+The \agparam{distinguish lexemes} switch has no effect unless a
+disregard token has been defined.  Normally, the disregard token
+(usually white space) is optional between lexemes.  This may lead to
+apparent shift-reduce conflicts if the characters that comprise the
+second of two successive lexemes can be construed as part of the first
+lexeme.  In this situatation, turning on the \agparam{distinguish
+lexemes} switch effectively requires a disregard token to separate the
+two lexemes.
+
+\index{Edit command}\index{Configuration parameters}
+\index{File extension}\index{syn}
+\agparamheading{edit command}{string, default = \agcode{"ed \#.syn"}}
+
+This parameter is no longer used and is retained only for file
+compatibility with the DOS version of AnaGram.
+
+\index{Enable mouse}\index{Configuration switches}
+\agparamheading{enable mouse}{switch, default on}
+
+This parameter is no longer used and is retained only for file
+compatibility with the DOS version of AnaGram.
+
+\index{Enum constant name}\index{Configuration parameters}
+\agparamheading{enum constant name}{string,
+default = \agcode{"\${\us}\%{\us}token"}}
+
+Use the \agparam{enum constant name} parameter to control the names
+AnaGram uses for the enumeration constants it defines in the
+header file for your parser.  The value of \agparam{enum constant
+name} should be a string containing the ``\agcode{\%}'' character.
+AnaGram will substitute each token name in turn for the
+``\agcode{\%}'' character in this template as it creates the list of
+enumeration constants.  If it finds a ``\agcode{\$}'' character it
+will substitute the name of your parser.
+
+\index{Eof token}\index{Configuration parameters}\index{Token}
+\agparamheading{eof token}{token name, no default}
+
+If you use the auto resynchronization capability of AnaGram, you must
+specify an end of file token explicitly.  You can do this either by
+specifying a terminal token in your grammar called \agcode{eof} or by
+using the \agparam{eof token} parameter to identify some other
+terminal token to be used as the end of file marker.  You would do
+this only if you must use the name \agcode{eof} for some other
+purpose.
+
+\index{Error frame}\index{Error frame}\index{Configuration switches}
+\agparamheading{error frame}{switch, default off}
+
+AnaGram uses the \agparam{error frame} switch in conjunction with the
+\index{Diagnose errors}\index{Configuration switches}\agparam{diagnose errors}
+switch.  If both are set, when your parser encounters a syntax error,
+before invoking the
+\index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} macro,
+your parser will determine the frame in which the error occurred, that
+is, the production the parser was trying to match at the time of the
+error.
+
+% XXX: See chapter (dd.tex) for a complete discussion.
+
+\index{Configuration parameters}\index{Error token}\index{Token}
+\agparamheading{error token}{token name, no default}
+
+One of your options for error recovery after a syntax error is a
+technique similar to that provided in \agfile{yacc}.  You include a
+terminal token called \agcode{error} in your grammar.  When the parser
+encounters an error in the input it backs up the state stack to the
+most recent state in which \agcode{error} was an acceptable input.  It
+then shifts to the new state as though it had seen an actual
+\agcode{error} token.  At this point, it skips over any character in
+the input which is not an acceptable input character for this state.
+Once it does find an acceptable input character, it continues
+processing as though nothing had happened.  If you wish to use this
+approach and for some reason you wish to use the name \agcode{error}
+for some other token in your grammar, you may use the \agparam{error
+token} parameter to identify some other terminal token in your grammar
+as the ``error token''.
+
+\index{Configuration switches}\index{Error trace}\index{Trace}
+\index{Window}
+\agparamheading{error trace}{switch, default off}
+
+If you turn the \agparam{error trace} switch on, AnaGram will include
+code in your parser so that when it encounters a syntax error it will
+write the contents of the \index{Parser state stack}\index{State
+stack}\index{Stack}parser state stack to a file.  The name of the file
+is the same as the name of your syntax file but with the extension
+\index{File extension}\index{etr}\agfile{.etr}.  You may override this
+definition by defining
+\index{AG{\us}TRACE{\us}FILE{\us}NAME}\index{Macros}\agcode{AG{\us}TRACE{\us}FILE{\us}NAME}
+in your embedded C.
+
+The \agmenu{Error Trace} option in the \agmenu{Action} menu can then
+read this information and prepare a pre-built \agwindow{Grammar Trace}
+showing you the status of the parse at the time of the syntax error.
+You would use this switch primarily when you are first checking out
+your grammar to make sure it accurately represents the input you
+desire to handle.  You would also use it any time your parser
+encounters a syntax error you don't understand.  For more information,
+see Chapter 5.
+
+\index{Escape backslashes}\index{Configuration switches}
+\agparamheading{escape backslashes}{switch, default off}
+
+\agparam{Escape backslashes} is used only in conjunction with the
+\agparam{line numbers} option.  When turned on, it causes the
+backslashes in the pathname generated by the \agparam{line numbers}
+option to be doubled.  This switch has been provided because C and C++
+compilers are not consistent in their handling of backslashes in path
+names.
+
+\index{Event driven}\index{Configuration switches}
+% XXX \index{AG{\us}RUNNING{\us}CODE} ?
+% XXX \index{exit{\us}flag} ?
+\agparamheading{event driven}{switch, default off}
+
+If you turn the \agparam{event driven} switch on, when you build a
+parser, it will be configured as an ``event driven'' parser.  This
+means that after calling its initializer function, you call it once
+with each discrete unit of input.  The parser proceeds until it
+needs more input, finishes the grammar, or encounters an error.  It
+then returns.  The \agcode{exit{\us}flag} field in the parser control
+block is equal to \agcode{AG{\us}RUNNING{\us}CODE} if more input is needed.
+Other values indicate other reasons for termination.
+% XXX crossreference the discussion of exit codes?
+
+When \agparam{event driven} is on, \agparam{auto init} has no effect;
+you must always call the initializer function yourself.
+
+\index{Far tables}\index{Configuration switches}
+\agparamheading{far tables}{switch, default = off}
+
+If \agparam{far tables} is on when AnaGram builds a parser, it will
+declare the larger tables it builds as \agcode{far}.  This can be a
+convenience when using some memory models of the 8086 architecture.
+
+\index{Grammar token}\index{Configuration parameters}\index{Token}
+\agparamheading{grammar token}{token name, no default}
+
+The \agparam{grammar token} parameter may be used to specify the
+grammar, or ``goal'', token for the syntax analyzer portion of
+AnaGram.  An alternative method is to append a ``\$'' to the goal
+token when you define it.  You may also simply use the name
+\agcode{grammar} to identify the grammar token.
+
+\index{Header file name}\index{Configuration parameters}\index{File name}
+\agparamheading{header file name}{string, default = \agcode{"\#.h"}}
+
+This parameter names the parser header file AnaGram generates.  The
+contents of the header file are described in Chapter 9.  When AnaGram
+creates the file, it copies the value of \agparam{header file name},
+substituting the name of your syntax file for the ``\agcode{\#}''
+character, in order to create the pathname and extension for the file.
+You can therefore use this parameter to give the header file a
+particular name, independent of the syntax file name, or to specify a
+particular drive or directory where you want the header file to
+reside.  Note that if you include a full DOS/Windows pathname,
+backslash characters must be quoted.
+
+\index{Input values}\index{Configuration switches}
+\agparamheading{input values}{switch, default off}
+
+% XXX this shouldn't say ASCII because it's true even if the
+% characters are some other character set...
+If the input to your parser includes explicit token values which are
+not simply the ASCII values of corresponding ASCII input characters,
+you must set the \agparam{input values} switch to inform AnaGram.
+Unless your parser is \agparam{event driven}, you must also provide
+your own \agcode{GET{\us}INPUT} macro.
+
+\index{Line length}\index{Configuration parameters}
+\agparamheading{line length}{integer value, default = 80}
+
+\agparam{Line length} is an obsolete configuration parameter, recognized
+for the sake of compatibility with configuration files prepared for
+the DOS version of AnaGram.  It is ignored in AnaGram 2.0.
+
+\index{Line numbers}\index{configuration switches}
+\agparamheading{line numbers}{switch, default off}
+
+If \agparam{line numbers} is set, AnaGram will put syntax file line
+numbers into the generated C code file using the
+\index{\#line}\agcode{\#line}
+directive so that your compiler diagnostics will refer to lines in the
+syntax file rather than in the generated C code file.  If
+\agparam{line numbers} is off, AnaGram will put syntax file line
+numbers in comments.  The
+\index{Line numbers path}\index{Configuration parameters}
+\agparam{line numbers path} and 
+\index{Escape backslashes}\index{Configuration switch}
+\agparam{escape backslashes}
+switches may be used to control the generation of the line number
+directives.
+
+\index{Line numbers path}\index{Configuration parameters}
+\agparamheading{line numbers path}{string, default = \agcode{NULL}}
+
+When you have set the \agparam{line numbers} switch and
+\agparam{line numbers path} is not NULL, AnaGram uses it in the 
+\agcode{\#line} directive in place of the full path name of your
+syntax file.
+% XXX update for unix where we (maybe) don't generate full pathnames
+
+\index{Lines and columns}\index{Configuration switches}
+\agparamheading{lines and columns}{switch, default on}
+
+If this switch is set, AnaGram will incorporate code into your parser
+to track line numbers and column numbers in its input.  At all times,
+the \agcode{line} and \agcode{column} fields in your parser control
+block will mark the location of the current lookahead character.  The
+treatment of tab characters is controlled by the
+\index{TAB{\us}SPACING}\index{Macros}\agcode{TAB{\us}SPACING} macro.
+
+\index{Main program}\index{Configuration switches}
+\agparamheading{main program}{switch, default on}
+
+The \agparam{main program} switch determines what AnaGram does if you
+invoke the Build Parser command, but have no embedded C in your syntax
+file.  If the switch is on, AnaGram creates a main program which does
+nothing but call your parser.  The switch is ignored if your parser
+uses \agparam{pointer input} or is \agparam{event driven}.
+
+\index{Max conflicts}\index{Configuration parameters}\index{Conflicts}
+\agparamheading{max conflicts}{integer value, default = 50}
+
+\agparam{Max conflicts} limits the number of conflicts AnaGram will
+record.  Sometimes, a simple editing error in your syntax file can
+cause hundreds of conflicts, which you don't need to see in gory
+detail.  If you have a grammar that is in serious trouble and you want
+to see more conflicts, you may change \agparam{max conflicts} to suit
+your needs.
+
+\index{Near functions}\index{Configuration switches}
+\agparamheading{near functions}{switch, default off}
+
+\agparam{Near functions} controls the use of the \agcode{near} keyword
+for static functions in your parser.  If your parser is to run on a
+16-bit 80x86 processor you would want to turn it on.  If you are
+going to run your parser on some other processor or use a C compiler
+that does not support the \agcode{near} keyword you should leave
+\agparam{near functions} off.
+
+\index{Configuration switches}\index{Nest comments}\index{Comments}
+\agparamheading{nest comments}{switch, default off}
+
+Use this switch to allow nested comments in your syntax or
+configuration files.  It defaults to off, in accordance with the ANSI
+standard for C.  Note that AnaGram scans comments in any embedded C
+code as well as in the grammar specification.  You may turn this
+switch on and off as many times as necessary in a single file.
+
+\index{Old style}\index{Configuration switches}
+\agparamheading{old style}{switch, default off}
+
+\agparam{Old style} controls the function definitions in the code
+AnaGram generates.  When \agparam{old style} is off, AnaGram generates
+ANSI style calling sequences with prototypes as necessary.  When
+\agparam{old style} is on, it generates old style function definitions,
+and no prototypes.  It also causes the
+\index{Const data}\index{Configuration switch}\agparam{const data}
+switch to be ignored.
+
+\index{Page length}\index{Configuration parameters}
+\agparamheading{page length}{integer value, default = 66}
+
+\agparam{Page length} is an obsolete configuration parameter,
+recognized for the sake of compatibility with configuration files
+prepared for the DOS version of AnaGram.  It is ignored in AnaGram
+2.0.
+
+\index{Parser file name}\index{Configuration parameters}\index{File name}
+\agparamheading{parser file name}{string, default = \agcode{"\#.c"}}
+
+AnaGram creates a parser which consists of all the embedded C code in
+your syntax file, the syntax tables created by the syntax analyzer,
+and a parsing engine configured to your requirements.  This code is
+written to a file whose name is given by this parameter.  When AnaGram
+creates your parser file, it copies the value of the \agparam{parser
+file name} parameter, substituting the name of your syntax file for
+the ``\agcode{\#}'' character, in order to create the pathname and
+extension for the file.  You can therefore use this parameter to give
+the parser file a particular name, independent of the syntax file
+name, or to specify a particular drive or directory where you want the
+parser file to reside.  Note that if you include a full DOS/Windows
+pathname, you must quote the backslash characters.  If writing a C++
+parser you would use this parameter to set the output filename suffix.
+
+\index{Parser}\index{Parser name}\index{Configuration parameters}
+\agparamheading{parser name}{string, default = \agcode{"\$"}}
+
+% XXX This should say something other than ``name your parser''
+AnaGram uses the value of \agparam{parser name} to name your parser,
+substituting the name (not including the extension) of your syntax
+file for a ``\agcode{\$}'' character.  If you accept the default value of
+\agparam{parser name} and have a syntax file called \agfile{ana.syn},
+AnaGram will name your parser \agcode{ana}.
+
+The \index{Initializer}initializer for your parser will have the same
+name preceded by \agcode{init{\us}}. In the above example, the
+initializer would be called \agcode{init{\us}ana}.
+
+\index{Configuration parameters}\index{Stack}\index{Parser stack alignment}
+\agparamheading{parser stack alignment}{c data type, default = \agcode{int}}
+
+\agparam{Parser stack alignment} is used to control byte alignment of
+the parser stack, \agcode{PCB.vs}.  AnaGram normally adds a field of
+the specified data type to the \agcode{union} declaration that defines
+the data type for the parser stack.  This parameter can be used to
+deal with byte alignment problems when a parser is to be run on a
+processor with byte alignment restrictions.  For instance, if your
+grammar has tokens of type \agcode{double} and your processor requires
+double precision variables to be properly aligned, you can include the
+following statement in a configuration section in your grammar or in
+your configuration file:
+\begin{indentingcode}{0.4in}
+parser stack alignment = double
+\end{indentingcode}
+If the data type is \agcode{void}, no alignment declaration will be
+made.
+% You will not need to change this parameter if your parser is to
+% run on a PC or compatible processor.
+%
+% XXX this really ought to be updated for the century of the fruitbat
+
+\index{Configuration parameters}\index{Parser stack size}
+\agparamheading{parser stack size}{integer value, default = 32}
+
+\agparam{Parser stack size} is used to set the sizes of the parser
+stacks in your parser control block.  When AnaGram analyzes your
+grammar, it determines the minimum amount of stack space required for
+the deepest left recursion.  To this depth it adds one half the value
+of the \agparam{parser stack size} parameter.  It then sets the actual
+stack size to the larger of this value and the \agparam{parser stack
+size} parameter.  If you find 32 wastefully large or dangerously
+small, you can define it to suit the needs of your particular parser.
+
+\index{Pointer input}\index{Configuration switches}
+\agparamheading{pointer input}{switch, default off}
+
+When you turn \agparam{pointer input} on you tell AnaGram that the
+input to your parser is in memory and can be scanned simply by
+incrementing a pointer.  Before calling your parser you should make
+sure that the \agcode{pointer} field in your parser control block is
+properly initialized to point to the first character or token in your
+input.
+
+Use the parameter
+\index{Pointer type}\index{Configuration parameters}\agparam{pointer type}
+to specify the type of the pointer. The default value of pointer type
+is \agcode{unsigned char *}.
+
+\index{Pointer type}\index{Configuration parameters}
+\agparamheading{pointer type}{c data type, default = \agcode{unsigned char *}}
+
+If you have set the \agparam{pointer input} switch, AnaGram will use
+the value of the \agparam{pointer type} parameter to declare the
+\agcode{pointer} field in your parser control block.
+
+\index{Print file name}\index{Configuration parameters}\index{File name}
+\agparamheading{print file name}{string, default = \agcode{"LPT1"}}
+
+\agparam{Print file name} is an obsolete configuration parameter,
+recognized for the sake of compatibility with configuration files
+prepared for the DOS version of AnaGram.  It is ignored by AnaGram
+2.0.
+
+\index{Quick reference}\index{Configuration switches}
+\agparamheading{quick reference}{switch, default off}
+
+The \agparam{quick reference} switch is no longer used, but is still
+recognized for compatiblity's sake.  In future versions of AnaGram it
+may no longer be recognized.
+
+\index{Configuration switches}\index{Reduction choices}
+\agparamheading{reduction choices}{switch, default off}
+
+If the \agparam{reduction choices} switch is set when AnaGram builds a
+parser, it will include in your parser file a function which can
+identify the acceptable choices for the reduction token in the current
+state.  You would use this switch only if you were using semantically
+determined productions in your grammar and if there were states in
+which not all the tokens on the left side of the production were valid
+reduction tokens.
+
+\index{Rule coverage}\index{Configuration switches}\index{Coverage}
+\agparamheading{rule coverage}{switch, default off}
+
+If you set the \agparam{rule coverage} switch, AnaGram will include
+code in your parser to count the number of times your parser identifies
+each rule in your grammar.  To maintain the counts, AnaGram declares,
+at the beginning of your parser, an integer array, whose name is
+created by appending \agcode{{\us}nrc} to the name of your parser.  The
+array contains one counter for each rule you have defined in your
+grammar.  There are no entries for the auxiliary rules that AnaGram
+creates to deal with set overlaps or disregard statements.  In order
+to identify every rule that the parser reduces in the course of
+execution, AnaGram
+has to turn off certain optimization features in your parser.
+Therefore, a parser that has the \agparam{rule coverage} switch
+enabled will run slightly slower than one with the switch off.  An
+entry on the \agmenu{Browse} menu allows you to view the coverage data.
+% XXX See Chapter ???.
+
+\index{Tab spacing}\index{Configuration parameters}
+\agparamheading{tab spacing}{integer value, default = 8}
+
+\agparam{Tab spacing} controls the expansion of tabs when AnaGram
+displays your syntax file or the \agwindow{File Trace} test file.
+
+The value of \agparam{tab spacing} is also used to set the default
+value of the \index{TAB{\us}SPACING}\index{Macros}\agcode{TAB{\us}SPACING}
+macro in your parser.
+
+The default value of \agparam{tab spacing} is 8.  If you prefer a
+different value, you should probably include an appropriate statement
+in your configuration file. For example:
+
+\begin{indentingcode}{0.4in}
+tab spacing = 2
+\end{indentingcode}
+
+\index{Test file binary}\index{Configuration switch}
+\agparamheading{test file binary}{switch, default off}
+
+\agparam{Test file binary} causes \agwindow{File Trace} to read test
+files in binary mode.  When \agwindow{File Trace} reads a test file,
+it normally reads it in text mode, which in Windows causes carriage return
+characters to be stripped out.  Occasionally it is necessary to test a
+grammar where carriage return characters are important and should not
+be stripped.  In this situation, set \agparam{test file binary} to on,
+and the carriage return characters will not be discarded.
+% XXX rewrite the second half of this paragraph?
+
+\index{Test file mask}\index{Configuration parameters}
+\agparamheading{test file mask}{string, default = \agcode{"*.*"}}
+
+% XXX default should be ``*'' on unix
+AnaGram uses \agparam{test file mask} to filter the pick list of test
+files when you use the
+\index{File Trace}\index{Trace}\index{Window}\agwindow{File Trace}
+feature.
+You may set it to any value you wish, including a pathname.
+% XXX: test this
+For instance, if you know that all your test files are in the directory
+\agfile{C:{\bs}PROJECT{\bs}SOURCE} and have
+extension \agfile{.FOO} you could set test file mask to
+\agcode{"C:{\bs\bs}PROJECT{\bs\bs}SOURCE{\bs\bs}*.FOO"}.
+Note that, as in any string literal, backslash characters must be
+escaped.
+
+\index{Test range}\index{Configuration switches}\index{Range}
+\agparamheading{test range}{switch, default off}
+% XXX should this really default to off?
+
+When \agparam{test range} is on, AnaGram will insert code in your
+parser to make sure all input characters or token identifiers are
+within the range specified in your grammar.  If you do not turn this
+switch on, your parser will run slightly faster, but its behavior will
+be undefined if it gets input outside the range you have specified
+in your grammar.
+
+\index{Token names}\index{Configuration switches}
+\agparamheading{token names}{switch, default off}
+
+When \agparam{token names} is set, AnaGram includes a static array of
+ASCII strings in your parser containing the names of your tokens.  The
+name of this array is \agcode{\#{\us}token{\us}names} where the
+``\agcode{\#}'' character is replaced with the name of your parser.
+The entry for tokens which do not have names is an empty string:
+\agcode{""}.
+
+\index{Top margin}\index{Configuration parameters}
+\agparamheading{top margin}{integer value, default = 3}
+
+\agparam{Top margin} is an obsolete configuration parameter,
+recognized for the sake of compatibility with configuration files
+prepared for the DOS version of AnaGram.  It is ignored by AnaGram
+2.0.
+
+\index{Traditional engine}\index{Configuration switches}
+\agparamheading{traditional engine}{switch, default off}
+
+Traditional LALR-1 parsers use a parsing engine which has only four
+actions: shift, reduce, accept, and error.  AnaGram, in the interests
+of faster execution and more compact tables, uses a parsing engine
+with a number of short-cut actions.  The \agparam{traditional engine}
+switch tells AnaGram not to use the short-cut actions.
+
+You would set this switch primarily in conjunction with use of the
+\index{Grammar Trace}\index{Trace}\index{Window}\agwindow{Grammar Trace}
+in order to have a clearer idea of what is happening.  AnaGram will
+then be using the same parsing actions as textbook parsers.  Note that
+if a lookahead token has already been selected, AnaGram will display
+it on the last line of the \agwindow{Parser Stack} pane in the
+\agwindow{Grammar Trace} window.
+% XXX what is this note doing here?
+
+You should turn this switch back off when you have finished debugging
+or your parser will be larger and slower than necessary.
+
+% XXX: say that in production code traditional engine is not useful
+% and only serves to slow things down.
+
+\index{Video mode}\index{Configuration parameters}
+\agparamheading{video mode}{integer value, default = $-$1}
+
+\agparam{Video mode} is an obsolete configuration parameter,
+recognized for the sake of compatibility with configuration files
+prepared for the DOS version of AnaGram.  It is ignored by AnaGram
+2.0.
author	David A. Holland
date	Sat, 22 Dec 2007 17:52:45 -0500
parents
children