Mercurial > ~dholland > hg > ag > index.cgi

\chapter{Configuration Parameters}
\index{Configuration parameters}\index{Parameters}

\agterm{Configuration parameters} are named constants that control the
way AnaGram works.  AnaGram ignores case\index{Case sensitivity} when
it looks up the names of configuration parameters, so that
\agcode{parser name} and \agcode{Parser Name} both refer to the same
parameter.  Configuration parameters that have only true/false or
on/off values are often referred to as
\index{Configuration switches}\agterm{configuration switches}.

Configuration parameters are used to control:

\begin{itemize}
\item Comment nesting
\item Grammar analysis
\item Parser generation
\end{itemize}

Every configuration parameter has a default value which has been
chosen to correspond to a standard if it exists, customary usage if
such can be determined, or otherwise to the most likely usage.

Configuration parameters may be specified either in
\index{Configuration file}\index{File}\agparam{configuration files},
always named \agfile{AnaGram.cfg}, or in a syntax file.  A
configuration file is a normal ASCII file containing parameter
specifications.  The syntax of a configuration file is the same as
that of a configuration segment within a syntax file, except that a
configuration file does not have the brackets ( \agcode{[ ]} ) that
enclose a configuration segment in a syntax file.  You may comment the
configuration file freely, just as though it were a syntax file.
% XXX ``configuration segment'' is a forward reference and we should
% rearrange all this so it isn't. Also, the forward reference is
% ``configuration section''. Sigh.

% Parameters can be set in either a configuration file or in your syntax
% file.
Apart from the \agparam{nest comments} switch, if a parameter
is specified more than once, only the last value is used (see below).
The \agparam{nest comments} switch, which affects the way AnaGram
reads your configuration and syntax files, takes effect as soon as
AnaGram encounters it in a file and stays in effect unless it is later
turned off.

% XXX this should be belabored less. Also, good practice dictates that
% if you ship a project or a grammar it should compile in someone
% else's environment, and we shouldn't encourage people to do things
% like put \agparam{pointer input} in a systemwide AnaGram.cfg.
%
% XXX also in the Unix world it ought to read
% /usr/local/etc/AnaGram.cfg and then also ~/.AnaGram.cfg - or
% something like that. And it ought to be possible to set params
% on the agcl command line. We need to think about this. (Well,
% there's not really any valid use for either, so perhaps it
% doesn't matter.)
%
% How about something like
%
% Support for a global configuration file dates from the DOS-based
% AnaGram 1.x, where the same configuration mechanism was used to
% establish user interface preferences. AnaGram 2.0 and above handle
% preferences separately, and the configuration system is only used
% for code-related options. Since good practice dictates that code
% should continue to work if exported outside of one's personal
% environment, there are few or no legitimate uses of the global
% configuration file and support for it will likely be removed in a
% future AnaGram release.
%
% (But there really should be support for params on the agcl command
% line; if nothing else it would make it a lot easier to test
% combinations of settings.)
%
On initialization, AnaGram checks the directory that contains the
AnaGram executable file.  If it finds \agfile{AnaGram.cfg}, it reads it
and sets internal parameters accordingly.  It then looks for
\agfile{AnaGram.cfg} in your working directory and, if it finds it, reads
it in turn.  If any parameter is set in both files, the last setting
wins.  The effect of this two stage process is to allow you to set
your standard preferences in the principal directory, with specific
overrides in your working directories.  You may also put configuration
parameters in your syntax file, which override the settings in the
configuration files.  Note that neither configuration file is
necessary.

Before executing an Analyze Grammar or Build Parser command, AnaGram
resets configuration parameters to their initial values, as determined
by the built in defaults and the configuration files read at program
initialization.

There are, therefore, four levels at which parameters may be set. At
the first level, there are the settings built into AnaGram.  If you
don't like some of these, you can override them with a configuration
file at the second level, the tools directory where you installed
AnaGram.  If a particular project needs overrides, you can put them in
a configuration file at the third level, the working directory for
this project.  And if you have specific configuration requirements for
a particular parser, the best place for them is the fourth level, the
syntax file for the parser.

For all of this flexibility, some people prefer to set every
configuration parameter explicitly in their syntax files so there is
no question as to what setting is being used.  AnaGram is set up so
you can do it whichever way you prefer.

If you are uncertain as to the actual parameters that AnaGram is using
at any time, the
\index{Configuration Parameters}\index{Window}
\agwindow{Configuration Parameters} window listed in the
\agmenu{Windows} menu will show you the current state of all
parameters.

The different varieties of configuration parameters are described
below.  Each definition of a parameter must start on a new line.  A
configuration file is just a sequence of parameter definitions, each
on a separate line.  Blank lines can be used as separators where you
please, and comments may be used as described for syntax files.
Case\index{Case sensitivity} is ignored for parameter names (but not
for the whole definition).  In a syntax file, each set of definitions
must be enclosed with brackets ( \agcode{[ ]} ), forming a
\index{Configuration section}\agterm{configuration section}, one of
the four kinds of AnaGram statements.  Configuration sections can be
scattered throughout a syntax file, but each section should begin on a
new line, and following statements should also of course start on new
lines.  There is no restriction on the number of sections, or on the
number of times a parameter appears.  The last setting of a parameter
wins.

The first variety of configuration parameter is a simple
\index{Switches}\index{Configuration switches}switch that controls
one of the various features of AnaGram.  Such parameters are also called
\agterm{configuration switches}.  They need simply be stated to set the
condition (turn it on) or negated with the tilde (\agcode{\~{}}) to
reset the condition (turn it off). Thus

\begin{indentingcode}{0.4in}
nest comments
\end{indentingcode}
causes AnaGram to allow nested comments, and

\begin{indentingcode}{0.4in}
\~{}nest comments
\end{indentingcode}
causes AnaGram to disallow nested comments.

You may also set or reset configuration switches with explicit on or
off values:

\begin{indentingcode}{0.4in}
nest comments = on
nest comments = off
\end{indentingcode}

A second variety of configuration parameter takes a value which is the
name of a token.  Thus

\begin{indentingcode}{0.4in}
grammar token = c grammar
\end{indentingcode}
specifies that the token \agcode{c grammar} is the grammar that
AnaGram should use as the starting point for analyzing your grammar.

A third variety of configuration parameter takes a value which is a C
or C++ data type.  Thus

\begin{indentingcode}{0.4in}
default token type = unsigned char *
\end{indentingcode}
signifies that the value of a token, unless otherwise specified, is a
pointer to an \agcode{unsigned char}.  AnaGram does not accept the
full panoply of C and C++ \index{Data type}data types.  The
restrictions are that AnaGram does not allow specification of array or
function types, nor explicit structure types.  Types that are defined
with typedef statements, structure definitions, or class definitions,
including template classes, in your embedded C or C++ are acceptable.
If you have more complex data types, you should define a simple name
using a typedef statement.

A fourth variety of configuration parameter takes a string value to
set an ASCII string used by AnaGram.  Thus

\begin{indentingcode}{0.4in}
header file name = "widget.h"
\end{indentingcode}
signifies that the header file created by AnaGram should be called
\agfile{widget.h}.  In
those strings which are used to name the parser or files which AnaGram
builds, the character ``\agcode{\#}'' is used to indicate that AnaGram
should substitute the name of your syntax file.  In strings used to
determine the names of program variables or functions, ``\agcode{\$}''
is used to indicate that AnaGram should substitute the name of your
parser. When building enumeration constants for the names of the
tokens in your grammar, ``\agcode{\%}'' will be replaced by the name
of the token.

The final variety of configuration parameter takes a numeric value.
The value may be decimal, octal or hexadecimal, following the C
conventions, and may have an optional sign.  Thus

\begin{indentingcode}{0.4in}
parser stack size = 50
\end{indentingcode}
tells AnaGram to allocate space for at least fifty stack entries when
it creates your parser.

If AnaGram does not recognize a parameter, it will give you a warning
with line number, column number, and the message ``no such
parameter''.  If the value for a parameter is inappropriate, such as a
string value for a parameter which should have a numeric value, the
message will be ``inappropriate value''.  If the error occurs in the
configuration file found in the AnaGram directory, AnaGram will prefix
the warning with the complete path name for the file.  If the error
occurs in the configuration file in your working directory, AnaGram
will prefix the warning with ``AnaGram.cfg:''. If AnaGram encounters a
syntax error while reading a configuration file, it will honor the
parameter settings it found before the syntax error, but will ignore
everything that follows the error.

\section{Alphabetic Listing of Configuration Parameters}

\index{Configuration switches}\index{Allow macros}\index{Macros}
\agparamheading{allow macros}{switch, default on}

When this switch is set, i.e., on, reduction procedures will be
implemented as macros if they are sufficiently simple.  This makes
your parser some what more compact and faster but makes it somewhat
more difficult to debug.  It's a good idea to turn this switch off for
debugging.

\index{Configuration switches}\index{Auto init}
\agparamheading{auto init}{switch, default on}

This switch controls the initialization of any parser that is not
\agparam{event driven}.  When it is on, the
\index{Initializer}initializer for your parser is automatically called
every time the parser is called.
This is the normal situation.  On occasion, however, it
is desirable to call a parser several times without reinitializing it.
In this case, you may set the \agparam{auto init} parameter to off.
Should you do this, you must call the initializer yourself whenever
appropriate.
% XXX characterize the occasion...

When \agparam{event driven} is set, \agparam{auto init} has no effect.

\index{Configuration switches}\index{Auto resynch}
\agparamheading{auto resynch}{switch, default off}

Setting this switch causes AnaGram to include an automatic
resynchronization procedure in the parser.  The resynchronization
procedure will be invoked upon encountering a syntax error and will
skip over input until it finds input characters or tokens consistent
with its state at the time of the error.  The purpose of the
resynchronization procedure is to provide a simple way for your parser
to proceed in the event of syntax errors so that it can find more than
one syntax error on a given pass.  The resynchronization procedure
uses a heuristic based on your own syntax.  AnaGram itself uses this
technique to resynchronize after syntax errors in its input.

A disadvantage to using this resynchronization technique is that the
resynchronization procedure turns off all reduction procedures.  The
reason is that the resynchronization may cause a number of reduction
procedures to be skipped.  This means that the parameters for any
reduction procedures that might be called later would be suspect and
could cause serious problems.  It seems more prudent simply to shut
them down. Semantically determined productions will subsequently, of
course, always use the default reduction token.

If you have a
\index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR}
macro, it will be called \emph{before} the resynchronization
process. It will also be called on subsequent syntax errors, so your
program will not lose control entirely.

If you use the auto resynchronization procedure, you must also specify
the \agparam{eof token} configuration parameter (see below) so that
the synchronizer doesn't inadvertently try to pass over the end of
file.

For other methods of recovering from syntax errors, see Chapter 9.

\index{Configuration switches}\index{Backtrack}
\agparamheading{backtrack}{switch, default on}

If your parser does not continue after encountering a syntax error,
you can speed up your parser and make it a little smaller by turning
off the \agparam{backtrack} switch.  If \agparam{backtrack} is on,
AnaGram configures your parser so that in case of syntax error it can
undo any default reductions it might have made as a consequence of the
erroneous input.  The purpose of such an undo function is to identify
the proper error frame and to maximize the probability of being able
to recover gracefully.

% XXX shouldn't these be indexed as ``obsolete parameters'' or
% something, with xrefs so if you look up ``Bottom margin'' in the
% index it says ``see ``obsolete parameters''''?
%
% Also, shouldn't the various obsolete parameters be described with
% the same text?
%
\index{Configuration parameters}\index{Bottom margin}
\agparamheading{bottom margin}{integer value, default = 3}

This is an obsolete parameter which was used in the DOS version of
AnaGram.  It is no longer used, but is still recognized for the sake
of compatibility.

\index{Configuration switches}\index{Bright background}
\agparamheading{bright background}{switch, default on}

This configuration switch is not used in AnaGram 2.0.  It is retained
for compatibility with configuration files used with the DOS versions
of AnaGram.

\index{Configuration switches}\index{Case sensitive}
\index{Case sensitivity}
\agparamheading{case sensitive}{switch, default on}

Use this switch to control how your parser deals with distinctions
between upper and lower case.  When \agparam{case sensitive} is on,
AnaGram builds a parser which distinguishes upper from lower case.
When this switch is off, AnaGram builds a parser which ignores case
for all input.  This does not mean that the values of character set
tokens are not case sensitive.  Although 'a' and 'A' would map to the
same token, the values would still be lower and upper case
respectively.

% XXX the last bit could be explained more clearly. (something like
% ``parsers still preserve case'')

% XXX this should discuss character sets, locales, and other such
% garbage.

\index{Configuration parameters}\index{Compile command}
\agparamheading{compile command}{string, default = \agcode{NULL}}

This parameter is retained only for compatibility with the DOS version
of AnaGram.  It is ignored in the Windows version.

\index{Configuration switches}\index{Const data}
\agparamheading{const data}{switch, default on}

The \agparam{const data} switch controls the use of \agcode{const}
qualifiers in generated C code.  If the switch is on, all fixed data
arrays in the parser file will be qualified as \agcode{const}.  The
\agparam{const data} switch is ignored if the \agparam{old style}
switch is set.

\index{Configuration parameters}\index{Context type}
%XXX: \index{context tracking} ?
\agparamheading{context type}{c data type, no default}

By default, \agparam{context type} is undefined.  If you assign the
name of a C data type, AnaGram will implement ``context tracking'' in
your parser.  See Chapter 9.  The data type name can be either a
standard, pre-defined data type or one which you create with a
\agcode{typedef} statement.

\index{Configuration parameters}\index{Coverage file name}
\index{File extension}\index{nrc}
\agparamheading{coverage file name}{string, default = \agcode{"\#.nrc"}}

If you set the \agparam{rule coverage} configuration switch, AnaGram
will provide functions in your parser to read and write rule counts to
a file.  The name of the file will be determined by \agparam{coverage
file name}.  The name of your syntax file will be substituted for the
``\agcode{\#}'' character.

\index{Configuration switches}\index{Declare pcb}
% XXX \index{Parser control block} ?
\agparamheading{declare pcb}{switch, default on}

When AnaGram builds a parser, it checks the status of the
\agparam{declare pcb} switch. If it is on, AnaGram declares a parser
control block for you.  AnaGram creates the name of the control block
variable by appending \agcode{{\us}pcb} to the name of your parser.
AnaGram will also code an \agcode{\#include} statement to include your
parser header file, and will define the \agcode{PCB} macro for you.
If you wish to declare the parser control block yourself you should
turn this switch off.

\index{Configuration parameters}\index{Default input type}
\index{Input type}
% XXX: \index{Types} ?
\agparamheading{default input type}{c data type, default = \agcode{int}}

This parameter tells AnaGram what data type to assume for terminal
tokens if they are not explicitly declared.  Normally, you would
explicitly declare terminal tokens only when you have set the
\agparam{input values} configuration switch.  The default type for
nonterminal tokens is given by \agparam{default token type}.

\index{Configuration switches}\index{Default reductions}\index{Reduction}
\agparamheading{default reductions}{switch, default on}

If in a given parser state there is only one production that could be
possibly reduced, it is usually faster to reduce it on any input than
to check specifically for correct input before reducing it.  The only
time this default reduction causes trouble is in the event of
erroneous input.  In this situation you may get an erroneous
reduction.  Normally when you are parsing a file, this is
inconsequential because you are not going to continue semantic action
in the presence of error.  But, if you are using your parser to handle
real-time interactive input, you have to be able to continue semantic
processing after notifying your user that he has entered erroneous
input.  In this case you would want to turn the \agparam{default
reductions} switch off so that productions are reduced only when there
is correct input.

\index{Configuration parameters}\index{Default token type}\index{token}
% XXX \index{Types} ?
\agparamheading{default token type}{c data type, default = \agcode{void}}

This parameter takes a C data type as its value.  It is used to set
the data type for the semantic values of nonterminal tokens whose type
is not explicitly specified in the grammar.  To set the default type
for terminal tokens use \agparam{default input type}.

\index{Diagnose errors}\index{Configuration switches}
\agparamheading{diagnose errors}{switch, default on}

If you set this switch, AnaGram will include a syntax error diagnostic
procedure in your parser. This procedure will be called before your
\index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} macro is
called.  It will store a pointer to a string in the
\agcode{error{\us}message} field of your parser control
block.  The string will contain a diagnostic message.  If there is
only one syntactically correct input, x, for example, the message will
be ``Missing x''.  Otherwise it will be ``Unexpected x'' if the input
is recognizable but incorrect and ``Unexpected input'' otherwise.  If
the \agparam{error frame} switch has been set, the
\agcode{error{\us}frame{\us}ssx} and
\agcode{error{\us}frame{\us}token} fields
in the parser control block will be set as described in Chapter 9.

% XXX say: diagnose errors causes the token_names[] array to be
% included in the parser. and index token_names[]...

\index{Distinguish lexemes}\index{Configuration switches}
% XXX \index{Disregard} ?
\agparamheading{distinguish lexemes}{switch, default off}

The \agparam{distinguish lexemes} switch has no effect unless a
disregard token has been defined.  Normally, the disregard token
(usually white space) is optional between lexemes.  This may lead to
apparent shift-reduce conflicts if the characters that comprise the
second of two successive lexemes can be construed as part of the first
lexeme.  In this situatation, turning on the \agparam{distinguish
lexemes} switch effectively requires a disregard token to separate the
two lexemes.

\index{Edit command}\index{Configuration parameters}
\index{File extension}\index{syn}
\agparamheading{edit command}{string, default = \agcode{"ed \#.syn"}}

This parameter is no longer used and is retained only for file
compatibility with the DOS version of AnaGram.

\index{Enable mouse}\index{Configuration switches}
\agparamheading{enable mouse}{switch, default on}

This parameter is no longer used and is retained only for file
compatibility with the DOS version of AnaGram.

\index{Enum constant name}\index{Configuration parameters}
\agparamheading{enum constant name}{string,
default = \agcode{"\${\us}\%{\us}token"}}

Use the \agparam{enum constant name} parameter to control the names
AnaGram uses for the enumeration constants it defines in the
header file for your parser.  The value of \agparam{enum constant
name} should be a string containing the ``\agcode{\%}'' character.
AnaGram will substitute each token name in turn for the
``\agcode{\%}'' character in this template as it creates the list of
enumeration constants.  If it finds a ``\agcode{\$}'' character it
will substitute the name of your parser.

\index{Eof token}\index{Configuration parameters}\index{Token}
\agparamheading{eof token}{token name, no default}

If you use the auto resynchronization capability of AnaGram, you must
specify an end of file token explicitly.  You can do this either by
specifying a terminal token in your grammar called \agcode{eof} or by
using the \agparam{eof token} parameter to identify some other
terminal token to be used as the end of file marker.  You would do
this only if you must use the name \agcode{eof} for some other
purpose.

\index{Error frame}\index{Error frame}\index{Configuration switches}
\agparamheading{error frame}{switch, default off}

AnaGram uses the \agparam{error frame} switch in conjunction with the
\index{Diagnose errors}\index{Configuration switches}\agparam{diagnose errors}
switch.  If both are set, when your parser encounters a syntax error,
before invoking the
\index{SYNTAX{\us}ERROR}\index{Macros}\agcode{SYNTAX{\us}ERROR} macro,
your parser will determine the frame in which the error occurred, that
is, the production the parser was trying to match at the time of the
error.

% XXX: See chapter (dd.tex) for a complete discussion.

\index{Configuration parameters}\index{Error token}\index{Token}
\agparamheading{error token}{token name, no default}

One of your options for error recovery after a syntax error is a
technique similar to that provided in \agfile{yacc}.  You include a
terminal token called \agcode{error} in your grammar.  When the parser
encounters an error in the input it backs up the state stack to the
most recent state in which \agcode{error} was an acceptable input.  It
then shifts to the new state as though it had seen an actual
\agcode{error} token.  At this point, it skips over any character in
the input which is not an acceptable input character for this state.
Once it does find an acceptable input character, it continues
processing as though nothing had happened.  If you wish to use this
approach and for some reason you wish to use the name \agcode{error}
for some other token in your grammar, you may use the \agparam{error
token} parameter to identify some other terminal token in your grammar
as the ``error token''.

\index{Configuration switches}\index{Error trace}\index{Trace}
\index{Window}
\agparamheading{error trace}{switch, default off}

If you turn the \agparam{error trace} switch on, AnaGram will include
code in your parser so that when it encounters a syntax error it will
write the contents of the \index{Parser state stack}\index{State
stack}\index{Stack}parser state stack to a file.  The name of the file
is the same as the name of your syntax file but with the extension
\index{File extension}\index{etr}\agfile{.etr}.  You may override this
definition by defining
\index{AG{\us}TRACE{\us}FILE{\us}NAME}\index{Macros}\agcode{AG{\us}TRACE{\us}FILE{\us}NAME}
in your embedded C.

The \agmenu{Error Trace} option in the \agmenu{Action} menu can then
read this information and prepare a pre-built \agwindow{Grammar Trace}
showing you the status of the parse at the time of the syntax error.
You would use this switch primarily when you are first checking out
your grammar to make sure it accurately represents the input you
desire to handle.  You would also use it any time your parser
encounters a syntax error you don't understand.  For more information,
see Chapter 5.

\index{Escape backslashes}\index{Configuration switches}
\agparamheading{escape backslashes}{switch, default off}

\agparam{Escape backslashes} is used only in conjunction with the
\agparam{line numbers} option.  When turned on, it causes the
backslashes in the pathname generated by the \agparam{line numbers}
option to be doubled.  This switch has been provided because C and C++
compilers are not consistent in their handling of backslashes in path
names.

\index{Event driven}\index{Configuration switches}
% XXX \index{AG{\us}RUNNING{\us}CODE} ?
% XXX \index{exit{\us}flag} ?
\agparamheading{event driven}{switch, default off}

If you turn the \agparam{event driven} switch on, when you build a
parser, it will be configured as an ``event driven'' parser.  This
means that after calling its initializer function, you call it once
with each discrete unit of input.  The parser proceeds until it
needs more input, finishes the grammar, or encounters an error.  It
then returns.  The \agcode{exit{\us}flag} field in the parser control
block is equal to \agcode{AG{\us}RUNNING{\us}CODE} if more input is needed.
Other values indicate other reasons for termination.
% XXX crossreference the discussion of exit codes?

When \agparam{event driven} is on, \agparam{auto init} has no effect;
you must always call the initializer function yourself.

\index{Far tables}\index{Configuration switches}
\agparamheading{far tables}{switch, default = off}

If \agparam{far tables} is on when AnaGram builds a parser, it will
declare the larger tables it builds as \agcode{far}.  This can be a
convenience when using some memory models of the 8086 architecture.

\index{Grammar token}\index{Configuration parameters}\index{Token}
\agparamheading{grammar token}{token name, no default}

The \agparam{grammar token} parameter may be used to specify the
grammar, or ``goal'', token for the syntax analyzer portion of
AnaGram.  An alternative method is to append a ``\$'' to the goal
token when you define it.  You may also simply use the name
\agcode{grammar} to identify the grammar token.

\index{Header file name}\index{Configuration parameters}\index{File name}
\agparamheading{header file name}{string, default = \agcode{"\#.h"}}

This parameter names the parser header file AnaGram generates.  The
contents of the header file are described in Chapter 9.  When AnaGram
creates the file, it copies the value of \agparam{header file name},
substituting the name of your syntax file for the ``\agcode{\#}''
character, in order to create the pathname and extension for the file.
You can therefore use this parameter to give the header file a
particular name, independent of the syntax file name, or to specify a
particular drive or directory where you want the header file to
reside.  Note that if you include a full DOS/Windows pathname,
backslash characters must be quoted.

\index{Input values}\index{Configuration switches}
\agparamheading{input values}{switch, default off}

% XXX this shouldn't say ASCII because it's true even if the
% characters are some other character set...
If the input to your parser includes explicit token values which are
not simply the ASCII values of corresponding ASCII input characters,
you must set the \agparam{input values} switch to inform AnaGram.
Unless your parser is \agparam{event driven}, you must also provide
your own \agcode{GET{\us}INPUT} macro.

\index{Line length}\index{Configuration parameters}
\agparamheading{line length}{integer value, default = 80}

\agparam{Line length} is an obsolete configuration parameter, recognized
for the sake of compatibility with configuration files prepared for
the DOS version of AnaGram.  It is ignored in AnaGram 2.0.

\index{Line numbers}\index{configuration switches}
\agparamheading{line numbers}{switch, default off}

If \agparam{line numbers} is set, AnaGram will put syntax file line
numbers into the generated C code file using the
\index{\#line}\agcode{\#line}
directive so that your compiler diagnostics will refer to lines in the
syntax file rather than in the generated C code file.  If
\agparam{line numbers} is off, AnaGram will put syntax file line
numbers in comments.  The
\index{Line numbers path}\index{Configuration parameters}
\agparam{line numbers path} and
\index{Escape backslashes}\index{Configuration switch}
\agparam{escape backslashes}
switches may be used to control the generation of the line number
directives.

\index{Line numbers path}\index{Configuration parameters}
\agparamheading{line numbers path}{string, default = \agcode{NULL}}

When you have set the \agparam{line numbers} switch and
\agparam{line numbers path} is not NULL, AnaGram uses it in the
\agcode{\#line} directive in place of the full path name of your
syntax file.
% XXX update for unix where we (maybe) don't generate full pathnames

\index{Lines and columns}\index{Configuration switches}
\agparamheading{lines and columns}{switch, default on}

If this switch is set, AnaGram will incorporate code into your parser
to track line numbers and column numbers in its input.  At all times,
the \agcode{line} and \agcode{column} fields in your parser control
block will mark the location of the current lookahead character.  The
treatment of tab characters is controlled by the
\index{TAB{\us}SPACING}\index{Macros}\agcode{TAB{\us}SPACING} macro.

\index{Main program}\index{Configuration switches}
\agparamheading{main program}{switch, default on}

The \agparam{main program} switch determines what AnaGram does if you
invoke the Build Parser command, but have no embedded C in your syntax
file.  If the switch is on, AnaGram creates a main program which does
nothing but call your parser.  The switch is ignored if your parser
uses \agparam{pointer input} or is \agparam{event driven}.

\index{Max conflicts}\index{Configuration parameters}\index{Conflicts}
\agparamheading{max conflicts}{integer value, default = 50}

\agparam{Max conflicts} limits the number of conflicts AnaGram will
record.  Sometimes, a simple editing error in your syntax file can
cause hundreds of conflicts, which you don't need to see in gory
detail.  If you have a grammar that is in serious trouble and you want
to see more conflicts, you may change \agparam{max conflicts} to suit
your needs.

\index{Near functions}\index{Configuration switches}
\agparamheading{near functions}{switch, default off}

\agparam{Near functions} controls the use of the \agcode{near} keyword
for static functions in your parser.  If your parser is to run on a
16-bit 80x86 processor you would want to turn it on.  If you are
going to run your parser on some other processor or use a C compiler
that does not support the \agcode{near} keyword you should leave
\agparam{near functions} off.

\index{Configuration switches}\index{Nest comments}\index{Comments}
\agparamheading{nest comments}{switch, default off}

Use this switch to allow nested comments in your syntax or
configuration files.  It defaults to off, in accordance with the ANSI
standard for C.  Note that AnaGram scans comments in any embedded C
code as well as in the grammar specification.  You may turn this
switch on and off as many times as necessary in a single file.

\index{Old style}\index{Configuration switches}
\agparamheading{old style}{switch, default off}

\agparam{Old style} controls the function definitions in the code
AnaGram generates.  When \agparam{old style} is off, AnaGram generates
ANSI style calling sequences with prototypes as necessary.  When
\agparam{old style} is on, it generates old style function definitions,
and no prototypes.  It also causes the
\index{Const data}\index{Configuration switch}\agparam{const data}
switch to be ignored.

\index{Page length}\index{Configuration parameters}
\agparamheading{page length}{integer value, default = 66}

\agparam{Page length} is an obsolete configuration parameter,
recognized for the sake of compatibility with configuration files
prepared for the DOS version of AnaGram.  It is ignored in AnaGram
2.0.

\index{Parser file name}\index{Configuration parameters}\index{File name}
\agparamheading{parser file name}{string, default = \agcode{"\#.c"}}

AnaGram creates a parser which consists of all the embedded C code in
your syntax file, the syntax tables created by the syntax analyzer,
and a parsing engine configured to your requirements.  This code is
written to a file whose name is given by this parameter.  When AnaGram
creates your parser file, it copies the value of the \agparam{parser
file name} parameter, substituting the name of your syntax file for
the ``\agcode{\#}'' character, in order to create the pathname and
extension for the file.  You can therefore use this parameter to give
the parser file a particular name, independent of the syntax file
name, or to specify a particular drive or directory where you want the
parser file to reside.  Note that if you include a full DOS/Windows
pathname, you must quote the backslash characters.  If writing a C++
parser you would use this parameter to set the output filename suffix.

\index{Parser}\index{Parser name}\index{Configuration parameters}
\agparamheading{parser name}{string, default = \agcode{"\$"}}

% XXX This should say something other than ``name your parser''
AnaGram uses the value of \agparam{parser name} to name your parser,
substituting the name (not including the extension) of your syntax
file for a ``\agcode{\$}'' character.  If you accept the default value of
\agparam{parser name} and have a syntax file called \agfile{ana.syn},
AnaGram will name your parser \agcode{ana}.

The \index{Initializer}initializer for your parser will have the same
name preceded by \agcode{init{\us}}. In the above example, the
initializer would be called \agcode{init{\us}ana}.

\index{Configuration parameters}\index{Stack}\index{Parser stack alignment}
\agparamheading{parser stack alignment}{c data type, default = \agcode{int}}

\agparam{Parser stack alignment} is used to control byte alignment of
the parser stack, \agcode{PCB.vs}.  AnaGram normally adds a field of
the specified data type to the \agcode{union} declaration that defines
the data type for the parser stack.  This parameter can be used to
deal with byte alignment problems when a parser is to be run on a
processor with byte alignment restrictions.  For instance, if your
grammar has tokens of type \agcode{double} and your processor requires
double precision variables to be properly aligned, you can include the
following statement in a configuration section in your grammar or in
your configuration file:
\begin{indentingcode}{0.4in}
parser stack alignment = double
\end{indentingcode}
If the data type is \agcode{void}, no alignment declaration will be
made.
% You will not need to change this parameter if your parser is to
% run on a PC or compatible processor.
%
% XXX this really ought to be updated for the century of the fruitbat

\index{Configuration parameters}\index{Parser stack size}
\agparamheading{parser stack size}{integer value, default = 32}

\agparam{Parser stack size} is used to set the sizes of the parser
stacks in your parser control block.  When AnaGram analyzes your
grammar, it determines the minimum amount of stack space required for
the deepest left recursion.  To this depth it adds one half the value
of the \agparam{parser stack size} parameter.  It then sets the actual
stack size to the larger of this value and the \agparam{parser stack
size} parameter.  If you find 32 wastefully large or dangerously
small, you can define it to suit the needs of your particular parser.

\index{Pointer input}\index{Configuration switches}
\agparamheading{pointer input}{switch, default off}

When you turn \agparam{pointer input} on you tell AnaGram that the
input to your parser is in memory and can be scanned simply by
incrementing a pointer.  Before calling your parser you should make
sure that the \agcode{pointer} field in your parser control block is
properly initialized to point to the first character or token in your
input.

Use the parameter
\index{Pointer type}\index{Configuration parameters}\agparam{pointer type}
to specify the type of the pointer. The default value of pointer type
is \agcode{unsigned char *}.

\index{Pointer type}\index{Configuration parameters}
\agparamheading{pointer type}{c data type, default = \agcode{unsigned char *}}

If you have set the \agparam{pointer input} switch, AnaGram will use
the value of the \agparam{pointer type} parameter to declare the
\agcode{pointer} field in your parser control block.

\index{Print file name}\index{Configuration parameters}\index{File name}
\agparamheading{print file name}{string, default = \agcode{"LPT1"}}

\agparam{Print file name} is an obsolete configuration parameter,
recognized for the sake of compatibility with configuration files
prepared for the DOS version of AnaGram.  It is ignored by AnaGram
2.0.

\index{Quick reference}\index{Configuration switches}
\agparamheading{quick reference}{switch, default off}

The \agparam{quick reference} switch is no longer used, but is still
recognized for compatiblity's sake.  In future versions of AnaGram it
may no longer be recognized.

\index{Configuration switches}\index{Reduction choices}
\agparamheading{reduction choices}{switch, default off}

If the \agparam{reduction choices} switch is set when AnaGram builds a
parser, it will include in your parser file a function which can
identify the acceptable choices for the reduction token in the current
state.  You would use this switch only if you were using semantically
determined productions in your grammar and if there were states in
which not all the tokens on the left side of the production were valid
reduction tokens.

\index{Rule coverage}\index{Configuration switches}\index{Coverage}
\agparamheading{rule coverage}{switch, default off}

If you set the \agparam{rule coverage} switch, AnaGram will include
code in your parser to count the number of times your parser identifies
each rule in your grammar.  To maintain the counts, AnaGram declares,
at the beginning of your parser, an integer array, whose name is
created by appending \agcode{{\us}nrc} to the name of your parser.  The
array contains one counter for each rule you have defined in your
grammar.  There are no entries for the auxiliary rules that AnaGram
creates to deal with set overlaps or disregard statements.  In order
to identify every rule that the parser reduces in the course of
execution, AnaGram
has to turn off certain optimization features in your parser.
Therefore, a parser that has the \agparam{rule coverage} switch
enabled will run slightly slower than one with the switch off.  An
entry on the \agmenu{Browse} menu allows you to view the coverage data.
% XXX See Chapter ???.

\index{Tab spacing}\index{Configuration parameters}
\agparamheading{tab spacing}{integer value, default = 8}

\agparam{Tab spacing} controls the expansion of tabs when AnaGram
displays your syntax file or the \agwindow{File Trace} test file.

The value of \agparam{tab spacing} is also used to set the default
value of the \index{TAB{\us}SPACING}\index{Macros}\agcode{TAB{\us}SPACING}
macro in your parser.

The default value of \agparam{tab spacing} is 8.  If you prefer a
different value, you should probably include an appropriate statement
in your configuration file. For example:

\begin{indentingcode}{0.4in}
tab spacing = 2
\end{indentingcode}

\index{Test file binary}\index{Configuration switch}
\agparamheading{test file binary}{switch, default off}

\agparam{Test file binary} causes \agwindow{File Trace} to read test
files in binary mode.  When \agwindow{File Trace} reads a test file,
it normally reads it in text mode, which in Windows causes carriage return
characters to be stripped out.  Occasionally it is necessary to test a
grammar where carriage return characters are important and should not
be stripped.  In this situation, set \agparam{test file binary} to on,
and the carriage return characters will not be discarded.
% XXX rewrite the second half of this paragraph?

\index{Test file mask}\index{Configuration parameters}
\agparamheading{test file mask}{string, default = \agcode{"*.*"}}

% XXX default should be ``*'' on unix
AnaGram uses \agparam{test file mask} to filter the pick list of test
files when you use the
\index{File Trace}\index{Trace}\index{Window}\agwindow{File Trace}
feature.
You may set it to any value you wish, including a pathname.
% XXX: test this
For instance, if you know that all your test files are in the directory
\agfile{C:{\bs}PROJECT{\bs}SOURCE} and have
extension \agfile{.FOO} you could set test file mask to
\agcode{"C:{\bs\bs}PROJECT{\bs\bs}SOURCE{\bs\bs}*.FOO"}.
Note that, as in any string literal, backslash characters must be
escaped.

\index{Test range}\index{Configuration switches}\index{Range}
\agparamheading{test range}{switch, default off}
% XXX should this really default to off?

When \agparam{test range} is on, AnaGram will insert code in your
parser to make sure all input characters or token identifiers are
within the range specified in your grammar.  If you do not turn this
switch on, your parser will run slightly faster, but its behavior will
be undefined if it gets input outside the range you have specified
in your grammar.

\index{Token names}\index{Configuration switches}
\agparamheading{token names}{switch, default off}

When \agparam{token names} is set, AnaGram includes a static array of
ASCII strings in your parser containing the names of your tokens.  The
name of this array is \agcode{\#{\us}token{\us}names} where the
``\agcode{\#}'' character is replaced with the name of your parser.
The entry for tokens which do not have names is an empty string:
\agcode{""}.

\index{Top margin}\index{Configuration parameters}
\agparamheading{top margin}{integer value, default = 3}

\agparam{Top margin} is an obsolete configuration parameter,
recognized for the sake of compatibility with configuration files
prepared for the DOS version of AnaGram.  It is ignored by AnaGram
2.0.

\index{Traditional engine}\index{Configuration switches}
\agparamheading{traditional engine}{switch, default off}

Traditional LALR-1 parsers use a parsing engine which has only four
actions: shift, reduce, accept, and error.  AnaGram, in the interests
of faster execution and more compact tables, uses a parsing engine
with a number of short-cut actions.  The \agparam{traditional engine}
switch tells AnaGram not to use the short-cut actions.

You would set this switch primarily in conjunction with use of the
\index{Grammar Trace}\index{Trace}\index{Window}\agwindow{Grammar Trace}
in order to have a clearer idea of what is happening.  AnaGram will
then be using the same parsing actions as textbook parsers.  Note that
if a lookahead token has already been selected, AnaGram will display
it on the last line of the \agwindow{Parser Stack} pane in the
\agwindow{Grammar Trace} window.
% XXX what is this note doing here?

You should turn this switch back off when you have finished debugging
or your parser will be larger and slower than necessary.

% XXX: say that in production code traditional engine is not useful
% and only serves to slow things down.

\index{Video mode}\index{Configuration parameters}
\agparamheading{video mode}{integer value, default = $-$1}

\agparam{Video mode} is an obsolete configuration parameter,
recognized for the sake of compatibility with configuration files
prepared for the DOS version of AnaGram.  It is ignored by AnaGram
2.0.
author	David A. Holland
date	Mon, 30 May 2022 23:46:22 -0400
parents	13d2b8934445
children