view anagram/guisupport/helpdata.src @ 19:db7ff952e01e

both mansupps seem to be html
author David A. Holland
date Tue, 31 May 2022 02:06:45 -0400
parents 13d2b8934445
children
line wrap: on
line source

Accept Action

The accept action is one of the four actions of a
traditional ©parsing engineª. The accept action is
performed when the ©parserª has succeeded in identifying
the goal, or ©grammar tokenª for the ©grammarª.  When
the parser executes the accept action, it sets the ©exit_flagª
field in the ©parser control blockª to AG_SUCCESS_CODE and returns
to the calling program. The accept action is thus the last action of
the parsing engine and occurs only once for each successful execution
of the parser.

If the grammar token has a non-void value, you may
obtain its value by calling the ©parser value functionª
whose name is given by <parser name>_value, that is,
by appending "_value" to the ©parser nameª.
##

Parser Value Function, Return Value

The value assigned to the ©grammar tokenª in your parser
may be retrieved by calling the parser value function after
the parser has finished. The name of this function is given
by <©parser nameª>_value. The return type of the function
is the type assigned to the grammar token.

If you have set the ©reentrant parserª switch, the parser
value function takes a pointer to the ©parser control blockª
as its sole argument. Otherwise, it takes no arguments. The
value function is not defined if the grammar token has type "void".
##

AG_PLACEMENT_DELETE_REQUIRED

When the ©wrapperª option is specified, the wrapper
template class that AnaGram defines uses a "placement
new" operator to construct the wrapper object on the
©parser value stackª. The MSVC++ 6.0 compiler requires,
in this situation, that a corresponding "placement
delete" operator be defined. Other C++ compilers,
notably MSVC++ 5.0, generate an error message if
they encounter the definition of a "placement delete"
operator.

Accordingly, AG_PLACEMENT_DELETE_REQUIRED is used to determine
whther a "placement delete" operator should be defined.

AG_PLACEMENT_DELETE_REQUIRED is defined to be 1 if you are using MSVC++
6.0 or greater, 0 otherwise. You can override the automatic definition of
AG_PLACEMENT_DELETE_REQUIRED by defining it in the ©C prologueª section
of your grammar. Set it to a non-zero value to force the "placement
delete" definition, zero to skip the definition.

##

ag_tcv

ag_tcv is an array AnaGram includes in your ©parserª.
Your parser uses ag_tcv to translate external codes to
the internal token numbers that AnaGram uses. It uses
the actual input code to index the ag_tcv array to
fetch a ©token numberª. The token number is then used
to identify the input token.
##

Allow macros

"Allow macros" is a ©configuration switchª which
defaults to on. When it is set, i.e., on, ©reduction
procedureªs will be implemented as macros if they are
sufficiently simple. This makes your ©parserª somewhat
more compact but makes it somewhat more difficult to
debug. It's a good idea to turn this switch off for
debugging.
##

Analyze Grammar

The Analyze Grammar command will scan and
analyze your ©syntax fileª, and create a number of
tables summarizing your grammar.

Analyze Grammar does not create any ©output filesª.
To create a ©parserª, use the ©Build Parserª command.
You would probably use Analyze Grammar, rather than Build Parser, during
initial development of your ©grammarª.

You can use ©File Traceª and ©Grammar Traceª as soon as you have
analyzed your grammar. It is not necessary to build a parser first.
##

Attribute Statement

Attribute statements are used in ©configuration
sectionsª of your ©syntax fileª to specify certain
properties for ©tokenªs, ©character setªs, or other
units of your grammar. The attribute statements
available are:
	©disregardª
	©distinguish keywordsª
	©enumª
	©extend pcbª
	©hiddenª
	©leftª
	©lexemeª
	©nonassocª
	©rename macroª
	©reserve keywordsª
	©rightª
	©stickyª
	©subgrammarª
	©wrapperª
##

Auto init

Auto init is a ©configuration switchª which defaults to
on. It controls the initialization of any ©parserª that
it is not ©event drivenª. When it is set to on, your
parser is automatically initialized every time it is
called. This is the situation you will normally use. On
occasion, however, it is desirable to call a parser
several times without reinitializing it. In this case,
you may set the auto init parameter to off and then
call the ©initializerª yourself whenever it is
appropriate.
##

Auto resynch

"Auto resynch" is a ©configuration switchª which
defaults to off. You may use it to specify ©automatic
resynchronizationª as an ©error recoveryª mechanism.

Setting the "auto resynch" switch causes AnaGram to
include an automatic ©resynchronizationª procedure in
your ©parserª. The resynchronization procedure will be
invoked when your parser encounters a ©syntax errorª
and will skip over input until it finds input
characters or ©tokensª consistent with its state at the
time of the error.

An alternate technique, ©error token resynchronizationª,
uses an ©error tokenª which you include in your grammar.
##

Automatic Resynchronization

Automatic ©resynchronizationª is one of several ©error
recoveryª options available as part of parsers built by
AnaGram. You enable automatic resynchronization by
setting the ©auto resynchª ©configuration switchª. If
your parser includes automatic resynchronization it will
incorporate a heuristic procedure which will skip over
input tokens until it finds a token which makes sense
with respect to one or another of the ©productionªs
active at the time of the ©syntax errorª.

The purpose of the resynchronization procedure is to
provide a simple way for your parser to proceed in the
event of syntax errors so that it can find more than one
syntax error on a given pass. The resynchronization
procedure uses a heuristic based on your own syntax.
AnaGram itself uses this technique to resynchronize
after syntax errors in its input.

A disadvantage to using this resynchronization technique
is that the resynchronization procedure turns off all
©reduction procedureªs. Because of the error, a number
of reduction procedures, which normally would be
executed, will be skipped. The parameters for any
reduction procedures that might be called later would be
suspect and could cause serious problems. It seems more
prudent simply to shut them down.

If you use the automatic resynchronization procedure,
you must also specify an ©eof tokenª so that the
synchronizer doesn't inadvertently skip over the end of
file.

An alternative technique for resynchronization is called
©error token resynchronizationª.
##

Auxiliary Trace

An Auxiliary Trace is a pre-built grammar trace which
you may select from the ©Auxiliary Windowsª popup menu for
most windows which display parser state information.
The Auxiliary Trace provides a path to the state
specified in the highlighted line of the primary window.

When obtained
from the Parser Stack pane of the ©File Traceª or ©Grammar Traceª, the
Auxiliary Trace is simply a copy of the current status of these
traces so you can explore your alternatives while still retaining the
status of the original trace for reference.
##

Auxiliary Windows

From most AnaGram windows you can pop up an Auxiliary Windows
menu by clicking the right mouse button or by pressing Shift F10.
Auxiliary Windows may
have Auxiliary Windows of their own.

 Windows with a cursor bar (highlighted line):
The windows available in the Auxiliary Windows menu depend on the
grammar elements identified by the cursor bar in the parent window. If
the cursor bar identifies a ©parser stateª, there will be windows that
describe the state. If the cursor bar identifies a ©grammar ruleª,
there will be windows that describe the rule. If the cursor bar
identifies a ©tokenª, there will be windows that describe the token. In
the case of a ©marked ruleª, token windows will describe the marked
token, if any. In some cases, specialized pre-built grammar traces
such as the ©Conflict Traceª or ©Auxiliary Traceª are on the menu.

 Help windows:
For Help windows, the Auxiliary Windows menu will show all the
available links to other ©Help topicsª from this window. ©Using Helpª
is always available.
##

Backtrack

If your ©parserª does not continue after encountering a
©syntax errorª, you can speed it up and make it a
little smaller by turning off the backtrack
©configuration switchª. If backtrack is on, AnaGram
configures your parser so that in case of syntax error
it can undo any ©default reductionsª it might have made
as a consequence of the erroneous input. The purpose of
such an undo function is to identify the proper ©error
frameª and to maximize the probability of being able to
recover gracefully.
##

Empty Recursion

This warning message tells you that the recursive step of the
specified ©recursive ruleª can be completely matched by ©zero
lengthª tokens, i.e., by nothing at all.
The result is potentially an infinite loop in the generated ©parserª.
The specified rule is an expansion rule of the specified token.

Because of the possibility of encountering an infinite loop while parsing,
AnaGram turns off its ©keyword anomalyª analysis if empty recursion is
found. The ©File Traceª function is also disabled for the same reason.

The ©circular definitionª of a token has the same effect as an
empty recursion, in that no additional input is required to match
the recursive rule.

##
Keyword Anomaly analysis aborted: empty recursion

The ©keyword anomalyª analysis has been turned off, since the presence of
©recursive ruleªs with ©empty recursionª can cause infinite loops in the analysis.

##

Keyword Anomaly analysis aborted: circular definition

The ©keyword anomalyª analysis has been turned off, since the presence of
a ©circular definitionª can cause infinite loops in the analysis.

##

File Trace disabled: empty recursion

Because of the presence of ©recursive ruleªs with  ©empty recursionª in this grammar and
the infinite loops that can ensue, the ©File Traceª function has been
disabled.

##

File Trace disabled: circular definition

Because of the presence of a ©circular definitionª in this grammar and
the infinite loops that can ensue, the ©File Traceª function has been
disabled.

##



Both Error Token Resynch and Auto Resynch Specified



This ©warningª message indicates that your ©grammarª
defines an ©error tokenª and also requests ©automatic
resynchronizationª. AnaGram will ignore the request
for automatic resynchronization and will provide ©error
token resynchronizationª. If you named a token "error"
but do not wish ©error token resynchronizationª, you can
either rename "error", or, in a ©configuration
sectionª, you may explicitly specify the error token to
be something you don't otherwise use in your grammar:
	[ error token = not used ]
##

Bottom Margin

"Bottom margin" is an ©obsolete configuration parameterª.
##

Bright Background

"Bright background" is a ©configuration switchª which
was used in the DOS version of AnaGram. It is no longer
used, but is still recognized for the sake of upward
compatibility with old ©configuration fileªs.
##

Build Parser

You use the Build Parser command to create a ©parserª based on your
©grammarª. The parser is a C file consisting of the ©embedded Cª (which
may include C++) code in your ©syntax fileª, your ©reduction
procedureªs, a number of tables derived from your grammar
specification, and a ©parsing engineª customized to your requirements.

If you only wish to investigate your grammar and do not
wish to create ©output filesª, use the ©Analyze
Grammarª command.
##

Build <file name>

This item on the ©Action Menuª is available when you have analyzed a
©grammarª but you have not yet built it. It builds the grammar
without reloading the ©syntax fileª from the disk.
##

Cannot Make Wrapper for Default Token Type

This ©warningª message occurs when AnaGram finds a token type that has
been previously defined as the ©default token typeª
listed in a ©wrapperª statement. If a wrapper is needed for a
particular type, you must specify the ©data typeª explicitly
for each relevant ©tokenª.

As a result, a wrapper class has not been created for the specified token type.
##

Token with Wrapper cannot be Default Token Type

This ©warningª message indicates that an attempt has been made
to specify a class that has previously been listed in a ©wrapperª
statement as the ©default token typeª.
If a wrapper is needed for a particular type, you must specify the
©data typeª explicitly for each relevant ©tokenª.

As a result, the default token type has not been set.
##

Case Sensitive

"Case sensitive" is a ©configuration switchª which
defaults to on. When it is on, it instructs AnaGram to
build a parser for which all input is case sensitive.
When it is off, the AnaGram builds a parser which
ignores case for all input.

If the ©iso latin 1ª configuration switch is turned
off, case conversion will be limited to characters
in the normal ascii range. When it is on, case
conversion will be done for all iso latin 1 characters.

If you have other requirements for case conversion,
you may provide your own definition in your ©embedded cª for the
©CONVERT_CASEª macro which is invoked to perform case
conversion on input characters.

Note that the value of an input token is unaffected
by the case sensitive switch. When case sensitive is
off, 'a' and 'A' will be treated as the same input
token by the parser, but the ©token valueªs will
nevertheless be different.
##

C Prologue

If you include a block of ©embedded Cª code at the very
beginning of your syntax file, it is called the "C
prologue". It will be copied to your ©parser fileª
before any of the code generated by AnaGram. You can
use the C prologue to ensure that copyright notices,
#include directives, or type definitions, for example,
occur at the very beginning of your parser file.

If you specify a C or C++ type of your own definition,
you must provide a definition in the C prologue.
##

CHANGE_REDUCTION

CHANGE_REDUCTION(t) is a macro which AnaGram defines in
your ©parser fileª if your ©parserª uses ©semantically
determined productionsª. In your ©reduction procedureª,
when you need to change the ©reduction tokenª you can
easily do so by calling CHANGE_REDUCTION with the name
of the desired token as the argument. If the token name
has embedded spaces, replace the embedded spaces with
underline characters.
##

Character Constant

You may represent single characters in your ©grammarª by
using character constants. The rules for character
constants are the same as in C. The escape sequences
are as follows:
	\a		alert (bell) character
	\b		backspace
	\f		formfeed
	\n		newline
	\r		carriage return
	\t		horizontal tab
	\v		vertical tab
	\\		backslash
	\?		question mark
	\'		single quote
	\"		double quote
	\ooo	octal number
	\xhh	hexadecimal number

 AnaGram treats a single
character as a ©character setª
which contains only the specified character. Therefore you
can use a character constant in a ©set expressionª.
##

Character Map

The Character Map table shows you the mapping of input
characters to ©token numbersª. The ©ag_tcvª table in
your parser is based on the information in this table.

The fields in this table are:
	character code
	display character, if any (what Windows displays for this code)
	©partition set numberª
	©token numberª
	©token representationª

The display character will be what Windows displays for the character
code in the Data Tables font you have chosen.
##

Character Range

A "character range" is a simple way to specify a
©character setª. There are two ways to represent a
character range in an AnaGram ©syntax fileª.

The first way is like a ©character constantª: 'a-z'.

The second way allows somewhat greater freedom:
	'a'..'z'
	'a'..255
	^Z..037
	-1..0xff
Here you use two arbitrary ©character representationsª
separated by two dots. If the two characters are out of
order, AnaGram will reverse the order, but will give
you a ©warningª.

More complex ©character setsª may be specified by using
©unionª, ©differenceª, ©intersectionª, or ©complementª
operators.
##

Character Representation

In an AnaGram ©syntax fileª you may represent a
character literally with a ©character constantª or
numerically using decimal, octal or hexadecimal
representations following the conventions for C. Thus
'A', 65, 0101, and 0x41 all represent the same
character. Control characters can be represented using
the '^' character and either an upper or lower case
letter. Thus ^j and ^J are acceptable representations
of the ascii newline code. The rules for character
constants are identical to those in C, and the same
escape sequences are recognized.
##

Character Set

In AnaGram grammars you can conveniently specify whole
sets of characters at a time. This avoids
needless repetition and complexity.

Sets of characters may be defined in an AnaGram ©syntax
fileª in any of a number of ways. A single character is
taken to represent a character set consisting of a
single element. (See ©character representationª.) You
can also specify a set consisting of a range of
characters (see ©character rangeª) and perform the
familiar set operations, union, intersection, difference
and complement.

All the sets you define in your syntax file are
summarized in the ©Character Setsª window.

The ©unionª of two character sets, represented by a '+',
contains all characters that are in one or another of
the two sets. Thus, 'A-Z' + 'a-z' represents the set of
all upper and lower case letters.

The ©intersectionª of two character sets, represented
by a '&', contains all characters that are in both
sets. Thus, suppose you have the ©definitionsª
	letter = 'A-Z' + 'a-z'
	hex digit = '0-9' + 'A-F' + 'a-f'
Then (letter & hex digit) contains precisely upper and
lower case a to f.

The ©differenceª of two character sets, represented by
a '-', contains all characters that are in the first
set but not in the second set. Thus, using the same
definitions as above, (letter - hex digit) contains
precisely upper and lower case g to z.

The ©complementª of a character set, represented by a
preceding '~', represents all characters in the
©character universeª which are not in the given set.
Suppose you have defined a set, ©eofª, which consists of
the characters which represent end of file. Then, in
your grammar where you wish to accept an arbitrary
character, what you really want is anything but an end
of file character. You can define it thus:
	anything = ~eof
##

Character Sets

This window lists all of the distinct ©character setªs
which you defined, implicitly or explicitly, in your
©grammarª. Each line in the table describes one such
set.

The description takes the form of the internal set
number and the defining ©expressionª. The ©Auxiliary
Windowsª menu will allow you to see the ©Partition
Setsª which cover the character set, and the ©Set
Elementsª which it comprises, as well as the ©Token Usageª.
##

Character Universe, Universe

The character universe, or set of all expected input
characters to your parser, is defined as all characters
in the range given by a particular lower bound and a
particular upper bound, as described below.

The character universe is used for two things in
AnaGram. The first use is for calculating the
©complementª of a character set. The second use is in
the input processing of your parser. Input characters
will be used to index a ©token conversionª table to
convert character codes to token numbers. The length of
this table will be given by the size of the character
universe. If you have set the ©test rangeª
©configuration switchª you parser will verify that the
input character is within the range of the conversion
table. Otherwise, the character code will not be
checked for validity. In this case, an out-of-range
character will lead to undefined behavior.

If you have not used any characters with negative codes
in your grammar, the lower bound is zero. Otherwise, it
is the most negative such character.

If the highest character code you have used is less
than or equal to 255, the upper bound will be 255.

If you have used a character code greater than 255, the
upper bound will be the largest such code which appears
in your syntax file.
##

Characteristic Rule

Each ©parser stateª is characterized by a particular
set of ©grammar rulesª, and for each such rule, a
marked token which is the next ©tokenª expected. The
combination of a grammar rule and its marked token is often
called a ©marked ruleª. A marked rule which
characterizes a state is called a "characteristic
rule". In the course of doing ©grammar analysisª,
AnaGram determines the characteristic rules for each
©parser stateª. After analyzing your grammar, you may
inspect the ©State Definition Tableª to see the
characteristic rules for any state in your parser.
##

Characteristic Token

Every state in a ©parserª, except state 0, can be
characterized by the one, unique ©tokenª which causes a
jump to that state. That token is called the
©characteristic tokenª of the state, because to get to
that ©parser stateª you must have just seen precisely
that token in the input. Note that several states could
have the same characteristic token.

When you have a list of states, such as is given by the
©parser state stackª, it is equivalent to a list of
characteristic tokens. This list of tokens is the list
of tokens that have been recognized so far by the
parser.
##

Circular Definition

If the ©expansion ruleªs for a ©tokenª contain a ©grammar ruleª that
consists only of the token itself, the definition of the
token is circular. A circular definition is an extreme
case of ©empty recursionª.

As in cases of empty recursion, the generated parser may contain
infinite loops. When such a condition is detected, therefore,
©keyword anomalyª analysis the ©File Traceª option are disabled.

##

column

"column" is an integer field in your ©parser control
blockª used for keeping track of the column number of
the current character in your input. Line and column
numbers are tracked only if the ©lines and columnsª
©configuration switchª has been set.
##

Command Line

If you provide the name of a syntax file on the
command line when you start AnaGram, it will open
the file and run either ©Analyze Grammarª or ©Build
Parserª depending on the setting of the ©Autobuildª
switch.
##

Command Line Version, agcl.exe

The command line version of AnaGram, agcl.exe, can be
used in make files. It takes the name of a single syntax
file on the command
line. Error and ©warningª messages are written to stdout.

Normally you would only use the command line version once you
have finished developing your ©parserª and are integrating
it with the rest of your program.

The command line version of AnaGram is not included with
trial copies.
##

Comment

You may incorporate comments in your syntax file using
either of two conventions. The first is the normal C
convention for comments which begin with "/*" and end
with "*/". Such comments may be of arbitrary length. By
setting or resetting the ©nest commentsª switch, you
may control whether they may be nested or not.

The second convention for comments is the C++ comment
convention. In this case the comment begins with "//"
and ends with a newline.

When writing a ©grammarª, you may wish to allow a user
to comment his input freely without your having to
explicitly allow for comments in your grammar. You may
accomplish this by using the ©disregardª statement.
##

Compile Command

"Compile command" is a ©configuration parameterª which
takes a string value. This parameter was used in the
DOS version of AnaGram, but is ignored in the Windows
version.
##

Complement

In set theory, the complement of a set, S, is the set
of all elements of the ©universeª which are not members
of the set S.

In AnaGram, the complement operator for ©character
setsª is given by '~' and has higher precedence than
©differenceª, ©intersectionª, or ©unionª.

In AnaGram, the most useful complement is that of the
end of file character set. For ordinary ascii files it
is often convenient to read the entire file into
memory, append a zero byte to the end, and define the
end of file set thus:
	eof = 0 + ^Z.
Then, ~©eofª represents all legitimate input characters.

You can then use set differences to specify certain
useful sets without tedious enumeration. For example, a
comment that is to be terminated by the end of line
then consists of characters from the set
	comment char = ~'\n' & ~eof
This set could also be written
	comment char = ~('\n' + eof)
##

Completed Rule

A "completed rule" is a ©characteristic ruleª which has no ©marked
tokenª.  In other words, it has been completely matched and will be
reduced by the next input.

If there is more than one completed rule in a state,
the decision as to which to reduce is made based on the
next input token. If there is only one completed rule
in a state, it will be reduced by default unless the
©default reductionsª switch has been reset, i.e.,
turned off.
##

Configuration File

If it can find them, AnaGram reads two configuration
files to set up ©configuration parameterªs. At program
initialization, it will first attempt to read a
configuration file in the directory that contains
the AnaGram executable file you are running. Then it
will read a configuration file in your working
directory. Both files should have the name
"AnaGram.cfg" if they exist. Neither is necessary.

If a parameter is specified in both files, the
specification in the file from the working directory
takes precedence.

The effect of this two stage process is to allow you to
set your standard preferences in the principal
directory, with specific overrides in your working
directories.

The values for configuration parameters in ©syntax
filesª override those read from configuration files.

AnaGram does not save configuration parameters in
the Windows registry, nor does it provide any
mechanism for setting or changing the values of
configuration parameters within AnaGram itself.
##

Configuration Parameter

Configuration parameters may be specified either in
©configuration filesª or in your ©syntax fileª. In your
syntax files, configuration parameters are specified,
one per line, in a ©configuration sectionª.

AnaGram ignores case when identifying a configuration
parameter, so that "ALLOW MACROS", "Allow Macros", and
"allow macros" are all equivalent forms.

There may be any number of configuration sections in a
©syntax fileª. Any parameter may be specified any
number of times. Since AnaGram maintains only one value
in storage for these parameters, whenever it refers to
one it will see the most recently specified value.
Every configuration parameter has a default value which
has been chosen to correspond to a standard if it
exists, customary usage if such can be determined, or
otherwise to the most likely usage.

Before executing an Analyze Grammar or Build Parser command, AnaGram
resets configuration parameters to their initial values, as
determined by the built in defaults and the configuration files read
at program initialization.

The ©Configuration Parameters Windowª shows the current settings of all
of the configuration parameters. When this window is active you may
press ©F1ª or click with the ©help cursorª to pop up a help window
describing the parameter under the cursor bar.

There are several varieties of configuration
parameters. Some simply set or reset a condition. These
need simply be stated to set the condition or negated
with the tilde (~) to reset the condition. Thus
	[ nest comments ]
causes AnaGram to allow nested comments, and
	[ ~nest comments ]
causes AnaGram to disallow nested comments.

If you prefer you may explicitly specify a switch value as on or off:
	[ nest comments = on]

 A second kind
of configuration parameter takes a value
which is the name of a token. Thus
	[ grammar token = c grammar]
specifies that the token, c grammar, is the ©grammar
tokenª which is to be analyzed.

A third variety of configuration parameter takes a
value which is a C data type. Thus
	[ default token type = unsigned char *]
signifies that the ©semantic valueª of a token, unless
otherwise specified is a pointer to an unsigned char.

A fourth variety of configuration parameter takes a
string value to set some ascii string used by AnaGram.
Thus
	[ header file name = "widget.h" ]
signifies that the header file created by AnaGram
should be called "widget.h".

In string-valued parameters used to specify the names
of output files or the name of your parser, you may use
the '#' character to indicate the name of your syntax
file: When the string is actually used, AnaGram will
substitute the syntax file name for the '#'.

In string-valued parameters used to specify the names
of functions or variables that AnaGram generates, you
may use '$' to specify the name of your parser. When
the string is actually used, AnaGram will substitute
the name of your parser for the '$'.

In the "©enum constant nameª" configuration parameter
you may use '%' to specify where a token name is to be
substituted.

The final variety of configuration parameter takes a
numeric value. The value may be decimal, octal
or hexadecimal, following the C conventions, and may
have an optional sign. Thus
	[parser stack size = 50]
tells AnaGram to allocate space for at least fifty stack entries
when it creates your parser.
##

Configuration Parameters Window

The Configuration Parameters window lists the
©configuration parameterªs AnaGram accepts with their
current values, as set by the ©configuration filesª it
has read and by the most recent ©syntax fileª it has
analyzed. Configuration parameters cannot be changed
from within AnaGram.
##

Configuration Section

A configuration section is one of the main divisions of
your ©syntax fileª. It begins with a left square
bracket on a fresh line. It then contains definitions
of ©configuration parameterªs, ©configuration switchª
settings and ©attribute statementªs. These
specifications must each start on a new line. The
configuration section is closed with a right bracket.
Any further component of your syntax file, other than a
©commentª, must start on a fresh line.

There can be any number of configuration sections in a
syntax file.
##

Configuration Switch

A configuration switch is a ©configuration parameterª
which can take on only the two values true and false,
or on and off. You set a configuration switch, or turn
it on, by simply naming it in your ©configuration fileª
or in a ©configuration sectionª of your ©syntax fileª.
You turn it off, or "reset" it, by use of the tilde:
"~nest comments", for example, resets, or turns off,
the ©nest commentsª switch. If you prefer, you may
assign the value "on" to set the switch, or "off" to
reset it. For example:
	nest comments = on
##

Conflict

"Conflicts" arise during the ©grammar analysisª when
AnaGram cannot determine how to treat a given input
token. There are two sorts of conflicts: ©shift-reduce
conflictsª and ©reduce-reduce conflictsª. Conflicts may
arise either because the grammar is inherently
ambiguous, or simply because the grammar analyzer
cannot look far enough ahead to resolve the conflict.
In the latter case, it is often possible to rewrite the
grammar in such a way as to eliminate the conflict. In
particular, ©null productionsª are a common source of
conflicts.

When AnaGram analyzes your grammar, it lists all
unresolved conflicts in the ©Conflictsª window. A number
of ©Auxiliary Windowsª available from the Conflicts window
provide help in identifying the source of the conflict.

There are a number of ways to deal with conflicts. If
you understand the conflict well, you may simply choose
to ignore it. When AnaGram encounters a shift-reduce
conflict while building parse tables it resolves it by
choosing the ©shift actionª. When AnaGram encounters a
reduce-reduce conflict while building parse tables, it
resolves it by selecting the ©grammar ruleª which
occurred first in the grammar.

A second way to deal with conflicts is to set ©operator
precedenceª parameters. If you set these parameters,
AnaGram will use them preferentially to resolve
conflicts. Any conflicts so resolved will be listed in
the ©Resolved Conflictsª window.

A third way to resolve a conflict is to declare some
tokens as ©stickyª. This is particularly useful for
©productionªs whose sole purpose is to skip over
uninteresting input.

A fourth way to resolve conflicts is to declare a token
to be a ©subgrammarª. When you do this, AnaGram does
not look beyond the definition of the subgrammar token
itself for reducing tokens. This is not a particularly
selective way to resolve conflicts and should be used
only when the subgrammar token is naturally defined
only by internal criteria. The tokens identified by
lexical scanners are prime examples of this genre.

The fifth way to deal with conflicts is to rewrite the
grammar to eliminate them. Many people prefer this
approach since it yields the highest level of
confidence in the resulting program.

Please refer to the AnaGram User's Guide for more information about
dealing with conflicts.
##

Conflicts

If there are ©conflictªs in your grammar which are not
resolved by ©precedence rulesª, they will be listed in
the Conflicts window. The Conflicts window will also be
listed in the ©Browse Menuª. Conflicts which have been
resolved by ©precedence rulesª are listed in the
©Resolved Conflictsª window.

The Conflicts window lists the conflicts, or
ambiguities, which AnaGram found in your grammar. The
table identifies the ©parser statesª in which it found
conflicts, the ©conflict tokenªs for which it had more
than one option, and the ©marked rulesª for each such
option. If one of the rules for a particular conflict
has a ©marked tokenª, the conflict is
a ©shift-reduce conflictª. The marked token is the token
to be shifted. If none of the rules has a marked token the conflict is
a ©reduce-reduce conflictª.

AnaGram provides a number of ©Auxiliary Windowsª to help
you find and fix the source of the conflict. The
©Conflict Traceª window is a pre-built ©Grammar Traceª
window which shows you one of perhaps many ways to
encounter the conflict. The ©Reduction Traceª window
shows the result of reducing a particular ambiguous
rule.

In addition, the ©Rule Derivationª and ©Token
Derivationª windows show you why the conflict token is a
©reducing tokenª. They are particularly useful for
shift-reduce conflicts.

The ©Expansion Chainª window is helpful for understanding
reduce-reduce conflicts.

Other Auxiliary Windows which are often useful are the
©State Definitionª window, the ©Reduction Statesª
window, and the ©Problem Statesª window.

Please refer to the AnaGram User's Guide for more information on how to
deal with conflicts.
##

Conflicts Resolved by Precedence Rules

This ©warningª message indicates that AnaGram has
resolved conflicts in your grammar by using ©precedence
rulesª: guidelines you supplied either by explicit
©precedence declarationsª, by using a ©stickyª
statement or ©distinguish lexemesª statement, or
implicitly by using a ©disregardª statement. These
conflicts are listed in the ©Resolved Conflictsª
window, and are not listed in the ©Conflictsª window.
##

Conflict Token

In any given ©conflictª, there is a ©tokenª for which
an unambiguous ©parser actionª cannot be determined.
This token is called the "conflict token".
##

Conflict Trace

The Conflict Trace is a ready-made ©Grammar Traceª
which shows you one of perhaps many ways to get to the
state which has the ©conflictª selected by the cursor
bar. The Conflict Trace window is an option in the
©Auxiliary Windowsª menu for the ©Conflictsª window and
the ©Resolved Conflictsª window.
##

Const Data

The const data ©configuration switchª controls the use
of CONST qualifiers in generated code. If the switch is
set, all fixed data arrays in the ©parser fileª will be
qualified as CONST, unless the ©old styleª switch is
set. The default setting is ON. Other configuration
switches which control declaration qualifiers in the
parser file are ©near functionsª and ©far tablesª.
##

CONTEXT

"CONTEXT" is a macro which AnaGram defines for you if
you have defined a ©context typeª. It provides access
to the top value of the ©context stackª. Your
©GET_CONTEXTª macro may store the current context by
assigning a value to CONTEXT. Suppose your parser uses
©pointer inputª, and you wish to know the value of the
©pointerª for every production. You could define
GET_CONTEXT thus:
	#define GET_CONTEXT CONTEXT = PCB.pointer

 In ©reduction procedureªs, you may use the CONTEXT
macro to find the context for the rule you are
reducing, that is to say, the value the context
variables had when the first token in the rule was
encountered.
##

Context Stack

It is often convenient, when writing ©reduction
procedureªs, to know the actual context of the ©grammar
ruleª your procedure is reducing. To do this you need
to know the values that certain variables, such as
stack pointers, or input pointers, in your program had
at various stages as your parser matched the rule. You
can accomplish this by maintaining a context stack.

If you wish, AnaGram will keep track, on a stack, of any
context variables you wish. To do so, define a structure
which can hold all the values you need to stack. Use the
©context typeª ©configuration parameterª to tell AnaGram
how to declare the stack. Then define the ©GET_CONTEXTª
macro to gather the appropriate values and store them on
the stack. The ©CONTEXTª macro evaluates to the proper
location into which the GET_CONTEXT macro should store
the context value. AnaGram will invoke the GET_CONTEXT
macro whenever necessary to make sure the right values
are stacked. In a reduction procedure, you can then use
the macro ©RULE_CONTEXTª to find the value of the
context structure as of the beginning of each token in
the rule you are reducing.

If your parser is ©event drivenª, store the context of
the input token in PCB.input_context. The default
version of GET_CONTEXT will stack the context as
appropriate.

If your parser should encounter an error, you may use
©ERROR_CONTEXTª to determine the values of the context
variables at the beginning of the aborted grammar rule.
##

context type

"Context type" is a ©configuration parameterª whose
value is a C type name, possibly as defined by a
typedef statement. By default, "context type" is
undefined. If you define it, AnaGram will set up a
©context stackª in your ©parser control blockª so you
can track the context of ©productionªs.

Each time your parser pushes values onto the state
stack and value stack it will invoke the ©GET_CONTEXTª
macro to store the current context on the context
stack. The macro ©CONTEXTª names the current stack
location. In your GET_CONTEXT macro you can use it as
the destination for the current context. In a
©reduction procedureª, CONTEXT names the context as of
the beginning of the production. Two other macros are
available to inspect the values of the context stack.
In a reduction procedure, you may use ©RULE_CONTEXTª[k]
to determine the value of the context variable as it
was as of the (k+1)th token in the rule. In particular,
RULE_CONTEXT[0] is the value the context variable had
when the first token in the rule was seen.

If you enable the ©error frameª ©configuration switchª,
you may use ©ERROR_CONTEXTª to determine the context of
the production your parser was trying to identify at
the time of the error.
##

CONVERT_CASE

CONVERT_CASE is a user definable macro which AnaGram
invokes to convert the case of input characters when
the ©case sensitiveª switch has been turned off. If
you do not define the macro yourself, AnaGram will
provide a macro which will convert case correctly
for characters in the ASCII character range and
also for ©ISO latin 1ª characters if the corresponding
©configuration switchª is on.

##

Coverage File Name

If you have set the ©rule coverageª ©configuration
switchª to include coverage analysis in your parser,
AnaGram uses the value of the coverage file name
©configuration parameterª to find the results of your
testing. The value of the parameter is a string. The
default value is "#.nrc", where '#' represents the name
of your syntax file.
##

cs

cs is a field in a ©parser control blockª which
contains your ©context stackª. cs will be defined only
if you have defined the ©configuration parameterª
©context typeª.
##

Current Grammar

The Current Grammar is the ©grammarª you presently have
loaded. Its name is displayed on the title bar of
each AnaGram window.

A status field at the right center of the ©Control Panelª
indicates the state of processing that has been
carried out on the grammar.

"Loaded" means that the ©syntax fileª has been read
into memory, but that syntax errors have been found.

"Parsed" means that AnaGram has tried to analyze the
grammar, but got into some kind of difficulty and did
not complete the job. The explanation should be
apparent from the messages in the ©Warningsª window.

"Analyzed" means that a ©grammar analysisª has been
completed, but no ©output filesª have been written.

"Built" means that an analysis has been completed and
output files have been written.
##

Data Type

The ©tokensª in your ©parserª usually have ©semantic
valuesª. The data types for these values will be
determined by the ©default input typeª and ©default
token typeª ©configuration parameterªs unless you
explicitly provide ©token declarationsª in your grammar.
You may also define the data type for any ©nonterminalª
token by preceding the token name with an ordinary C
cast when you write a production. For example:

	(int) integer
		-> '0-9':d						=d-'0';
		-> integer:n, '0-9':d	=10*n + d - '0';

The data type may be any simple C or C++ data type, with
arbitrary indirection and qualification. You may also
use any type you have defined by means of typedef,
struct or class definitions. Template classes may also
be used. If you specify a type of your own definition,
you must provide a definition in the ©C prologueª at the
beginning of your ©syntax fileª.

A token may have the type "void" if its value has no
interest for the parser. Since your parser will not
stack a value for a void token, your parser may run
somewhat faster when tokens are declared as void.
##

Declare pcb

"Declare pcb" is a ©configuration switchª that defaults
to on. If this switch is set when you invoke the ©Build
Parserª command, AnaGram will automatically declare a
©parser control blockª for you, at the beginning of
your parser file. If you have used data types that you
define yourself, the typedef statements need to precede
the parser control block declaration. In this case, you
should turn "declare pcb" off and declare it yourself.

For more information, see the AnaGram User's Guide.
##

Default Input Type

The default input type is a ©configuration parameterª
which determines the ©data typeª for the ©semantic
valueªs of ©terminal tokensª if they are not explicitly
declared. Normally, you would explicitly declare
terminal tokens only when you have set the ©input
valuesª ©configuration switchª. If you do not set the
default input type, it will default to "int".

The default data type for the values of ©nonterminal
tokensª is given by the ©default token typeª
configuration parameter.
##

Default Reduction

"Default reductions" is a ©configuration switchª which
defaults to on.

A "default reduction" is a ©parser actionª which may be
used in your parser in any state which has precisely
one ©completed ruleª.

If a given ©parser stateª has, among its ©characteristic
rulesª, exactly one completed rule, it is usually faster
to reduce it on any input than to check specifically for
correct input before reducing it. The only time this
default reduction causes trouble is in the event of a
©syntax errorª. In this situation you may get an
erroneous reduction. Normally when you are parsing a
file, this is inconsequential because you are not going
to continue semantic action in the presence of error.
But, if you are using your parser to handle real-time
interactive input, you have to be able to continue
semantic processing after notifying your user that he
has entered erroneous input. In this case you would want
default reductions to have been turned off so that
©productionªs are reduced only when there is correct
input.
##

Default reduction value

If a ©grammar ruleª does not have a ©reduction procedureª
the ©semantic valueª of the first token in the rule will
be taken as the semantic value of the token on the left
hand side. If these tokens do not have the same ©data typeª
a ©warningª will be given.
##

Default Token Type

"Default token type" is a ©configuration parameterª
which determines the ©data typeª for the ©semantic
valueª of a ©nonterminal tokenª if no other type is
explicitly specified. It defaults to void. Therefore, if
any ©reduction procedureª returns a value, you must
either explicitly set the type of the ©reduction tokenª
or you must set default token type to an appropriate
value.

The default token type cannot have a ©wrapperª class
defined.

The default data type for the value of a ©terminal
tokenª is given by the ©default input typeª
configuration parameter.
##

Definition, Definition Statement

AnaGram syntax files may contain definition statements
which assign new names to ©character setsª, ©virtual
productionsª, ©keyword stringsª, ©immediate actionsª,
or ©tokensª. Definitions have the form
	name = <character set>
	name = <virtual production>
	name = <keyword string>
	name = <immediate action>
	name = <token name>

For example,
	letter = 'a-z' + 'A-Z'
	statement list = statement?...
	include = "include"

The symbols thus defined may be used anywhere the
expression on the right hand side might be used. Such
definitions, in and of themselves, do not define tokens.
Tokens are defined only by their usage in productions.

##

DELETE_WRAPPERS

If your parser uses ©wrapperªs and exits with an error condition, there
may be objects remaining on the ©parser value stackª. The DELETE_WRAPPERS macro
can be used to delete any remaining objects on the stack.
If you have enabled
©auto resynchª, DELETE_WRAPPERS will be invoked automatically.
##

Diagnose Errors

"Diagnose errors" is a ©configuration switchª which
defaults to on. When this switch is on, AnaGram includes a
function, ag_diagnose(), in your parser which provides simple
syntax error disgnoses. When your parser encounters a
syntax error, this function will be called immediately prior
to the invocation of the ©SYNTAX_ERRORª macro. A pointer to the message will be
stored in the ©error_messageª field of the ©parser control blockª.

If you wish to implement your own ©error diagnosisª, you
should turn this switch off, and include a call to your
own diagnostic procedure in your SYNTAX_ERROR macro.

ag_diagnose() provides three possible error messages,
governed by three macros: ©MISSING_FORMATª, ©UNEXPECTED_FORMATª, and
©UNNAMED_TOKENª. You may override the definitions of
these macros with your own definitions if you wish
to provide diagnostics in another language

If you have set the ©error frameª
switch it will also set the ©error_frame_tokenª field.
The "error_frame_token" is the non-terminal token which
the parser was trying to complete when the error was
encountered.

When the "diagnose errors" switch is set, AnaGram also
includes the a ©token namesª table in the parser which
contains the ascii names of the tokens in the grammar,
including entries for character constants and keywords.

Use the ©token names onlyª switch to limit the table
to explicitly named tokens only.
##

MISSING_FORMAT

MISSING_FORMAT is a macro that is used by the error
diagnositic function created by the ©diagnose errorsª
switch. If you do not define it in your parser,
AnaGram will define it thus:
	#define MISSING_FORMAT "Missing %s"

 This format is used when the diagnostic function can
identify a unique terminal or nonterminal token that
would satisfy the syntactic rules and is named
in the ©token namesª table.
##

UNEXPECTED_FORMAT

UNEXPECTED_FORMAT is a macro that is used by the error
diagnositic function created by the ©diagnose errorsª
switch. If you do not define it in your parser,
AnaGram will define it thus:
	#define UNEXPECTED_FORMAT "Unexpected %s"

 This format is used when the diagnostic function cannot
identify a named, unique terminal or nonterminal token that
would satisfy the syntactic rules and finds an
incorrect token, the name of which can be found
in the ©token namesª table.
##

UNNAMED_TOKEN

UNNAMED_TOKEN is a macro that is used by the error
diagnositic function created by the ©diagnose errorsª
switch. If you do not define it in your parser,
AnaGram will define it thus:
	#define UNNAMED_TOKEN "input"

 This macro is used as argument for the ©UNEXPECTED_FORMATª
macro when the actual, erroneous input cannot be identified.
##

Difference

In set theory, the difference of two sets, A and B, is
defined to be the set of all elements of A that are not
elements of B. In an AnaGram ©syntax fileª, you
represent the difference of two ©character setsª by
using the '-' operator. Thus the difference of A and B
is A - B. The difference operator is ©left
associativeª.
##

Disregard

The purpose of the "disregard" statement is to skip over
uninteresting ©white spaceª and comments in your input
file. It allows you to specify a token that should be
passed over in the input to your parser. The statement
takes the form:
	disregard ws
where "ws" is a token name or character set. Disregard
statements, like other ©attribute statementªs, may be
placed in any ©configuration sectionª.

You may have more than one disregard statement in your
©grammarª. If you do, AnaGram will create a shell
production. For example, suppose you write:
	[ disregard alpha
	   disregard beta ]
AnaGram will proceed as though you had written:
	gamma -> alpha | beta
	[ disregard gamma ]

 It frequently happens that you wish your ©parserª to
disregard blanks or comments, except that ©white spaceª
within names, numbers, strings, and other elementary
constructs is subject to special rules and thus should
not be disregarded blindly. In this case, you can use
the "©lexemeª" statement to declare these constructs off
limits for the disregard statement. Within these
constructs, the disregard statement will be inoperative
and the admissibility of white space is determined
solely by the productions which define these constructs.

Outside those productions which define lexemes, you
should not generally use a token which is supposed to be
disregarded. If you do, your grammar will have
©conflictªs, since the token could satisfy both the
explicit usage, as well as the implicit rules set up by
the disregard statement. Such conflicts, however, are
resolved automatically in favor of your explicit use of
the token. The conflicts will appear in the ©Resolved
Conflictsª window.

If you have "open ended" lexemes in your grammar such
as variable names or numeric constants, your grammar
will detect a conflict if one of these lexemes may
follow another such lexeme immediately. To deal with
these conflicts, you should turn on the "©Distinguish
Lexemesª" configuration switch. It will cause white
space to be required as a separator between the
lexemes.

In order to implement the "disregard" statement AnaGram
will redefine some tokens in your grammar. For example,
'+' may be redefined to consist of a simple plus sign
followed by optional white space:
	'+' -> '+'%, white space?...
The ©percent signª is used to indicate the original,
simple plus without the optional white space attached.
You will probably notice the percent sign appearing in
some windows and traces.
##

distinguish keywords

"distinguish keywords" is an ©attribute statementª
which you may include in a ©configuration sectionª. It
is used to tell AnaGram how to distinguish ©keywordªs
from similar sequences of characters in your input
stream. For example, you may want your parser to
recognize "int" as a keyword when it appears in the
following context:
	int x;
but not when in appears in the middle of such words as
"integral" and "intolerant". The operand of
"distinguish keywords" is a list of character set
©expressionªs separated by commas and enclosed in braces
({ }).

Once AnaGram has read your entire syntax file, it
evaluates all of these character sets and tests each
keyword string against the character sets in the order
in which they were encountered in the program. If all
the characters which constitute a particular keyword
are members of the specified set, the keyword logic is
set up so that it will recognize the keyword only if
the immediately following character is not in the set.

In the example above,
	[distinguish keywords {'a-z'} ]
will do the trick.

The "©stickyª" statement also affects the recognition
of keywords.
##

Distinguish Lexemes

The "distinguish lexemes" ©configuration switchª is
used in conjunction with the "©disregardª" statement
and the "©lexemeª" statement to resolve the
©shift-reduce conflictªs which often crop up when
suppressing white space.

The difficulty with suppressing white space is that you
wish it to be optional in cases like "x+y", where it is
not necessary in order to parse correctly, but you want
to require it in situations such as "mytype x", where
it is necessary to separate otherwise indistinguishable
constructs. If the white space were optional, it would
be necessary to allow for "mytypex", but it would be
impossible to determine if this were to be interpreted as
"mytype x", "mytyp ex", or any of the many other
possibilities.

The distinguish lexemes switch causes AnaGram to make
the white space optional where doing so causes no
ambiguity and makes it mandatory where to make it
optional would lead to ambiguity. In the example given
above, "mytypex" would be treated as a single name, and
another name would have to follow separating white
space.

The default value for distinguish lexemes is OFF. It is
anticipated that this will be changed to ON in future
releases of AnaGram.
##

Duplicate Production

This ©warningª message appears when a ©productionª
appears twice in your ©grammarª. You will have a
number of ©reduce-reduce conflictªs as a consequence.
Eliminate the duplicate, and the conflicts it caused
will go away.
##

Edit Command

"Edit command" is a ©configuration parameterª which
accepts a string value. It is no longer used and is
retained only for file compatiblity with the DOS
version of AnaGram.
##

Embedded C

You may encapsulate pieces of C or C++ code in your ©syntax
fileª more or less arbitrarily. Such pieces of code will
simply be copied to the ©parser fileª in the order in
which they are encountered. Each such piece of code must
be enclosed with braces({}). The left brace must be on a
new line, and nothing except comments may follow the
right brace. AnaGram does not inspect the interior of
such a piece of C code except to identify character
constants, strings, comments and blocks surrounded with
braces so that it does not identify the end of the
embedded C prematurely. Note that AnaGram will use the
status of the ©nest commentsª ©configuration switchª in
effect at the beginning of the embedded C.

AnaGram, of course, can be confused by unterminated
strings, unbalanced brackets, and unterminated comments.
The most likely outcome, in such a situation, is that
AnaGram will encounter an end of file looking for the
end of the embedded C. Should this happen, AnaGram will
identify the beginning of the piece of embedded C which
caused the problem.

If your syntax file begins with a block of embedded C,
called the "©C prologueª", it will be copied to the very
beginning of the parser file, preceding all of AnaGram's
output. You may use such an initial block of embedded C
to guarantee that program title comments, copyright
notices and important definitions are at the very
beginning of your parser file.

The code you include as embedded C, of course, has to
coexist with the code AnaGram generates. In order to
keep the potential for name conflicts to a minimum, all
variables and functions which AnaGram defines begin with
the letters "ag_". You should avoid variable names which
begin with these letters.

If AnaGram finds no embedded C in a syntax file, and you
ask it to build a parser, it will automatically generate
a main program that calls your parser. If you don't want
it to do this, you may turn off the ©main programª
©configuration switchª.
##

Empty Keyword String

This ©warningª appears when you have a keyword string
that contains no characters whatsoever. ©Keyword
stringsª must contain at least one character. If you
wish a null match, use a ©null productionª instead.
##

Enable Mouse

"Enable mouse" is a ©configuration switchª that defaults
to on. It is not used in the Windows version of AnaGram
and has been retained only for file compatibility with
the DOS version.
##

Enum Constant Name

The "enum constant name" ©configuration parameterª
allows you to select the name AnaGram will use for the
set of enumeration constants it defines in the ©parser
headerª file for your ©parserª. The value of "enum
constant name" should be a string containing the '%'
character. AnaGram will substitute each token name in
turn into this template as it creates the list of
enumeration constants. If it finds a '$' character it
will substitute the name of your parser. The default
value of "enum constant name" is "$_%_token".
##

Enumeration Constants

In your ©parser headerª file, AnaGram includes a typedef
enum statement which provides enumeration constants
corresponding to all the named constants in your
grammar. The names of the enumeration constants
themselves are defined by the ©enum constant nameª
©configuration parameterª. These constants are useful
when dealing with ©semantically determined productionsª.
##

Enum

Within a ©configuration sectionª, you may use an "enum"
statement to define numeric values for any number of
tokens just as you define enumeration constants in C.
The syntax is effectively the same as the enum statement
in C:

  [
    enum {
      first = 60,
      second,
      third,
      fourth = 'a',
      fifth,
    }
  ]

is exactly equivalent to
  first = 60
  second = 61
  third = 62
  fourth = 'a'
  fifth = 'b'
##

eof

"eof" is a quasi reserved word in AnaGram, used to
specify an end of file token. You may use another token
as an end of file delimiter by setting the ©Eof Tokenª
©configuration parameterª. eof is not required unless
you use ©automatic resynchronizationª in your ©parserª.

If you have not defined eof or specified an Eof Token
parameter, ©File Traceª may show a syntax error when it
encounters the end of a test file.

There are various ascii values that are commonly used
to represent an end of file. The end of a string in
memory is commonly 0, DOS uses ^Z, Unix uses ^D, and
Unix style stream I/O uses -1. It is often convenient
then to define

	eof = -1 + 0 + ^D + ^Z
##

Eof Token

"Eof token" is a ©configuration parameterª which accepts
a token name as a value. There is no default value.
AnaGram does not need a specification for the eof token
unless you are using its ©automatic resynchronizationª
facility.

If you use the ©automatic resynchronizationª capability
of AnaGram, you must specify explicitly an end of file
token. You can do this either by defining a ©terminal
tokenª in your ©grammarª called eof or by using the "eof
token" parameter to identify some other terminal token
to be used as the end of file marker. You would do this
only if you must use the name "©eofª" for some other
purpose.

Note that "eof" is case sensitive. Neither Eof nor
EOF will qualify as end of file tokens unless you
explicitly specify them using the eof token parameter.
##

Eof Token Not Defined

This ©warningª appears if you have requested either
©error token resynchronizationª or ©automatic
resynchronizationª and you have not defined an ©eof
tokenª. The resynchronization procedure will not work
correctly at end of file.
##

Error Action

The error action is one of the four ©parser actionªs of a
traditional ©parsing engineª. The error action is
performed when the parser has encountered an input
token which is not admissible in the current state.
The further behavior of a traditional parser is
undefined.
##

Error Defining

"Error defining TXXX: <token representation>" is a
©warningª message which appears if errors are encountered
while attempting to evaluate the ©character setª for
the specified ©tokenª. This warning is always generated
in addition to more detailed warnings that are made
when the actual errors are encountered.
##

Error frame

"Error frame" is a ©configuration switchª which defaults
to off. You use this switch to specify the ©error
diagnosisª capabilities of your parser. If this switch
is set and the ©diagnose errorsª switch is set, i.e.,
on, your parser will include a function which will
determine the "context" of any ©syntax errorª, that is,
the token the parser was trying to complete.

To determine the context of an error, your parser will
scan backwards through the ©parser state stackª,
examining ©characteristic rulesª until it finds a state
which can accept a unique ©nonterminalª reduction token
that you have not marked as ©hiddenª. It will then set
PCB.©error_frame_ssxª to the ©parser stack indexª for
that level.
##

ERROR_CONTEXT

ERROR_CONTEXT is a macro AnaGram defines for you. If
your parser encounters a ©syntax errorª, you have
enabled the ©error frameª ©configuration switchª, and
you have defined a ©context typeª, ERROR_CONTEXT will
enable you to access the ©contextª as of when the parser
encountered the beginning of the ©error_frame_tokenª.
##

Error Diagnosis

"Error diagnosis" and ©error recoveryª are the two
aspects of ©error handlingª. If in the ©embedded Cª
portion of your syntax file you define a macro called
©SYNTAX_ERRORª, it will be invoked by the parser when a
©syntax errorª is encountered. If you have set the
©diagnose errorsª ©configuration switchª, the
©error_messageª field of the ©parser control blockª will
contain a pointer to a string containing a diagnostic
message. The diagnostic is of the form "Missing <token
name>" or "Unexpected <token name>".

If you do not define SYNTAX_ERROR it will be
automatically defined so that a message will be written
to stderr.

If the ©lines and columnsª switch has been set you will
have the current line number and column number available
for your diagnostic message.

If you have set the ©error frameª switch as well as the
diagnose errors switch, the variable
PCB.©error_frame_tokenª will identify the ©nonterminal
tokenª the parser was trying to recognize when the
error was encountered.

Of course, if your parser is controlling direct keyboard
input, a diagnosis might be unnecessary. In this case
you might define SYNTAX_ERROR so that it simply beeps at
the user and let it go at that.
##

Error Handling

Rarely is a parser built to read an arbitrary input
file. The normal situation is that the parser is built
to read files that conform to the rules specified in a
grammar, rules that describe a class of input files
rather than all possible input files. If the input file
does not conform to the grammar, the parser will detect
a ©syntax errorª.

There are two aspects to error handling in your parser:
©error diagnosisª and ©error recoveryª. Error diagnosis
consists in informing your user that something
unexpected has happened. Error recovery consists in
either aborting the parse, or getting it started again
in some reasonable manner. AnaGram provides several
options for both error diagnosis and error recovery.

When a syntax error is encountered, first your error
diagnosis option is executed and then your error
recovery option is executed.
##

error_message

error_message is a field in a ©parser control blockª to
which your ©error handlingª procedures may refer. If you
have set the ©diagnose errorsª ©configuration switchª,
on encountering a ©syntax errorª your ©parserª will
create a string containing an appropriate diagnostic
message and store a pointer to it into
PCB.error_message.
##

Error Trace

"Error Trace" is both a ©configuration switchª and the
name of an option in the ©Action Menuª. If the switch
is on, AnaGram adds code to your parser to capture
state information to a file in case of a ©syntax errorª. The Error
Trace option can then read this information and prepare a pre-built
©Grammar Traceª showing you the state of the parser at the time of
the error.

The name of the file is determined by the macro
©AG_TRACE_FILE_NAMEª. AnaGram will provide a default
definition for the macro consisting of the name of
your ©syntax fileª plus the extension ".etr". You
may override this definition by defining AG_TRACE_FILE_NAME
in your ©embedded Cª.

If error trace is enabled, AnaGram will also enable the
Error Trace option on the ©Action Menuª. If you select
Error Trace AnaGram will initialize a ©Grammar Traceª
window from the error trace file you select. The parser
stack of the trace will be as it was when the error
occurred. The last line of the parser stack pane will
show the ©lookahead tokenª that caused the syntax error. You may
then use the Grammar Trace to explore the nature of
the syntax error your parser encountered.

AnaGram will
warn you if the error trace file is older than
the syntax file, since under those conditions, the
error trace file might be invalid.
##

AG_TRACE_FILE_NAME

AG_TRACE_FILE_NAME is a C macro used to determine the
name of the file your parser will write when it
encounters a ©syntax errorª if you have enabled
the ©error traceª ©configuration switchª.

You may define AG_TRACE_FILE_NAME in your ©embedded Cª.
AnaGram provides a default definition given by the
name of your ©syntax fileª with the extension ".etr".
##

Error Recovery

Error recovery is the process of continuing after a
©syntax errorª. AnaGram offers several options. These
are controlled by ©configuration parameterªs and by
your grammar.

If you do not specify any error recovery, your parser
will simply return to the calling program when it
encounters a syntax error. ©PCBª.©exit_flagª will be set
to two, to indicate termination on syntax error.

If you wish your parser to simply ignore the erroneous
token and continue, set PCB.exit_flag to zero in your
©SYNTAX_ERRORª macro. You might use this option if your
parser is dealing directly with keyboard input.

You may wish to use YACC type error handling. To do
this, simply incorporate a token called "error" in your
grammar, or specify some other token as an ©error
tokenª. On syntax error, your parser will back up to
the most recent state where "error" was acceptable
input, treat the bad input as an instance of error, and
then skip all input until it finds an acceptable input
token. At that point it will proceed as though nothing
had happened.

AnaGram also provides an ©automatic resynchronizationª
option, which uses a complex heuristic to compare input
tokens against all stacked states in order to find the
best state from which to continue.
##

Error Token Resynchronization

One of your options for ©error recoveryª after a ©syntax
errorª is a technique similar to that provided in YACC.
You include a terminal token called "error" in your
grammar. (Or, use the ©error tokenª configuration
parameter to specify some other token to serve this
purpose.) When the parser encounters an error in the
input, after invoking the ©SYNTAX_ERRORª macro, it backs
up the ©parser state stackª to the most recent state in
which "error" was an acceptable input. It then shifts to
the new state as though it had seen an actual "error"
token. At this point, it skips over any character in the
input which is not an acceptable input character for
this state. Once it does find an acceptable input
character, it continues processing as though nothing had
happened.
##

error_frame_ssx

error_frame_ssx is a field in a ©parser control blockª
to which your ©error handlingª routines may refer. When
your ©SYNTAX_ERRORª macro is called, if you have set
both the ©diagnose errorsª and ©error frameª
configuration switches, error_frame_ssx will contain the
value of the ©parser stack indexª at the beginning of
the ©error_frame_tokenª. For example, if in a syntax
file, you fail to close a comment, AnaGram will
encounter an illegal end of file in the comment. In this
situation, error_frame_token is the token for a comment,
and error_frame_ssx gives the parser stack depth at the
beginning of the comment.
##

error_frame_token

error_frame_token is a field in a ©parser control blockª
to which your ©error handlingª routines may refer. If
you have set both the ©diagnose errorsª and ©error
frameª ©configuration switchªes, when your
©SYNTAX_ERRORª macro is called, it will contain the
©token numberª of the error_frame_token.
##

error, Error Token

"Error token" is a ©configuration parameterª that takes
a token name for a value. It has no default value. If
you do not specify it, and your grammar has a terminal
token called "error", it will be used as the error
token. If you have an error token defined your parser
will presume that you wish to use the ©error token
resynchronizationª method of ©error recoveryª.
##

Escape Backslashes

"©Escape backslashesª" is a ©configuration switchª that
defaults to off. When turned on, the ©line numbersª switch
will write pathnames with doubled backslashes. The switch
is no longer necessary, since AnaGram now uses forward slashes
in the pathnames in #line directives rather than backslashes.switch.
##

Event Driven

It is often convenient to configure your parser to be
"event driven". In this situation, instead of calling
your parser once to process the entire input, you call
an ©initializerª to initialize the parser, and then you
call the parser once for each input token. Each time you
call it, the parser processes the single input token
until it can do no more.

You can interrogate the ©exit_flagª field of the
©parser control blockª to determine whether the parse is
complete or whether the parser encountered an error.

Event driven parsers are especially convenient for
dealing with terminal input or communications protocols.
##

Event Driven Parser Cannot Use Pointer Input

This ©warningª message appears if you specify pointer
input for your ©parserª and also specify that it should
be event driven. If you are going to use ©pointer
inputª, you should not specify your ©parserª as event
driven.  Conversely, if you really want an ©event
drivenª parser, you cannot specify pointer input.
##

Excessive Recursion

This ©warningª message appears if an internal stack in
AnaGram overflows because of the complexity of an
expression in your ©grammarª. Simplify your grammar by
using ©definitionª statements to name subexpressions.
##

exit_flag

exit_flag is a field in the ©parser control blockª.
When your parser returns, PCB.exit_flag contains an exit
code describing the outcome of the parse.  Mnemonic
values for the exit codes are defined in the parser
header file AnaGram generates. These mnemonics, their
values and their meanings are:
	AG_RUNNING_CODE    				= 0:	Parse is not yet complete
	AG_SUCCESS_CODE        				= 1:  	Parse terminated successfully
	AG_SYNTAX_ERROR_CODE   		= 2:	Syntax error encountered
	AG_REDUCTION_ERROR_CODE 	= 3:	Bad reduction token encountered
	AG_STACK_ERROR_CODE   			= 4: 	Parser stack overflowed
	AG_SEMANTIC_ERROR_CODE 		= 5: 	Semantic error, user defined

 An AnaGram parser checks exit_flag on return
from every ©reduction procedureª. AnaGram will exit with
the flag unchanged if it is non-zero. To halt a parse
from a reduction procedure, then, you need only set the
exit_flag to AG_SEMANTIC_ERROR_CODE, or any other unused value
greater than zero that suits your needs.
##

Expansion, Expansion Rule

In analyzing a ©grammarª, we are often interested in the
full range of input that can be expected at a certain
point. The expansion of a ©tokenª or state shows us
all the expected input. An expansion yields a set of
©marked ruleªs. The ©marked tokenª in each rule
shows us what input to expect.

The set of expansion rules of a (©nonterminalª) token
shows all the expected input that can occur whenever the
token appears in the grammar. The set consists of all
the ©grammar ruleªs produced by the token, plus all the
rules produced by the first token of any rule in the
set. A ©marked tokenª for an expansion rule of a token
is the first element in the rule.

The expansion of a state consists of its ©characteristic
ruleªs plus the expansion rules of the marked token in each
characteristic rule.
##

Expansion Chain

You may select an Expansion Chain window from the
©Auxiliary Windowsª popup menu of most windows that contain
©expansion ruleªs.

The Expansion Chain window is extremely useful for
indicating why a particular ©grammar ruleª is an
©expansion ruleª in a particular state. To see a chain
of productions that produces a desired expansion rule,
select the expansion rule with the cursor bar, press
the right mouse button for the Auxiliary Windows menu, and select
Expansion Chain.

The Expansion Chain window will then present a sequence
of expansion rules, using the same format as the
Expansion Rules window, but subject to the constraint
that each rule is produced by the ©marked tokenª in the previous line.

The first rule in the window is a ©characteristic ruleª
for the given state.  The last rule in the window is
the rule selected by the cursor bar in the window from
which you chose the Expansion Chain. It should be noted
that this expansion is not unique. There may be other
derivations.
##

Expansion Rules

You may select an Expansion Rules window from the
©Auxiliary Windowsª popup menu of most windows which display
©marked rulesª. The Expansion Rules window shows the
complete set of ©expansion ruleªs for the ©marked
tokenª in the highlighted rule.

In other windows, including all trace windows, the
Expansion Rules window shows the expansion of the token
on the highlighted line.
##

F1

Use the F1 key to bring up a context sensitive help window. Because of
various peculiarities of the Windows API, there are a few contexts
where the F1 key does not work; however, generally the ©help cursorª
works where F1 does not and vice versa.

©Helpª windows have hypertext links to related help windows.
In a help window, the right mouse button pops up a menu of
all the links for the window.
##

extend pcb

The "extend pcb" statement is an ©attribute statementª that allows you to
add declarations of your own to the ©parser control blockª. With this
feature, data needed by ©reduction procedureªs can be stored in the pcb
rather than in global or static storage. This capability greatly
facilitates the construction of ©thread safe parsersª.

The extend pcb statement may be used in any configuration section.
The format is as follows:
	extend pcb { <C or C++ declaration>... }

It may, of course, extend over multiple lines and may contain any
C or C++ declarations. AnaGram will append it to the end of the parser
control block declaration in the generated parser ©header fileª.  There may
be any number of extend pcb statements. The extensions are appended to
the pcb in the order in which they occur in the syntax file.

The extend pcb statement is compatible with both C and C++ parsers. Note
that even if you are deriving your own class from the parser control
block, you might want to use the extend pcb to provide virtual function
definitions or other declarations appropriate to a base class.
##

Far Tables

"Far tables" is a ©configuration switchª which defaults
to off. If it is set, when AnaGram builds a ©parserª it
will declare the larger tables it builds as FAR. This
can be a convenience when using some memory models with
8086 architecture.
##

Fatal Syntax Errors

This ©warningª message occurs when AnaGram cannot
complete the ©Analyze Grammarª command on your ©syntax
fileª because of errors in your syntax file.
##

File Trace

You can use the File Trace facility to verify your grammar,
even before you have implemented ©reduction proceduresª or
any other code. Thus you can defer writing procedural code
until you have the grammar working to your specifications.

To run File Trace, select
File Trace from the ©Action Menuª or click on the File Trace button.

Select a test file. When the ©File Trace Windowª appears,
double click at any point in the ©test file paneª, or
click the ©Parse Fileª button to parse the entire file.
AnaGram will parse up to the point you have selected
according to the rules in your ©grammarª. If the test file does not
conform to the rules of the grammar, the parse will halt with a
©syntax errorª. You can then inspect the ©Parser Stack paneª and the
©Rule Stack paneª to get an idea of the nature of the problem.


AnaGram uses different colors to
distinguish the portion of the test file that has
been parsed from the portion that has not been parsed,
so the location of the error should be readily apparent.

Since the syntax error often occurs somewhat downstream
from the actual error, you may need to back the parse up
and approach the error slowly. In the Test File pane,
double click at any point prior to the error to back
the parse up to that point. You can then click on the
©Single Stepª button to perform a single parser action.

You may also use the cursor keys to control the parse.
As long as no error is encountered, the parse is locked
to the blinking cursor. If you cursor past the syntax
error, however, the parse can no longer track the cursor
so the cursor location will differ from the parse location . The
cursor and parse locations will also differ after you single click
at any point other than the current parse location.

When the cursor and the parse location are thus out of synch, the
Single Step button is replaced with a ©Synch Parseª button. You
can click on Synch Parse to get the parse back in synch with the
cursor.

The File Trace option will be greyed out on the ©Action Menuª
if your grammar has ©empty recursionª, since
such a grammar may cause infinite loops in the parser.

Because a File Trace is based on character codes, it will also be greyed out
on the ©Action Menuª if your parser uses ©token inputª rather than
character input.

All parser actions performed by a File Trace update the ©trace
coverageª counts, enabling you to verify the extent to which
your test files exercise your parser.

Normally, AnaGram reads test files in "text" mode,
discarding carriage return characters. If your parser
needs to recognize carriage return characters
explicitly, you should turn the "©test file binaryª"
switch on.
##

File Trace Window

The ©File Traceª window normally consists of three panes:
	The ©Parser Stack paneª
	The ©Test File paneª
	The ©Rule Stack paneª

 If your grammar uses ©semantically determined productionsª,
the ©Reduction Choices paneª will appear when necessary
to allow you to select a ©reduction tokenª. The choice that
you make will be remembered and reused if you should back up
the parse and parse past this point again. The remombered choice
is not made automatically when you use ©Single Stepª. Thus,
if you wish to
change your choice, position the cursor at the location where
the choice must be made and Single Step past the choice.

If you ©reloadª the test file, the choices you have made will
be discarded.

The active pane has
a distinctively colored title panel and cursor bar. You can
use the tab key to tab among the panes. The function of
other keyboard keys depends on which pane is active.

Along the bottom of the File Trace Window is a toolbar with
two status boxes:
	©Parse Locationª
	©Parse Statusª
and five buttons:
	©Single Stepª
	©Parse Fileª
	©Resetª
	©Reloadª
	©Helpª

 If the blinking cursor loses synch with the current
parse location, the Single Step button is replaced with
the ©Synch Parseª button.
##

Grammar Trace Window

The ©Grammar Traceª window normally consists of three panes:
	The ©Parser Stack paneª
	The ©Allowable Input paneª
	The ©Rule Stack paneª

 If your grammar uses ©semantically determined productionsª,
the ©Reduction Choices paneª will appear when necessary
to allow you to select a ©reduction tokenª.

The active pane has
a distinctively colored column header and cursor bar. You
can use the tab key to tab among the panes. The function of other
keyboard keys depends on which pane is active.

Along the bottom of the Grammar Trace Window is a toolbar with
a ©Parse Statusª box, a ©text entryª field
and four buttons:
	©Proceedª
	©Single Stepª
	©Resetª
	©Helpª

 In the ©Parser Stack paneª you can see a
representation of the ©parser state stackª and ©parser stateª as they
might appear in the course of execution of your ©parserª. You can
examine the ©allowable inputª tokens and see the changes to the
state and the state stack caused by any input token you
choose. The ©Rule Stack paneª shows the relationship between the
contents of the parser stack and your ©grammarª. If your grammar
uses ©semantically determined productionsª, you can select the
appropriate ©reduction tokenª from the ©Reduction Choices paneª.

You can enter text characters directly in the ©text entryª
field. This means you can run a Grammar Trace like a ©File Traceª
where the test file is replaced by the characters you type in the
text entry field.  This is a very convenient way to check out your
grammar.
##

Test File, Test File Pane

In the ©File Traceª, the file under test is displayed in the
upper right pane. To parse to a specific point, double
click at that point.

As long as the parse location and the cursor are synchronized,
when you use the cursor keys to
move the cursor, the parse will track the cursor.

If the parse encounters a ©syntax errorª, it will not be able
to go beyond the location of the error. In this situation,
moving the cursor right or down will cause the cursor position to
differ from the parse location. The parse and cursor positions can also
differ if you single click anywhere in the Test File pane.

If the
parse location and the cursor are thus not synchronized, the
©Single Stepª button will be replaced with a ©Synch Parseª
button. Click on the Synch Parse button to get the cursor
and the parse back in synch. Of course, the parse will still
not be able to proceed past a syntax error.

In the default color scheme, parsed text is shown on a lighter
background than is unparsed text.

If your grammar uses ©semantically determined productionªs,
the parse will halt when one is encountered and the ©reduction
choices paneª will be displayed so you may select the appropriate
©reduction tokenª.

At any time you can click on the ©Reset buttonª to reset the parse to
the beginning of the test file. If you modify the test file, you
can click on the ©Reload buttonª to load the modified file and
reset the parse.

Normally, AnaGram reads test files in "text" mode, discarding carriage
return characters. If your parser needs to recognize carriage return
characters explicitly, you should turn the ©test file binaryª
©configuration switchª on.

Sample test files are provided with the FFCALC and FC ©examplesª.
##

Parse Location

The current location of the ©File Traceª parser in the
©test file paneª. The format is <line number>:<column number>.
##

Parse Status

The current state of the ©File Traceª or ©Grammar Traceª parser.

 Ready: The parser is ready for input.
 Running: The parser is processing input.
 Parse Complete: The parser has reached the end of the input. Click
on ©resetª or ©reloadª to restart the parse.
 Syntax error: A syntax error has been encountered. The parser cannot
go any further.
 Unexpected end of file: The parser has reached the end of the actual
input but the grammar still expects more.
 Select reduction token: The parser encountered a ©semantically determined
productionª. Select a ©reduction tokenª from the ©Reduction Choices paneª.
 Selection error: The reduction token selected from the Reduction Choices
pane was not allowable input in the present state. Select another
reduction token.
##

Parse File

Use the Parse File button in the ©File Traceª to parse all the way
to the end of file. The parse will not stop until it encounters a
©syntax errorª, a ©semantically determined productionª, or the end of file.
##

Reset

Use the Reset button in the ©File Traceª or ©Grammar Traceª to reset
the parse to its initial state. This is most convenient when using
a ©Conflict Traceª, ©Error Traceª, or other ©Auxiliary Traceª
since these traces seldom begin at state 0.
##

Reload

The Reload button in the ©File Trace Windowª rereads the test file.
This is convenient if you modify the test file while you are testing
the ©grammarª.
##

Lookahead Token

In an ©LALR-1 parserª the "lookahead token" is the next token to be
processed. For each ©parser stateª there is a list of tokens that
may be seen in this state. For each token there is a corresponding
©parser actionª. The parser scans the list looking for the lookahead
token and then performs the corresponding parser action. If the
lookahead token cannot be found and there is no ©default reductionª,
the parser signals a ©syntax errorª.

In File Trace, and in some circumstances in Grammar Trace, the
lookahead token can be seen on the last line of the
©Parser Stack paneª.
##

GET_CONTEXT

If you have defined a "©context typeª" ©configuration
parameterª, and wish to maximize the performance of your
parser, you should write a GET_CONTEXT macro which
stores the context of the input token directly in
©CONTEXTª, the current stack location. Otherwise, you
can write your ©GET_INPUTª macro so that it stores
context into ©PCBª.©input_contextª. The default
definition for GET_CONTEXT will then copy
PCB.input_context to the ©context stackª at the
appropriate time.
##

GET_INPUT

GET_INPUT is a macro which you should define to control
©parser inputª if your
parser is not ©event drivenª and you are not using
©pointer inputª. If you don't define it, AnaGram will
define it by default to read a single character from
stdin:

	#define GET_INPUT (PCB.input_code = getchar())

 ©PCBª.©input_codeª is an integer field in the ©parser control blockª
which is used to hold the current character code. You
may also want GET_INPUT to set the values of ©input_contextª or
©input_valueª. It may call an input function, or it may execute
in-line code when it is invoked.
##

iso latin 1

The "iso latin 1" ©configuration switchª controls case
conversion on input characters when the ©case sensitiveª
switch is set to off. When "iso latin 1" is set, the
default ©CONVERT_CASEª macro is defined to convert
correctly all characters in the latin 1 character set.
When the switch is off, only characters in the ASCII
range (0-127) are converted.
##

Dragon Book

The "dragon book" is the classic reference on formal parsing:
	Compilers: Principles, Techniques, and Tools
	Aho, Sethi, and Ullman
	Addison-Wesley, 1986.

 It is called the "dragon book" because of its
colorful cover illustration showing a knight in
armour ("data flow analysis") armed with sword
("©LALR parser generatorª") and shield ("syntax
directed translation") at his PC attacking a
bright red dragon ("complexity of compiler design").
##

LALR-1 Parser

An LALR-1 parser is a ©parserª created from a
©grammarª by an ©LALR parser generatorª.
##

LALR Parser Generator

LALR(k) (LookAhead Left-to-right Rightmost derivation)
parser generators are
programs that create parsers algorithmically from
formal grammars. The (k) refers to the number of
lookahead symbols used to make parsing decisions.
Normally, k = 1.

LALR parsers are a subset of the class of
so-called LR parsers. LALR parsers are generally more compact
and less costly to create. These advantages are
obtained at a slight sacrifice in generality. Although
is possible to contrive an LR grammar which has
©conflictªs when analyzed with the LALR algorithm,
such situations rarely occur in practice, and can
be easily resolved by rewriting a few rules.

In the ©dragon bookª, section 4.7, the authors list the following
attractive properties of LR parsing:
 LR parsers can be constructed to recognize virtually
all programming-language constructs for which context-free
grammars can be written.
 The LR parsing method is the most general nonbacktracking
shift-reduce parsing method known, yet it can be implemented as
efficiently as other shift-reduce methods.
 The class of grammars that can be parsed using LR methods is
a superset of the class of grammars that can be parsed with
predictive parsers.
 An LR parser can detect a syntactic error as soon as it is
possible to do so on a left-to-right scan of the input.
##

Getting Started

AnaGram is an ©LALR parser generatorª. Its input is
a ©syntax fileª, which you prepare with an ordinary
programming editor. Its output is a ©parser fileª. which
you can compile with a C or C++ compiler on any platform
and link into your program. To compile on Unix platforms, set
the ©no crª ©configuration switchª.

AnaGram has extensive context-sensitive hypertext
©helpª. In any AnaGram window, press ©F1ª or select an item with the
©Help Cursorª. Further documentation in HTML format, including
documentation of examples, is found in the html subdirectory. AnaGram
also has a comprehensive hard-copy manual, the AnaGram User's Guide.

If you are new to AnaGram, you might begin by reviewing the Help
Topics ©How AnaGram Worksª and ©Program Developmentª, and looking at
An Annotated Example and Summary of AnaGram Notation in the HTML
documentation.

If you are not already familiar with formal parsing techniques, you
may want to read Introduction to Syntax Directed Parsing in the HTML
documentation. Note also the Fahrenheit to Celsius conversion
examples in the examples/fc directory, which comprise a graded
sequence of syntax files illustrating most of the basic
principles of ©syntax directed parsingª in easy steps. Documentation
is in html/fc.html.

AnaGram has many features, many of which are not
commonly found in parser generators:
 the ©configuration sectionª
 ©thread safe parsersª
 C++ support
 the ©disregardª and ©lexemeª statements
 ©event drivenª parsers
 ©character setsª
 ©virtual productionsª
 ©File Traceª, ©Grammar Traceª
 ©automatic resynchronizationª
 ©error token resynchronizationª

To familiarize yourself with the many options available for configuring
your parsers, select ©Configuration Parametersª from the ©Browse Menuª.
Use ©F1ª or the ©Help Cursorª to pop up explanations of the various
parameters.


If you don't find the information you need, please visit the
AnaGram web page at http://www.parsifalsoft.com for further
information and support.

##

How AnaGram Works

AnaGram contains an ©LALR Parser Generatorª which creates a
table driven ©LALR-1 parserª from a ©grammarª written in a variant
of Backus-Naur Form. AnaGram works in two steps. In the
first step, or analysis phase, it reads a ©syntax fileª and
compiles a number of tables describing the grammar. In the
second step, or build phase, it writes two output files:
a ©parser fileª written in C or C++ and a ©header fileª.

Syntax files normally have the extension .syn. The rules for
writing syntax files are given in the AnaGram User's Guide
and in the Summary of AnaGram Notation in the HTML documentation.

The header file contains definitions and declarations, including
the definition of a ©parser control blockª.

The parser file consists of:
 The ©C prologueª, if any.
 Definitions and declarations provided by AnaGram.
 ©Reduction procedureªs.
 a customized ©parsing engineª.
 a ©parse functionª to be called when input is to be parsed.

 The name of the parser file is controlled by the ©parser
file nameª ©configuration parameterª. The name of the
parse function itself is controlled by ©parser nameª. In the
default case, the parser file will have the same name as
the syntax file, with the extension .c. The name of the
parse function is given by the ©parser nameª parameter. It defaults
to the name of the syntax file.
##

Examples

The EXAMPLES directory of the AnaGram distribution disk
contains a number of examples to help you get started.
Documentation for the examples, in HTML format, is located
in the html directory (start at index.html or examples.html).

The traditional Hello, World, in examples/hw, is a good
example for getting familiar with the mechanical
procedures of building both C and C++ parsers from
©syntax fileªs.

The Fahrenheit/Celsius conversion examples in the
examples/fc directory on your AnaGram diskette comprise
a graded sequence of syntax files which illustrate
most of the basic principles of ©syntax directed
parsingª in easy steps. In addition, these examples
demonstrate many features of AnaGram which are not
found in other parser generators:
 the ©configuration sectionª
 ©character setsª
 ©virtual productionsª
 ©error token resynchronizationª
 ©File Traceª
 the ©disregardª and ©lexemeª statements
 ©event drivenª parsers

The Four Function Calculator (examples/ffcalc) is used
traditionally to demonstrate parser generators. If you
are already familiar with ©syntax directed parsingª this
example will give you a good overview of the basics of
AnaGram. An annotated version of this example may be
found in AnaGram's HTML documentation.
The FFCALC example illustrates the use of ©precedence
rulesª to resolve ©conflictsª.

Other examples are available to demonstrate additional
features of AnaGram.

RCALC (examples/rcalc) is a simple four function
calculator which accepts roman numeral input. It
illustrates the following AnaGram features:
 ©pointer inputª
 ©SYNTAX_ERRORª macro
 ©context stackª

DSL (examples/dsl) is a complete DOS script language,
which provides capabilities well in excess of DOS batch
files. DSL is a complete working program, used in the
past to create AnaGram's install program. Some of the
specific features of AnaGram which it illustrates are:
 ©distinguish lexemesª
 ©distinguish keywordsª
 ©far tablesª

MPP is a fully functional macro preprocessor for C or
C++. Included with MPP are two C grammars, either of
which may be incorporated into MPP. MPP uses several
parsers that work together:
 TS.SYN is the primary token scanner parser that
identifies tokens, and handles preprocessor
commands.
 MAS.SYN is used to do macro argument substitution.
 CT.SYN is used to identify tokens that result from
string concatenation during macro argument
substitution.
 EX.SYN is used to evaluate constant expressions in
#if preprocessor statements.

Among the more powerful features of AnaGram that MPP
illustrates are:
 ©semantically determined productionsª
 ©event drivenª parsers
##

Goal, Goal Token, Start Token

The ©grammar tokenª is the token which represents the
"top level" in your grammar. Some people refer to it as
the "goal" or "goal token" and others as the "start
token". Whichever it is called, it is the single token
which describes the complete input to your parser.

The most common way to specify a grammar token is as
follows:
	grammar -> statements?..., eof
This production tells AnaGram that the input to your
parser consists of a (possibly empty) sequence of
statements followed by an end of file token.

There are a number of ways of specifying which token in
your ©syntax fileª represents the top level of your
grammar. You may simply name it "grammar", or you may
tag it with a '$' character when you define it, or you
may set the ©grammar tokenª ©configuration parameterª.

If you should inadvertently tag several tokens with the
'$' character and/or set the grammar token parameter,
it is the last such specification in the file which
wins. Some people develop their grammars bottom up,
gradually adding new levels of complexity. In the
course of development, they may specify a number of
tokens as grammar tokens and forget to remove the old
specifications.

Notice that if you define the token "grammar" anywhere
in your syntax and specify the grammar token otherwise,
"grammar" will not be the grammar token. This is to
keep "grammar" from being a reserved word. If you need
to use it in your syntax for something other than the
whole grammar, you are free to do so.
##

Grammar

Traditionally, a "grammar" is a set of ©productionªs
which taken together specify precisely a set of
acceptable input streams, in terms of an abstract set
of ©terminal tokensª. The set of acceptable input
streams is often called the "language" defined by the
grammar.

In AnaGram, the term "grammar" also includes
©configuration sectionsª as well as the ©definitionsª
of ©character setsª and ©virtual productionsª which
augment the collection of productions. The term is
often used in contrast to the term "©syntax fileª"
which is used to signify the complete AnaGram source
file including reduction procedures and embedded C or
the term "©parserª" which refers to AnaGram's output
file.

A grammar is often called a "syntax", and the rules of
the grammar are often called syntactic rules.
##

Grammar Analysis

The major function of AnaGram is the analysis of
context-free grammars written in a particular variant
of Backus-Naur Form.

The analysis of a grammar proceeds in four stages. In
the first, the input grammar is analyzed and a number
of tables are built which describe all of the
©productionªs and components of the ©grammarª.

In the second stage, AnaGram analyzes all of the
character sets defined in the grammar, and where
necessary, defines auxiliary tokens and productions.

In the third stage, AnaGram identifies all of the
states of the parser and builds the go-to table for the
parser.

In the fourth stage, Anagram identifies ©reduction
tokensª for each completed ©grammar ruleª in each state
and checks for ©conflictªs.

Use the ©Analyze Grammarª command to cause AnaGram to
analyze your grammar.
##

Grammar Is Ambiguous

This ©warningª message appears if your ©grammarª
contains ©conflictªs. AnaGram will resolve ©shift-reduce
conflictsª by selecting the shift option. It will
resolve ©reduce-reduce conflictsª by selecting from the
conflicting ©grammar ruleªs the one which appears first
in the ©syntax fileª.
##

Grammar Rule

A "grammar rule" is the right hand side of a production.
It is a sequence of ©rule elementsª. Each rule element
identifies some token, which can be either a ©terminal
tokenª or ©nonterminal tokenª.

A grammar rule is "matched" by a
corresponding sequence of tokens in the input stream to
the parser. The rule elements in the grammar rule may be
©token nameªs, ©set expressionsª, ©character constantsª,
©immediate actionªs, ©keyword stringsª, or ©virtual
productionsª.

A grammar rule may be followed by an
optional ©reduction procedureª. The ©semantic valuesª of
the tokens that comprise the rule may be passed to the
reduction procedure by using ©parameter assignmentsª.

A grammar rule always makes up the right hand side of a
production. The left hand side of the production
identifies one or more ©nonterminal tokensª, or
©reduction tokensª, to which the rule reduces when
matched. If there is more than one reduction token,
the production is called a ©semantically determined productionª and
there should be a ©reduction procedureª to select
the correct reduction token at run time.
##

Grammar Token

The "grammar token" ©configuration parameterª may be
used to specify the ©goalª, or "start" token for the
syntax analyzer portion of AnaGram. Alternatively, you
could simply call the token "grammar", or you could
append a '$' character to it when you define it.

Each grammar must have a grammar token specified before
it can be analyzed or before a parser can be built. The
grammar token is the single token to which the grammar
finally condenses. When this token is identified by the
parser, the parse is complete.
##

Grammar Trace

AnaGram's Grammar Trace facility lets you examine the workings of your
©parserª in detail. You can use the Grammar Trace as soon as you have
analyzed your ©grammarª, even before you have written any ©reduction
procedureªs or other code. Thus you can defer writing procedural code
until you have the grammar working to your specifications.

Select the ©Grammar Trace Windowª
from the ©Action Menuª or click on the Grammar Trace
button.

In the ©Parser Stack paneª you can see a representation of the
©parser state stackª and ©parser stateª as they might appear in the
course of execution of your ©parserª. The ©Rule Stack paneª shows the
relationship between the contents of the parser stack and your
©grammarª. If your grammar uses ©semantically determined
productionsª, you can select the appropriate ©reduction tokenª from
the ©Reduction Choices paneª.

At any stage, the ©Parser Stackª represents a parse
in progress. It shows the sequence of ©tokenªs that have
been input so far and the states in which they were
seen. When a production is complete and the grammar rule
is reduced, the tokens that make up the rule are removed
from the stack and replaced by the token on the left
side of the production. Initially, the Parser Stack contains
only a ©lookahead lineª.

To explore your grammar, choose ©tokenªs one by one from
the ©Allowable Inputª
pane. This pane shows the tokens allowable at the current state of the
grammar, and the actions that result when the tokens are chosen.

You can also enter text characters directly in the ©text entryª
field. This means you can run a Grammar Trace like a ©File Traceª
where the test file is replaced by the characters you type in the
text entry field.  This is a very convenient way to check out your
grammar. Text entry is, of course, not appropriate for grammars that
expect ©token inputª.

In a ©File Traceª you can advance the parse no matter which pane is
active. In a Grammar Trace there is a question as to whether input is
intended to come from the Allowable Input pane or the text entry
field.  Therefore the parse can only be advanced when one of these
two is active to indicate that it is the source of input.

Specialized prebuilt Grammar Traces such as the ©Conflict Traceª and
the ©Auxiliary Traceª can be selected from ©Auxiliary Windowsª popup
menus where appropriate.

All Grammar Trace activity updates the ©trace coverageª counts.
##

Text Entry

It is sometimes more convenient to enter text in the
text entry box on the ©Grammar Traceª toolbar than to
select individual tokens from the ©Allowable Input paneª.

By entering text you can proceed quickly to a troublesome
state without having to choose each individual token
en route.

After entering text, press Enter or click on the Proceed
button to parse the text. Click on the single step button
to work slowly through the text step by step.
##

header file name

The "header file name" parameter names the ©parser
headerª file that AnaGram will generate when it builds
your parser. This header file can be used with your
parser or with other modules in your program. The
header file contains a number of typedef statements and
an number of macro definitions which are needed in your
parser and may be useful in other modules.

If the value of this parameter contains a '#' character,
AnaGram will substitute the name of your syntax file for
the '#'. The default value of "header file name" is
"#.h".
##

Help, Using Help

There are 3 main ways to access AnaGram Online Help:
 Press F1 for context-sensitive help from most windows and menu items.
 Similarly, use the ©Help Cursorª from most windows and menu items.
 From the Help menu, you can bring up ©Help Topicsª and choose a topic.

You can also get fly-over help for the toolbar buttons on the ©Control
Panelª. File and Grammar Traces have a Help button.

AnaGram's Help windows, unlike most others, remain on-screen until you
dismiss them. This means you can refer to several topics at once. They
have hypertext links to other Help topics. Also, right-clicking
the mouse on a Help window or pressing F1 will pop up an Auxiliary
Windows menu of all linked topics in the window. "Using Help" is always
available from this popup menu.

Note that, for the ©Warningsª, ©Configuration Parameterªs and ©Help
Topicsª windows, F1 will give you help for the item
on the highlighted line, whereas the Help Cursor allows you
to select any line by clicking on it.

AnaGram also has documentation in HTML format, indexed in the index.html
file. This documentation covers Getting Started, examples, and some
further topics mainly condensed from the User's Guide. Hard copy
documentation is in the AnaGram User's Guide, which has the most
detail.
##

Hidden

In a ©configuration sectionª of your grammar you may use
an ©attribute statementª to declare one or more tokens
to be "hidden". Tokens that are "hidden" do not appear
in the ©token namesª table, and thus do not appear in syntax error
diagnoses. When your parser attempts to determine the
©error frameª of a ©syntax errorª, it will disregard the
tokens that have been declared hidden. The hidden
declaration consists simply of the keyword hidden
followed by a list of tokens, separated by commas and
enclosed in braces ({ }):
	[ hidden { widget, wombat, foo, bar } ]

 You would use the "hidden" attribute primarily for
tokens whose name would not mean anything to your users.
##

Immediate Action

Immediate actions are snippets of C code which are to
be executed in the middle of a ©grammar ruleª. Immediate
actions are denoted by a '!' character followed by
either a C expression, terminated by a semicolon; or a
block of C code enclosed in braces. For example, in a
simple desk calculator example one might write the
following:
	transaction
	    -> !printf('#');, expression:x  =printf("%d\n",x);

 Notice that the only apparent difference between an
immediate action and a ©reduction procedureª is that the
immediate action is preceded by '!' instead of '='.
Notice that the immediate action must be followed by a
comma to separate it from the following ©rule elementª.

Immediate actions may also be used in ©definitionªs:
	prompt = !printf('#');

The above example, using this definition would then be:
	transaction
	  -> prompt, expression:x  =printf("%d\n",x);

 You could accomplish the same result by writing a ©null
productionª and a reduction procedure:
  prompt
   ->   =printf('#');

This is exactly how AnaGram implements immediate
actions.
##

Implementation Errors

"Implementation errors" are errors your parser detects
which are not the immediate result of bad input.  When
it encounters an implementation error, your parser will
call a macro which you can define to deal with the
problem in a manner suitable to your needs. If you don't
provide these macros, AnaGram will make default
definitions. There are two macros corresponding to two
implementation errors:
	©PARSER_STACK_OVERFLOWª
	©REDUCTION_TOKEN_ERRORª
##

Inappropriate Value

This ©warningª message appears when the value assigned to
a ©configuration parameterª is not appropriate to that
parameter. Check the definition of the parameter, by
opening the ©Configuration Parameters Windowª,
selecting the parameter and pressing F1.
##

Initializer

For every ©parserª it generates, AnaGram generates an
"initializer" function to call the parser. AnaGram
names the initializer by prefixing the ©parser nameª
with "init_". If your parser is ©event drivenª, you must
call the initializer before you call the parser.

If your parser is not event driven, AnaGram will
normally include a call to the initializer in the
parser. If you wish to be able to call your parser more
than once without its being re-initialized, you may turn
off the ©auto initª ©configuration switchª. When you do
this, you assume responsibility for calling the
initializer. If your parser is event driven, you must
always call the initializer function.

If the ©reentrant parserª switch is set, the initializer takes
a pointer to the ©parser control blockª as its sole argument. Otherwise
it takes no arguments. The initializer returns no value. All
communication is by means of the ©parser control blockª.
##

Input Character

The actual unit of ©parser inputª is usually a
single character. Note that you are not limited to
eight-bit characters. Your parser will use the input
character to index a translation table, ©ag_tcvª, to
determine the ©token numberª for that character. The
©token numberª identifies the actual syntactic token.
The character code itself will be the ©semantic valueª
of the token. Note that AnaGram groups together all
input characters that are syntactically
indistinguishable into a single input token.
##

input_code

input_code is a field in the ©parser control blockª
which contains the current ©input characterª, or, if your
©GET_INPUTª macro supplies ©token numberªs directly, the
token number.

If you write your own ©GET_INPUTª macro, you must make
sure that you store the input character, or token
number, you get into ©PCBª.input_code.
##

INPUT_CODE(t)

If you set both the ©pointer inputª and the ©input
valuesª ©configuration parameterªs, you must provide an
INPUT_CODE macro for your parser. In this situation,
your parser will use the pointer to load the
©input_valueª field of the ©parser control blockª and
uses the INPUT_CODE macro to extract the appropriate
value for the ©input_codeª field. For example, if the
input_value is a structure and the appropriate member
field is called "id" you would write:

	#define INPUT_CODE(t) (t).id
##

input_context

"input_context" is a field which AnaGram adds to the
definition of the ©parser control blockª structure when
you define a ©context typeª ©configuration parameterª.
If you choose, you can write your GET_INPUT macro so
that it stores the context value in ©PCBª.input_context.
The default definition for ©GET_CONTEXTª will then stack
the context value at the appropriate time. You can think
of PCB.input_context as a sort of temporary "parking
place" for the context value.
##

Input Scan Aborted

This ©warningª message appears if AnaGram is unable to
finish scanning your ©syntax fileª because of previous
errors.
##

input values

"Input values" is a ©configuration switchª which
defaults to off. If your ©parser inputª includes
explicit ©token valueªs which are not simply the ascii
values of corresponding ascii input characters, you must
set the "input values" switch to inform AnaGram. Unless
your parser is ©event drivenª or uses ©pointer inputª,
you must also provide your own ©GET_INPUTª macro.

If your parser uses pointer input, you must provide an
©INPUT_CODE(t)ª macro.

The semantic value of an input token is to be stored in the
©input_valueª field of the parser control block.
##

input_value

input_value is a field in the ©parser control blockª
which is used to store the semantic value of the input
token.

If you write your own ©GET_INPUTª macro, and you have
set the ©input valuesª ©configuration switchª, you
should make sure that you store the value of the ©input
characterª or token into ©PCBª.input_value.
##

Internal Error

"AnaGram internal error: ..." is a ©warningª message which
appears if one of AnaGram's internal consistency tests
fails. This message should never appear if AnaGram is
working properly. Usually AnaGram will abort on
encountering an internal error, although under
a small set of circumstances it may continue. Should
this happen, it would be wise to close AnaGram and
restart it.

If you do get an internal error, please note the complete
message identifing the problem and file a bug report,
following the directions posted on the AnaGram web page
at http://www.parsifalsoft.com.
A copy of the relevant
syntax file and a summary of the circumstances surrounding
the problem would be greatly appreciated.
##

Intersection

In set theory, the intersection of two sets, A and B, is
defined to be the set of all elements of A which are
also elements of B. In an AnaGram ©syntax fileª, the
intersection of two ©character setsª is represented with
the '&' operator. The intersection operator has lower
©precedenceª than the ©complementª operator, but higher
precedence than the ©unionª and ©differenceª operators.
The intersection operator is ©left associativeª.
##

Keyboard Support

AnaGram can be controlled entirely from the keyboard. In the Control
Panel, you
can tab to any button and press Enter to select it. In addition to
the conventional
Windows keyboard functions, the following keys have been implemented:
	Escape closes any AnaGram window except the Control Panel.
	F8 toggles between an active AnaGram window and the Control Panel
	F10 accesses the Control Panel menu from any
AnaGram Window.
	Shift F10 pops up the Auxiliary Windows menu
##

Keyword, Keyword String

Keywords are a very important feature of AnaGram. They
provide an easy way to pick up special character
sequences in your input, thereby eliminating the need
for a lot of tedious ©productionªs.

If AnaGram finds, on the right hand side of one of your
©grammarª productions, a string enclosed in double
quotes, such as "IF", it automatically creates from the
string a "keyword" which is incorporated into your
parser. You may have any number of keywords. A keyword
is treated as a single terminal token. Recognition of
keywords is governed by the ©case sensitiveª switch.

Your parser will look for a keyword in its input stream
wherever you have defined this particular keyword to be
legitimate input. It will do whatever lookahead is
necessary in order to pick up the entire keyword. If
several keywords match the input, such as IF and IFF,
it will select the longest match, IFF in this example.

Important points to notice about keywords:
 1) Keywords take precedence over ordinary
characters in the input stream - thus if the character
I and the keyword IF are both legitimate input at some
point, IF will be selected, if present, in preference
to I.
 2) Keywords are not reserved words. Your parser
will only look for a keyword when it is in a state
where that keyword is legitimate input.
 3) Keywords do not participate in character sets
and should not appear in definitions of character sets.
In particular, they are not considered as belonging to
the complement of a character set. Thus
a keyword would not be considered legitimate input
for the production
		next char -> ~( '/' + '*' )

 4) Keywords may appear in virtual productions.

 5) Keywords may be named by means of a definition.

AnaGram will list all the keywords in your grammar in
the ©Keywordsª window. In addition, in numerous
windows where the cursor bar selects a state, the
©Auxiliary Windowsª popup menu will list a Keywords option.
This window will provide a list of the keywords
acceptable in the selected ©parser stateª.

On occasion, a kind of conflict, called a ©keyword
anomalyª may occur. If so, such conflicts will be listed
in the ©Keyword Anomaliesª window. The "©stickyª"
©attribute statementª is useful in dealing with keyword
anomalies.
##

Keyword Anomalies Found

This ©warningª message indicates that AnaGram has found
at least one ©keyword anomalyª in your ©grammarª. Open
the ©Keyword Anomaliesª window to see a list of those
that have been found.
##

Keyword Anomaly

In ©syntax directed parsingª, it is assumed that input
©tokenªs can be uniquely identified. In the case of
©keywordªs, however, there is the possibility that the
individual characters making up the keyword, as well as
the keyword taken as a whole, could constitute
legitimate input under some circumstances. Thus
©keywordsª, though a powerful and useful tool, are not
completely consistent with the assumptions that underlie
©syntax directed parsingª. This can occasionally give
rise to a type of conflict, diagnosed by AnaGram,
called a "keyword anomaly". AnaGram is quite
conservative in its diagnoses, so that many keyword
anomalies it reports are actually innocuous and can be
safely ignored.

Basically, a keyword anomaly is a situation where a
keyword is recognized, causes a reduction, and the
parser arrives in a state where the keyword is not
legal input. If the keyword, seen simply as a sequence
of characters, might have been legal input in the
original state, AnaGram notes the existence of a
keyword anomaly.

If you have a keyword that causes a keyword anomaly and
it is actually a reserved word in your grammar, the
anomaly is by definition innocuous. You should use the
©reserve keywordsª statement to inform AnaGram that the
keyword is reserved and the anomaly need not be
diagnosed.

To help identify and correct any problems associated
with keyword anomalies, AnaGram provides the ©Keyword
Anomaliesª window to identify the anomalies, and the
©Keyword Anomaly Traceª to help you understand a
particular anomaly.
##

Keyword Anomaly Trace

A Keyword Anomaly Trace is a ready made ©grammar traceª
window which you may select from the ©Auxiliary Windowsª
menu of the ©Keyword Anomaliesª window. The anomaly
trace provides a path to a state which illustrates the
©keyword anomalyª. In this state, the keyword is a
reducing token, but after the reduction, it is not
allowable input.
##

Keyword Anomalies

The Keyword Anomalies window is available only if your
grammar has ©keywordª anomalies.

Each entry in the Keyword Anomalies window consists of
two lines. The first line identifies the ©parser stateª
at which the ©keyword anomalyª occurs and the offending
keyword. The second line identifies the ©grammar ruleª
which the keyword may erroneously reduce.

The ©Auxiliary Windowsª menu provides three auxiliary
windows keyed directly to the anomaly to help you
determine the nature of the problem: The ©Keyword
Anomaly Traceª window, the ©Reduction Traceª window, and
the ©Rule Derivationª window. Three other windows provide
supporting information: the ©Reduction Statesª window,
the ©Rule Contextª window and the ©State Definitionª
window.
##

Keywords

The Keywords entry in the ©Browse Menuª pops up a
window which lists all of the keywords defined in your
©grammarª. The ©token numberª is also specified.

A Keywords window is also an option in the ©Auxiliary
Windowsª popup menu for any window which distinguishes
various states of your parser. The Keywords window will
show all of the ©keywordªs which will be recognized in
the state selected by the cursor bar in the parent
window.

The ©Auxiliary Windowsª menu for a Keywords window
provides a ©Token Usageª option which will allow you to
all the uses of a particular keyword in your grammar.
##

left

"left" controls a ©precedence declarationª, indicating
that all of the listed ©rule elementsª are to be
considered ©left associativeª.
##

Left Associative

A binary operator is said to be left associative if
an expression with repeated instances of the operator
is to be evaluated from the left. Thus, for example,
  x = a/b/c

is normally taken to mean x = (a/b)/c The division
operator is said to be left associative.

In ©grammarªs with ©conflictªs, you may use ©precedence
declarationªs to specify that an operator should be left
associative.
##

Lexeme

The "lexeme" ©attribute statementª is used to fine-tune
the "©disregardª" statement. The lexeme statement takes
the form:
    lexeme { T1, T2,....Tn }

where T1,...Tn is a list of ©nonterminalª tokens
separated by commas. Lexeme statements may be placed in
any ©configuration sectionª, and there may be any number
of them.

When you specify that a ©tokenª is to be disregarded,
AnaGram rewrites your ©grammarª so that the token will be
passed over whenever it occurs at the beginning of a
file or following a lexical unit, or "lexeme". If you
have no lexeme statement, then the lexemes in your
grammar are just the terminal tokens.

The lexeme statement allows you to specify that certain
nonterminal tokens are also to be treated as lexemes.
This means that the disregard token will be skipped
following the lexeme, but not between the characters
that constitute the lexeme.

Lexemes correspond to the tokens that a lexical scanner,
if you were using one, would commonly identify and pass
to a parser as single tokens. You don't usually wish to
disregard ©white spaceª within these tokens. For
example, in a grammar for a conventional programming
language where blank characters are to be disregarded,
you might include:
  [
    lexeme {string, character constant, name, number}
  ]

since blank characters must not be overlooked within
strings and constants, and should not be permitted
within names or numbers.

If your grammar allows for situations where successive
lexemes could run together if they were not separated
by space, a name followed by a number, for example, you
may use the "©distinguish lexemesª" ©configuration
switchª to force a separation between the tokens.

White space may be used explicitly within definitions of
lexeme tokens in your grammar if desired, without
causing conflicts. Thus, if you wish to allow embedded
space in variable names, you might write:
  [
    disregard space
    lexeme {variable name}
  ]
  space = ' ' + '\t'
  letter = 'a-z' + 'A-Z'
  digit = '0-9'

  variable name
   -> letter
   -> variable name, letter + digit
   -> variable name, space..., letter + digit
##

line

line is a field in your ©parser control blockª used for
keeping track of the line number of the current
character in your input. Line and column numbers are
tracked only if the ©lines and columnsª ©configuration
switchª has been set.
##

line length

Line length is an ©obsolete configuration parameterª.
##

Line Numbers

"Line numbers" is a ©configuration switchª which
defaults to off. If it is on, the ©Build Parserª
command will put "#line" directives into the generated
C code file so that your compiler diagnostics will
refer to lines in the ©syntax fileª rather than in the
generated C code file. For more information on the
"#line" directive, see Kernighan and Ritchie, second
edition, section A12.6.

If the "line numbers" switch is off, AnaGram will put
comments into your parser file to help you find
reduction procedures and embedded C in your syntax
file.

Prior to AnaGram 2.01, if your C or C++ compiler required that the
backslashes in the pathname in the #line directive be doubled, you
would have used AnaGram's ©escape backslashesª switch to make this
happen. Although you may still use ©escape backslashesª, it should no
longer be necessary because AnaGram now puts forward slashes into #line
pathnames instead of backslashes.

If you wish, you may specify the pathname in the #line
directives explicitly by using the ©Line Numbers Pathª
configuration parameter.

You may also wish to change the "©parser file nameª"
parameter to provide a full path name for your parser
file.
##

Line Numbers Path

"Line Numbers Path" is a ©configuration parameterª
which takes a string value. It defaults to NULL.

When you have set the ©Line Numbersª ©configuration
switchª and Line Numbers Path is not NULL, AnaGram
uses it in the #line directive in place of the full
path name of your ©syntax fileª.

Note that Line Numbers Path should be the complete
pathname for your syntax file.

Line Numbers Path is useful when using AnaGram in cross
platform development. When parsers are to be compiled
and tested on a platform different from that used to run
AnaGram, you may use Line Numbers Path to provide a
pathname on the platform used for compiling and
testing.
##

Lines and Columns

"Lines and columns" is a ©configuration switchª which
defaults to on. When set, i.e., on, it causes the
©Build Parserª command to incorporate code into your
parser which will automatically track the line number
and column number of the input token.

You would normally set the "lines and columns" switch
when you are planning to build a parser which will read
an input file and which will need to diagnose ©syntax
errorsª with some precision.

Your parser will store the line and column numbers in
the ©lineª and ©columnª fields respectively in the
©parser control blockª.

If the input to your parser includes tab characters, you
should either set the ©tab spacingª ©configuration
parameterª appropriately or provide a ©TAB_SPACINGª
macro for your parser.

Your parser will count line and column numbers beginning
with one.
##

Main Program

The "main program" ©configuration switchª determines
what AnaGram does if you invoke the ©Build Parserª
command, but have no ©embedded Cª in your ©syntax
fileª. If the switch is on and you have not specified
©pointer inputª or an ©event drivenª parser, AnaGram
creates a main program which does nothing but call your
©parserª. The "main program" switch defaults to on.

This feature, along with the default definitions for
©GET_INPUTª and ©error handlingª, makes it possible
to write a grammar with no ©embedded Cª or ©reduction
procedureªs whatsoever and still get an executable
program which will read input from stdin and parse it
according to your grammar.
##

Marked Rule

A "marked rule" is a ©grammar ruleª together with a
marked token that indicates how much of the rule has already
been matched. The ©marked tokenª and any tokens following it
indicate the input that should be expected if the
remainder of the rule is to be matched.

When marked rules are displayed in AnaGram windows, the
marked token is represented by a difference in the font. The token may
be in bold face, underlined, italicized, shown with a different point
size, or in a different font altogether. Since AnaGram allows you to
change fonts to suit your own preferences, you should be careful that
the font you choose for the marked tokens allows them to be readily
distinguished from the other tokens in your grammar rules. An
underlined font is often suitable.
##

Max conflicts

The "max conflicts" ©configuration parameterª limits the
number of ©conflictªs AnaGram will record.  Sometimes, a
simple error editing your syntax file can cause hundreds
of conflicts, which you don't need to see in gory
detail. The default value of max conflicts is 50. If you
have a grammar that is in serious trouble and you want
to see more conflicts, you may change max conflicts to
suit your needs.
##

Missing

The ©warningª message Missing <element 1> in <element 2>
indicates that AnaGram expects to see an instance of
syntactic element 1 at the specified location, internal
to an instance of syntactic element 2. AnaGram cannot
reliably continue parsing its input after an error of
this type. Therefore, it limits further analysis of
your grammar to scanning for syntax errors.
##

Missing Production

"Missing production, TXXX: <token name>" is a ©warningª
message which indicates that the specified ©tokenª
appears to be defined recursively, but there is no
initial ©productionª to get the recursion started. If
you get this warning, check your ©grammarª closely.
##

Missing Reduction Procedure

"Missing reduction procedure, RXXX" is a ©warningª
message which appears either when the ©grammar ruleª indicated
specifies a ©parameter assignmentª but does not have a
©reduction procedureª to use it, or when the rule has no reduction
procedure but the value of the token on the left hand side is used in
as an argument for some other reduction procedure and the ©default reduction valueª
does not have the same type as the token on the left hand side.
In this latter case, a reduction procedure may be needed to effect
correct type conversion.

This warning is
provided in case the lack of a reduction procedure is an
oversight.
##

Multiple Definitions

"Multiple definitions for TXXX: <token name>" is a
©warningª message which indicates that the specified
©tokenª has been defined both as a ©character setª and
as a ©nonterminal tokenª. It cannot be both.
##

Near Functions

"Near Functions" is a ©configuration switchª that
defaults to off. It controls the use of the "near"
keyword for static functions in your parser. If your
parser is to run on an 80x86 processor you might wish
to turn it on. Your parser will then be a slight bit
smaller and will run a little bit faster.

If you are going to run your parser on some other
processor or use a C or C++ compiler that does not
support the "near" keyword you should make sure "near
functions" is set to off.
##

Negative Character Code in Pointer Mode

This ©warningª message appears if your ©grammarª defines
negative character codes and uses ©pointer inputª. If
your grammar uses the default definition for ©pointer
typeª it will be reading unsigned characters so that
the parser will never see the negative codes that have
been defined. You may correct the problem by providing
your own definition of pointer type.
##

Nest Comments

"Nest comments" is a ©configuration switchª which
defaults to off. It controls the treatment of ©commentsª
while scanning your ©syntax fileª. It defaults to off,
in accordance with the ANSI standard for C which
disallows ©nested commentsª. Note that AnaGram scans
comments in any ©embedded Cª code as well as in the
grammar specification. You may turn this switch on and
off as many times as necessary in a single file.
##

Nested Comment

As delivered, AnaGram treats C style ©commentsª
according to the ANSI standard: They do not nest. For
those who prefer nested comments, however, the ©nest
commentsª ©configuration switchª allows them to nest.
##

Nesting too deep

This ©warningª message indicates that ©set
expressionªs or ©virtual productionsª are
nested so deeply they have exhausted the available
stack space and AnaGram cannot continue its analysis.

Use a ©definitionª statement to name an intermediate
level.
##

no cr

"no cr" is a ©configuration switchª which
defaults to off. When this switch is set, it will
cause the ©parser fileª and ©header fileª to be
written without carriage returns. This is convenient
if you wish to use the generated parser files in a
Unix environment.
##

No Grammar Token Specified

This ©warningª message appears if your ©grammarª does not
specify a ©grammar tokenª. Edit your ©syntax fileª to
specify one.
##

No Productions in Syntax File

This ©warningª message appears if AnaGram did not find
any ©productionsª at all in your ©syntax fileª. Check
to see you have the right file.
##

No Such Parameter

This ©warningª message appears when AnaGram does not
recognize the name of a ©configuration parameterª you
have tried to set in your ©syntax fileª. Check the
spelling of the parameter you wish to set in the
©Configuration Parameters Windowª.
##

No Terminal Tokens in Expansion

No terminal tokens in expansion of TXXX is a ©warningª
message indicating that there are no terminal tokens
to be found in an expansion of the specified token.
Although there are a few circumstances where this could
be legitimate, it is more likely that there is a missing
rule in the grammar.
##

Not a Character Set

"Not a character set, TXXX: <token name>" is a ©warningª
message which indicates that the specified ©tokenª has
been used both on the left side of a ©productionª and in
a ©character setª expression defining some other token.
AnaGram will use an empty set in place of the
specified token in evaluating the ©character setª. You
will get another warning, ©Error definingª token, when
AnaGram finishes its evaluation of the character set.
##

Nothing Reduces

"Nothing reduces TXXX -> RYYY" is a ©warningª message
which indicates that the ©grammarª does not specify any
input to follow an instance of the indicated ©grammar
ruleª. In all probability, the grammar does not have
any explicit end of file, or ©eof tokenª. If the grammar
does not have any conflicts with ©tokenª T000, then an
explicit end of file indicator is not necessary.
Otherwise you should modify your grammar to require an
explicit end of file.
##

Null Character in String

This ©warningª message appears when AnaGram finds an
explicit null character in a quoted string. If you must
allow for a null in a ©keyword stringª
you will have to rewrite your
©grammar ruleª. For instance, instead of

  widget
    -> "abc\0def"

write

  widget
    -> "abc", 0, "def"
##

nonassoc

"nonassoc" controls a ©precedence declarationª,
indicating that all of the listed ©rule elementsª are
to be considered non-associative.
##

Nonterminal Token, Nonterminal

A nonterminal token is one which is constructed from a
series of other tokens as specified by one or more
©productionªs. Nonterminal tokens are to be
distinguished from ©terminal tokenªs, which are the
basic input units appearing in your input stream.
Terminal tokens most often represent single characters
or a character belonging to a ©character setª such as
'a-z'.
##

Null Production

A "null production" is one that has no tokens on the
right hand side whatsoever. Null ©productionªs
essentially are identified by the first following input
token. Null productions are extremely convenient
syntactic elements when you wish to make some input
optional. For example, suppose that you wish to allow an
optional semicolon at some point in your grammar. You
could write the following pair of productions:
  optional semicolon -> | ';'
Note that a null production can never follow a '|'.

This could also be written on multiple lines thus:
  optional semicolon
    ->
    -> ';'

You can always rewrite your grammar to eliminate null
productions if you wish, but you usually pay a price in
conciseness and clarity. Sometimes, however, it is
necessary to do such a rewrite in order to avoid
©conflictªs, to which null productions are especially
prone. For example suppose you have the following
production:
  foo -> wombat, optional semicolon, widget

You can rewrite this as two productions:
  foo
    -> wombat, widget
    -> wombat, ';', widget

This rewrite specifies exactly the same input language,
but is less prone to conflicts. On the other hand, it
may require significantly more table space in your
parser.

If you have a null production with no ©reduction
procedureª specified, your parser will automatically
assign the value zero to ©reduction tokenª.

Null productions can also be generated by ©virtual
productionsª.

A token that has a null production is a "©zero lengthª"
token.
##

Old Style

"Old Style" is a ©configuration switchª which defaults
to off. It controls the function definitions in the code
AnaGram generates. When "old style" is off, it generates
ANSI style calling sequences with prototypes as
necessary. When "old style" is on, it generates old
style function definitions.
##

Output Files

When you use the ©Build Parserª command, to request
output from AnaGram, it creates two files: a ©parser
fileª and a ©parser headerª file.
##

Page Length

"Page length" is an ©obsolete configuration parameterª.
##

Obsolete Configuration Parameter, Obsolete Configuration Switch

A number of ©configuration parameterªs and ©configuration switchªes
which were used in the DOS version of AnaGram are no longer
used, but are still recognized for the sake of upward
compatibility. These parameters include:
 ©bottom marginª
 ©line lengthª
 ©page lengthª
 ©top marginª
 ©quick referenceª
 ©video modeª

##

Parameter

"Parameter <name> has type void" is a ©warningª message
which appears when a ©parameter assignmentª is attached
to a ©tokenª that has been defined to have the void
©data typeª.
##

Parameter Assignment

In any ©grammar ruleª, the ©semantic valueª of any
©rule elementª may be passed to a ©reduction procedureª
by means of a parameter assignment. Simply follow the
rule element with a colon and a C variable name. The C
variable name can then be used in the reduction
procedure to reference the semantic value of the token
it is attached to. AnaGram will automatically provide
necessary declarations.

Here are some examples of rule elements with parameter
assignments:

  '0-9':d
  integer:n
  expression:x
  declaration : declaration_descriptor

##

Parameter Not Defined

AnaGram does not have a ©configuration parameterª
with the specified name. Please check the spelling.
##

Parameter Takes Integer Value
The specified ©configuration parameterª takes
an integer value only.
##


Parameter Takes String Value

The specified ©configuration parameterª takes
a string value only.
##

Parse Function

To run your parser, you call the parse function.
The name of the parse function is given by
the ©parser nameª ©configuration parameterª and defaults to the
name of your parser file.

If your parser uses ©pointer inputª, you should set the ©pointerª
field of the ©parser control blockª before calling the parser
function.

If your parser is ©event drivenª, you should first call the
©initializerª, and then you should call the parser function
for each input token you

If the ©reentrant parserª switch is set, the parse function takes
a pointer to the ©parser control blockª as its sole argument. Otherwise
it takes no arguments. The parse function returns no value. All
communication is by means of the ©parser control blockª.

To retrieve the value of the ©grammar tokenª, once the parse is complete,
use the ©parser value functionª.
##

Parser

A parser is a program or, more commonly, a procedure within
a program, which scans a sequence of ©input charactersª
or input tokens and accumulates them in an input
buffer or stack as determined by a set of ©productionªs
which constitute a ©grammarª.

When the parser discovers
a sequence of tokens as defined by a ©grammar ruleª, or
right hand side of a production, it "reduces" the
sequence to a single ©reduction tokenª as defined by the
left hand side of the grammar rule. This ©nonterminal
tokenª now replaces the tokens which matched the grammar
rule and the search for matches continues.

If an input
token is encountered which will not yield a match for
any rule, it is considered a ©syntax errorª and some
kind of ©error recoveryª may be required to continue. If
a match, or ©reduce actionª, yields the ©grammar tokenª,
sometimes called the ©goal tokenª or ©start tokenª, the
parser deems its work complete and returns to whatever
procedure may have called it.

The ©Grammar Traceª and ©File Traceª functions in
AnaGram provide a convenient means for understanding the
detailed operation of a syntax directed parser.

©Tokensª may have ©semantic valuesª. If the ©input
valuesª ©configuration switchª is on, your parser will
expect semantic values to be provided by the input
process along with the token identification code. If the
input values switch is off, your parser will take the
ascii value of the input character, that is, the actual
input code, as the value of the character.

When the
parser reduces a production, it can call a ©reduction
procedureª or ©semantic actionª to analyze the values of
the constituent tokens. This reduction procedure can
then return a value which characterizes the reduced
token.
##

Parser Control Block

A "Parser Control Block" is a structure which contains
all of the data necessary to describe the instantaneous
state of a parser. The typedef statement which defines
the structure is included in the ©parser headerª file
for your parser. AnaGram creates the name of the data
type for the structure by appending "_pcb_type" to the
©parser nameª.

You may add your own declarations to the parser control
block by using the ©extend pcbª statement.

If the ©declare pcbª ©configuration switchª is on, its
normal state, AnaGram will declare a parser control
block for you at the beginning of your parser file.
AnaGram will determine the name of the parser control
block by appending "_pcb" to the ©parser nameª. AnaGram
will also define the macro PCB as a short hand notation
for use within the parser. All references to the parser
control block within the code that AnaGram generates
are made using the PCB macro.

If you wish to declare your own parser control block,
you must include the ©parser headerª file for your
parser before your declaration. Then you declare a
control block and define PCB to refer to the control
block you have declared.

Suppose your grammar is called widget. You would then
write the following statements in your ©embedded Cª:
  #include "widget.h"
  widget_pcb_type widget_control_pcb;
  #define PCB widget_control_pcb

Alternatively, you could write the following:
  #include "widget.h"
  widget_pcb_type *widget_control_pcb_pointer;
  #define PCB (*widget_control_pcb)

and then allocate storage for the structure when
necessary.

Some fields of interest in the parser control block are
as follows:
	©input_codeª
	©input_valueª
	©input_contextª
	©pointerª
	©token_numberª
	©reduction_tokenª
	©ssxª
	©snª
	©ssª[©parser stack sizeª]
	©vsª[parser stack size];
	©csª[parser stack size];
	©lineª
	©columnª
	*©error_messageª
	©error_frame_ssxª
	©error_frame_tokenª
##

PCB

"PCB" is a macro AnaGram defines for use in the code it
generates to refer to the ©parser control blockª for
your ©parserª. Normally, AnaGram automatically declares
storage for a parser control block and defines PCB for
you. If you turn off the ©declare PCBª switch, you may
define PCB yourself.
##

PCB_TYPE

If you are writing your parser in C++, you may prefer to derive
a class from the ©parser control blockª rather than use the
©extend pcbª statement. In this case you may define the
PCB_TYPE macro in your syntax file to specify your derived
class.

For instance, you have defined

class MyPcb : public parser_pcb_type {...};

You would then add the following line:

#define PCB_TYPE MyPcb

If you do not define PCB_TYPE, AnaGram will define it as the
type of your parser control block.
##

Parser File

The "parser file" is the C (or C++) file output by AnaGram when
you execute the ©Build Parserª command. It contains all
of the ©embedded Cª from your ©syntax fileª, all of the
©reduction procedureªs defined in your ©grammarª,
syntax tables which represent, in a condensed form, all
of the intricacies of your grammar, and a customized
©parsing engineª. The name of the parser file is given
by the ©parser file nameª ©configuration parameterª. The
name of the ©parserª itself is given by the ©parser
nameª configuration parameter.

If you wish the parser file to be written without carriage
returns, suitable for a Unix environment, set the ©no crª
configuration switch.
##

Parser File Name

"Parser file name" is a ©configuration parameterª which
takes a string value. The default value is "#.c".
AnaGram uses this parameter to generate the name of the
output C file, or ©parser fileª, created by the ©Build
Parserª command.  The '#' character is used in this
string as a wild card to indicate the name of the
current ©syntax fileª. If the first character of the
parser file name string is a '.' character, AnaGram
will substitute the name of the current working
directory for the dot. Thus ".\\#.c" will create the
file name as a complete path. This can sometimes be
important when using the ©line numbersª switch to
enable a debugger to find code in your parser file.

Note that the parser file name is not the same as the
©parser nameª.
##

Parser Generator

A parser generator, such as AnaGram, is a program that
converts a ©grammarª, a rule-based description of the
input to a program, into a conventional, procedural
module called a ©parserª. The parsers AnaGram generates
are simple C modules which can be compiled on almost
any platform. AnaGram parsers are also compatible with
C++.
##

Header File, Parser Header

When you use the command ©Build Parserª to generate
source code for a parser, AnaGram creates two files, a
header file and a C source file. Unless different
paths are specified in the ©parser file nameª and
©header file nameª parameters, both files will be
written to the directory that contains the ©syntax fileª.

The header file contains a number of typedef statements,
including the definition of the ©parser control blockª,
and a number of macro
definitions which may be useful in your parser
or in other modules of your program.

If you do not alter
the ©header file nameª parameter, the
name of the header file will be the same as the name of
your ©syntax fileª and it will have the extension ".h".

If you wish the header file to be written without carriage
returns, suitable for a Unix environment, set the ©no crª
configuration switch.
##

Parser Input

AnaGram ©parserªs may be configured to accept input in any of
three different ways:

 By default, a ©parse functionª gets its input by invoking the
©GET_INPUTª macro each time it is ready for another input token. The
default implementation of GET_INPUT reads ©input characterªs from stdin.  For
most practical problems, you will want to override this definition of
GET_INPUT, storing the current input character in PCB.input_code.

 Alternatively, you may configure a parser to read input from an
array in memory. Set the ©pointer inputª switch and load the
©pointerª field of the parser control block before calling the
parse function. The parser will then run, incrementing the
pointer, until it finishes or encounters an error.

 The third alternative is to set the ©event drivenª switch. The
parser will be configured as a callback routine. Begin by calling
the ©initializerª. Then, for each input character, store the
character in the ©input_codeª field of the parser control block and
call the parse function. Each time
you call the parse function it will continue until it needs more
input.  You can check its status by inspecting the ©exit_flagª in the
parser control block.

The input to your parser may be either text characters or ©tokensª
accumulated by a pre-processor, or ©lexical scannerª. The latter
case is referred to as ©token inputª. If you use a lexical scanner,
you may find it convenient to configure your parser as event driven.

Altlhough lexical scanners are often not necessary
when you use AnaGram, if you do need one you can write it in AnaGram.
##

Parser Name

"Parser Name" is a ©configuration parameterª which
defaults to "#", where "#" represents the name of your
©syntax fileª. AnaGram uses this parameter to name your
©parse functionª. The ©initializerª for your parser will have the
same name preceded by "init_". Note that "©parser file
nameª" is not the same configuration parameter as "parser
name".
##

Parser Stack

Your ©parserª uses a "parser stack" to keep track of the
©grammar rulesª it is trying to match and its progress
in matching them. Normally, there are two separate
stacks defined by AnaGram: ©PCBª.©ssª, the ©parser state
stackª which maintains ©parser stateª numbers, and
PCB.©vsª, the ©parser value stackª which maintains the
©semantic valueªs of tokens that have been identified so
far. If you wish to maintain a stack tracking other
variables you may set the ©context typeª ©configuration
parameterª, and AnaGram will define a third stack,
PCB.©csª. All are indexed by the same stack index,
PCB.©ssxª.

To see how tokens accumulate on the parser stack, run
the ©Grammar Traceª or the ©File Traceª.

Normally, when the return value of a ©reduction procedureª
is stored on the parser value stack, it is stored by
simply coercing the stack pointer to the correct type.
If the return value is a C++ object, this can cause
serious problems. These problems can be avoided by
using the ©wrapperª statement.
##

Parser stack alignment

Parser stack alignment is a ©configuration parameterª whose
value is a C or C++ data type. It defaults to "long". If
any tokens have type "double", it will be automatically set
to double. Thus, you will normally not need to change this
parameter if your parser is to run on a PC or compatible
processor. It provides alignment control for processors
which restrict address for multibyte data access. The
default setting provides for correct operation on 64 bit
processors.

To control byte alignment of the parser stack,
©PCBª.©vsª, AnaGram normally adds a field of the
specified data type to the "union" statement which
defines the data type for the ©parser stackª. This
parameter can be used to deal with byte alignment
problems when a ©parserª is to be run on a processor
with byte alignment restrictions. For instance, if your
©grammarª has ©tokenªs of type "long double" and your
processor requires long double variables to be
properly aligned, you can include the following
statement in a ©configuration sectionª in your grammar
or in your ©configuration fileª:

  parser stack alignment = long double

If the data type specified is "void", no alignment declaration
will be made.
##

Parser Stack Index, Stack Index

The parser stack index, ©PCBª.©ssxª, tracks the depth
of the ©parser state stackª, the ©parser value stackª,
and the ©context stackª if you defined one. The parser
stack index is incremented by ©shift actionsª and
reduced by ©reduce actionsª.
##

Parser Stack Overflow

Your ©parserª uses a ©parser stackª to keep track of the
©grammar rulesª it is trying to match and its progress
in matching them. If your grammar has any ©recursive
ruleªs that are not strictly left recursive, then no
matter how big you make the parser stack, it will be
possible to create a syntactically correct input which
will cause the stack to overflow. As a practical matter,
however, it is usually possible to set the ©parser stack
sizeª to a value large enough so that an overflow is a
freak occurrence. Nevertheless, it is necessary to check
for overflow, and in the case overflow should occur,
your parser has to do something.  What it does is invoke
the ©PARSER_STACK_OVERFLOWª macro. If you don't define
it, AnaGram will define it for you, although not
necessarily to your taste.
##

Recursive rule, Recursion

A ©grammar ruleª is said to be "recursive" if the ©tokenª on the left side
of the rule also appears on the right side of the rule, or
in an ©expansion ruleª of any token on the right side of the rule.

If the token on the left side is the
first token on the right side, the rule is said to be "left recursive".
If it is the last token on the right side, the rule is said to be
"right recursive". Otherwise, the rule is "center recursive".

For example:
  statement list
    -> statement
    -> statement list, statement  // left recursive

  fraction part
    -> digit
    -> fraction part, digit       // right recursive

  expression
    -> factor
    -> expression, '+' + '-', factor

  factor
    -> primary
    -> factor, '*' + '/', primary

  primary
    -> number
    -> name
    -> '(', expression, ')'       // center recursive

Note that if all the tokens in the rule other then the recursive token itself
are ©zero lengthª tokens, it is possible for the
rule to be matched arbitrarily many times without any input whatsoever. In
other words, such a rule creates an infinite loop in the parser. AnaGram can
detect this condition and issues an ©empty recursionª diagnostic if it occurs.

##

PARSER_STACK_OVERFLOW

PARSER_STACK_OVERFLOW is a user definable macro. If you
do not define it, AnaGram will define it so that it
will print a message on stderr and abort the ©parserª in
case of a ©parser stack overflowª.
##

Parser Stack Size

"Parser stack size" is a ©configuration parameterª with
a default value of 128. It is used to define the sizes
of your ©parser stacksª in your ©parser control blockª.
When analyzing your grammar, AnaGram will determine the
minimum amount of stack space required for the deepest
left ©recursionª. To this depth it will add one half the
value of the parser stack size parameter. It will then
set the actual stack size to the larger of this value
and the parser stack size parameter.
##

Parser State, State Number

The essential part of your ©parserª is a group of tables
which describe in detail what to do for each "state" of
the parser.

The states of a parser are determined by sets of
"©characteristic rulesª". The ©State Definition Tableª
shows the characteristic rules for each state of your
parser.

AnaGram numbers the states of a parser as it identifies
them, beginning with zero. In all windows, state numbers
are displayed as three digit numbers prefixed with the
letter 'S'.
##

Parser State Stack, State Stack

The parser state stack is a stack maintained by your
©parserª and which is an integral part of the parsing
process. At any point in the parse of your input
stream, the parser state stack provides a summary of
what has been found so far. The parser state stack is
stored in ©PCBª.©ssª and is indexed by PCB.©ssxª, the
©parser stack indexª.
##

Parser Value Stack, Value Stack

In parallel with the ©parser state stackª, your parser
maintains a "value stack", ©PCBª.©vsª, each entry of
which corresponds to the ©semantic valueª of the token
identified at that state. Since the semantic values of
different tokens might well have different ©data typeªs,
AnaGram gives you the opportunity, in your ©syntax
fileª, to define the data type for any token. AnaGram
then builds a typedef statement creating a data type
which is a union of the all the types you have defined.
AnaGram creates the name for this ©data typeª by
appending "_vs_type" to the ©parser nameª. AnaGram uses
this data type to define the value stack.
##

Parser Action

In a traditional LR parser, there are only four actions: the ©shift
actionª, the ©reduce actionª, the ©accept actionª and the ©error
actionª. AnaGram, in doing its ©grammar analysisª, identifies a
number of special cases, and creates a number of extra actions which
make for faster processing, but which can be represented as
combinations of these primitive actions.

When a shift action is performed, the current state
number is pushed onto the ©parser state stackª and the
new state number is determined by the current state
number and the current input token. Different tokens
cause different new states.

When a reduce action is performed, the length of the
rule being reduced is subtracted from the ©parser stack
indexª and the new state number is read from the top of
the parser state stack. The ©reduction tokenª for the
rule being reduced is then used as an input token.
##

Parsing Engine

A parser consists of three basic components: A set of
syntax tables, a set of ©reduction procedureªs and a
parsing engine. The parsing engine is the body of code
that interprets the parsing table, invokes input
functions, and calls the reduction procedures. The
©Build Parserª command configures a parsing engine
according to the implicit requirements of the syntax
specification and according to the explicit values of
the ©configuration parameterªs.

The parsing engine itself is a simple automaton,
characterized by a set of states and a set of inputs.
The inputs are the tokens of your grammar. Each state
is represented by a list of tokens which are admissible
in that state and for each token a ©parser actionª to perform
and a parameter which further defines the action.

Each state in the grammar, with the exception of state
zero, has a ©characteristic tokenª which must have been
recognized in order to jump to that state. Therefore,
the ©parser state stackª, which is essentially a list
of state numbers, can also be thought of as a list of
token numbers. This is the list of tokens that have
been seen so far in the parse of your input stream.
##

Partition

If you use ©character setsª in your grammar, AnaGram
will compute a "partition" of the ©character universeª.
This partition is a collection of non-overlapping
character sets such that every one of the sets you have
defined can be written as a ©unionª of partition sets.

Each partition set is assigned a unique ©tokenª. If one
of your character sets requires more than one partition
set to represent it, AnaGram will create appropriate
©productionªs and add them to your grammar so your parser
can make the necessary distinctions.

To see how AnaGram has partitioned the character
universe, you may inspect the ©Partition Setsª window
found in the ©Browse Menuª.
##

Partition Set Number

Each ©partitionª set is identified by a unique
reference number called the partition set number.
Partition set numbers are displayed in the form Pxxx.
Partition sets are numbered starting with zero, so the
first set is P000.

To see the elements of a given partition set, call up
the ©Partition Setsª window from the ©Browse Menuª,
then, after selecting a partition set, call up the ©Set
Elementsª window from the ©Auxiliary Windowsª popup menu.
##

Partition Sets

The Partition Sets option in the ©Browse Menuª pops up
a window which shows the complete ©partitionª of the
©character universeª for your parser.

The Partition Sets option in the ©Auxiliary Windowsª popup menu
for the ©Character Setsª window lets you see the
partition sets which cover the specified character set.

Each entry in a Partition Sets window identifies a
token number and a ©partition set numberª. The ©Auxiliary
Windowsª menu provides a ©Set Elementsª entry which
enables you to see precisely which characters belong to
the partition set. It also has a Token Usage entry to show you
what rules the set is used in.
##

PCONTEXT

PCONTEXT is an alternate form of the ©CONTEXTª macro
which takes an explicit argument to specify the
©parser control blockª. PCONTEXT is defined in the ©parser
headerª file.
##

PERROR_CONTEXT

PERROR_CONTEXT is an alternative form of the
©ERROR_CONTEXTª macro. It differs only in that it takes
an argument so you can specify the appropriate
©parser control blockª explicitly. PERROR_CONTEXT is defined in
the ©parser headerª file.
##

pointer

"pointer" is a field which will be included in the
©parser control blockª for your parser if you have set
the ©pointer inputª ©configuration switchª. Your main
program should set PCB.pointer before it calls your
parser. Thereafter, your parser will increment it
appropriately. When you are executing a ©reduction
procedureª or a ©SYNTAX_ERRORª macro PCB.pointer will
always point to the next input character to be read.
##

Pointer input

"Pointer input" is a ©configuration switchª which you
may set to control ©parser inputª. It defaults to off. When you set
pointer input, you tell AnaGram that the input to your parser is in
memory and can be scanned simply by incrementing a pointer. Before
calling your parser you should make sure that ©PCBª.©pointerª is
properly initialized to point to the first character or token in your
input.

Use the ©configuration parameterª "©pointer typeª" to
specify the type of the pointer. The default value of
"pointer type" is "unsigned char *"

There is no particular reason why pointer type should
be limited to variants on char. It could define a
pointer to int or a structure just as well.

If you use pointer input with structures or C++
classes, you should set the ©input valuesª switch and
define an ©INPUT_CODEª(t) macro.

If you are using a 16 bit compiler and your input array
is so large that you need "huge"
pointers, make sure that "pointer type" is properly
defined.
##

Pointer Type

"Pointer Type is a ©configuration parameterª which
defaults to "unsigned char *". When you have specified
©pointer inputª, AnaGram uses the value of pointer type
to declare a pointer field in your ©parser control
blockª.
##

Precedence, Operator Precedence

In expressions of the form a+b*c, the convention is to
perform the multiplication before the addition.
Multiplication is said to take precedence over
addition. In general the rank order in which operations
are to be performed if there are no parentheses forcing
an order of computation is called the precedence of the
operators.

If you have an ambiguous ©grammarª, that is, a grammar
with a number of ©conflictªs, you may use ©precedence
declarationªs to resolve the conflicts and to set
operator precedence.
##

Precedence Declaration

Precedence declarations are ©attribute statementsª which
may be used to resolve ©conflictªs in your grammar by
assigning precedence and associativity to operators.
Precedence declarations must be made inside
©configuration sectionsª. Each declaration consists of
the keyword ©leftª, ©rightª, or ©nonassocª followed by a
list of ©rule elementsª. The rule elements in the list
must be separated by commas and the entire list must be
enclosed in braces ({ }).

Each of the rule elements is assigned the same
precedence level, which is higher than that assigned in
all previous precedence declarations and lower than that
in all subsequent declarations. The rule elements are
defined to be left, right, or nonassociative,
depending on whether the keyword was "left", "right", or
"nonassoc".

All conflicts which are resolved by precedence
declarations are listed in the ©Resolved Conflictsª
window.
##

Precedence Rules

AnaGram can resolve certain types of ©conflictªs in your
grammar by applying precedence rules. There are three
classes of rules available:  explicit ©precedence
declarationsª, the "©stickyª" statement, and the
implicit rule associated with the use of a "©disregardª"
token outside a ©lexemeª.

Whenever AnaGram uses a precedence rule of any kind to
resolve a conflict, it produces a ©warningª message and
lists the conflict in the ©Resolved Conflictsª window.
##

Previous States

The Previous States window can be accessed via the
©Auxiliary Windowsª popup menu from any window that identifies
©parser stateªs. It shows the ©characteristic ruleªs
for all of the states which jump to the presently
selected state.
##

Print File Name

"Print file name" is a configuration parameter which
is not used in the Windows version of AnaGram. It is
retained only for compatibility with pre-existing
©configuration fileªs.
##

Problem States

The Problem States window is essentially a trimmed
version of the ©Reduction Statesª window. It is
available in the ©Auxiliary Windowsª popup menu for the
©Conflictsª and ©Resolved Conflictsª windows.

The Problem States window has the same format as the
Reduction States window, and differs only in that it
shows only those reduction states for which the
©conflict tokenª is acceptable input.
##

Production

Productions are the mechanism you use to describe how
complex input structures are built up out of simpler
ones. Each production has a left hand side and a right
hand side. The right hand side, or ©grammar ruleª, is a
sequence of ©rule elementsª, which may represent either
©terminal tokensª or ©nonterminal tokensª. The left
hand side is a list of ©reduction tokensª. In most
cases there would be only a single reduction token.
Productions with more than one ©tokenª on the left hand
side are called ©semantically determined productionsª.

The "->" symbol is used to separate the left hand side
from the right hand side. If you have several
productions with the same left hand side, you can avoid
rewriting the left hand side either by using '|' or by
using another "->".

A ©null productionª, or empty right hand side, cannot
follow a '|'.

Productions may be written thus:
  name
   -> letter
   -> name, digit

This could also be written
  name -> letter | name, digit

In order to accommodate semantic analysis of the data,
you may attach to any grammar rule a ©reduction
procedureª which will be executed when the rule is
identified. Each token may have a ©semantic valueª. By
using ©parameter assignmentªs, you may provide the
reduction procedure with access to the semantic values of
tokens that comprise the grammar rule. When it finishes, the
reduction procedure may return a value which will be
saved on the ©parser value stackª as the semantic value of the
©reduction tokenª.
##

Productions

The ©Productionªs window is available via the ©Auxiliary
Windowsª popup menu in any window which identifies tokens.
If the token identified by the highlighted line is
©nonterminalª, the Productions window will show the
rules produced by that ©tokenª.
##

PRULE_CONTEXT

PRULE_CONTEXT is an alternative form of the
©RULE_CONTEXTª macro. It differs only in that it takes
an argument so you can specify the appropriate ©parser control blockª
explicitly. PRULE_CONTEXT is defined in
the ©parser headerª file.
##

Quick Reference

"Quick reference" is an ©obsolete configuration switchª.
##

Range Bounds Out of Order

This is a ©warningª message that appears when you have a
©character rangeª of the form 'z-a'. AnaGram interprets
this range as being equal to 'a-z', but provides a
warning in case the unusual order was the result of a
clerical error.
##

Recursive Definition of Char Set

This ©warningª appears when AnaGram discovers a
recursively defined ©character setª. Character sets
cannot be defined recursively.
##

Redefinition

"Redefinition of <name>" is a ©warningª message which
appears when AnaGram discovers a redefinition of a
©symbolª. The new ©definitionª is ignored.
##

Redefinition of Grammar Token

This ©warningª appears when AnaGram encounters a new
definition of the ©grammar tokenª. AnaGram discards the
old definition. The last definition in the syntax file
wins. If you get this warning, check your ©syntax fileª
to make sure you have the grammar token you want.
##

Redefinition of token

"Redefinition of token, TXXX: <name>" is a ©warningª
message which occurs when AnaGram encounters a
©definitionª statement and the specified ©grammar tokenª
has already been seen on the left side of a
©productionª. AnaGram will ignore the definition
statement.
##

Reduce Action, Reduction

The reduce action, or reduction, is one of the four
actions of a traditional ©parsing engineª. The reduce
action is performed when the parser has succeeded in
matching all the elements of a ©grammar ruleª, and the
next input token is not erroneous. Reducing the grammar
rule amounts to subtracting the length of the rule from
the ©parser stack indexª, identifying the ©reduction
tokenª, stacking its ©semantic valueª and then doing a
©shift actionª with the reduction token as though it had
been input directly.
##

Reduce-Reduce Conflict

A grammar has a "reduce-reduce" ©conflictª at some
state if a single token turns out to be a ©reducing
tokenª for more than one ©completed ruleª.
##

Reducing Token

In a ©parser stateª with more than one ©completed ruleª,
your parser must be able to determine which one was
actually found. Therefore, during analysis of your
grammar, AnaGram examines each completed rule in order
to determine all the states the ©parserª will branch to
once the rule is reduced. These states are called the
"reduction states" for the rule. In any window that
displays ©marked ruleªs, these states may be found in
the ©Reduction Statesª window listed in the ©Auxiliary
Windowsª popup menu.

The acceptable input tokens for those states are the
"reducing tokens" for the completed rules in the state
under investigation. If there is a single token which is
a reducing token for more than one rule, then the
grammar is said to have a ©reduce-reduce conflictª at
that state. If in a particular state there is both a
©shift actionª and a ©reduce actionª for the same token
the grammar is said to have a ©shift-reduce conflictª in
that state.

Note that a "reducing token" is not the same as a
"©reduction tokenª".
##

Reduction Choices

"Reduction choices" is a ©configuration switchª which
defaults to off. If it is set, AnaGram will include in
your ©parser fileª a function which will identify the
acceptable choices for ©reduction tokenª in the current
state. This function, of course, is useful only if you
are using ©semantically determined productionsª. The
prototype of this function is:
	int $_reduction_choices(int *);
 where '$' represents the name of your parser. You must
provide an integer array whose length is at least as
long as the maximum number of reduction choices you
might have. The function will fill the array with
the token numbers of those which are acceptable in the
current state and will return a count of the number of
acceptable choices it found.
##

reduction_token

"reduction_token" is a field in your ©parser control
blockª. If your grammar uses ©semantically determined
productionsª, your ©reduction procedureªs need a
mechanism to specify which token the rule is to reduce
to. ©PCBª.reduction_token names the variable which
contains the ©token numberª of the ©reduction tokenª.
Prior to calling your reduction procedure, your parser
will set this field to the token number of the default
©reduction tokenª, i.e., the leftmost syntactically correct token in the
reduction token list for the production being reduced.
If the reduction procedure establishes that a different
reduction token is appropriate, it should store the
appropriate token number in PCB.reduction_token.
##

Reduction Procedures

The Reduction Procedures window lists the C function
prototypes for the ©reduction procedureªs in your grammar.

When this window is active, the ©syntax fileª window, if
visible, is synchronized with it so you can see the body of
the reduction procedure as well as its usage.
##

REDUCTION_TOKEN_ERROR

REDUCTION_TOKEN_ERROR is a user definable macro which your ©parserª
invokes when it encounters an inadmissible reduction
token. This error should occur only if your parser uses
©semantically determined productionsª and your
©reduction procedureª provides an incorrect ©token
numberª. If you do not define it, AnaGram will define
it so that it will print an error message on stderr and
abort the parse.

##

Reduction Procedure, Semantic Action

A "reduction procedure", or "semantic action", is a
function you write which your ©parserª executes when it
has identified the grammar rule to which the reduction
procedure is attached in your grammar.

When your parser has identified a particular ©grammar
ruleª, that is to say, a particular sequence of ©tokensª
that you have specified in your grammar, it "reduces"
the production to the token at the head of the
production, or ©reduction tokenª.

If you choose, you can
specify a "reduction procedure" which your parser will
call so that your program can do semantic analysis on
the production just identified. Your reduction procedure
will be called using, as arguments, the ©semantic
valuesª of tokens on the right side of the production.

Your reduction procedure may, if you choose, return a
value which will become the semantic value of the
reduction token. Since many of the tokens in
©productionªs are there for only syntactic purposes, you
may specify, when you write your grammar, the tokens
whose values are needed as arguments for your reduction
procedure.

To attach a reduction procedure to a grammar rule, just
write it immediately following the rule. There
are two formats for reduction procedures,
depending on the size and complexity of the procedure.

The first form consists of an equal sign followed by a C
expression and a semicolon. When the rule is matched the
expression will be evaluated and its value will be
stacked on the ©parser value stackª as
the value of the reduction token. For example:
    =-a;
    =myProcedure(x, q);

The second form consists of an equal sign followed by a
block of C code enclosed in curly braces. If you wish to
return a value for the reduction token you have to use a
return statement. For example:
    ={
      if (x > y) return x;
      return x+2y;
     }

In both forms of the reduction procedure, ©parameter
assignmentªs may be attached to ©rule elementªs in
order to make their semantic values available to the reduction
procedure. When the reduction procedure is executed,
local variables
will defined with the names specified in the parameter
assignments. The values of these variables
will have been set to the value of the corresponding
token.

If the return value of your reduction procedure is a
C++ object, you may wish to spacify that AnaGram
enclose it in a ©wrapperª so that constructor calls
and destructor calls are made. Otherwise the object
pushed onto and popped from the parser value stack simply by
coercing the stack pointer to the appropriate type.

The reduction procedures in your grammar are summarized
in the ©Reduction Proceduresª window.
##

Reduction States

The Reduction States window can be accessed via the
©Auxiliary Windowsª popup menu from any window which displays
©parser stateª numbers and ©marked ruleªs. If the highlighted
©grammar ruleª has no marked token, the Reduction States window will
show the states the parse could reach by reducing the rule and
processing the resultant ©reduction tokenª.
##

Reduction Token

A ©tokenª which appears on the left hand side of a
©productionª is called a reduction token. It is so
called because when the ©grammar ruleª on the right side
of the production is matched in the input stream, your
©parserª will "reduce" the sequence of tokens which
matches the rule by replacing the sequence of tokens
with the reduction token.

If more than one
reduction token is specified,
the production is called a ©semantically determined productionª
and your ©reduction procedureª
should choose the appropriate reduction token. If it does not, your parser
will use the first token in the list that is syntactically
correct as the default.

The ©CHANGE_REDUCTIONª macro can be used to specify the reduction
token.

Note that a "reduction token" is not the same as a
"©reducing tokenª".
##

Reduction Trace

The Reduction Trace window is available from the
©Conflictsª window and the ©Resolved Conflictsª window.
It can be used in conjunction with the ©Conflict Traceª
to study ©conflictªs. The Reduction Trace represents the
result of taking the reduce option in the conflict state
of the Conflict Trace.
##

Reentrant Parser

"Reentrant parser" is a ©configuration switchª which defaults to off.
If it is on when AnaGram builds a parser AnaGram will generate code that
passes the pointer to the ©parser control blockª via calling sequences,
rather than using static references to the pcb.

You can use the reentrant parser switch to help make ©thread safe
parsersª.

The reentrant parser switch is compatible with both C and C++.

The reentrant parser switch cannot be used in conjunction with
the ©old styleª switch.

When you have enabled the reentrant parser switch, the ©parse functionª,
the ©initializerª function, and the ©parser value functionª
will be defined to take a pointer to the parser control block as
their sole argument.
##

Reload Button

The ©File Traceª window includes a reload button to allow
you to reread your ©test fileª after you have modified
it without having to start a new file trace. After the
file has been reread, the file trace is reset.
##

rename macro

AnaGram uses a number of macros in its generated code.
It is possible, therefore, to run into naming
collisions with other components of your program. The
rename macro ©attribute statementª allows you to change
the name AnaGram uses for a particular macro to avoid
these problems.

For example, in the Microsoft
Foundation Classes, V4.2, there is a class called
"CONTEXT". If you use the ©context stackª option in
AnaGram, your ©parserª will have a macro called
©CONTEXTª. To avoid the name collision, add the
following attribute statement to any configuration
section in your grammar:
	rename macro CONTEXT AG_CONTEXT
Then, simply use "AG_CONTEXT" where you would otherwise
have used "CONTEXT".
##

reserve keywords

"reserve keywords" is an ©attribute statementª which
can be used to specify a list of ©keywordªs that are
reserved and cannot be used except as explicitly
specified in the grammar. In particular this switch
enables AnaGram to avoid issuing meaningless ©keyword
anomalyª warnings.

AnaGram does not automatically presume that keywords
are also reserved words, since in many grammars there
is no need to specify reserved words.

Reserve keywords statements must be made inside
©configuration sectionsª. Each statement consists of
the keyword "reserve keywords" followed by a list of
keyword ©tokensª.  The tokens must be separated by
commas and the list must be enclosed in braces ({ }).
Each keyword listed will then be treated as a reserved
word.
##

Reset Button

The Reset button, found on ©File Traceª and ©Grammar
Traceª windows restores the initial configuration of
the trace. This is especially convenient for ©Conflict
Traceª or other ©Auxiliary Traceªs.
##

Resolved Conflicts

AnaGram creates the Resolved Conflicts window only when
the grammar it is analyzing has ©conflictªs and when
those conflicts have been resolved by ©precedence
declarationªs, by "©stickyª" statements, or in
connection with the explicit use of a token specified in
a ©disregardª statement. The Resolved Conflicts window
shows the conflicts that have been resolved, using the
same format as that of the ©Conflictsª Window. The rule
chosen is marked with an asterisk in the leftmost column
of the window.
##

Resynchronization

"Resynchronization" is the process of getting your
parser back in step with its input after encountering a
©syntax errorª. As such, it is one method of ©error
recoveryª. Of course, you would resynchronize only if it
is necessary to continue after the error. There are
several options available when using AnaGram. You could
use the ©auto resynchª option, which causes AnaGram to
incorporate an automatic resynchronizing procedure into
your parser, or you could use the ©error token
resynchronizationª option, which is similar to the
technique used by YACC programmers.
##

right

"right" controls a ©precedence declarationª, indicating
that all of the listed ©rule elementsª are to be
considered ©right associativeª.
##

Right Associative

A binary operator is said to be right associative if
an expression with repeated instances of the operator
is to be evaluated from the right. Thus, for example,
when '=' is used as an assignment operator
	x = a = b
is normally taken to mean a = b followed by x = a The
assignment operator is said to be right associative.

In ©grammarªs with ©conflictªs, you may use ©precedence
declarationªs to specify that an operator should be
right associative.
##

Rule Context

The Rule Context window can be accessed via the
©Auxiliary Windowsª menu in any window that displays
©grammar ruleªs. AnaGram displays all occurrences in the
©grammarª of all the ©reduction tokenªs for the rule.
##

RULE_CONTEXT

RULE_CONTEXT is a macro you may use if you have defined
a ©context stackª. In any reduction procedure,
RULE_CONTEXT will be a pointer to the context value
stacked before the first token of the rule being
reduced. Since the context stack contains an entry for
each token in the rule, you may inspect the context
value for each token in the rule by subscripting
RULE_CONTEXT. RULE_CONTEXT[k] is the context of the
(k-1)th token in the rule.
##

Rule Coverage

"Rule Coverage" is the name of both a ©configuration
switchª and a window. The configuration switch
defaults to off. If you set it, AnaGram will include
code in your ©parserª to count the number of times your
parser identifies each ©grammar ruleª in your grammar.
To maintain the counts, AnaGram declares, at the
beginning of your parser, an integer array, whose name
is created by appending "_nrc" to your ©parser nameª.
The array contains one counter for each rule you have
defined in your grammar. There are no entries for the
auxiliary rules that AnaGram creates to deal with set
overlaps or ©disregardª statements. In order to identify
positively all the rules that the parser reduces,
AnaGram has to turn off certain optimization features in
your parser. Therefore a parser that has rule coverage
enabled will run slightly slower that one with the
switch off.

In addition, AnaGram creates a pair of functions to
write the counters to a file and to initialize the
counters from a file. The names of these functions are
given by appending "_write_counts" and "_read_counts" to
the name of your parser. The name of the file is given by the
©coverage file nameª paramater which defaults
to the name of your ©syntax fileª but with the extension ".nrc".

If rule coverage is enabled, AnaGram will also enable the
Rule Coverage option on the ©Browse Menuª. If you select
Rule Coverage, AnaGram will initialize a ©Rule Coverageª
window from the rule count file you select.

AnaGram will
warn you if the rule count file is older than
the syntax file, since under those conditions, the
coverage file might be invalid.
##

Rule Derivation, Token Derivation

You can use the Rule Derivation and Token Derivation
windows to understand the nature of ©conflictªs in your
grammar. To create these windows, open the ©Conflictsª
window. Move the cursor bar to a ©completed ruleª, that
is, one which has no marked token. Press the right mouse button to pop
up the ©Auxiliary Windowsª menu. You may then select the Rule
Derivation or the Token Derivation.

The Rule Derivation window and the Token Derivation
window, together, show how a ©conflictª, or ambiguity,
has arisen in your grammar. Both windows contain a
sequence of rules, and both begin with the same rule,
the rule which is the root cause of the conflict.

Each subsequent line in the rule derivation is an
©expansionª of the marked token in
the previous rule. The last rule in the derivation
window is the rule you selected in the Conflicts
window. Thus the rule derivation window shows you how
the rule involved in the conflict derives from the
root.

Each subsequent line in the token derivation window
shows an expansion of the marked token in the previous rule. The first
token of the last rule in the derivation window is the token that
causes the conflict. This is the usage that is inconsistent with other
usages of this token in the conflict state.

The Rule Derivation and Token Derivation windows each
have five auxiliary windows. The ©Rule Contextª window
is keyed to the highlighted rule. the other four
windows, the ©Expansion Rulesª window, the
©Productionsª window, the ©Set Elementsª window and the
©Token Usageª window are keyed to the marked token.
Remember that there is no marked token on the last
line of the Rule Derivation window.
##

Rule Element

A ©grammar ruleª is a list of "rule elements", separated
by commas. Rule elements may be ©token nameªs,
©character setsª, ©keywordªs, ©immediate actionªs, or
©virtual productionsª. When AnaGram encounters a rule
element for which no token presently exists, it creates
one.

Any rule element may be followed by a ©parameter assignmentª
in order to make the ©semantic valueª of
the rule element available to a ©reduction procedureª.
##

Rule Number

AnaGram assigns a unique rule number to each ©grammar
ruleª that you specify in your grammar. Rules are
numbered sequentially as they are encountered in the
©syntax fileª. AnaGram constructs rule 0 itself. Rule
zero has a single ©rule elementª, the ©grammar tokenª,
unless you have an ©disregardª statement in your
grammar. In this case, there will be two elements.

In AnaGram displays, rule numbers are displayed with a
prefixed 'R' and a three digit decimal number.
##

Rule Stack, Rule Stack Pane

The Rule Stack pane appears across the bottom of a ©Grammar
Traceª or ©File Traceª window. It provides an alternate view of the
parser stack for the trace, showing, for each state, rules instead of
the tokens that you see in the ©Parser Stack paneª. Because it is
synched with the syntax file window, the Rule Stack makes it easy to
see the relationship between the trace and your grammar.

For each level of the parser stack, the Rule Stack shows the ©parser
stateª number and all the active rules. The active rules at any
state consist of all the ©expansion ruleªs for the state that are
consistent with the input at all subsequent states.

Except for the last level
of the stack, each rule has a ©marked tokenª, which in the default
configuration is displayed in bold, italic type. The significance of
the marked token is that all tokens in the rule to the left of the
marked token have already been matched in the input, and the input
in subsequent levels is consistent so far with the marked
token. As more input is processed, rules
that are inconsistent with the new input are deleted from the display.

The last level of the stack shows the current state of the parser and
the rules against which the ©lookahead tokenª will be matched. At
this level, there may be rules with no marked tokens. These are
rules which have been matched exactly in the input. If there is
more than one such rule, at the next parser step the parser will use
the lookahead token to determine which rule to reduce.

In the last level of the stack, marked tokens represent the input the
parser expects to see.

The Rule Stack pane is synched with the ©syntax fileª window if it is
visible so that the rule highlighted in the Rule Stack can be seen
in context in the syntax file.
For rules that AnaGram
generated automatically (to implement ©virtual productionsª
or the ©disregardª statement). the cursor bar will move to the
top of the syntax file window.

The Rule Stack pane is also synched with the other panes in the trace.
As you move the cursor bar in the Rule Stack, the cursor bar in the
Parser Stack pane will track the stack level in the Rule Stack. In
a File Trace, text will be highlighted in the ©Test Fileª pane
corresponding to the selected token in the Parser Stack pane. In a
Grammar Trace, the marked token in the highlighted rule will be
highlighted in the ©Allowable Input paneª.

Clicking the right mouse button pops up an ©Auxiliary Windowsª menu to
give you more information about the highlighted rule.
##

Rule Table

The Rule Table lists, in numerical order, all the
©grammar ruleªs defined in your ©grammarª. Each rule is
preceded by the ©nonterminalª tokens which produce it.
If you are not using ©semantically determined
productionªs, then there will be precisely one token
line per rule. The Rule Table is synched to your ©syntax
fileª to show the rule in context.
##

Semantic Value, Token Value

A ©tokenª generally has a "semantic value", or "token
value", as well as the ©token numberª which identifies
it syntactically.  Each instance of the token in the
input stream can have a different value. For example,
you might have a token called "variable name". In one
instance the variable name might be "widget" and in
another, "wombat". Then "widget" and "wombat" would be
the semantic values in the two instances. Another token
might have numeric semantic values.

You can specify the C or C++ ©data typeª of the token value.
The data type of "variable name" could be "char *"
where the value is a pointer to a string holding the name. There
are separate default types for the values of ©terminalª
and ©nonterminalª tokens. In the usual case of ordinary
character input, the value of a terminal token is just
the ascii character code.

The value of a nonterminal token is determined by the ©reduction procedureªs
attached to the rules the token produces. If there is no reduction
procedure, the value of the token is the value of the first token
in the rule.

It should be noted that the stack operations have been
implemented in such a way that a C++ object that belongs
to a class for which the assignment operator has been
overridden will encounter serious problems. This shortcoming
will be addressed in a future version of AnaGram. Note that
there is no problem with using a pointer to any C++ object.
##

Semantically Determined Production

A "semantically determined production" is one which has
more than one ©reduction tokenª specified on the left
side of the ©productionª. You would write such a
production when the reduction tokens are syntactically
indistinguishable. The ©reduction procedureª may then
specify which of the listed reduction tokens the grammar
rule is to reduce to based on semantic considerations.
If there is no reduction procedure, or the reduction
procedure does not specify a reduction token, the parser
will use the first syntactically correct one in the list.

To simplify changing the reduction token, AnaGram
provides a predefined macro, ©CHANGE_REDUCTIONª.

The ©semantic valueªs of all the reduction tokens for a
given semantically determined production must have the
same ©data typeª.

©File Traceª and ©Grammar Traceª have a ©Reduction Choices paneª which
appears when a semantically determined production is invoked and
you need to choose a reduction token.
##

Set Elements

The Set Elements window is available via the ©Auxiliary
Windowsª popup menu from windows which specify character sets,
partition sets or tokens. It displays the actual
characters which make up the set, or which map to the
specified token. For each character, the numeric code as
well as its display symbol is given.
##

Set Expression, Expression

A set expression is an algebraic expression used to
define a ©character setª in terms of individual
characters, ranges of characters, or other sets of
characters as constructed using ©complementsª, ©unionsª,
©intersectionsª, and ©differencesª.
##

Shift Action

The shift action is one of the four actions of a
traditional ©parsing engineª. The shift action is
performed when the input token matches one of the
acceptable input tokens for the current ©parser stateª.
The ©semantic valueª of the token and the current
©state numberª are stacked, the ©parser stack indexª is
incremented and the state number is set to a value
determined by the previous state and the input token.
##

Shift-Reduce Conflict

A "shift-reduce" ©conflictª occurs if in some ©parser
stateª there exists a ©terminal tokenª that should be
shifted, because it is legitimate input for one of the
©grammar ruleªs of the state, but should also be used to
reduce some other rule because it is a ©reducing tokenª
for that rule.
##

sn

sn is a field in a ©parser control blockª to which your
©error handlingª routines and your ©reduction
procedureªs may refer. Its value is the current ©state
numberª of your ©parserª. sn is modified every time
your parser "shifts" (performs a ©shift actionª on) a
token or reduces (performs a ©reduce actionª on) a
©productionª.
##

ss

ss is a field in a ©parser control blockª to which your
©error handlingª and ©reduction procedureªs may refer.
It is the ©state stackª for your ©parserª. Before every
©shift actionª, the current ©state numberª, ©snª, is
stored in PCB.ss[PCB.ssx], where ©ssxª is the ©parser
stack indexª. PCB.ssx is then incremented.
##

ssx

ssx is a field in a ©parser control blockª to which
your ©error handlingª routines and ©reduction
procedureªs may refer. It is the ©parser stack indexª
for your ©parserª. On every ©shift actionª it is
incremented. On every ©reduce actionª the length of
the ©grammar ruleª being reduced is subtracted from
PCB.ssx.
##

State Definition

The State Definition window can be accessed via the
©Auxiliary Windowsª popup menu from any window that specifies
states. It displays the ©characteristic rulesª that
define the state. The rules are displayed with a marked token, which is
the next token needed in the input if the particular ©grammar ruleª is
to be matched. If the rule is a completed rule, no token will be
marked.

Each line contains the state number, blank if it is the
same as the state number of the previous line, the ©rule
numberª, and finally the ©marked ruleª.

The ©State Definition Tableª, found in the ©Browse
Menuª, displays the characteristic rules for all states
in the ©grammarª.
##

State Definition Table

The State Definition Table lists, for each ©parser
stateª, all of the ©characteristic rulesª which define
that state. The rules are displayed with a ©marked tokenª, which is the
next token needed in the input if the particular ©grammar ruleª is to
be matched. If the rule is a completed rule, no token will be
marked.

Each line contains the state number, blank if it is the
same as the state number of the previous line, the ©rule
numberª, and finally the ©marked ruleª.

In the ©Auxiliary Windowsª menu for many states there is
a ©State Definitionª entry which provides the
characteristic rules for the ©parser stateª identified by
the cursor bar.
##

State Expansion

The State Expansion window may be accessed using the
©Auxiliary Windowsª menu from any window that identifies
a particular ©parser stateª. It shows the complete set
of ©expansion ruleªs for the state, consisting of the
union of the set of ©characteristic ruleªs and, for each
characteristic rule, the set of expansion rules for the
marked token. Thus the State
Expansion window shows all possible legal input to your
parser in the given state.
##

Sticky

"Sticky" statements are ©attribute statementªs and may
be used just like a ©precedence declarationª to resolve
©conflictªs. If a ©shift-reduce conflictª occurs in a
state where the ©characteristic tokenª is "sticky", the
shift action will always be chosen.

Sticky statements must be made inside ©configuration
sectionsª. Each statement consists of the keyword
"sticky" followed by a list of ©tokensª. The tokens must
be separated by commas and the list must be enclosed in
braces ({ }). Each token will then be treated as sticky.

All conflicts which are resolved by sticky statements
are listed in the ©Resolved Conflictsª window.
##

subgrammar

Declaring a nonterminal token to be a "subgrammar"
changes the way AnaGram searches for reducing tokens.

Normally, if there is a completed rule in a particular
state, AnaGram investigates all states to which the
parser could jump on reducing the rule. It then
considers all terminal tokens that are acceptable input
in these states to be reducing tokens for the given
rule. If this set of tokens overlaps the set of tokens
for which there are shift actions, or the set of tokens
which reduce a different rule, there is a ©conflictª.

Now consider a particular nonterminal token T and all
the rules it produces, whether directly or indirectly.
What the preceding remarks mean is that in determining
the reducing tokens for any of these rules, AnaGram
considers not only the definition, but also the usage
of T.

There are circumstances when it is inappropriate to
consider the usage of T. The most common example occurs
when building a lexical scanner for a language such as
C. In this case, you can write a complete grammar for a
C token with no difficulty. But if you try to extend it
to a sequence of tokens, you get scores of conflicts.
This situation arises because you specify that any C
token can follow another, when in actual practice, an
identifier, for example, cannot follow another
identifier without some intervening space or
punctuation. While it is theoretically possible to write
a grammar for a sequence of tokens that has no
conflicts, it is not usually pretty.

The subgrammar declaration resolves this problem by
telling AnaGram that when it is looking for reducing
tokens for any rule produced directly or indirectly by a
subgrammar token, it should disregard the usage of the
token and only consider usage internal to the definition
of the subgrammar token, as though the subgrammar token
were the start token of the grammar.

The subgrammar declaration is made in a ©configuration
sectionª and consists of the keyword "subgrammar"
followed by a list of token names separated by
commas and enclosed in braces ({ }). For example:
	subgrammar { name, number}
##

Suspicious Production

This ©warningª message appears when AnaGram finds a
©productionª of the form x -> x. There is probably a
typo somewhere in your ©syntax fileª. This production
causes a ©conflictª in your grammar. AnaGram leaves
this production in your ©grammarª, but if you build a
parser, it will never succeed in recognizing this
production.
##

Switch Takes on/off Values Only

The specified parameter is a ©configuration switchª. The
only values it may be assigned are ON and OFF.

##

Symbol

In writing your ©grammarª you use symbols, or names, to
represent most of your ©tokensª. You may also use
symbols to represent ©character setªs, ©virtual
productionªs, ©immediate actionªs, or ©keywordªs.

A symbol, or name, must begin with a letter or an
underscore. It may then contain any number of these
characters as well as digits and embedded white space
(including comments). For identification purposes all
adjacent white space characters within a symbol name
are considered to be a single blank.

Upper case and lower case letters are considered to be
different.

Examples:
	token name
	token/*embedded comment*/name

 All symbols used in your grammar are listed in
the ©Symbol Tableª window found in the ©Browse Menuª.
##

Symbol Table

The Symbol Table lists all the symbols, or names, you
used in your grammar. ©Symbolªs may be used, of course,
to identify ©tokensª, ©definitionsª, ©virtual
productionsª, ©immediate actionªs, or ©keywordªs.

Each line in this table identifies a single symbol. The
first field is the token number, if any. This is
followed by the name. If the name identifies an
©expressionª or virtual production, it is followed by an
equal sign and the expression or virtual production.
##

Syntax Analysis Aborted

This ©warningª message appears if, because of previous
errors, AnaGram is unable to complete the ©Analyze
Grammarª command on your ©syntax fileª.
##

Syntax Directed Parsing

Syntax directed parsing, or formal parsing, is an
approach to building ©parsersª based on formal language
theory. Given a suitable description of a language,
called a ©grammarª, there are algorithms which can be
used to create parsers for the language automatically.
In this context, the set of all possible inputs to a
program may be considered to constitute a language, and
the rules for formulating the input to the program
constitute the grammar for the language.

The parsers built from a grammar have the advantage
that they can recognize any input that conforms to the
rules, and can reject as erroneous any input that fails
to conform.

Since the program logic necessary to parse input is
often extremely intricate, programs which use formal
parsing are usually much more reliable than those built
by hand. They are also much easier to maintain, since
it is much easier to modify a grammar specification
than it is to modify complex program logic.
##

Syntax Error

When you specify a ©grammarª, you specify a set of
input character or token sequences which your ©parserª
will "recognize". Usually it is possible for there to
be other sequences of input tokens which deviate from
the rules set down by your grammar. Should your parser
find such a sequence in its input which is not
explicitly allowed for in your grammar, it is said to
have found a "syntax error". The general treatment of
syntax errors is called ©error handlingª, of which there
are two distinct aspects: ©error diagnosisª and ©error
recoveryª. AnaGram allows you to make provision for
error handling to fit your needs, but should you not do
so, it will provide simple default error handling.
##

Statements

AnaGram source files, or ©syntax fileªs, consist of
the following types of statements:
	 ©productionªs
	 ©configuration sectionªs
	 ©embedded Cª
	 ©definitionªs
	 ©token declarationªs

 Statements may be in any order. Each statement must
begin on a new line. If a statement cannot be
construed as complete, it may continue onto another
line.

Statements may contain spaces, tabs or comments, but
may not contain blank lines.
##

Syntax File

Input files to AnaGram are called syntax files. The
default extension for syntax files is .syn. A
syntax file contains a "©grammarª" and supporting C or
C++ code.  The file consists of several distinct types
of statements. These are ©token declarationsª,
©productionªs, ©definitionsª, ©embedded Cª, and
©configuration sectionsª. There may be as many of each
as you need, in whatever order you find convenient.

Each such statement begins on a new line.
##

SYNTAX_ERROR

SYNTAX_ERROR is a macro which your parser will invoke
when it encounters a syntax error in its input stream.
If you have set the ©diagnose errorsª ©configuration
switchª, the static variable ©PCBª.©syntax_errorª will
contain a pointer to a diagnostic message when
SYNTAX_ERROR is invoked. If you have also set the
©error frameª switch, ©PCBª.©error_frame_ssxª and
©PCBª.©error_frame_tokenª will also be set
appropriately.
##

Tab Spacing

"tab spacing" is a ©configuration parameterª which
controls the expansion of tabs when AnaGram displays
your source file or test files in the ©File Traceª window.

The value of "tab spacing" is also used to set the
default value of the ©TAB_SPACINGª macro in your parser.

The default value of "tab spacing" is 8. If you prefer
a different value, you should probably include an
appropriate statement in your ©configuration fileª. For
example:

	tab spacing = 2
##

TAB_SPACING

If you have enabled the ©lines and columnsª switch, your
parser needs to know tab spacing in order to increment
the column count when it encounters a tab character. It
is set up to use the value given by the TAB_SPACING
macro. If you do not define TAB_SPACING in your parser,
AnaGram will provide a default definition, setting it to
the value of the ©tab spacingª ©configuration
parameterª.
##

Terminal, Terminal Token

A "terminal token" is a token which does not appear on
the left side of a ©productionª. It represents,
therefore, a basic unit of input to your ©parserª.  If
the input to your parser consists of ascii characters,
you may define terminal tokens explicitly as ascii
characters or as sets of ascii characters. If you have a
lexical scanner, or preprocessor, which produces numeric
codes, you may define the terminal tokens directly in
terms of these numeric codes.
##

Test File Binary

"Test file binary" is a ©configuration switchª which
defaults to off. When it is off, and you select the
©File Traceª option, AnaGram will read your test files
in "text" mode, discarding carriage return characters.
When "test file binary" is on, AnaGram will read test
files in "binary" mode, preserving carriage return
characters.

If your parser needs to recognize carriage return
characters explicitly, you should turn "test file
binary" on.
##

Test File Mask

"Test file mask" is a string-valued ©configuration
parameterª which AnaGram uses to set up the file dialog
for the ©File Traceª command. It defaults to "*.*". If
there is a conventional file name format for the input
to the ©parserª you are developing, you will probably
want to set "test file mask" in a ©configuration
sectionª in your ©syntax fileª so it is easier to pick
out your test files.
##

Test range

"Test range" is a ©configuration switchª which defaults
to on. When it is set, i.e., on, AnaGram will configure
your parser so that it checks input characters to
verify that they are within the range given by the
©character universeª before it indexes the ©token
conversionª table. If range testing is not necessary
for your parser, you may turn test range off and get a
slight improvement in the performance of your parser.
##

Thread Safe Parsers

AnaGram 2.01 incorporates several changes designed to make it
easier to write thread safe parsers.

First, the ©parserªs generated by AnaGram 2.01 no longer use static or global
variables to store temporary data. All nonconstant data have been
moved to the ©parser control blockª.

Second, two new features which make it substantially
easier to build thread safe parsers have been added. The ©reentrant parserª switch
makes the entire parser reentrant, by passing the pointer to the parser control
block as an argument on all function calls. The ©extend pcbª statement allows
you to add your own variable declarations to the ©parser control
blockª so you can avoid references to global or static variables in
your ©reduction procedureªs.

Third, new support has been added for C++ classes, including
the ©wrapperª statement and the ©PCB_TYPEª macro.
##

token_number

token_number is a field in a ©parser control blockª to
which your ©error handlingª procedures and ©reduction
procedureªs may refer. It contains the actual ©token
numberª of the current input token. If you are supplying
token numbers directly, it is the result of using the
actual input character to index the ©token conversionª
array, ag_tcv.
##

Token

Tokens are the units with which your parser works.
There are two kinds of tokens: ©terminal tokensª and
©nonterminal tokensª. These latter are identified by the
parser as sequences of tokens. The grouping of tokens
into more complex tokens is governed by the ©grammar
rulesª, or ©productionªs in your grammar. In your
grammar, tokens are denoted by ©token nameªs, ©virtual
productionsª, explicit ©character representationsª,
©keywordªs, ©immediate actionªs, or ©expressionªs which
yield ©character setsª.
##

Token Conversion

By using ©character setª ©expressionªs, you may in your
©syntax fileª define a number of input characters as
being syntactically equivalent. When your ©parserª gets
an input character, it uses the character code to index
a table called ©ag_tcvª. The value it extracts from this
table is the ©token numberª for the input character. The
actual character code of the input character becomes the
©token valueª.
##

Token Declaration

A token declaration is simply a ©productionª with no
right hand side. Token declarations can be used to
define the ©data typeªs of tokens. To define the data type
of a token, simply put the data type in parentheses
preceding the name of the token. You can use a list of
tokens joined by commas, if you wish.  Thus:
	(char *) variable name, function name
could be used to specify that the ©semantic valueªs of
the tokens "variable name" and "function name" are both
character pointers.

Of course, token types may be specified as part of any
production the token generates, but sometimes, in the
interest of clarity, it is advisable to group all
declarations together.
##

Token Name

All ©nonterminal tokensª that you define in your
©grammarª by means of explicit ©productionªs must have
names by which they may be referenced. Token names are
©symbolsª which represent the token syntactically in
your grammar specification.
##

Token Names

"Token names" is a ©configuration switchª that defaults
to off. If it is set, it causes AnaGram to include in
the ©parser fileª a static array of character strings, indexed by
token number, which provides ascii representations of token
names. The name of this array is given by "<parser name>_token_names",
where <parser name> is the name of the parser function as
given by the value of the ©parser nameª parameter.

AnaGram also defines a macro, ©TOKEN_NAMESª, which evaluates
to the name of the array.

The array contains strings for all grammar tokens which have
been explicitly named in the syntax file as well as tokens
which represent ©keywordªs or single character constants.

The array is useful in creating ©syntax errorª diagnostics.

Prior to version 2.01 of AnaGram, the TOKEN_NAMES array contained
strings only for explicitly named tokens. If this restriction
is required, set the ©token names onlyª switch.

Token names are also included if the ©diagnose errorsª
switch is set.
##

TOKEN_NAMES

"TOKEN_NAMES" is the name of a macro that AnaGram defines to
provide access to a static array of character strings indexed by
token number, which provides ascii representation of token
names. The array is generated if any of the ©token namesª,
©token names onlyª or ©diagnose errorsª switches are ON.

If ©token names onlyª is set, the array contains non-empty
strings only for those tokens which are explicitly named
in the syntax file. Otherwise, the array also contains
strings for tokens which represent keywords or single
character constants.
##


token names only

"Token names only" is a ©configuration switchª that defaults to
off. If it is set, it will cause AnaGram to include in the
parser file a static array containing the names of the tokens
in your grammar. This array will include only those tokens
to which you have assigned names explicitly and will not
include character constants or keywords. "Token names only"
takes precedence over ©token namesª.
##

Token Not Used

"Token not used, TXXX: <token name> is a ©warningª
message which appears if AnaGram finds an unused ©tokenª
in your ©grammarª. Often an unused token is the result
of an oversight of some kind and indicates a problem in
the grammar.
##

Token Number

AnaGram assigns a unique number, called the "token
number" to each token in the grammar, no matter whether
it is a ©terminal tokenª or a ©nonterminal tokenª. Your
parser does all of its analysis of your input stream
using token numbers as its primary material.

You may need to know the values of token numbers that
AnaGram has assigned, either so a lexical scanner can
output correct token numbers, or so a ©reduction
procedureª can correctly resolve a ©semantically
determined productionª.

To help you, AnaGram defines enumeration constants for
each of the named tokens in your grammar. The definition
of these constants is in the ©parser headerª file.
##

Token Representation

Not all of the ©tokensª in your grammar have a ©token
nameª. Some of the tokens may represent ©character setsª
which you spelled out explicitly, ©virtual productionsª,
©immediate actionªs, or ©keywordªs. In its analysis
tables, AnaGram tries to provide a meaningful
representation for tokens whenever it can. Its first
choice is to use the name, if it has one. Otherwise it
will use the set definition or the definition of the
virtual production if one exists. If AnaGram cannot
otherwise represent your token, it will resort to using
the token number which it normally represents using the
letter T followed by a three digit, zero-padded token
number.
##

Token Table

The Token Table lists all the tokens of your grammar.
The first field is the token number. It is followed by a
flag field which is "zl" if the token is a ©nonterminal
tokenª and is ©zero lengthª. If the token is nonterminal
and not zero length, the flag field contains "nt". If
the token is a ©terminal tokenª, the field is blank.

The next field is blank unless the token has been
declared ©stickyª or has had a ©precedenceª level
assigned. If the token is sticky, this field will
contain 's'. If a precedence level has been assigned,
this field will contain the letter 'l', 'r', or 'n' to
indicate associativity followed by the precedence
level. Finally there is the ©data typeª of the ©semantic
valueª of this token and the ©token representationª.
##

Token Usage

The Token Usage table may be accessed via the ©Auxiliary
Windowsª menu from any window that identifies tokens. It
shows all the rules in the grammar that use the token.
##

Top Margin

"Top margin" is an ©obsolete configuration parameterª.
##

Trace Coverage

Trace Coverage is a table which is built whenever you
run ©Grammar Traceª, one of its pre-built versions, or a ©File
Traceª.  You can access it from the ©Browse Menuª. It shows the number
of times each rule in your grammar has been reduced. Unless you have
set the ©Rule Coverageª ©configuration switchª, some ©null productionªs
and some rules that consist of only one element will not be counted
because of speed optimizations in the parser tables.

The Trace Coverage tables are reset to zero when you load a new syntax
file or start AnaGram.
##

Compound Action

Traditionally, ©LALR-1 parserªs use only four simple
©parser actionªs: shift, reduce, accept and error.
AnaGram parsers use a number of compound actions
in order to reduce the size of parse tables and
speed up processing. A single compound action
may replace several simple shift or reduce actions.

The ©Traditional Engineª ©configuration switchª may
be used to force AnaGram to use only the simple
actions.
##

Traditional Engine

"Traditional engine" is a ©configuration switchª that
defaults to off. Traditional ©LALR-1 parserªs use a
©parsing engineª which has only four actions:
 ©shift actionª
 ©reduce actionª
 ©accept actionª
 ©error actionª


AnaGram, in the interest of
faster execution and more compact parse tables,
uses a parsing engine with a number of
short-cut, or ©compound actionªs. The "traditional engine" switch tells
AnaGram not to use the short-cut actions.

You would turn this switch on if you wished to use the ©Grammar Traceª
or ©File Traceª to see how the standard four parser actions work for
a particular combination of grammar and input. Note that to see the
effects of single parser actions, you must use the ©Single Stepª
button. Remember that in the Grammar Trace, when you single step and
the token you have selected causes a reduce action, it will appear
on the ©lookahead lineª of the ©parser stack paneª and will be preselected
in the ©allowable input paneª until it is finally shifted in to
the parser stack.

Normally, you should leave the "traditional engine" switch off, Then
AnaGram will, whenever possible, compress several parsing actions into
one compound action in order to speed execution of the parser.

Unfortunately use of the term "traditional" has sometimes created the
impression that there is a conservative aspect to the operation of
traditional engine parsers. This is not the case. They have the same
effect, but are slower and have much larger tables.
##

Type Redefinition

"Type Redefinition of TXXX: <token name> is a ©warningª
message which appears when AnaGram finds a conflicting
©data typeª definition for a ©tokenª in your ©grammarª.
The new definition will override the previous one. If
you intend to use different type definitions, you should
use extreme caution and check the generated code to
verify that your ©reduction procedureªs are getting the
values you intended.
##

Undefined Symbol

"Undefined symbol: <name>" is a ©warningª message which
appears when AnaGram encounters an undefined ©symbolª
while evaluating a ©character setª expression. The
following warning in the ©Warningsª window identifies
the particular ©tokenª AnaGram was trying to evaluate.
##

Undefined Token

"Undefined token TXXX: <name>" is a ©warningª message
which appears when the indicated ©tokenª has been used
in the ©grammarª, but there is no definition of it as a
©terminal tokenª nor does any ©productionª define it as
a ©nonterminal tokenª.
##

Unexpected

"Unexpected <element 1> in <element 2>" is a ©warningª
message which you may get when AnaGram analyzes your
grammar. It appears when AnaGram unexpectedly encounters an instance of
syntactic element 1 at the specified location in an instance of
syntactic element 2.  AnaGram cannot reliably continue parsing its
input.  Therefore, it limits further analysis to scanning for syntax
errors. If this error is not the result of a prior error, you should
correct your ©syntax fileª.  Remember that this error could result from
something missing just as well as from something extraneous.

If element 1 is ©eofª, it often means that you have
an unbalanced brace or comment delimiter in the code
following the indicated location.
##

Union

The union of two sets is the set of all elements that
are to be found in one or another of the two sets. In an
AnaGram syntax file the union of two ©character setsª A
and B is represented using the plus sign, as in A + B.
The union operator has the same precedence as the
©differenceª operator: lower than that of ©intersectionª
and ©complementª. The union operator is ©left
associativeª.

Watch out! In an AnaGram syntax file 65 + 97 represents
the character set which consists of the lower case 'a'
and upper case 'A'. It does not represent 162, the sum
of 65 and 97.
##

Video mode

"Video mode" is an ©obsolete configuration parameterª.
##

Virtual Production

Virtual productions are a special short hand
representation of ©grammar rulesª which can be used to
indicate a choice of inputs. They are an important
convenience, especially useful when you are first
building a grammar.

Here are some examples of virtual productions:
	name?						// optional name
	name?...					// 0 or more instances of name
	{name | number}			// exactly one name or number
	{name | number}...			// one or more instances of name or number
	[name | number]			// optional choice of name or number
	[name | number]...			// zero or more instances of name or number

 AnaGram rewrites virtual productions, so that when you
look at the syntax tables in AnaGram, there will be
actual ©productionªs replacing the virtual productions.

A virtual production appears as one of the rule
elements in a grammar rule, i.e. as one of the members
of the list on the right side of a production.

The simplest virtual production is the "optional"
token. If x is an arbitrary token, x? can be used to
indicate an optional x.

Related virtual productions are x... and x?...  where
the three dots indicate repetition. x... represents an
arbitrary number of occurrences of x, but at least one.
x?... represents zero or more occurrences of x.

The remaining virtual productions use curly or square
brackets to enclose a sequence of rules. The brackets
may be followed variously by nothing, a string of three
dots, or a slash, to indicate the choices to be made
from the rules. Note that rules may be used, not merely
tokens.

If r1 through rn are a set of ©grammar rulesª, then
	{r1 | r2 | ... | rn}
is a virtual production that allows a choice of exactly
one of the rules. Similarly,
	{r1 | r2 | ... | rn}...
is a virtual production that allows a choice of one or
more of the rules. And, finally,
	{r1 | r2 | ... | rn}/...
is a virtual production that allows a choice of one or
more of the rules subject to the side condition that
rules must alternate, that is, that no rule can follow
itself immediately without the interposition of some
other rule. This is a case that is not particularly
easy to write by hand, but is quite useful in a number
of contexts.

If the above virtual productions are written with []
instead of {}, they all become optional. [] is an
optional choice, []... is zero or more choices, and
[]/... is zero or more alternating choices.

Null productions are not permitted in virtual
productions in those cases where they would cause an
intrinsic ambiguity.

You may use a ©definitionª statement to assign a name to
a virtual production.
##

Void token

"Void token, <token name>, used as parameter" is a
©warningª message which appears if AnaGram encounters a
©data typeª definition declaring a ©tokenª to have type
void when the token has previously been used in a
©parameter assignmentª for a ©reduction procedureª. Your
C or C++ compiler will complain when it tries to compile
the call to the reduction procedure.
##

vs

vs is a field in a ©parser control blockª to which your
©error handlingª procedures and ©reduction procedureªs
may refer. It is the ©parser value stackª for your
parser. The ©semantic valuesª of the ©tokensª identified
by the parser are stored in the value stack.  The value
stack, like the other ©parser stacksª, is indexed by
©PCBª.©ssxª. When you are executing a reduction
procedure, PCB.vs[PCB.ssx] contains the semantic value
of the first token in the grammar rule you are reducing,
PCB.vs[PCB.ssx+1] contains the second, and so forth. The
return value from your reduction procedure will be
stored in turn in PCB.vs[PCB.ssx].

vs is defined to be of type $_vt, where "$" represents
the name of your parser. AnaGram defines $_vt to
be a union of fields of sizes corresponding to all the
different data types declared in your syntax for the
semantic values of your tokens. In order to avoid
restrictions on the use of C++ classes, the fields are
defined as character arrays. On some processors which
have byte alignment restrictions for multibyte data,
you might encounter a bus error. To correct this
problem, set the ©parser stack alignmentª parameter to
an appropriate data type.
##

Warning

If while analyzing your syntax file, AnaGram finds
something suspicious, it is likely to issue a warning.
The Warnings window will pop up automatically when the
analysis has been completed. If the warning is for a
©syntax errorª in your input file, you will have to fix
it, because AnaGram cannot successfully interpret it.
Otherwise, AnaGram will be able to create a ©parserª for
you, if you wish, no matter how serious the warnings may
be.

You can bring up the Help topic associated with a highlighted warning
by pressing F1 or by clicking with a ©Help Cursorª.

If you have syntax errors, AnaGram will synchronize the
cursor in the ©syntax fileª window with the cursor in the
Warnings window so that whenever the Warnings window is
active, the cursor bar in the syntax file window will
identify the location of the error.

##

What's New

Changes in AnaGram 2.40

Most of the changes in AnaGram 2.40 are under the hood - cleanup of
source files, reorganization of the source tree, revision of build and
test procedures, and so forth, in preparation for the open source
release. All of this will, with luck, be invisible to the end user.

Open Source

AnaGram is now ©open sourceª. AnaGram itself
uses the 4-clause BSD ©licenseª; the ©parsing engineª, and thus the output
files, are licensed with the less restrictive zlib ©licenseª. Source
distributions are available from http://www.parsifalsoft.com.

The manual has been re-typeset using LaTeX instead of WordPerfect.
The typographic consistency and formatting has been considerably
improved; unfortunately, the pagination is now completely different,
so page numbers are not portable to the new version.

All the logic dealing with registration, trial copies, serial numbers,
and so forth has been removed.

Unix Support

The Unix build of the ©command line versionª of AnaGram (agcl) is now
supported and available to the public. There is at present no GUI for
the Unix version. The long-term goal is to migrate the AnaGram GUI
away from the closed (and orphaned) IBM Visual Age class library to
something else, probably GTK, so as to support both Windows and Unix.

Improved Functionality

 Examples. The examples have been adjusted to the current dialect of
C++ and are now compilable again. The legacy "classlib" code some
still depend on is being phased out.

Increased Convenience

 File names. File names in the AnaGram distribution and source
tree are no longer limited to 8+3 characters, and quite a few now have
less cryptic names. Additionally, all HTML files are now named ".html",
not ".htm".

 Installed files. The AnaGram.cgb and AnaGram.hlp files found in
older releases of AnaGram no longer exist; their contents are compiled
into the AnaGram executables instead.

Bug Fixes

 Engine compiler error. The ©error_messageª field of the PCB has
been changed to const char * so current C++ compilers will accept the
code generated when ©diagnose errorsª is turned off.

 Multiple output header files. Including more than one AnaGram
output header file at once used to cause some compilers to issue a
warning, because an #ifndef directive was checking the wrong
symbol. This has been corrected.

 Wrappers and error tokens. AnaGram 2.01 generated uncompilable
code if you tried to use the ©wrapperª feature and error token
resynchronization at the same time. This has been corrected.

 More than 256 keywords. Build 8 of AnaGram 2.01 fixed certain
problems with large keyword tables, but in the process introduced
another, which is now fixed.

For changes in the previous versions of AnaGram, see ©What's New in AnaGram 
2.01ª and ©What's New in AnaGram 2.0ª.

##

What's New in AnaGram 2.01

Changes in AnaGram 2.01

Improved Functionality

  Improved support for building ©thread safe parsersª. All
nonconstant parser data previously declared as static variables has been
moved to the ©parser control blockª. When the ©reentrant parserª switch
is set, all references to the parser control block are passed to functions
via calling sequences. The ©extend pcbª switch provides a mechanism to
add user-defined variables to the parser control block.

  Improved support for C++ parsers. The ©wrapperª statement
provides C++ wrapper classes for objects to be stored on the ©parser value stackª.
The ©PCB_TYPEª macro allows you to derive a C++ class from the parser control
block and to access its members from your ©reduction proceduresª.

  Support for the ©ISO Latin 1ª character set. When using
the ©case sensitiveª switch, case conversion is performed for all ISO-Latin-1
characters, not just those in the ASCII range.

  Improved support for error diagnostics. It is now possible for users
to provide their own text for the error messages created by the ©diagnose errorsª
switch. In addition, the ©token namesª table option now includes ascii representation
of individual characters and keywords instead of only named tokens. The ©token names
onlyª switch can be used for compatibility with previous versions of AnaGram

  More precise determination of error context. The tables used by the ©error frameª
option to provide the context of a syntax error have been reworked and now provide
a substantially more precise localization of the error.

Improved error diagnostics in AnaGram

 ©Missing reduction procedureª diagnostic.
In addition to warning that there is a ©parameter assignmentª
without a ©reduction procedureª, this
diagnostic is now provided if the ©default reduction valueª
does not have the same ©data typeª as the ©reduction tokenª.

 ©Command line versionª. Diagnostics have been reformatted so
they can be recognized by the Microsoft Visual C++ IDE.

 Refined ©keyword anomalyª diagnostics. There should
now be fewer false alarms.

Increased Convenience

 ©File Traceª. If your grammar uses ©semantically determined productionsª,
the File Trace feature will now remember the choices you have
made for ©reduction tokenªs, so that you do not have to make
the same choices over and over again as you work with an example.

 File Paths. The file paths in the #line directives created by the ©line numbersª
switch now use forward slashes instead of backslashes.

Changed Defaults

 ©Parser stack alignmentª. Now defaults to long instead of int.
 ©Parser stack sizeª. Now defaults to 128 instead of 32.

Bug Fixes

 Interaction between context tracking and error token. In previous
versions of AnaGram, if the first token in a rule was the ©error tokenª,
the value of ©CONTEXTª was the value that corresponded to the location
of the error. CONTEXT now correctly shows the context at which the
aborted rule began. For instance, in the following example, if a
syntax error is encountered while parsing the expression, the error
rule will skip over remaining characters to the terminating semicolon.
When invoked from handleError(), the CONTEXT macro will return the
context as it was at the beginning of the expression.
        expression statement
          -> expression, ';'
          -> error, ~(eof + ';')?..., ';'        =handleError();

 ©Distinguish lexemesª. Several minor bugs in the implementation of distinguish lexemes have been
corrected.

 Set partition logic. Corrected problems in the interaction between the set ©partitionª logic
and the implementation of the ©disregardª statement.

 Table size. Fixed a data sizing problem which occurred when one particular parse table
had precisely 256 entries.

 Keyword recognition. Fixed a problem that could cause difficulties with ©keywordª
recognition when the ©case sensitiveª switch was turned off.

 Default conflict resolution. With unresolved ©shift-reduce conflictªs, the shift case was
not always being selected. This problem has been corrected.

 Lockup. It was possible to write an erroneous grammar that would cause
AnaGram to lock up. This problem has been corrected.

 Potential bus error. The error diagnostic funtion created by the ©diagnose errorsª
switch, could, under some circumstances, access an uninitialized value
on the ©parser value stackª. This problem has been corrected.

 Internal errors. Fixed a number of minor bugs which could cause ©internal errorªs
while running ©File Traceª.

For changes in the previous version of AnaGram, see ©What's New in AnaGram 2.0ª.
##

What's New in AnaGram 2.0

AnaGram's user interface has been completely revamped to make it more
convenient and easier to use. However, the same tried and true AnaGram
algorithms are still in place to build your parsers. The rules for
syntax files are also unchanged.

The ©File Traceª and ©Grammar Traceª facilities have each had their
windows combined into a single unit, and a ©Rule Stackª synched with
these windows and with your syntax file window has been added. The
Rule Stack is particularly convenient for relating the progress of the
parse to the ©grammar rulesª in your ©syntax fileª.

A ©text entryª field has also been added to the Grammar Trace. This
means you can provide character input to your parser in much the same
way you can with a ©test fileª in File Trace, but with instant control
over the input.

Some further controls have been added to both File and Grammar Traces.
In particular there is a Reset button to reset the trace to its initial
state. This is particularly useful for ©Conflict Traceªs.

AnaGram now has a small ©Control Panelª (default position is at the
upper right of the screen) from which you can conveniently control
operation.  A menu bar provides access to the various commands and
tables. There are toolbar buttons for Analyze Grammar, Build Parser,
File Trace, and so on. The panel also has a data entry field for
entering search keys.

You can set both colors and fonts in AnaGram windows to suit your own
preferences. We suggest you check Help for ©Colorsª or ©Fontsª before
making changes to make sure that all information will still be properly
displayed.

AnaGram's ©Helpª has been updated to provide hypertext-type links. But
you can still keep multiple Help windows on view at once. A popup menu
shows all the links in a window. New topics have been added. Also,
further documentation topics are provided in HTML format in the html
subdirectory.

A ©Help Cursorª on the Control Panel toolbar can be used to get help for
most AnaGram windows, buttons and menu items. F1 can also be used.

On the ©Action Menuª you will find a list of your most recently used
syntax files. Just click on the file of your choice to have AnaGram
analyze it (or build it if ©Autobuildª is on).
##

White Space

In many grammars it is desirable to pass over blanks,
tabs, and similar characters, as well as comments,
collectively termed "white space", as though they were
not there. The "©disregardª" statement in AnaGram may
be optionally used to accomplish this. The "©lexemeª"
statement may be used to exercise fine control over the
scope of the disregard statement.
##

Wrapper

The wrapper ©attribute statementª provides correct handling of C++
objects returned by ©reduction procedureªs.

If you specify a wrapper for a C++ object, then, when a reduction
procedure returns an instance of the object, a copy of the object will
be constructed on the ©parser value stackª and the destructor will be
called when the object is removed from the stack.

Without a wrapper, objects are stored on the value stack simply
by coercing the stack pointer to the appropriate type.
There is no constructor call when the object is stored nor
a destructor call when it is removed from the stack.

Classes which use reference counts or otherwise overload the
assignment operator should always have wrappers in order to
function correctly.

Wrapper statements, like other ©attribute statementsª, must appear in
configuration sections. The syntax is simply
  wrapper { <comma delimited list of data types> }

For example:
  [
    wrapper {CString, CFont}
  ]

You cannot specify a wrapper for the ©default token typeª.

If your parser exits with an error condition, there may be
objects remaining on the stack. The ©DELETE_WRAPPERSª macro
may be used to delete these objects. If you have enabled
©auto resynchª, DELETE_WRAPPERS will be invoked automatically.

The ©AG_PLACEMENT_DELETE_REQUIREDª macro is used to control
definition of a "placement delete" operator in the wrapper
class AnaGram defines.
##

Zero Length

A zero length ©tokenª is a ©reduction tokenª which can
be matched by a void, i.e. by nothing at all. It
represents an optional item, or a sequence of optional
items, in the input. Since the matching process can
involve several levels of reductions, it is most precise
to use the following recursive definition: A zero length
token is one which either has at least one ©null
productionª or has at least one grammar rule defining it
such that all the tokens in the rule are zero length
tokens.

Care should be taken when using ©zero lengthª tokens in
©recursive ruleªs. If all the tokens in the rule other than
the recursive token itself are zero length tokens
the rule will generate an infinite loop in the generated
parser.

The ©Token Tableª identifies zero length tokens because
the use of such tokens sometimes inadvertently causes
©conflictªs.
##

Control Panel

The AnaGram Control Panel appears at the upper right of your monitor
when you start AnaGram. It has a menu bar, command buttons, a button
which enables a ©help cursorª, and a ©status indicatorª. At the lower
left you will see a data entry field for entering ©searchª
keys, with neighboring search forward and search backward buttons.

Notice that the ©Options Menuª has a "Stay On Top" entry which
allows you to specify whether the Control Panel stays on top of
other AnaGram windows.
##

Status Indicator

The status indicator at the right of the AnaGram
Control Panel shows the status of the ©current grammarª:
  Ready
  Loaded
  Error
  Parsed
  Analyzed
  Built

"Ready" appears only when no grammar has been selected.

"Loaded" and "Parsed" are normally transitory.

"Error" means at least one syntax error has been detected
in your grammar and AnaGram cannot continue. Check the
Warnings window to determine the nature of the problem.

"Analyzed" means that a ©grammar analysisª has been
completed, but no ©output filesª have been written.

"Built" means that an analysis has been completed and
output files have been written.
##

Help Cursor

The Help Cursor is accessed via the button with the question mark on
AnaGram's ©Control Panelª. It is convenient for getting help on
©Warningªs, browse tables, menu items and so on.

If you click on the button you enable the Help Cursor, which you can
then drag with the mouse. A further mouse click will provide help
for the item underneath the cursor.

Note further that AnaGram also has F1 help which you may find
simpler and faster than the Help Cursor.
##

Search

AnaGram has a simple search facility to let you search for text strings
in AnaGram windows. A data entry field on the ©Control Panelª is
provided for you to enter text. Left-clicking on the neighboring
buttons lets you search either forward or backward for a line in the
active window which contains at least one instance of the text.

Note that the search begins at the next line after the highlighted line
for forward search; at the line preceding the highlighted line for
backward search.
##

Search Key

To find a text string in an AnaGram window, enter the
string in the Search Key field in the ©Control Panelª
and press Enter.

To find another instance of the string click on the
©Find Nextª button or press F3.

To find a previous instance of the string click on
the ©Find Previousª button or press F4.

In windows that have a cursor bar, a forward search
begins on the line following the cursor and a backward
search begins on the line preceding the cursor.
##

Find Next

The Find Next key, on the ©Control Panelª immediately
to the right of the ©Search Keyª field, locates
the next instance of the search key in the most recently
active AnaGram window. F3 is the keyboard equivalent.
##

Find Previous

The Find Previous key, on the ©Control Panelª immediately
to the right of the ©Find Nextª key, searches
backwards for the search key in the most recently
active AnaGram window. F4 is the keyboard equivalent.
##

Fonts, Set Fonts

The Set Fonts dialog allows you to use the fonts of your choice in
AnaGram windows. You should make sure that the ©marked tokenªs font is
very distinctive so that marked tokens will show up clearly even if
they are only 1 or 2 characters long. Sometimes it is helpful to use an
underlined font for marked tokens.

A Default button at the bottom of the dialog lets you revert to
AnaGram's original fonts if you wish.
##

Colors, Set Colors

The Set Colors dialog allows you change the colors of
AnaGram windows. Notice that in the ©File Traceª the ©test file paneª
requires three different sets of text and background colors. You
should make sure that the backgrounds, at least, can be easily
distinguished from each other so the trace information can be
properly displayed. You also want to take care that an active pane in
a File Trace or Grammar Trace can be distinguished from inactive
panes.

The Default button at the bottom of the dialog lets you revert to
AnaGram's original colors if you wish.

Color changes pertain only to the client areas of AnaGram windows. The
remaining parts of your windows will have the customary colors you have
chosen for your system.
##

Marked Token

Some tables and trace panes display each rule with one token marked to
show how far parsing has progressed in the rule. The marked token is
the next input expected in the input stream. It is shown in a different
font to distinguish it from other tokens in the rule. If no token is
marked, the rule is a ©completed ruleª, i.e. it has been completely
matched and will be reduced by the next input.

You can set the font for marked tokens by choosing Fonts from the
©Options Menuª. You should make sure that the font is very distinctive so
that marked tokens will show up clearly even if they are only 1 or 2
characters long.  Sometimes it is helpful to use an underlined font for
marked tokens.
##

Synch Parse

The Synch Parse button replaces the ©Single Stepª button on the
toolbar of the ©File Trace windowª when, for some reason, the
location of the blinking cursor in the ©test file paneª differs from
the current parse position. This can occur when you single click in
the test file pane or when the parse cannot track the cursor because
of a ©syntax errorª or a ©semantically determined productionª.

Click the synch parse button to resynch the parse with the cursor.
##


Single Step

The Single Step button is one of the control buttons for the ©File
Traceª and ©Grammar Traceª. It advances the parse one ©parser
actionª at a time. In the File Trace, it is replaced with the "©Synch
Parseª" button whenever the blinking cursor loses synch with
the current parse location.

In the Grammar Trace, the Single Step button takes its input from the
Allowable Input pane, the Reduction Choices pane, or the ©text entryª
field, depending on which is active.
##

Proceed

The Proceed button is one of the control buttons for the
©Grammar Traceª. If the ©Reduction Choices paneª or the ©Allowable
Input paneª is active, Proceed parses the highlighted token
until it is shifted in to the ©parser stackª. If the ©text entryª
field is active, Proceed parses all text in the field. If a
©syntax errorª is encountered, the parse stops and all ©reduce
actionªs are undone.

Note that selecting a token in Allowable Input can cause a syntax
error under certain circumstances. This can happen only if the
following conditions are all true:
 the indicated operation is a ©reductionª,
 the reduction token for the rule being reduced has been used in several
different contexts in the grammar
 and the specified token may
follow it in some contexts and not in others.
##

Reduction Choices Pane

The ©File Traceª and ©Grammar Traceª display a Reduction Choices
pane when they need to reduce a ©semantically determined productionª.

The rule to be reduced is highlighted in the ©rule stack paneª.
If the ©syntax fileª window is visible, it shows the rule in
context in your grammar.

The Reduction Choices pane lists all possible ©reduction tokenªs for
the specified rule. The first reduction token that is admissible in
the current context is highlighted and it appears
as the ©lookahead tokenª in the ©parser stack paneª. The text that
comprises the entire rule is highlighted in the ©test file paneª.

Select the desired reduction token before continuing with the parse.

If you select a token and it does not appear as the lookahead token,
it is not syntactically correct in the current context. If you try
to proceed with the parse, you will get a ©selection errorª.
##

Selection Error

The ©Parse Statusª field indicates a "selection error" if you
choose a ©reduction tokenª from the ©Reduction Choices paneª of
a ©File Traceª or ©Grammar Traceª and the selected token is not
syntactically correct in the current context.
##

Parser Stack Pane

The Parser Stack pane, the upper left pane of the ©File Traceª and
©Grammar Traceª windows, displays the ©parser stackª for the current
trace.

Each line corresponds to one level in the parser state stack. It shows
the stack index, the ©parser stateª for that level, and the ©tokenª which
was seen at that state. The last line of the stack, the ©lookahead
lineª, corresponds to the current state of the parser. Since no input
has yet been processed for this state, the token, if any, which
appears at this level is a ©lookahead tokenª.

If you move the cursor in the Parser Stack pane of a File Trace,
the text that makes up the selected token will be
highlighted in the ©Test File paneª. You can back the parse up to
any desired stack level by double clicking at the beginning of the
token text in the Test File pane.

Similarly, if you move the cursor bar in the Parser Stack pane of a
Grammar Trace, the ©Allowable Input paneª will change to display the
allowable tokens in the selected state. The previously
selected token will be highlighted. Then, double click on any token in
the Allowable Input pane to back the parse up and choose a token
a second time.

The ©Rule Stack paneª of the File or Grammar Trace is also synched
to the Parser Stack pane. If the ©syntax fileª window is visible, it
will be synched to show the rule currently selected in the rule
stack pane. Note that rules that have been automatically generated
by the expansion of ©virtual productionsª cannot be synched, so the
top line of the syntax file will be highlighted instead.

In the Grammar Trace, the last line of the Parser Stack may or may not
display a ©lookahead tokenª, depending on the last ©parser actionª
performed. If input was taken from Allowable Input and the last
action was a simple ©reduce actionª, the last input token selected
will be displayed as the lookahead input. But if the last action
performed shifted the token in, the lookahead field will be empty.

If you right-click on a highlighted line in the Parser Stack pane, you will
get a pop-up menu to give you more information. In particular you can
get an ©Auxiliary Traceª starting at the current point in your File or
Grammar Trace, so you can explore various possibilities without losing
your position in the old trace.
##

Exit

Select this entry from the ©Action Menuª to terminate AnaGram.
##

Allowable Input, Allowable Input Pane

The upper right pane of the ©Grammar Traceª window lists the
allowable input tokens for the current state of the ©grammarª.

The tokens in the Allowable Input pane are listed in two groups:
first, the ©terminal tokensª allowable in this state, and
second, the ©nonterminal tokensª. Between these two groups of tokens
is inserted a line which is either an option for a ©default reductionª,
or declares that there is no default action.

Double click, press Enter, or click the ©Proceedª button to
parse the highlighted token. When all parse actions triggered
by the highlighted token have been completed, all panes of the trace
will be redrawn to show the new state of the parser.

Note that selecting a token in Allowable Input can cause a syntax
error under certain circumstances. This can happen only if the
following conditions are all true:
 the indicated operation is a ©reductionª,
 the reduction token for the rule being reduced has been used in several
different contexts in the grammar
 and the specified token may
follow it in some contexts and not in others.

If you wish to see the results of a single parser action, click
on the ©single stepª button. The parser will perform a single
parser action. If the
token you selected was not shifted in, it will now be displayed
as the ©lookahead tokenª on the last line, the ©lookahead lineª in
the ©Parser Stack paneª, and will be preselected in the Allowable
Input pane.

Because AnaGram, by default, uses a number of compound
parser actions, this situation does not arise very often unless you
have set the ©traditional engineª switch or reset the ©default
reductionsª switch. Usually you will want to select the same token to
proceed, but it is not necessary.

The Allowable Input pane also displays
the ©parser actionª associated with a specific token. If it is
not a ©compound actionª, the action and its result are also shown.

The ©parser actionª field for a token may be interpreted as follows: If
this token would cause a shift to a new state, the action field is ">>"
followed by the new state number. If the token would cause a
©reductionª, the action field is "<<" followed by a ©rule numberª to
show the rule reduced.  If the parser action is a compound action, the
action field is blank.  If the token would cause the grammar to be
accepted, the action field is "Accept".


The ©text entryª field at the bottom of the Grammar Trace can be
used as a convenient alternative to the Allowable Input pane. It
accepts characters rather than tokens. Most non-printing characters
such as newline are only available from Allowable Input.
##

Copy

The Copy command on the ©Windows Menuª copies the currently active
table or Help topic to the clipboard.
##

Statistical Summary

While your grammar is being analyzed, a Statistical Summary window
pops up to show you the progress of the analysis. Unless you have
turned off ©Show Statisticsª on the ©Options Menuª, this window will remain
on-screen for your reference. Among other things, it shows you the
number of rules and states in your grammar, and the number of conflicts
and warnings, if any.

Note that if your grammar is small and you have Show Statistics turned
off, the appearance of this window on your monitor may be exceedingly
brief - you may just see a flash.

If the window is turned off or you have closed it, you can get it from
the ©Browse Menuª.
##

Stay On Top

The Stay On Top entry in the ©Options Menuª allows you to specify whether
the ©Control Panelª stays on top of other AnaGram windows.
##

Show Syntax

If this entry in the ©Options Menuª is checked, AnaGram will display the
©syntax fileª when it has analyzed your ©grammarª. If this entry is not checked
or you have closed the syntax file window, you can select the window
from the ©Browse Menuª.
##

Show Statistics

If this entry in the ©Options Menuª is checked, AnaGram will leave the
©Statistical Summaryª on the screen after it has analyzed your ©grammarª. If
this entry is not checked or you have closed the Statistical Summary
window, you can select the window from the ©Browse Menuª.
##

About AnaGram

Select this entry from the ©Help Menuª to find out the version and
serial numbers of your copy of AnaGram, and how to contact Parsifal
Software.
##

Help Topics

Select Help Topics from the ©Help Menuª to get a complete list of AnaGram
Help Topics titles. You can bring up the window for a highlighted topic
by double-clicking with the left mouse button, pressing F1, or using
the ©Help Cursorª.
##

Cascade Windows

Select this entry from the ©Windows Menuª to cascade your open windows
starting at top left of the screen.
##

Close Windows

Select this entry from the ©Windows Menuª to close all open windows
except the ©Control Panelª. You may also close the active window
by pressing the Escape key.
##

Hide Windows

Select this entry from the ©Windows Menuª to hide all open windows
except the ©Control Panelª. Restore them to the screen with ©Restore
Windowsª
##

Restore Windows

Use this command on the ©Windows Menuª to restore to the screen
any windows you have previously hidden with ©Hide Windowsª.
##

Token Input, Preprocessor, Lexical Scanner

AnaGram makes it unnecessary, in most cases, to have a separate
preprocessor to provide the ©tokensª which are fed to your parser.

However in some cases you may want to use a preprocessor, or lexical
scanner, to provide input to your parser. The preprocessor may
or may not be written in AnaGram. If it sends the parser token
numbers, as opposed to character codes, this is referred to as token
input, as opposed to character input. Please refer to the AnaGram
User's Guide for information on identifying the tokens to the parser
and providing their semantic values, if any.

Since a ©File Traceª is based on character codes, it will be greyed out
on the ©Action Menuª if you have token input. For a ©Grammar Traceª,
entering characters in the ©text entryª field is not appropriate and
will simply cause a syntax error.
##

Lookahead Line

The last line of the ©Parser Stack paneª, the "lookahead" line,
will sometimes show a ©lookahead
tokenª, and sometimes not. In a ©File Traceª, you will always see a
lookahead token because it is available from the ©test fileª.

In a ©Grammar Traceª you will usually see a lookahead token only when
you have used the ©Single Stepª button or if there is available
input in the ©text entryª field. In the latter case the token
corresponding to the first character of the input will appear on the
lookahead line.

If you click Single Step after selecting a token from ©Allowable
Inputª and it causes only a simple ©reduce actionª (as opposed to a
shift or a compound action), then, upon completion of the reduction,
the token you selected will appear on the lookahead line and also
will be preselected in Allowable Input.

Usually you would select
this token for the next parse step.  However, if there are other
possible inputs in this state, the parse theoretically could have
arrived at this state by a different sequence of input tokens. Thus,
if you are more interested in the behavior of the parser at this
state than in the response of the parser to a particular sequence of
inputs, it is perfectly valid to select a different input token, and
AnaGram will let you do it.

Note that if you have enabled the ©traditional engineª switch or
disabled the ©default reductionsª switch, the
probability of finding a token which does a simple reduction is
noticeably higher than otherwise.
##

Action Menu

The Action menu begins with the ©Analyze Grammarª and ©Build Parserª
commands. If a grammar has already been analyzed, but not yet built,
there will also be an extra Build command bearing the name of your
syntax file.

There are also ©Reanalyzeª and ©Rebuildª commands which are
initially greyed out. They become available if you change the
current syntax file.

The next section has ©File Traceª and ©Grammar Traceª
commands. If you have enabled the ©Error Traceª
©configuration switchª, this section also shows an
Error Trace command.

The menu ends with an ©Exitª command
and a list of recently used syntax files, if any. Just
click on a syntax file name to have AnaGram analyze it, or
build it if the ©Autobuildª option is on.
##

Browse Menu

Initially, the Browse Menu shows only a single entry:
©Configuration Parametersª which lets you see the
current state of configuration parameters before any
may have been set by your syntax file. Once you have
analyzed a grammar, this menu fills up with many tables
containing information about your grammar. You can also
bring up a window showing your ©syntax fileª from this menu.
If your grammar has generated ©syntax errorªs or warnings, or
contains conflicts, there will be ©Warningªs or ©Conflictªs
entries.
##

Options Menu

From this menu you can select a ©Fontsª or ©Colorsª dialog so you can
set AnaGram's fonts and colors to suit your own tastes. You can set
©Autobuildª if you want AnaGram to automatically build your ©grammarª
when you select a ©syntax fileª from the ©Action Menuª.  You can also
choose whether or not to automatically show the ©Statistical Summaryª
window or your syntax file window when you open a grammar, or make
the ©Control Panelª stay on top of other AnaGram windows.
##

Windows Menu

The Windows menu lets you cascade, close, or hide all AnaGram
windows except the ©Control Panelª, or restore them if they
have been hidden. It also has a list of open windows (even
if hidden) so you can select the one you want. The Copy command will
copy most windows to the clipboard.
##

Help Menu

The Help Menu has the following entries:

©Getting Startedª provides a brief description of AnaGram and
introductory suggestions.

©Help Topicsª brings up a list of all help topics.

©Using Helpª tells you how to use AnaGram's help facilities.

©What's Newª has information on new features of this version of AnaGram.

©About AnaGramª tells you what version of AnaGram you are using, and also
provides contact information for Parsifal Software.
##

Autobuild

When Autobuild (©Options Menuª) is checked, selecting a file
from the list of most recently used files on the ©Action Menuª
invokes the ©Build Parserª command. Otherwise, the ©Analyze
Grammarª command is invoked.
##

Reanalyze, Rebuild

Reanalyze and Rebuild commands on the ©Action Menuª are
initially greyed out.

Reanalyze becomes available if
you have a syntax file currently analyzed or built
in AnaGram and change it while AnaGram is still running.

Rebuild becomes available if
you have a syntax file currently built
and change it while AnaGram is still running.
##

Percent Sign

The percent sign ( % ) is used to mark certain tokens in your grammar
which AnaGram must redefine in order to implement the ©disregardª
statement. If you have used this statement in your grammar, You will
probably notice the percent sign appearing in some windows and traces.

The percent sign indicates the original token, without the optional
white space attached. Early versions of AnaGram used the degree sign
instead, but this character is not generally available in Windows.
##

Program Development

The first step in writing a program is to write a ©grammarª in
AnaGram notation which describes the input the program expects.

The file containing the grammar, called the ©syntax fileª, should
have the extension ".syn". You could also make up a few sample input
files at this time, but it is not necessary to write ©reduction
procedureªs at this stage.

Run AnaGram and use the ©Analyze Grammarª command to create parse
tables. If there are ©syntax errorsª in the grammar at this point,
you will have to correct them before proceeding, but you do not
necessarily have to eliminate ©conflictsª, if there are any, at this
time. There are, however, many aids available to help you with
conflicts. These aids are described in the AnaGram User's Guide, and
somewhat more briefly in the Online Help topics.

Once syntax errors are corrected, you can try out your grammar on the
sample input files using the ©File Traceª facility.
With File Trace, you can see interactively just how your grammar
operates on your test files. You can also use ©Grammar Traceª to
answer "what if" questions concerning input to the grammar. The
Grammar Trace does not use a test file, but rather allows you to make
input choices interactively.

At any time, you can write ©reduction procedureªs to process your
input data as its components are identified in the input stream. Each
procedure is associated with a ©grammar ruleª. The reduction
procedures will be incorporated into your parser when you create it
with the ©Build Parserª command.

By default, unless you specify an input procedure, ©parser inputª
will be read from stdin, using the default ©GET_INPUTª macro.
You will probably wish to redefine GET_INPUT, or configure your
parser to use ©pointer inputª or ©event drivenª input.
##

License, Copyright, Copying, Open Source, Warranty, No Warranty

AnaGram, A System for Syntax Directed Programming

Copyright 1993-2002 Parsifal Software

Copyright 2006, 2007 David A. Holland

All Rights Reserved.

AnaGram itself is released to the public under the traditional 4-clause BSD
license:

   Redistribution and use in source and binary forms, with or without
   modification, are permitted provided that the following conditions are
   met:

   1. Redistributions of source code must retain the above copyright notice,
   this list of conditions and the following disclaimer.

   2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.

   3. All advertising materials mentioning features or use of this software
   must display the following acknowledgement:
      This product includes software developed by Parsifal Software,
      Jerome T. Holland, and their contributors.

   4. Neither the name of Parsifal Software nor the name of Jerome T.
   Holland nor the names of their contributors may be used to endorse or
   promote products derived from this software without specific prior written
   permission.

   THIS SOFTWARE IS PROVIDED BY PARSIFAL SOFTWARE,
   JEROME T. HOLLAND, AND CONTRIBUTORS ``AS IS'' AND ANY
   EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
   AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
   IN NO EVENT SHALL PARSIFAL SOFTWARE, JEROME T. 
   HOLLAND, OR THE CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
   INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 
   CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
   PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
   USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
   HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
   WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
   OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
   POSSIBILITY OF SUCH DAMAGE.

The AnaGram ©parsing engineª, that is, the code that is emitted by
AnaGram and incorporated into programs developed using AnaGram, uses
this less restrictive zlib-style license:

   This software is provided 'as-is', without any express or implied warranty.
   In no event will the authors be held liable for any damages arising from
   the use of this software.

   Permission is granted to anyone to use this software for any purpose,
   including commercial applications, and to alter it and redistribute it
   freely, subject to the following restrictions:

   1. The origin of this software must not be misrepresented; you must not
   claim that you wrote the original software. If you use this software in a
   product, an acknowledgment in the product documentation would be
   appreciated but is not required.

   2. Altered source versions must be plainly marked as such, and must not
   be misrepresented as being the original software.

   3. This notice may not be removed or altered from any source distribution.

##