view doc/manual/xg-i.tex @ 24:a4899cdfc2d6 default tip

Obfuscate the regexps to strip off the IBM compiler's copyright banners. I don't want bots scanning github to think they're real copyright notices because that could cause real problems.
author David A. Holland
date Mon, 13 Jun 2022 00:40:23 -0400
parents 13d2b8934445
children
line wrap: on
line source

\chapter{Exploring Your Grammar I: Traces}

\section{Introduction}

AnaGram provides two important facilities to help you become familiar
with the workings of your parser: the File Trace and the Grammar
Trace.  The File Trace takes its input from a test file and, under
your control, parses the input in accordance with your grammar.  The
Grammar Trace is completely interactive.  In each state it presents
you with a list of acceptable tokens and lets you select the one you
wish.  The File Trace lets you see how your grammar will deal with a
specific input file.  The Grammar Trace lets you answer ``what if?''
questions.

AnaGram also provides several ready-made traces to help you deal with
specific problems.  The
\index{Error Trace}\index{Trace}\index{Window}\agwindow{Error Trace}
can show you why your parser has diagnosed a syntax error.  The
\index{Conflict Trace}\index{Trace}\index{Window}\agwindow{Conflict
Trace} and
\index{Reduction Trace}\index{Trace}\index{Window}\agwindow{Reduction
trace} can help you identify the sources of ambiguities in your
grammar.  The
\index{Keyword Anomaly Trace}\index{Trace}\index{Window}\agwindow{Keyword
Anomaly Trace} can help you understand the genesis of a keyword
anomaly in your grammar.  In addition, the
\index{Trace}\index{Auxiliary Trace}\index{Window}\agmenu{Auxiliary Trace}
selection in the
\agmenu{Auxiliary Windows} popup menu enables you to get a trace to
any state you wish or to transmute a File Trace into a Grammar Trace.

The trace functions serve several purposes.  They are an ideal way to
learn how syntax directed parsing works since they make each step
clearly visible.  They are also useful in determining whether your
grammar works as you wish it to.  You can try your grammar out on real
data before you actually build a parser.  If your grammar has
problems, such as conflicts, you can use the trace functions to track
down the problems, to understand them, and to verify your corrections.

You may have any number of File and Grammar Traces active
simultaneously.

Remember that AnaGram usually uses short-cut parsing actions whenever
possible.  If you wish to see just the standard four parsing actions,
you may want to set the \agparam{traditional engine} switch.  This
will, however, substantially increase the size of your parser and
reduce its performance.

\section{Trace Windows}

The \agwindow{File} and \agwindow{Grammar Trace} windows normally each
contain three panes: the \agwindow{Parser Stack} pane, an input pane,
and the \agwindow{Rule Stack} pane.  If your grammar uses semantically
determined productions, the Reduction Choices pane will appear when
necessary to allow you to select a reduction token.

% XXX s/some point/a point/
In the File Trace, the input, or \index{Test File pane}\agwindow{Test
File}, pane displays the test file.  You need only double click on
some point in the pane to test your grammar up to that point in the
file.  Differently colored backgrounds are used to distinguish the
part of the file that has been parsed from the part that has not yet
been parsed.

In the Grammar Trace, the input, or \index{Allowable Input
pane}\agwindow{Allowable Input}, pane displays a list of tokens which
are allowable input in the current state of the parser.  You may
select tokens from the \agwindow{Allowable Input} pane, one by one, or
use the text entry field to type input to be parsed.

The active pane has a distinctively colored title panel and cursor
bar, except for the \agwindow{Test File} pane in the \agwindow{File
Trace} which has no cursor bar.  You can use the tab key to tab among
the panes. The function of other keyboard keys depends on which pane
is active.

Along the bottom of the trace windows is a toolbar which provides
status information as well as control buttons.

\subsection{Parser Stack Pane}

The \index{Parser Stack pane}\agwindow{Parser Stack} pane, the upper
left pane of the \agwindow{File Trace} and \agwindow{Grammar Trace}
windows, displays the parser stack for the current trace.

Each line corresponds to one level in the parser state stack, showing
the stack index, the parser state for that level, and the token which
was seen at that state.  The last line of the stack, the lookahead
line, corresponds to the current state of the parser.  Since no input
has yet been processed for this state, the token, if any, which
appears at this level is a lookahead token.

In the File Trace, the token on the lookahead line corresponds to the
character at the parse location in the Test File pane.  In the Grammar
Trace, the lookahead line is empty whenever all previous input to the
parser has been completely parsed, otherwise it will display the most
recently selected input token.

If you move the cursor in the Parser Stack pane of a File Trace, the
text that makes up the selected token will be highlighted in the Test
File pane.  You can back the parse up to any desired stack level by
double clicking at the beginning of the token text in the Test File
pane.

Similarly, if you move the cursor bar in the Parser Stack pane of a
Grammar Trace, the \index{Allowable Input pane}\agwindow{Allowable
Input} pane will change to display the allowable tokens in the
selected state.  The previously selected token will be highlighted.
Then, double click on any token in the Allowable Input pane to back
the parse up and choose a token a second time.

The \index{Rule Stack pane}\agwindow{Rule Stack} pane (see below) of
the \agwindow{File} or \agwindow{Grammar Trace} is also synched to the
\agwindow{Parser Stack} pane.  If the syntax file window is visible,
it will be synched to show the rule currently selected in the
\agwindow{Rule Stack} pane.  Note that rules that have been
automatically generated by the expansion of virtual productions cannot
be synched, so the top line of the syntax file will be highlighted
instead.

If you right-click on a highlighted line in the Parser Stack pane, you
will get a pop-up menu to give you more information.  In particular
you can get an \agwindow{Auxiliary Trace} starting at the current
point in your \agwindow{File} or \agwindow{Grammar Trace}, so you can
explore various possibilities without losing your position in the old
trace.

\subsection{Rule Stack Pane}

The \index{Rule Stack pane}Rule Stack pane appears across the bottom
of a \agwindow{Grammar Trace} or \agwindow{File Trace} window.  It
provides an alternate view of the parser stack for the trace, showing,
for each state, rules instead of the tokens that you see in the
\agwindow{Parser Stack} pane.  Because it is synched with the syntax
file window, the \agwindow{Rule Stack} makes it easy to see the
relationship between the trace and your grammar.

For each level of the parser stack, the \agwindow{Rule Stack} shows
the parser state number and all the active rules. The active rules at
any state consist of all the expansion rules for the state that are
consistent with the input at all subsequent states.

Except for the last level of the stack, each rule has a
\index{Token}\agterm{marked token}, which in the default
configuration is displayed in bold, italic type.  The significance of
the marked token is that all tokens in the rule to the left of the
marked token have already been matched in the input, and the input in
subsequent levels is consistent so far with the marked token. As more
input is processed, rules that are inconsistent with the new input are
deleted from the display.

The last level of the stack shows the current state of the parser and
the rules against which the lookahead token will be matched.  At this
level, there may be rules with no marked tokens.  These are rules
which have been matched exactly in the input.  If there is more than
one such rule, at the next parser step the parser will use the
lookahead token to determine which rule to reduce.

In the last level of the stack, marked tokens represent the input the
parser expects to see next.

The \agwindow{Rule Stack} pane is synched with the syntax file window
if it is visible so that the rule highlighted in the \agwindow{Rule
Stack} can be seen in context in the syntax file.  For rules that
AnaGram generated automatically to implement virtual productions or
the disregard statement, the cursor bar will move to the top of the
syntax file window.

The \agwindow{Rule Stack} pane is also synched with the other panes in
the trace.  As you move the cursor bar in the \agwindow{Rule Stack},
the cursor bar in the \agwindow{Parser Stack} pane will track the
stack level in the \agwindow{Rule Stack}.  In a \agwindow{File Trace},
text will be highlighted in the \agwindow{Test File} pane
corresponding to the selected token in the \agwindow{Parser Stack}
pane.  In a \agwindow{Grammar Trace}, the marked token in the
highlighted rule will be highlighted in the \agwindow{Allowable Input}
pane.

Clicking the right mouse button in the \agwindow{Rule Stack} pane pops
up an \agmenu{Auxiliary Windows} menu to give you more information
about the highlighted rule.  The \agmenu{Auxiliary Windows} menu
offers four choices keyed to the marked token: \agmenu{Expansion
Rules}, \agmenu{Productions}, \agmenu{Set Elements} and \agmenu{Token
Usage}.  The \agmenu{Keywords}, \agmenu{State Definition} and
\agmenu{State Expansion} options are keyed to the state number.  The
\agmenu{Expansion Chain} and \agmenu{Rule Context} options are keyed
to the highlighted rule.  Note that by the very nature of the rule
stack, a completed rule may occur only at the last level.

\subsection{Reduction Choices Pane}
\index{Reduction Choices pane}

The \agwindow{File Trace} and \agwindow{Grammar Trace} display a
\agwindow{Reduction Choices} pane when they need to reduce a
semantically determined production.  The rule to be reduced is
highlighted in the \agwindow{Rule Stack} pane.  The syntax file
window, if visible, will show this rule highlighted if it is one that
appears directly in your grammar.

The \agwindow{Reduction Choices} pane lists all possible reduction
tokens for the specified rule.  The first reduction token that is
admissible in the current context is highlighted and it appears as the
lookahead token in the \agwindow{Parser Stack} pane.  In the
\agwindow{File Trace}, text that comprises the entire rule is
highlighted in the \agwindow{Test File} pane.

Select the desired reduction token before continuing with the parse.

If you select a token and it does not appear as the lookahead token,
it is not syntactically correct in the current context.  If you try to
proceed with the parse, you will get a selection error.

\subsection{Parse Status}
\index{Parse Status}

Both the \agwindow{File} and \agwindow{Grammar Trace} have a
\agwindow{Parse Status field} on the toolbar at the bottom of the
window to indicate the current state of the parser.  The possible
values are as follows:
\begin{itemize}
\item \agmenu{Ready}: The parser is ready for input.
\item \agmenu{Running}: The parser is processing input.
\item \agmenu{Parse Complete}: According to the grammar, no further
input is expected.  Click on reset or reload to restart the parse.
\item \agmenu{Syntax error}: A syntax error has been encountered.  The
parser cannot go any further.
\item \agmenu{Unexpected end of file}: The parser has reached the end
of the actual input but the grammar still expects more.
\item \agmenu{Select reduction token}: The parser encountered a
semantically determined production.  Select a reduction token from the
\agwindow{Reduction Choices} pane.
\item \agmenu{Selection error}: The reduction token selected from the
Reduction Choices pane was not allowable input in the present
state.  Select another reduction token. 
\end{itemize}


\section{File Trace}
\index{File Trace}\index{Trace}\index{Window}

\subsection{Starting File Trace}

To do a \agwindow{File Trace} you must first analyze the grammar you
wish to use.  Then select \agmenu{File Trace} from the \agmenu{Action}
menu.  Select a file for parsing and the trace will begin.  If you
have not analyzed a grammar or if your grammar does not accept ASCII
characters the \agmenu{File Trace} option will be grayed out.  You may
provide a mask for the test file name by setting the
\index{Test file mask}\index{Configuration parameters}\agparam{test
file mask} configuration parameter in your syntax file (see Appendix
A, Configuration Parameters).  AnaGram normally reads test files in
text mode, that is, carriage return characters are stripped. If you do
not want carriage return characters removed, you should set the
\index{Test file binary}\index{Configuration switches}\agparam{test
file binary} configuration switch.

\subsection{Test File Pane}

The file under test is displayed in the input pane in the upper right
of the \agwindow{File Trace} window.  To parse to a specific point in
the file, double click at that point.  If you double click at a point
that precedes the current parse location, the parse will back up to
that point.  You may also use the cursor keys to control the parse.
As long as the parse location and the cursor are synchronized the
parse will track the cursor when you move the cursor using the cursor
keys.

If the parse encounters a syntax error, it will not be able to go
beyond the location of the error.  In this situation, moving the
cursor right or down will cause the cursor position to differ from the
parse location.  The parse and cursor positions can also differ if you
single click anywhere in the Test File pane.

If the parse location and the cursor are thus not synchronized, the
\agmenu{Single Step} button will be replaced with a \index{Synch
Parse}\agmenu{Synch Parse} button.  Click on the \agmenu{Synch Parse}
button to get the cursor and the parse back in synch.  Of course, the
parse will still not be able to proceed past a syntax error.

In the \agwindow{Test File} pane, you can distinguish text that has
been parsed from unparsed text because it is shown in a different
color.  (The default background color for parsed text is lighter.)
Initially no text has been parsed, and the caret is positioned at the
beginning of the file.  The \index{Parse Location}\agwindow{Parse
Location} box at the lower left of \agwindow{File Trace} will show
\textit{1:1} for line 1, column 1.  The \agwindow{Parse Status} box
next to it will say \agmenu{Ready}.

If your grammar uses semantically determined productions, the parse
will halt when one is encountered and the \agwindow{Reduction Choices}
pane will be displayed so you may select the appropriate reduction
token.

% XXX index Reload?
At any time you can click on the \index{Reset}\agmenu{Reset} button to
reset the parse to the beginning of the test file. If you modify the
test file, you can click on the \agmenu{Reload} button to load the
modified file and reset the parse.

Normally, AnaGram reads test files in ``text'' mode, discarding
carriage return characters. If your parser needs to recognize carriage
return characters explicitly, you should turn the \index{test file
binary}\agparam{test file binary} configuration switch on.

\subsection{File Trace Toolbar}

The File Trace window has a toolbar at the very bottom of the window
which provides parse status information as described above and
contains buttons to help control the parse.  The buttons are as
follows:

\begin{itemize}
\item \index{Single Step}\agmenu{Single Step}: Advances the parse one
parser action at a time.
\item \index{Synch Parse}\agmenu{Synch Parse}: Replaces the \agmenu{Single
Step} button when the blinking cursor and the parse location do not
coincide.  Clicking on the \agmenu{Synch Parse} button will cause the
parse to back up to the blinking cursor, if the blinking cursor
precedes the parse location, or parse to the blinking cursor
otherwise.  If the parser encounters a syntax error before reaching
the cursor, the parse will stop at the error and the parse will still
not be in synch with the cursor.
\item \index{Parse File}\agmenu{Parse File}: Parses all the way to the
end of file. The parse will not stop until it encounters a syntax
error, a semantically determined production, or the end of file.
\item \index{Reset}\agmenu{Reset}: Resets the parse to its initial
state.
\item \index{Reload}\agmenu{Reload}: Reloads the test file from
disk.  This is useful if you have edited the test file since you last
loaded it into the \agwindow{File Trace}.
\end{itemize}


\section{Grammar Trace}
\index{Grammar Trace}\index{Trace}\index{Window}

A \agwindow{Grammar Trace} can be selected from either the
\agmenu{Action} menu or the \agwindow{Control Panel} toolbar.  With
it, you can examine the workings of your parser in detail.  Using
various options, you can set up representations of the
\index{Parser state stack}\index{State stack}\index{Stack}parser state
stack and parser state as they might appear in the course of execution
of your parser.  You can then examine the possible inputs and the
changes to the state and the state stack caused by any input you
choose.

Several of AnaGram's debugging facilities employ a ready-made
\agwindow{Grammar Trace} to direct you to the source of trouble.
These are the \agwindow{Conflict Trace}, the \agwindow{Reduction
Trace}, the \agwindow{Error Trace}, the \agwindow{Keyword Anomaly
Trace}, and the \agwindow{Auxiliary Trace}.

AnaGram now provides a text entry field where you can enter input text for a
\agwindow{Grammar Trace}, in addition to choosing tokens from
\agwindow{Allowable Input}.  This means
you can run a \agwindow{Grammar Trace} like a \agwindow{File Trace}
where the test file is replaced by text you can type in.  This is a
very convenient way to check out your grammar.

\subsection{Allowable Input Pane}
\index{Allowable Input pane}

The upper right pane of the \agwindow{Grammar Trace} window lists the
allowable input tokens for the current state of the grammar.  The
current state is the state selected by the cursor bar in the
\agwindow{Parser Stack} pane.

The tokens in the \agwindow{Allowable Input} pane are listed in two
groups: first, the terminal tokens allowable in this state, and
second, the nonterminal tokens.  Between these two groups of tokens is
inserted a line which is either an option for a default reduction, or
declares that there is no default action.

Double click, press Enter, or click the \agmenu{Proceed} button to
parse the highlighted token.  When all parse actions triggered by the
highlighted token have been completed, all panes of the trace will be
redrawn to show the new state of the parser.

If you wish to see the results of a single parser action, click on the
\agmenu{Single Step} button.  The parser will perform a single parser
action.  If the token you selected was not shifted in, it will now be
displayed as the lookahead token on the last line, the lookahead line,
in the \agwindow{Parser Stack} pane, and will be preselected in the
\agwindow{Allowable Input} pane.  Because AnaGram, by default, uses a
number of compound parser actions, this situation does not arise very
often unless you have set the
\index{traditional engine}\agparam{traditional engine} switch or reset
the \index{default reductions}\agparam{default reductions} switch.
Usually you will want to select the same token to proceed, but it is
not necessary.

% XXX the << and >> render really poorly. They should be italic, too,
% but I can't get them to set that way.
The \agwindow{Allowable Input} pane also displays the parser action
associated with a specific token, and its result, provided it is not a
compound action.  The parser action field for a token may be
interpreted as follows: If this token would cause a shift to a new
state, the action field is $>>$ followed by the new state number.  If
the token would cause a reduction, the action field is $<<$ followed
by a rule number to show the rule reduced.  If the parser action is a
compound action, the action field is blank.  If the token would cause
the grammar to be accepted, the action field is \textit{Accept}.

If a parser action requires the reduction of a semantically determined
production, the \agwindow{Reduction Choices} pane will open.  Select a
reduction token to continue.

Note that selecting a token in \agwindow{Allowable Input} can cause a
syntax error under certain circumstances.  This can happen only if the
following conditions are all true:

\begin{itemize}
\item the indicated operation is a reduction,
\item the reduction token for the rule being reduced has been used in
several different contexts in the grammar,
\item and the specified token may follow it in some contexts and not
in others.
\end{itemize}

\subsection{Text Entry Field}
\index{Text Entry field}

It is sometimes more convenient to enter text in the text entry field
on the \agwindow{Grammar Trace} toolbar than to select individual
tokens from the \agwindow{Allowable Input} pane.  By entering text you
can proceed quickly to the state you want without having to choose
each individual token en route.

After entering text, press Enter or click on the \agmenu{Proceed}
button to parse the text.  The text will be parsed in its entirety
unless a syntax error or semantically determined production is
encountered.

Click on the \agmenu{Single Step} button to work slowly through the
text step by step.

\subsection{Grammar Trace Toolbar}

The Grammar Trace window has a toolbar at the very bottom of the
window which provides parse status information as described above and
contains buttons to help control the parse.  The buttons are as
follows:

\begin{itemize}
\item \index{Proceed}\agmenu{Proceed}: Parses the token selected in
\agwindow{Reduction Choices} if that pane is active, or the token
selected in \agwindow{Allowable Input} if that pane is active, or the
content of the text entry field if that is active.  The parse
continues until the input has been entirely shifted in, a semantically
determined production has been encountered, or a syntax error has
been encountered.
\item \index{Single Step}\agmenu{Single Step}: Advances the parse one
parser action at a time.  Input is taken as for the \agmenu{Proceed}
button.
\item \index{Reset}\agmenu{Reset}: Reset the parse to its initial
state. This is especially useful if the \agwindow{Grammar Trace} is
one of the pre-built traces discussed below.  In such a trace the
initial state can be quite complex.
\end{itemize}


\section{Other Traces}

\subsection{Error Trace}
\index{Error Trace}\index{Trace}\index{Window}

An \agwindow{Error Trace} is a preset \agwindow{Grammar Trace}
designed to help you deal with errors your parser encounters during
operation.  If the
\index{Error trace}\index{Configuration switches}\agparam{error trace}
configuration switch is set when AnaGram builds a parser, whenever the
parser detects:
\begin{itemize}
\item a \index{Syntax error}\index{Errors}syntax error
\item a parser stack overflow, or
\item a reduction token error
\end{itemize}
it will write the current state stack to a file with the extension
\index{etr}\index{File extension}\agfile{.etr} and the name of the
parser file.  If such a file already exists, it will be overwritten.

When you analyze the grammar, AnaGram will enable the \agmenu{Error
Trace} option in the \agmenu{Browse} menu.  When you select the
\agmenu{Error Trace} option, you can select an error trace file. 
AnaGram will use it to create a grammar trace which reproduces the
state of the parser at the time of the error.  You may then work with
the trace as with any other grammar trace.  The \agmenu{Reset} button
will restore the trace to its initial condition.

Note that when the syntax file is modified, an error trace file may
become invalid.  If you try to load an error trace file that is older
than the syntax file, AnaGram will pop up a message box to warn you.

\subsection{Conflict and Reduction Traces}
\index{Conflict Trace}\index{Trace}\index{Window}
\index{Reduction Trace}\index{Trace}\index{Window}

The \agwindow{Conflict Trace} and its companion, the
\agwindow{Reduction Trace}, are ready-made \agwindow{Grammar Traces}
which can be invoked from a \agwindow{Conflicts} window or a
\agwindow{Resolved Conflicts} window using the \agmenu{Auxiliary
Windows} menu.  The \agwindow{Conflict Trace} shows one way out of
perhaps many ways to get to the conflict state highlighted in the
\agwindow{Conflicts} window.  The \agwindow{Reduction Trace}
progresses one step farther than the \agwindow{Conflict Trace},
showing the result of selecting a reduce action in the conflict state.
You can manipulate both traces just as you would a regular grammar
trace.  The \agmenu{Reset} button will reset the trace to its original
condition.

\agwindow{Conflict} and \agwindow{Reduction Trace} windows may be
obtained only from the \agwindow{Conflicts} window or the
\agwindow{Resolved Conflicts} window using the \agmenu{Auxiliary
Windows} popup menu.  For more information see the section on
conflicts in Chapter 7.

\subsection{Keyword Anomaly Trace}
\index{Keyword Anomaly Trace}\index{Trace}\index{Window}

The \agwindow{Keyword Anomaly Trace} is another ready-made grammar
trace which is invoked from a \agwindow{Keyword Anomalies} window
using the \agmenu{Auxiliary Windows} popup menu.  It shows one way out
of perhaps many to get to the anomaly highlighted in the
\agwindow{Keyword Anomalies} window.  You can manipulate the
\agwindow{Keyword Anomaly Trace} in the same way that you would a
regular grammar trace.  The \agmenu{Reset} button will restore the
trace to its original configuration.

The \agwindow{Keyword Anomaly Trace} may be obtained only from the
\agmenu{Auxiliary Windows} popup menu in the \agwindow{Keyword
Anomalies} window. For more information see the section on keyword
anomalies in Chapter 7.

\subsection{Auxiliary Trace}
\index{Auxiliary Trace}\index{Trace}

For many AnaGram windows, an \agwindow{Auxiliary Trace} is one of the
options on the right mouse button pop up menu.  In most cases this is
simply a pre-built \agwindow{Grammar Trace} that leads to the state
specified by the highlighted line in the window.  In the case of the
\agwindow{File} and \agwindow{Grammar Traces}, selecting an
\agwindow{Auxiliary Trace} from the \agwindow{Parser Stack} pane popup
menu gives you a \agwindow{Grammar Trace} with the same
\agwindow{Parser Stack} pane as the original \agwindow{Trace}.

When using a \agwindow{File} or \agwindow{Grammar Trace}, an
\agwindow{Auxiliary Trace} can be useful for comparing the results at
a particular point of diverging input streams.


\section{Trace Coverage}
\index{Trace Coverage}\index{Window}\index{Coverage}

When you run either \agwindow{File Trace} or \agwindow{Grammar Trace},
AnaGram counts the number of times each rule is reduced.  You can
display these counts by selecting the \agmenu{Trace Coverage} window
in the \agmenu{Browse} menu.  The window displays all of the rules in
your grammar in the order in which they were encountered in your
syntax file.  The window is synched to your syntax file window so that
you can easily see each rule in context.  The first column of the
table gives the number of times this rule has been reduced by the
\agwindow{File Trace} process since you last analyzed your grammar.
Note that if, when tracing a file, you back up and try a portion of
input over and over, the counts for the rules involved will be
inflated relative to the other rules in the grammar.  Thus, the
primary use of the \agwindow{Trace Coverage} table is to determine
whether there are rules that are not tested at all.  Because AnaGram
normally uses a number of short-cut actions in its parsing tables,
some rules will not be counted even though they are obviously found in
the grammar.  These rules all have length zero or one and have no
reduction procedures. If you set the 
\index{rule coverage}\index{configuration parameters}\agparam{rule coverage}
configuration switch, AnaGram will turn off some of the optimization
so that you will get a more accurate count.
% XXX shouldn't ``found in the grammar'' above be ``found in the input''?