Mercurial > ~dholland > hg > ag > index.cgi
diff doc/misc/html/examples/fc.html @ 0:13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
author | David A. Holland |
---|---|
date | Sat, 22 Dec 2007 17:52:45 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/misc/html/examples/fc.html Sat Dec 22 17:52:45 2007 -0500 @@ -0,0 +1,752 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> +<HTML> +<HEAD> +<TITLE> Fahrenheit-Celsius Converter</TITLE> +</HEAD> + + + + +<BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif" + TEXT="#000000" LINK="#0033CC" + VLINK="#CC0033" ALINK="#CC0099"> + +<P> +<IMG ALIGN="right" SRC="../images/agrsl6c.gif" ALT="AnaGram" + WIDTH=124 HEIGHT=30 > +<BR CLEAR="all"> +Back to <A HREF="../index.html">Index</A> +<P> +<IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------" + WIDTH=1010 HEIGHT=2 > +<P> + +<H1> Fahrenheit-Celsius Converter</H1> +<IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------" + WIDTH=1010 HEIGHT=2 > +<P> +<H2>Introduction</H2> +<P> + Conversion of temperatures from Fahrenheit to Celsius is a + traditional starting point for learning how to use a + programming language. This directory contains a graded sequence + of Fahrenheit to Celsius conversion programs, starting with + a very simple case and working up to one of some complexity. + This sequence of programs illustrates an important aspect of + syntax directed programming: In contrast to conventional + programming methods it is quite easy to begin with a simple + case and then extend it to more complex situations. +<P> + All of these programs accept input from <tt>stdin</tt> and write + output to <tt>stdout</tt>. These programs are somewhat exceptional, + since, except for FC5, they do not have any embedded C and + therefore do not require explicit definition of a main + program. +<P> + <tt>fc1</tt> is the first and simplest of the Fahrenheit to Celsius + conversion programs. It expects the user to type a positive + integer value, assumed to be a Fahrenheit temperature, which + it converts to Celsius. It then exits. +<P> + <tt>fc2</tt>, the next example, is a somewhat more interesting + Fahrenheit-Celsius converter. This time the input stream may + contain any number of temperatures, either Fahrenheit or + Celsius, each terminated by a newline character. The + program will continue until it encounters an end of file. If it + encounters a syntax error in the input, it will skip to the + next newline character and continue. <tt>fc2</tt> has been set up to + illustrate the usage of the File Trace feature of AnaGram. +<P> + <tt>fc3</tt> adds two new features to <tt>fc2</tt>: It uses + floating point + arithmetic, so that it can deal with non-integral values, + and it allows optional white space in the input, except + within numbers. In addition, it changes the output format, + so that results are printed in degrees Kelvin, as well as in + Fahrenheit and Celsius. + <font size=-1>(Yes, we know that in Newspeak the official + usage is "Kelvins", not "degrees Kelvin". Shush.)</font> +<P> + <tt>fc4</tt> illustrates a shift-reduce conflict which arose when + modifying the <tt>fc3</tt> grammar to allow input in degrees Kelvin. + You should probably skip this example until you encounter a + shift-reduce conflict in one of your own grammars. <tt>fc4a</tt> and + <tt>fc4b</tt> are two different resolutions of the conflict. +<P> + <tt>fc5</tt> illustrates the use of an event driven parser. The + actual grammar is the same as <tt>fc4b</tt>. The only difference is + in the method of providing input to the parser. +<P> + +<H2>FC1</H2> + <tt>fc1</tt> is the first and simplest of the Fahrenheit to Celsius + conversion programs. It expects the user to type an integer + value, assumed to be a Fahrenheit temperature, which it + converts to Celsius. It then exits. +<P> + The following features of AnaGram are introduced in <tt>fc1</tt>: +<UL> +<LI> recursive definition of tokens </LI> +<LI> definition of a set as a range of characters</LI> +<LI> token type declaration </LI> +<LI> default token type </LI> +<LI> passing token values to reduction procedures </LI> +<LI> long and short form reduction procedures </LI> +</UL> + +<P> + <tt>fc1</tt> defines two nonterminal tokens, "grammar", which + describes the entire input stream, and "integer", which + describes a simple unsigned integer value. "grammar", defined + by one production, describes the input as consisting of an + "integer" followed by a newline character. There is a + following reduction procedure to print out both the input + and converted values. +<P> + "integer" is recursively defined by two productions. The + first production says that an "integer" may be represented + by a single decimal digit. '0-9' represents the set of ascii + characters on the range '0' through '9'. The token can be + matched by any character from this set. The second + production contains the recursion. It says that the combination of + any "integer" followed by another decimal digit is also an + integer. Note that the left side is the same for these two + productions and it need not be repeated. +<P> + Note the type cast preceding "integer". This type cast + defines the data type of the semantic value of "integer" to + be int. When the parser stores a token value for "integer" + on its value stack or retrieves a value from the stack, the + type of the data transmitted will be int. +<P> + Since there is no type cast for the token "grammar", the + data type for the semantic value of "grammar" is given by + the "default token type" configuration parameter, which + defaults to void. +<P> + The semantic values of the tokens in a grammar rule may be + passed to the associated reduction procedure. The name of + the variable in the reduction procedure is simply appended + to the token name or expression in the rule with a colon as + a separator. +<P> + All three reduction procedures in <tt>fc1</tt> operate on + the semantic + values of tokens on the parser stack as parameters. In + the first reduction procedure, the variable <tt>f</tt> represents the + value of the integer typed by the user. It is taken as a + Fahrenheit temperature and converted to Celsius by the + reduction procedure. This reduction procedure uses the long + form, consisting of an equal sign followed by a block of C + code. +<P> + The reduction procedures for the two productions for "integer" + convert the integer from ascii form as typed by the + user to binary form. The first reduction procedure + calculates the value of a single digit integer. The second + reduction procedure calculates the value for an integer with more + than one digit. Notice that these reduction procedures both + use the short form: a C expression terminated by a semicolon. + The value of the expression is saved as the semantic + value of the reduced token. +<P> + +<H3>Testing FC1</H3> + Run AnaGram and build a parser, <tt>fc1.c</tt>. Compile it + and link it + with your C compiler. Run <tt>fc1</tt> from the command line. + Type an integer and press + Enter. <tt>fc1</tt> will print out Fahrenheit and Celsius + temperatures. +<P> + + +<H2>FC2</H2> + The next example of a syntax file, <tt>fc2.syn</tt>, is a somewhat + more interesting Fahrenheit-Celsius converter. This time the + input stream may contain any number of temperatures. The + program will continue until it encounters an end of file. If + it encounters a syntax error in the input, it will skip to + the next newline character and continue. <tt>fc2</tt> has + been set up + to illustrate the File Trace feature of AnaGram. +<P> + The following features of AnaGram are introduced in <tt>fc2</tt>: +<UL> +<LI> configuration section </LI> +<LI> character set defined by set union </LI> +<LI> end of file token </LI> +<LI> virtual productions </LI> +<LI> use of '?' to define a virtual production </LI> +<LI> error token resynchronization </LI> +<LI> File Trace </LI> +</UL> +<P> + <tt>fc2</tt> allows the temperature values to be signed + integers which + may be either Fahrenheit or Celsius, as determined by a + following 'f' or 'c'. Each temperature value to be + converted must be followed by a newline. Spaces in the input + are not allowed. +<P> + To facilitate testing, a configuration section has been + added at the beginning of the file to set two configuration + parameters, discussed below. The lines specifying the + parameters have been labeled C1 and C2 at the right margin + to make them easy to refer to in this documentation. + Similarly, productions have been labeled P1, P2, etc. +<P> + After the configuration section, an end of file token, eof, + is defined. Remember that when using stream I/O, the end of + file is signalled by a -1. +<P> + The first production, P1, describes the entire input file as + an optional sequence of temperatures followed by an end of + file. +<P> + The expression <CODE>[temperature, '\n']...</CODE> is a "virtual + production". The square brackets indicate the rule inside the + brackets is optional. The ellipsis (<CODE>...</CODE>) + indicates that the + rule may be repeated an arbitrary number of times. +<P> + Productions P2 and P3 define "temperature" in the normal + case. Production P4 controls error recovery, described + below.In AnaGram, <CODE>'f' + 'F'</CODE> represents the set + of characters + containing both upper and lower case 'f'. (The plus sign is + the union operator of set theory.) Either an upper or lower + case 'f' in the input will match this set. <CODE>'c' + + 'C'</CODE> is + similarly interpreted. Thus a temperature consists of a + number followed by an 'f' or 'c' which can be either upper + or lower case. The reduction procedures, of course, are + different for 'f' and 'c'. +<P> + Productions P5 and P6 define "number" as consisting of an + integer with an optional sign. <CODE>'+'?</CODE> is a virtual production. + The question mark indicates that the preceding element, in + this case the plus sign, is optional. These productions + allow you to write numbers in the form 17, +17, or -17. +<P> + Productions P7 and P8 define "integer" exactly as in <tt>fc1</tt>. +<P> + +<H3> Recovering and Continuing after a Syntax Error </H3> + The production P4 controls error recovery. "error" is a + special token in an AnaGram grammar, which can be matched by + any sequence of input which contains a syntax error. If your + grammar has an error token, when your parser encounters a + syntax error it looks to see if there is an error token to + match with the syntax error. If "error" is not admissible in + the current state, it discards the previous token on the + input stack and looks again. It continues until it gets back + to a state where "error" is acceptable input or the stack is + empty. If the stack is empty, it terminates the parse. + Otherwise, it then looks to see if the next input token is + admissible. If so, the parse continues. If not, the token is + discarded and the parser reads input until it finds an + acceptable token or the end of file. In this example, the + parser will read characters until it finds a newline character + At this point the parse will continue as though nothing + had happened. This process is called "error token + resynchronization". It is one of several ways to continue after a + parser detects a syntax error in its input stream. +<P> + + +<H3> Configuration Parameters </H3> + Two configuration parameters have been set in the + configuration section of <tt>fc2</tt> to facilitate testing + using the File + Trace. The first, test file mask, limits the choice of test + files to be used with the File Trace option to files with + the extension <tt>.fc2</tt>. The second, traditional engine, turns + off certain optimizations AnaGram normally builds into its + parsers. When the traditional engine switch is set, the + parsers AnaGram builds use only the four traditional parser + actions: shift, reduce, error, and accept. Otherwise, + AnaGram parsers use a number of compound actions in order to + reduce the size of the parsing tables and increase the speed + of the parser. In this case, the traditional engine switch + has been turned on in order to make the behavior of the + parser as seen with the File Trace correspond to textbook + behavior. +<P> + +<H3> Testing FC2 </H3> + Test <tt>fc2</tt> just as you did <tt>fc1</tt>: Run AnaGram + and build a + parser, <tt>fc2.c</tt>. Compile it and link it with your C + compiler. Run + <tt>fc2</tt> from the command line. Type an integer, with or + without a sign, the letter 'f' + or 'c', and press Enter. <tt>fc2</tt> will print out Fahrenheit and + Celsius temperatures. Repeat until you are satisfied the + program works. Try making a few deliberate typos to test the + error token resynchronization. Note that the parser + automatically provides error diagnostics. These diagnostics + are created by the default <tt>SYNTAX_ERROR</tt> macro. +<P> + Type ^Z and Enter (Windows) or ^D (Unix) to generate an end + of file and end the program. (Of course, you + could also use ^C or ^Brk.) +<P> + Alternatively, you may use the test file, <tt>test.fc2</tt>, + which has + been provided for use with the File Trace (see below). + Simply run <tt>fc2</tt> with input redirection to take input from + <tt>test.fc2</tt>. At the command prompt, type: +<PRE> + fc2 < test.fc2 +</PRE> +<P> + +<H2> File Trace </H2> + The File Trace feature of AnaGram allows you to test a + grammar without actually building a parser. This enables you to + completely decouple the debugging of the grammar from the + debugging of the reduction procedures. You can try out test + files before you have written anything more than the + grammar. This allows for very early testing in your projects. +<P> + + File Trace allows you to see in fine detail just exactly how + your parser will analyze an input file. A File Trace consists of a + window with various panes so you can see what is going on, + and an interpretive parser which works in the background. +<P> + You can select File Trace from the Action + menu once you have analyzed your grammar. For the <tt>fc2</tt> + example, you will be offered a choice of test files with + extension <tt>.fc2</tt>. A good choice would be <tt>test1.fc2</tt>. + The File Trace window will show a Parser Stack + pane to your left, a Test File pane, which shows you the + input file you are parsing, to your right, and a Rule Stack + pane across the bottom of the window. +<P> + The way the File Trace parser works is this: Initially none of + the test file has been parsed. If you double-click with the + left mouse button at a point in the Test File pane, the parser + parses to that point. The unparsed part of the file will be + colored differently from the parsed part (in the default + color scheme, parsed characters + have a lighter background). To back up the parse to a + previous location, double-click at that spot, or single-click + and press Enter or the Synch Parse button at the bottom of + the File Trace window. To check a file for syntax errors, all + you need to do is to click the Parse File button. If there is + a syntax error, the parse will not advance beyond the error + point. Normally, however, you will probably want to proceed + more deliberately, moving the cursor one character at a + time. +<P> + If you wish to see even finer detail, you may make the + parser work in single step mode. by clicking on Single Step + or pressing Enter. Each time you click on the Single Step + button or press Enter, the parser will perform one parser + action. Note that in its normal configuration, AnaGram + produces parsers that use a number of parser actions more + complex than the traditional shift and reduce actions. +<P> + The Parser Stack pane shows the levels of the parser stack, the + state numbers on the stack and the tokens that have been + recognized so far. + As you advance through the test file, you can see by + looking at the Parser Stack pane how the parser stack changes + as characters are shifted in and reductions occur. The Rule + Stack pane is an alternate view of the parser stack showing + the grammar rules in play at any moment. Notice how the + syntax file window is synched with the Rule Stack. +<P> + If the parse position is not located at the blinking cursor, + the Single Step button will be changed to read "Synch Parse". + Clicking on the button will move the parse position to the + cursor. +<P> + If you now click on tokens at various levels in the Token + Stack, the Test File characters corresponding to these tokens + will be highlighted. You can restart the parse at any level + by double-clicking just preceding the highlight. The Rule + Stack is also synched with the Token Stack and Test File + panes. +<P> + You may interrupt the File Trace at any time to inspect any + other window without interfering with the File Trace. + Whenever you come back to it, you can proceed as though + nothing has happened. +<P> + If you have a long file, a complex grammar, or a (very) slow + computer, it can sometimes take a while for the parse to catch + up with the cursor. If you have a long test file and press + Parse File to move the cursor to the end of the file, the + parser has a lot of computation to do to catch up. +<P> + For further details about the File Trace, please refer to the + AnaGram User's Guide and the on-line documentation. +<P> +<BR> + + +<H2> FC3 </H2> + <tt>fc3</tt> adds two new features to <tt>fc2</tt>: It uses + floating point + arithmetic, so that it can deal with non-integral values, + and it allows optional white space, including comments, in + the input. In addition, it changes the output format, so + that results are printed in degrees Kelvin, as well as in + Fahrenheit and Celsius. +<P> + The following features of AnaGram are introduced in <tt>fc3</tt>: +<UL> +<LI> Setting the default token type </LI> +<LI> The "disregard" statement </LI> +<LI> The "lexeme" statement </LI> +<LI> Right recursion </LI> +<LI> Default value of a reduction token </LI> +</UL> +<P> + +<H3> Floating Point Arithmetic </H3> + To deal with floating point arithmetic, a number of new + productions have been added and two productions have been + changed. Productions P5 and P6 have been changed to define a + "number" in terms of an "unsigned number" instead of an + "integer". Productions P6a, P6b, and P6c define "unsigned + number" in terms of its integer part and its fraction part. + Productions P9 and P10 define the fraction part of the + number. Note that "fraction" is described using right + recursion rather than left recursion. This makes the reduction + procedures neater. Since reduction does not occur until + a rule is complete, note that with right recursion each new + digit causes the stack depth to increase by one. You can + observe this with the File Trace. +<P> + In order to replace integer arithmetic with floating point + arithmetic, statement C3 was added to the configuration + section of the grammar. It declares the default token type, + the type assigned to nonterminal tokens absent a specific + declaration, to be "double". Since we are not interested in + values for "grammar" and "temperature", they have been + explicitly cast to "void". +<P> + Note that production P6a does not have a reduction procedure. + If a rule has no reduction procedure the value of the + reduction token defaults to the value of the first element + in the rule. In this case, the value of "unsigned number", + in the absence of a reduction procedure, is taken to be the + value of "integer". +<P> + +<H3> Skipping White Space </H3> + Two statements, C4 and C5, are used to skip over uninteresting + white space in the input to <tt>fc3</tt>. The "disregard" statement + instructs AnaGram to rewrite your grammar in a standard + way so that your parser will skip over any instance of the + "white space" token that occurs between lexemes, or lexical + units, in the input to your parser. Of course, you can use + any token name you wish in a "disregard" statement. You can + even have multiple "disregard" statements and all of the + tokens you specify will be disregarded. +<P> + The "lexeme" statement is used to declare that certain + nonterminal tokens are each to be considered as indivisible + lexical units, or lexemes, from the point of view of lexical + analysis, so that the "disregard" statement is inoperative + inside the nonterminal tokens listed. In this case, statement + C5 simply guarantees that white space will not be + allowed inside a number. All terminal tokens are automatically + lexemes. <!-- when not part of a larger lexeme. --> +<P> +<!-- + It would be nice to rewrite this so the example expansion is + vaguely close to syntactically legal. +--> + To make your parser skip white space, AnaGram renames and + redefines the lexemes in your grammar and defines a number + of new tokens. For example, if '+' is a lexeme in your + grammar and your grammar is to disregard "white space", + AnaGram will rename the plus sign token as '+'%. It then + introduces a new production in your grammar as follows: +<PRE> + '+' + -> '+'%, white space?... +</PRE> + + This means that a plus sign followed by some white space + will now be treated the same, syntactically, as a plus sign + alone. The percent sign (a degree sign in AnaGram 1.x) is + used to indicate the original, or "pure", + definition of the token. +<P> + Productions P11 and P12 together define what is meant by + white space in this grammar. P11 defines white space to + include blanks and tab characters. P12 includes C style + comments (not nested). +<P> + The white space defined in P11 and P12 does not include + newline characters. There is a good reason for this. The + grammar uses newline characters as delimiters, marking the + end of a "temperature". In order to allow blank lines, + production P1 was modified to make the temperature optional. +<P> + To allow "//" style comments, a new token, "end of line", + defined by production P13, was added. In production P1, '\n' + was then replaced by "end of line". +<P> + +<H3> Testing FC3 </H3> + The "test file mask" was changed to use files with the + extension ".fc3" in the File Trace. TEST.FC3 can be used as + input for the File Trace. +<P> + Build and compile <tt>fc3</tt> in the same way as previous versions. + When you run it, try using numbers with decimal fractions. + Try typing blanks in various places to see how the parser + deals with them. Try redirecting input from <tt>test.fc3</tt>. +<P> + + +<H2> FC4 </H2> + <tt>fc4</tt> illustrates a shift-reduce conflict in a grammar. You + should probably skip this example until you encounter a + shift-reduce conflict in one of your own grammars. In the + meantime, skip ahead to <tt>fc5</tt>. +<P> + The following features of AnaGram are introduced in <tt>fc4</tt>: +<UL> +<LI> Conflicts window </LI> +<LI> Auxiliary Windows menu </LI> +<LI> State Definition window </LI> +<LI> Expansion Chain window </LI> +</UL> +<P> + In <tt>fc3</tt>, the output was changed to provide results in degrees + Kelvin, as well as in Fahrenheit and Celsius. In <tt>fc4</tt>, a + production, P3a, is added to accept input in degrees Kelvin. + Since there is no such thing as a negative temperature on + the Kelvin scale, it seemed appropriate to require that the + temperature be an unsigned number. This, however, caused a + conflict, i.e. an ambiguity, in <tt>fc4</tt>. In fact, it caused four + conflicts to be diagnosed, all of which have a common + source. +<P> + +<H3> Finding a Conflict </H3> + If you run AnaGram and analyze <tt>fc4</tt> you will find that + AnaGram finds conflicts in the grammar. To determine the nature + of the conflicts, you should first open the Conflicts + window. The Conflicts window is available via the Browse Menu + or a Control Panel button. +<P> + The first thing you see is that there are conflicts in + states S005 and S025. In each state, one conflict occurs + because a decimal point can either be shifted, in accordance + with rule R021, or it can reduce rule R014, an empty rule, + or null production. The other conflict occurs because a + decimal digit can either be shifted, in accordance with rule + R022, or it can reduce rule R014, the same rule that gives + trouble with the decimal point. +<P> + The first step in understanding the conflict is to see rules + R014, R021, and R022 in context. The Conflicts window is + synchronized with the syntax file window. Arrange these + windows so you can see them both at once. + Then, in the Conflicts window, move the cursor bar up and + down. In the syntax file window, the cursor will move + between productions for number, unsigned number and integer. +<P> + Although it is possible to recognize rules R021 and R022 in + the grammar, there is no explicit null production in the + grammar. To find out for sure what rule R014 is, pop up the + Rule Table (listed in the Browse menu) and look for R014. It + turns out that R014 is the null production that corresponds + to an optional plus sign, written '+'? in the grammar. +<P> + To get a better idea of what is going on, it is worthwhile to + find out what the parser is expecting to see in states S005 + or S025. To find out, click the right mouse button in the + Conflicts window on any line describing a state S005 + conflict to pop up the Auxiliary Windows menu. Select State + Definition to find out what state S005 is all about. It + seems that state S005 occurs when the parser has skipped + over the initial white space and is about to begin dealing + with the actual input. But, looking at this, it is still not + clear why a decimal digit or a decimal point is ambiguous. + The Expansion Chain windows for rules R021, R022 and R014 + will show how these rules are derived from the + characteristic rule for state S005. +<P> + + Return to the Conflicts window. Click the right mouse button + on rule R021, the rule that expects the decimal digit, + to pop up the Auxiliary Windows menu. Select + Expansion Chain. The Expansion Chain window shows how rule + R021 derives from the characteristic rule for the state. + Each line in this window is a grammar rule produced by the + marked token in the rule on + the previous line. +<P> + Now return to the Conflicts window, and get the Expansion + Chain window for rule R014. Rearrange the windows on the + screen so you can compare the Expansion Chain windows for + rules R014 and R021. Note that rule R014 derives from the + production +<PRE> + temperature + -> number, 'c' + 'C' +</PRE> + and rule R021 derives from the production +<PRE> + temperature + -> unsigned number, 'k' + 'K' +</PRE> +<P> + It is now possible to see the nature of the conflict. On the + one hand, if the input is a supposed to be a Kelvin + temperature, the parser can go right ahead accumulating a + number. On the other hand, if the input is supposed to be a + Celsius number, the parser has to first acknowledge the + optional plus sign. +<P> + It is the nature of LALR parsers that they can keep track of + many threads of possible parses simultaneously as long as + they don't have to do any reductions. When they come to the + end of a rule, however, they are forced to decide whether + the rule has been successfully matched. In this case, the + parser is at the end of a null production which arises from + the virtual production <CODE> '+'? </CODE> in P6 and is forced to decide + whether this null production has been matched. The conflict + diagnostics discussed above say that if the next token is a + digit or a decimal point, the parser cannot decide between + several possibilities. That is, it cannot decide whether or + not it has seen an elided plus sign. In effect the parser is + being required, because of the null production, to make a + premature decision as to whether a Kelvin or Celsius + temperature is present in the input. +<P> + The conflicts can be eliminated by rewriting the grammar so + the parser will not come to the end of a rule and be forced + to choose among the several threads of the parse until it + encounters the determining letter 'f', 'c', or 'k'. +<P> + Two ways to remove the conflict are illustrated. The first is + found in FC4A. In this grammar, production P6 has been + replaced with two productions, P6x and P6y. This is a + standard method of rewriting a grammar to eliminate null + productions. If you analyze <tt>fc4a</tt>, you will find that it no + longer has a conflict, so this solves the problem. +<P> + However, if you look at the <tt>fc4a</tt> grammar closely, you will + notice that +23.7K is not acceptable input, although common + sense suggests that you ought to be able to use a plus sign + on Kelvin temperatures. <tt>fc4b</tt> shows another way to fix the + grammar which deals with this quibble. In this grammar, + production P3a has been modified to allow an optional plus + sign before a Kelvin temperature. If you analyze <tt>fc4b</tt>, you + will find that this change also solves the problem. +<P> + In both these instances, the technique used to resolve the + conflicts was to rewrite the grammar so that there are no + differences between constructs upstream of the point where + they diverge. Another way to put it is this: In the original + grammar Kelvin temperatures were distinguished from + Fahrenheit and Celsius not only by the 'K' suffix, but also by the + optional plus sign. The essence of both fixes to the problem + is to remove this distinction. +<P> + Other AnaGram windows available from the Auxiliary Windows + popup menu for the Conflicts window such as Rule Derivation, Token + Derivation, Conflict Trace and Problem States are helpful in + tracking down conflicts. See the help messages and your + AnaGram User's Guide for further details. +<P> + +<H3>Testing FC4A and FC4B </H3> + Build and compile <tt>fc4a</tt> and <tt>fc4b</tt> in the + same way you built + previous versions. When you run the parsers, try using + Kelvin temperatures with and without leading plus signs. +<P> + The "test file mask" was changed to use files with the + extension ".fc4" in the File Trace. <tt>test.fc4</tt> can be + used as + input either for the parsers themselves or for the File + Trace. +<P> + + +<H2> FC5 </H2> + <tt>fc5</tt> illustrates the use of an event driven parser. The + grammar is essentially the same as for <tt>fc4b</tt>. The primary + difference is in the method of providing input to the + parser. The following features of AnaGram are introduced in + <tt>fc5</tt>: +<UL> +<LI> event driven parser </LI> +<LI> embedded C </LI> +<LI> main program </LI> +<LI> initializing and calling the parser </LI> +</UL> + + Statement C6 in the configuration section causes AnaGram to + build an event driven parser. An event driven parser is + first explicitly initialized and then called once for each + input unit. +<P> + A small main program has been included in a block of + embedded C at the end of the file. This program calls the + initializer and then reads characters from <CODE>stdin</CODE> and passes + them on to the parser. Previous <tt>fc</tt> programs did not + include a + main program, but relied on AnaGram to create one. However, + AnaGram does not automatically create a main program for + parsers which are event driven, use pointer input, or have + any embedded C. Therefore a main program is necessary in + this syntax file. +<P> + Note that the default names for the initializer and parser + are <CODE>init_fc5</CODE> and <CODE>fc5</CODE> respectively. + Only event driven parsers + require that the user explicitly call the initializer + function. +<P> + In addition, a global constant, "zero", was defined in the + embedded C to provide the value of absolute zero, and the + reduction procedures were modified to refer to "zero" + instead of the explicit value. +<P> + This illustrates an important point about the C parser file + that AnaGram builds: All blocks of embedded C precede the + reduction procedures, so that the reduction procedures can + access all variables and definitions included in embedded + C, no matter where they are located in the file. +<P> + +<H3>Testing FC5 </H3> + Since <tt>fc5</tt> uses the same grammar as <tt>fc4</tt>, + you can use the same + test files for <tt>fc5</tt> as for <tt>fc4</tt>. +</P> + +<BR> + +<IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------" + WIDTH=1010 HEIGHT=2 > +<P> +<IMG ALIGN="right" SRC="../images/pslrb6d.gif" ALT="Parsifal Software" + WIDTH=181 HEIGHT=25> +<BR CLEAR="right"> +<P> +Back to <A HREF="../index.html">Index</A> +<P> +<ADDRESS><FONT SIZE="-1"> + AnaGram parser generator - examples<BR> + Fahrenheit-Celsius Converter<BR> + Copyright © 1993-1999, Parsifal Software. <BR> + All Rights Reserved.<BR> +</FONT></ADDRESS> + +</BODY> +</HTML> +