diff help2html/notes.txt @ 0:13d2b8934445

Import AnaGram (near-)release tree into Mercurial.
author David A. Holland
date Sat, 22 Dec 2007 17:52:45 -0500
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/help2html/notes.txt	Sat Dec 22 17:52:45 2007 -0500
@@ -0,0 +1,211 @@
+
+Notes on HTML version of help.src
+---------------------------------
+
+----- Overview of structure assumptions and treatment of  help.src ------
+
+(a) The help.src file
+
+help.src consists of a sequence of topics, each with a title line and
+a topic body. They are only partly sorted by title line. There may or
+may not be "blank line"s between the topics. "Blank line"s, however,
+are necessary to separate some sections in the topic bodies. It was
+found necessary to disallow extra spaces or tabs in a "blank line",
+i.e. a blank line is just a newline (lf) char.  It turns out Jerry had
+already found it necessary to remove such extra spaces.
+
+There must be at least one "blank line" after a title line, preceding
+the topic body. A topic ends with an "end topic" line starting with
+"##" beginning in column 1. The end topic line may have some spaces
+before the terminating newline.
+
+The final eof line can have leading blanks and be preceded by "blank
+line"s as above.
+
+
+(b) Title:
+
+Title line is one line only, begins in col. 1 with a "lead title char":
+
+  lead title char = char - blank - tab - ',' - bullet - '\n',  char=~eof
+
+Here a "bullet" is a control-G (bell) character, aka 0x7.
+
+The individual topics in the title are comma-separated. They may have
+these chars:
+
+   title char = char - ',' - bullet - '\n',  where  char=~eof
+
+
+(c) Topic body:
+
+Inspection of help.src suggested that there were 6 different kinds of
+paragraphs in a topic body, which were separated by blank lines. The
+first paragraph was always of type "text" - this turned out to be
+needed for the parsing. Remaining paragraphs could be text or look
+like a table, code, or several varieties of list.
+
+(c1) Text paragraphs
+The first line of a text paragraph had to begin with a 
+
+   lead topic char = char - blank - tab - bullet - '\n',  where char=~eof
+
+or with a single blank followed by a lead topic char. The leading
+blank seemed to occur only after a preceding table-type section, after
+a blank line.  It seems that Jerry uses this construct for some
+purpose, but it didn't seem to be necessary for mhh5 parsing.
+
+Some text paragraphs had code interspersed, not always separated with
+a newline, and some had a following "table" or list without a blank
+line as separator.
+
+(c2) Table paragraphs
+A paragraph with a tab in col 1 of the first line was assumed to be
+some sort of table. Sometimes there were several leading tabs.
+Sometimes there were further sequences of tabs in the rest of the
+line. Mostly there was just one set of tabs - these were easy to deal
+with. There are only a couple of cases of several sets of tabs, which
+correspond to 2 and 3-column tables. As of July 12/01 mhh5 uses the
+tabs to construct multi-column tables and the contents of the table
+cells are treated as code.
+
+A table can occur following a text paragraph, tab list or one-space
+list without an intervening blank line.
+
+Treating the whole table as code enclosed in <pre> </pre> tags,
+removing the leading tabs and any leading spaces, worked very well
+except for the 2 or 3-column tables.
+ 
+(c3) Code paragraphs
+If a paragraph began with a sequence of 2, 3, 4, 5, 6, 8, or 10 spaces
+(all these sequences occur in help.src) it is considered a code
+paragraph and enclosed in <pre> </pre> tags. In this case the leading
+spaces are preserved.
+
+(c4) List paragraphs
+If a paragraph begins with a "bullet" (control-G) character it is
+deemed to be some sort of list. Sometimes the list items are separated
+by blank lines and sometimes not. The bullet was variously followed by
+one space, two spaces, or a tab. The two-spaces version occurs only
+once in help.src (in part of What's New) and possibly has no
+significance for Jerry's parsing; maybe it could be eliminated.
+
+All 3 list types were turned into HTML unordered lists, with the
+leading bullet and spaces or tab swallowed. Since list topics
+sometimes were separated with blank lines and sometimes not, it was
+necessary to keep an AgStack with the current list type on it -
+actually a multi-valued flag would have been sufficient but the stack
+might also be useful for tables.
+
+A one-space list can be followed by a table or code without an
+intervening blank line and a tab list can be similarly followed by a
+table.
+
+============================================================================
+
+
+July 1, 2001
+
+The list2 variation, with 2 leading spaces after the control G instead
+of 1 as in list1, appears to occur *only* in Wnat's New (not What's
+New in AnaGram 2.0). There doesn't seem to be any reason for it in
+terms of how AnaGram online help treats it.
+
+Internal Error topic has a line of code with a period after it. This
+looks wrong and also looks wrong in the online help. Period should be
+removed from help.src.
+
+Keyword topic - the list here has extra line spacing for the last list
+items, plus an embedded code example, so spacing doesn't look too
+good.  Possibly the line spacing could be altered in help.src. Also,
+use of IF and IFF to demonstrate keyword lookahead is not a good
+choice for HTML as the I looks like a vertical divider in some
+fonts. Change to EX and EXT or GO and GOO?
+
+Also, the use of double quotes around "keyword" doesn't look proper in
+HTML and is probably not good even in help.src. Remove quotes in
+help.src?
+
+
+July 3, 2001
+
+The exit_flag topic has a table which does not seem to be generated in
+any regular manner and accommodates basically to the default help font
+used by AnaGram, which is not monospaced.  Combinations of spaces and
+tabs have been used to get alignment:
+
+AG_RUNNING_CODE(15 char) is preceded by a tab, then 4 spaces and 4 tabs,
+                          then = 0:tab and the explanation.
+AG_SUCCESS_CODE (15 char) - same form as above but 8 space and 4 tabs
+AG_SYNTAX_ERROR_CODE (20 char) -  3 spaces, 2 tabs
+AG_REDUCTION_ERROR_CODE (23 char) -  1 space, 1 tab
+AG_STACK_ERROR_CODE (19 char) -  3 spaces, 3 tabs
+AG_SEMANTIC_ERROR_CODE (22 char) -  1 space, 2 tabs
+
+It is desirable to keep the width to a minimum here because it governs 
+the width chosen by the browser for displaying *all* the text.
+Possible to create a table (ugh) or swallow the spaces and tabs in 
+favor of a single space.
+
+The What's New topic has an example under Bug Fixes which uses a lot
+of spaces before the reduction procedure. These make the HTML page too
+wide - and in the online help, they do not seem to appear at all,
+probably being replaced with a single space.(?) This is a code
+paragraph, leading with 8 and 10 spaces, not a table paragraph; hard
+to see what to do other than change help.src.
+
+
+July 4, 2001
+
+mhh5 now inserts links properly. Turns out that some links in help.src
+include the "s" at the end inside the link if they are plural, even if
+the help topic is singular. So it is necessary to strip the "s" and
+test again for a matching topic title.
+
+Topics still in need of some adjustment are:
+  exit_flag
+  Internal Error
+  Keyword
+  Virtual Production
+  What's New
+  Character Constant
+  Data Type
+  PCB_TYPE
+Some of these are discussed above. Mostly the problem is use of tabs 
+and spaces in help.src in a way that does not lend itself to general 
+rules to apply to the html.
+
+Still need to be able to replace <,>, & in help.src with the
+appropriate entities before creating the html version.
+
+
+July 12, 2001
+
+mhh5 now inserts entities and can handle multiple-column tables (as
+detected by tab sequences in the table lines).
+
+
+Aug 3, 2001
+
+mhh5.syn should have some comments inserted at the beginning to
+mention this notes.txt file and to say what the program does. It
+should also probably print a comment at the beginning of the html file
+along the lines of:
+
+  " All the information in this file may also be accessed from within
+AnaGram using Help Topics on the Help menu. This HTML file is provided
+because it is sometimes more convenient to read the topics with your
+browser, and beeause the browser can be used to search the whole file.
+This is a long file with many links; some older browsers slow to a
+crawl. Netscape works fine."
+
+I am holding off on doing this because it means another round of
+compilation and testing and Jerry has not been available to even look
+at the output or pass on the above wording.
+
+Moreover, the output file has not been tested with up-to-date versions
+of Microsoft Internet Explorer. The old version on Secondo really does
+slow to a crawl. The file is not really usable; not clear if it could
+be improved by modifying the HTML if the newer Internet Explorer
+versions can't deal with it either.
+