Mercurial > ~dholland > hg > ag > index.cgi
diff help2html/notes.txt @ 0:13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
author | David A. Holland |
---|---|
date | Sat, 22 Dec 2007 17:52:45 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/help2html/notes.txt Sat Dec 22 17:52:45 2007 -0500 @@ -0,0 +1,211 @@ + +Notes on HTML version of help.src +--------------------------------- + +----- Overview of structure assumptions and treatment of help.src ------ + +(a) The help.src file + +help.src consists of a sequence of topics, each with a title line and +a topic body. They are only partly sorted by title line. There may or +may not be "blank line"s between the topics. "Blank line"s, however, +are necessary to separate some sections in the topic bodies. It was +found necessary to disallow extra spaces or tabs in a "blank line", +i.e. a blank line is just a newline (lf) char. It turns out Jerry had +already found it necessary to remove such extra spaces. + +There must be at least one "blank line" after a title line, preceding +the topic body. A topic ends with an "end topic" line starting with +"##" beginning in column 1. The end topic line may have some spaces +before the terminating newline. + +The final eof line can have leading blanks and be preceded by "blank +line"s as above. + + +(b) Title: + +Title line is one line only, begins in col. 1 with a "lead title char": + + lead title char = char - blank - tab - ',' - bullet - '\n', char=~eof + +Here a "bullet" is a control-G (bell) character, aka 0x7. + +The individual topics in the title are comma-separated. They may have +these chars: + + title char = char - ',' - bullet - '\n', where char=~eof + + +(c) Topic body: + +Inspection of help.src suggested that there were 6 different kinds of +paragraphs in a topic body, which were separated by blank lines. The +first paragraph was always of type "text" - this turned out to be +needed for the parsing. Remaining paragraphs could be text or look +like a table, code, or several varieties of list. + +(c1) Text paragraphs +The first line of a text paragraph had to begin with a + + lead topic char = char - blank - tab - bullet - '\n', where char=~eof + +or with a single blank followed by a lead topic char. The leading +blank seemed to occur only after a preceding table-type section, after +a blank line. It seems that Jerry uses this construct for some +purpose, but it didn't seem to be necessary for mhh5 parsing. + +Some text paragraphs had code interspersed, not always separated with +a newline, and some had a following "table" or list without a blank +line as separator. + +(c2) Table paragraphs +A paragraph with a tab in col 1 of the first line was assumed to be +some sort of table. Sometimes there were several leading tabs. +Sometimes there were further sequences of tabs in the rest of the +line. Mostly there was just one set of tabs - these were easy to deal +with. There are only a couple of cases of several sets of tabs, which +correspond to 2 and 3-column tables. As of July 12/01 mhh5 uses the +tabs to construct multi-column tables and the contents of the table +cells are treated as code. + +A table can occur following a text paragraph, tab list or one-space +list without an intervening blank line. + +Treating the whole table as code enclosed in <pre> </pre> tags, +removing the leading tabs and any leading spaces, worked very well +except for the 2 or 3-column tables. + +(c3) Code paragraphs +If a paragraph began with a sequence of 2, 3, 4, 5, 6, 8, or 10 spaces +(all these sequences occur in help.src) it is considered a code +paragraph and enclosed in <pre> </pre> tags. In this case the leading +spaces are preserved. + +(c4) List paragraphs +If a paragraph begins with a "bullet" (control-G) character it is +deemed to be some sort of list. Sometimes the list items are separated +by blank lines and sometimes not. The bullet was variously followed by +one space, two spaces, or a tab. The two-spaces version occurs only +once in help.src (in part of What's New) and possibly has no +significance for Jerry's parsing; maybe it could be eliminated. + +All 3 list types were turned into HTML unordered lists, with the +leading bullet and spaces or tab swallowed. Since list topics +sometimes were separated with blank lines and sometimes not, it was +necessary to keep an AgStack with the current list type on it - +actually a multi-valued flag would have been sufficient but the stack +might also be useful for tables. + +A one-space list can be followed by a table or code without an +intervening blank line and a tab list can be similarly followed by a +table. + +============================================================================ + + +July 1, 2001 + +The list2 variation, with 2 leading spaces after the control G instead +of 1 as in list1, appears to occur *only* in Wnat's New (not What's +New in AnaGram 2.0). There doesn't seem to be any reason for it in +terms of how AnaGram online help treats it. + +Internal Error topic has a line of code with a period after it. This +looks wrong and also looks wrong in the online help. Period should be +removed from help.src. + +Keyword topic - the list here has extra line spacing for the last list +items, plus an embedded code example, so spacing doesn't look too +good. Possibly the line spacing could be altered in help.src. Also, +use of IF and IFF to demonstrate keyword lookahead is not a good +choice for HTML as the I looks like a vertical divider in some +fonts. Change to EX and EXT or GO and GOO? + +Also, the use of double quotes around "keyword" doesn't look proper in +HTML and is probably not good even in help.src. Remove quotes in +help.src? + + +July 3, 2001 + +The exit_flag topic has a table which does not seem to be generated in +any regular manner and accommodates basically to the default help font +used by AnaGram, which is not monospaced. Combinations of spaces and +tabs have been used to get alignment: + +AG_RUNNING_CODE(15 char) is preceded by a tab, then 4 spaces and 4 tabs, + then = 0:tab and the explanation. +AG_SUCCESS_CODE (15 char) - same form as above but 8 space and 4 tabs +AG_SYNTAX_ERROR_CODE (20 char) - 3 spaces, 2 tabs +AG_REDUCTION_ERROR_CODE (23 char) - 1 space, 1 tab +AG_STACK_ERROR_CODE (19 char) - 3 spaces, 3 tabs +AG_SEMANTIC_ERROR_CODE (22 char) - 1 space, 2 tabs + +It is desirable to keep the width to a minimum here because it governs +the width chosen by the browser for displaying *all* the text. +Possible to create a table (ugh) or swallow the spaces and tabs in +favor of a single space. + +The What's New topic has an example under Bug Fixes which uses a lot +of spaces before the reduction procedure. These make the HTML page too +wide - and in the online help, they do not seem to appear at all, +probably being replaced with a single space.(?) This is a code +paragraph, leading with 8 and 10 spaces, not a table paragraph; hard +to see what to do other than change help.src. + + +July 4, 2001 + +mhh5 now inserts links properly. Turns out that some links in help.src +include the "s" at the end inside the link if they are plural, even if +the help topic is singular. So it is necessary to strip the "s" and +test again for a matching topic title. + +Topics still in need of some adjustment are: + exit_flag + Internal Error + Keyword + Virtual Production + What's New + Character Constant + Data Type + PCB_TYPE +Some of these are discussed above. Mostly the problem is use of tabs +and spaces in help.src in a way that does not lend itself to general +rules to apply to the html. + +Still need to be able to replace <,>, & in help.src with the +appropriate entities before creating the html version. + + +July 12, 2001 + +mhh5 now inserts entities and can handle multiple-column tables (as +detected by tab sequences in the table lines). + + +Aug 3, 2001 + +mhh5.syn should have some comments inserted at the beginning to +mention this notes.txt file and to say what the program does. It +should also probably print a comment at the beginning of the html file +along the lines of: + + " All the information in this file may also be accessed from within +AnaGram using Help Topics on the Help menu. This HTML file is provided +because it is sometimes more convenient to read the topics with your +browser, and beeause the browser can be used to search the whole file. +This is a long file with many links; some older browsers slow to a +crawl. Netscape works fine." + +I am holding off on doing this because it means another round of +compilation and testing and Jerry has not been available to even look +at the output or pass on the above wording. + +Moreover, the output file has not been tested with up-to-date versions +of Microsoft Internet Explorer. The old version on Secondo really does +slow to a crawl. The file is not really usable; not clear if it could +be improved by modifying the HTML if the newer Internet Explorer +versions can't deal with it either. +