Mercurial > ~dholland > hg > ag > index.cgi
view help2html/notes.txt @ 0:13d2b8934445
Import AnaGram (near-)release tree into Mercurial.
author | David A. Holland |
---|---|
date | Sat, 22 Dec 2007 17:52:45 -0500 |
parents | |
children |
line wrap: on
line source
Notes on HTML version of help.src --------------------------------- ----- Overview of structure assumptions and treatment of help.src ------ (a) The help.src file help.src consists of a sequence of topics, each with a title line and a topic body. They are only partly sorted by title line. There may or may not be "blank line"s between the topics. "Blank line"s, however, are necessary to separate some sections in the topic bodies. It was found necessary to disallow extra spaces or tabs in a "blank line", i.e. a blank line is just a newline (lf) char. It turns out Jerry had already found it necessary to remove such extra spaces. There must be at least one "blank line" after a title line, preceding the topic body. A topic ends with an "end topic" line starting with "##" beginning in column 1. The end topic line may have some spaces before the terminating newline. The final eof line can have leading blanks and be preceded by "blank line"s as above. (b) Title: Title line is one line only, begins in col. 1 with a "lead title char": lead title char = char - blank - tab - ',' - bullet - '\n', char=~eof Here a "bullet" is a control-G (bell) character, aka 0x7. The individual topics in the title are comma-separated. They may have these chars: title char = char - ',' - bullet - '\n', where char=~eof (c) Topic body: Inspection of help.src suggested that there were 6 different kinds of paragraphs in a topic body, which were separated by blank lines. The first paragraph was always of type "text" - this turned out to be needed for the parsing. Remaining paragraphs could be text or look like a table, code, or several varieties of list. (c1) Text paragraphs The first line of a text paragraph had to begin with a lead topic char = char - blank - tab - bullet - '\n', where char=~eof or with a single blank followed by a lead topic char. The leading blank seemed to occur only after a preceding table-type section, after a blank line. It seems that Jerry uses this construct for some purpose, but it didn't seem to be necessary for mhh5 parsing. Some text paragraphs had code interspersed, not always separated with a newline, and some had a following "table" or list without a blank line as separator. (c2) Table paragraphs A paragraph with a tab in col 1 of the first line was assumed to be some sort of table. Sometimes there were several leading tabs. Sometimes there were further sequences of tabs in the rest of the line. Mostly there was just one set of tabs - these were easy to deal with. There are only a couple of cases of several sets of tabs, which correspond to 2 and 3-column tables. As of July 12/01 mhh5 uses the tabs to construct multi-column tables and the contents of the table cells are treated as code. A table can occur following a text paragraph, tab list or one-space list without an intervening blank line. Treating the whole table as code enclosed in <pre> </pre> tags, removing the leading tabs and any leading spaces, worked very well except for the 2 or 3-column tables. (c3) Code paragraphs If a paragraph began with a sequence of 2, 3, 4, 5, 6, 8, or 10 spaces (all these sequences occur in help.src) it is considered a code paragraph and enclosed in <pre> </pre> tags. In this case the leading spaces are preserved. (c4) List paragraphs If a paragraph begins with a "bullet" (control-G) character it is deemed to be some sort of list. Sometimes the list items are separated by blank lines and sometimes not. The bullet was variously followed by one space, two spaces, or a tab. The two-spaces version occurs only once in help.src (in part of What's New) and possibly has no significance for Jerry's parsing; maybe it could be eliminated. All 3 list types were turned into HTML unordered lists, with the leading bullet and spaces or tab swallowed. Since list topics sometimes were separated with blank lines and sometimes not, it was necessary to keep an AgStack with the current list type on it - actually a multi-valued flag would have been sufficient but the stack might also be useful for tables. A one-space list can be followed by a table or code without an intervening blank line and a tab list can be similarly followed by a table. ============================================================================ July 1, 2001 The list2 variation, with 2 leading spaces after the control G instead of 1 as in list1, appears to occur *only* in Wnat's New (not What's New in AnaGram 2.0). There doesn't seem to be any reason for it in terms of how AnaGram online help treats it. Internal Error topic has a line of code with a period after it. This looks wrong and also looks wrong in the online help. Period should be removed from help.src. Keyword topic - the list here has extra line spacing for the last list items, plus an embedded code example, so spacing doesn't look too good. Possibly the line spacing could be altered in help.src. Also, use of IF and IFF to demonstrate keyword lookahead is not a good choice for HTML as the I looks like a vertical divider in some fonts. Change to EX and EXT or GO and GOO? Also, the use of double quotes around "keyword" doesn't look proper in HTML and is probably not good even in help.src. Remove quotes in help.src? July 3, 2001 The exit_flag topic has a table which does not seem to be generated in any regular manner and accommodates basically to the default help font used by AnaGram, which is not monospaced. Combinations of spaces and tabs have been used to get alignment: AG_RUNNING_CODE(15 char) is preceded by a tab, then 4 spaces and 4 tabs, then = 0:tab and the explanation. AG_SUCCESS_CODE (15 char) - same form as above but 8 space and 4 tabs AG_SYNTAX_ERROR_CODE (20 char) - 3 spaces, 2 tabs AG_REDUCTION_ERROR_CODE (23 char) - 1 space, 1 tab AG_STACK_ERROR_CODE (19 char) - 3 spaces, 3 tabs AG_SEMANTIC_ERROR_CODE (22 char) - 1 space, 2 tabs It is desirable to keep the width to a minimum here because it governs the width chosen by the browser for displaying *all* the text. Possible to create a table (ugh) or swallow the spaces and tabs in favor of a single space. The What's New topic has an example under Bug Fixes which uses a lot of spaces before the reduction procedure. These make the HTML page too wide - and in the online help, they do not seem to appear at all, probably being replaced with a single space.(?) This is a code paragraph, leading with 8 and 10 spaces, not a table paragraph; hard to see what to do other than change help.src. July 4, 2001 mhh5 now inserts links properly. Turns out that some links in help.src include the "s" at the end inside the link if they are plural, even if the help topic is singular. So it is necessary to strip the "s" and test again for a matching topic title. Topics still in need of some adjustment are: exit_flag Internal Error Keyword Virtual Production What's New Character Constant Data Type PCB_TYPE Some of these are discussed above. Mostly the problem is use of tabs and spaces in help.src in a way that does not lend itself to general rules to apply to the html. Still need to be able to replace <,>, & in help.src with the appropriate entities before creating the html version. July 12, 2001 mhh5 now inserts entities and can handle multiple-column tables (as detected by tab sequences in the table lines). Aug 3, 2001 mhh5.syn should have some comments inserted at the beginning to mention this notes.txt file and to say what the program does. It should also probably print a comment at the beginning of the html file along the lines of: " All the information in this file may also be accessed from within AnaGram using Help Topics on the Help menu. This HTML file is provided because it is sometimes more convenient to read the topics with your browser, and beeause the browser can be used to search the whole file. This is a long file with many links; some older browsers slow to a crawl. Netscape works fine." I am holding off on doing this because it means another round of compilation and testing and Jerry has not been available to even look at the output or pass on the above wording. Moreover, the output file has not been tested with up-to-date versions of Microsoft Internet Explorer. The old version on Secondo really does slow to a crawl. The file is not really usable; not clear if it could be improved by modifying the HTML if the newer Internet Explorer versions can't deal with it either.