diff doc/misc/html/oldclasslib/strdict.html @ 0:13d2b8934445

Import AnaGram (near-)release tree into Mercurial.
author David A. Holland
date Sat, 22 Dec 2007 17:52:45 -0500
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/misc/html/oldclasslib/strdict.html	Sat Dec 22 17:52:45 2007 -0500
@@ -0,0 +1,177 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
+<HTML>
+<HEAD>
+<TITLE>String Dictionary Class Definition</TITLE>
+</HEAD>
+
+
+<BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif"
+ TEXT="#000000" LINK="#0033CC"
+ VLINK="#CC0033" ALINK="#CC0099">
+
+<P>
+<IMG ALIGN="right" SRC="../images/agrsl6c.gif" ALT="AnaGram"
+         WIDTH=124 HEIGHT=30 >
+<BR CLEAR="all">
+Back to :
+<A HREF="../index.html">Index</A> |
+<A HREF="index.html">Class libraries for examples</A>
+<P>
+
+<IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------"
+        WIDTH=1010 HEIGHT=2  >
+<P>
+
+<H1>String Dictionary Class Definition</H1>
+<IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------"
+        WIDTH=1010 HEIGHT=2  >
+<P>
+<BR>
+<H2>Introduction</H2>
+<P>
+
+A string dictionary is a device for keeping track of character strings.
+It associates a unique, non-zero, unsigned integer with each unique
+string inserted in the dictionary. These indices are assigned in order
+of insertion, so that the first string inserted has index 1, the second
+index 2, and so on. These indices may be used to index auxiliary tables
+containing information about the corresponding string. In particular,
+these indices may be conveniently used to index a symbol table. A
+hashing scheme is used to facilitate quick identification of a given
+string.
+<P>
+<BR>
+
+<H2>Constructor </H2>
+
+The constructor for a string dictionary takes two unsigned integer
+arguments: the maximum number of entries and the average number of
+bytes per entry. The latter will default to ten if you don't specify
+it. The product of size and length must be less than 65536. For
+example:
+<PRE>
+      string_dictionary td(5000);    // 5000 entries, default size
+      string_dictionary xx(500,20);  // 500 entries, average size = 20
+</PRE>
+<P>
+<BR>
+
+<H2>Entering Strings in a Dictionary </H2>
+
+To enter a string in a dictionary, use the overloaded operator " &lt;&lt; ",
+with the dictionary on the left, the string on the right. The result is
+a nonzero, unsigned integer. For example:
+<PRE>
+      unsigned hwx = td &lt;&lt; "Hello, world!";
+</PRE>
+<P>
+<BR>
+
+<H2>Dictionary Lookups </H2>
+
+To look up a string in a dictionary, use the overloaded operator
+"<CODE>[]</CODE>",
+using the string to index the dictionary. The result is an unsigned
+integer. It will be zero if the string is not found in the dictionary.
+For example:
+<PRE>
+       unsigned k = td["Hello, world!"];
+</PRE>
+<BR>
+
+<H2>Recovering a String from a Dictionary </H2>
+
+The operator "<CODE>[]</CODE>" is also overloaded to allow you to get the pointer to
+a particular string in the dictionary. Simply index the dictionary with
+an unsigned integer. The result will be a pointer to the desired
+string. If the index is zero, or larger than the number of strings in
+the dictionary, the pointer will be null. For example:
+<PRE>
+       unsigned k = td &lt;&lt; "Hello";
+       char *ptr = td[k];             //returns pointer to "Hello"
+</PRE>
+<BR>
+
+<H2>Number of Entries in the Dictionary </H2>
+
+To find the number of entries in a dictionary, use the function "size".
+It returns an unsigned integer. Since the zero entry is unused, the
+integer returned by size is the index of the last entry in the
+dictionary. For example:
+<PRE>
+       unsigned n = size(td);
+</PRE>
+<P>
+<BR>
+
+<H2>Resetting a String Dictionary </H2>
+
+The overloaded function reset() may be used to restore a string
+dictionary to its initial condition, deleting all entries:
+<PRE>
+        reset(td);
+</PRE>
+<P>
+<BR>
+
+<H2>Technical Notes </H2>
+
+The dictionary uses a hashing algorithm freely adapted from Fletcher's
+checksum as described by John Kodis in Doctor Dobb's Journal, May,
+1992.
+<P>
+The length of the hash table is the smallest power of two that is
+greater than the specified size of the dictionary. Therefore, if you
+are using a computer with 8086 architecture, you are limited to a
+dictionary with fewer than 32768 entries.
+<P>
+The hash algorithm produces two unsigned integers, k1 and k2, which are
+each less than 255. The hash code is then calculated as
+ <CODE> 255*k1 + k2</CODE>,
+and masked to the length of the hash table. In case of collision, a
+hash increment is calculated by using k1 and k2 in the opposite order:
+<CODE> 255*k2 + k1</CODE>.
+This is multiplied by two and one is added to guarantee an
+increment which is relatively prime to the length of the hash table.
+This process guarantees the entire table can be searched.
+<P>
+The number of calls and the number of collisions are recorded so you
+can check the efficiency of the hash algorithm.
+<P>
+Because the dictionary allocates a single character array for all the
+dictionary entries, the product of the max number of entries and the
+average size must be less than 65536 when using 8086 architecture. If
+this is inconvenient there are several alternatives. You can allocate
+storage individually for each string and simply maintain a list of
+pointers, or you can declare a "huge" array, and use longs to index it.
+Remember that in this case the hash table would have to be declared as
+an array of longs.
+
+
+<P>
+<BR>
+
+
+<IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------"
+      WIDTH=1010 HEIGHT=2 >
+<P>
+<IMG ALIGN="right" SRC="../images/pslrb6d.gif" ALT="Parsifal Software"
+                WIDTH=181 HEIGHT=25>
+<BR CLEAR="right">
+
+<P>
+Back to :
+<A HREF="../index.html">Index</A> |
+<A HREF="index.html">Class libraries for examples</A>
+
+<P>
+<ADDRESS><FONT SIZE="-1">
+                  AnaGram parser generator - examples<BR>
+                  Class libraries - String dictionary class definition<BR>
+                  Copyright &copy; 1993-1999, Parsifal Software. <BR>
+                  All Rights Reserved.<BR>
+</FONT></ADDRESS>
+
+</BODY>
+</HTML>
+