comparison doc/misc/html/oldclasslib/strdict.html @ 0:13d2b8934445

Import AnaGram (near-)release tree into Mercurial.
author David A. Holland
date Sat, 22 Dec 2007 17:52:45 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:13d2b8934445
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
2 <HTML>
3 <HEAD>
4 <TITLE>String Dictionary Class Definition</TITLE>
5 </HEAD>
6
7
8 <BODY BGCOLOR="#ffffff" BACKGROUND="tilbl6h.gif"
9 TEXT="#000000" LINK="#0033CC"
10 VLINK="#CC0033" ALINK="#CC0099">
11
12 <P>
13 <IMG ALIGN="right" SRC="../images/agrsl6c.gif" ALT="AnaGram"
14 WIDTH=124 HEIGHT=30 >
15 <BR CLEAR="all">
16 Back to :
17 <A HREF="../index.html">Index</A> |
18 <A HREF="index.html">Class libraries for examples</A>
19 <P>
20
21 <IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------"
22 WIDTH=1010 HEIGHT=2 >
23 <P>
24
25 <H1>String Dictionary Class Definition</H1>
26 <IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------"
27 WIDTH=1010 HEIGHT=2 >
28 <P>
29 <BR>
30 <H2>Introduction</H2>
31 <P>
32
33 A string dictionary is a device for keeping track of character strings.
34 It associates a unique, non-zero, unsigned integer with each unique
35 string inserted in the dictionary. These indices are assigned in order
36 of insertion, so that the first string inserted has index 1, the second
37 index 2, and so on. These indices may be used to index auxiliary tables
38 containing information about the corresponding string. In particular,
39 these indices may be conveniently used to index a symbol table. A
40 hashing scheme is used to facilitate quick identification of a given
41 string.
42 <P>
43 <BR>
44
45 <H2>Constructor </H2>
46
47 The constructor for a string dictionary takes two unsigned integer
48 arguments: the maximum number of entries and the average number of
49 bytes per entry. The latter will default to ten if you don't specify
50 it. The product of size and length must be less than 65536. For
51 example:
52 <PRE>
53 string_dictionary td(5000); // 5000 entries, default size
54 string_dictionary xx(500,20); // 500 entries, average size = 20
55 </PRE>
56 <P>
57 <BR>
58
59 <H2>Entering Strings in a Dictionary </H2>
60
61 To enter a string in a dictionary, use the overloaded operator " &lt;&lt; ",
62 with the dictionary on the left, the string on the right. The result is
63 a nonzero, unsigned integer. For example:
64 <PRE>
65 unsigned hwx = td &lt;&lt; "Hello, world!";
66 </PRE>
67 <P>
68 <BR>
69
70 <H2>Dictionary Lookups </H2>
71
72 To look up a string in a dictionary, use the overloaded operator
73 "<CODE>[]</CODE>",
74 using the string to index the dictionary. The result is an unsigned
75 integer. It will be zero if the string is not found in the dictionary.
76 For example:
77 <PRE>
78 unsigned k = td["Hello, world!"];
79 </PRE>
80 <BR>
81
82 <H2>Recovering a String from a Dictionary </H2>
83
84 The operator "<CODE>[]</CODE>" is also overloaded to allow you to get the pointer to
85 a particular string in the dictionary. Simply index the dictionary with
86 an unsigned integer. The result will be a pointer to the desired
87 string. If the index is zero, or larger than the number of strings in
88 the dictionary, the pointer will be null. For example:
89 <PRE>
90 unsigned k = td &lt;&lt; "Hello";
91 char *ptr = td[k]; //returns pointer to "Hello"
92 </PRE>
93 <BR>
94
95 <H2>Number of Entries in the Dictionary </H2>
96
97 To find the number of entries in a dictionary, use the function "size".
98 It returns an unsigned integer. Since the zero entry is unused, the
99 integer returned by size is the index of the last entry in the
100 dictionary. For example:
101 <PRE>
102 unsigned n = size(td);
103 </PRE>
104 <P>
105 <BR>
106
107 <H2>Resetting a String Dictionary </H2>
108
109 The overloaded function reset() may be used to restore a string
110 dictionary to its initial condition, deleting all entries:
111 <PRE>
112 reset(td);
113 </PRE>
114 <P>
115 <BR>
116
117 <H2>Technical Notes </H2>
118
119 The dictionary uses a hashing algorithm freely adapted from Fletcher's
120 checksum as described by John Kodis in Doctor Dobb's Journal, May,
121 1992.
122 <P>
123 The length of the hash table is the smallest power of two that is
124 greater than the specified size of the dictionary. Therefore, if you
125 are using a computer with 8086 architecture, you are limited to a
126 dictionary with fewer than 32768 entries.
127 <P>
128 The hash algorithm produces two unsigned integers, k1 and k2, which are
129 each less than 255. The hash code is then calculated as
130 <CODE> 255*k1 + k2</CODE>,
131 and masked to the length of the hash table. In case of collision, a
132 hash increment is calculated by using k1 and k2 in the opposite order:
133 <CODE> 255*k2 + k1</CODE>.
134 This is multiplied by two and one is added to guarantee an
135 increment which is relatively prime to the length of the hash table.
136 This process guarantees the entire table can be searched.
137 <P>
138 The number of calls and the number of collisions are recorded so you
139 can check the efficiency of the hash algorithm.
140 <P>
141 Because the dictionary allocates a single character array for all the
142 dictionary entries, the product of the max number of entries and the
143 average size must be less than 65536 when using 8086 architecture. If
144 this is inconvenient there are several alternatives. You can allocate
145 storage individually for each string and simply maintain a list of
146 pointers, or you can declare a "huge" array, and use longs to index it.
147 Remember that in this case the hash table would have to be declared as
148 an array of longs.
149
150
151 <P>
152 <BR>
153
154
155 <IMG ALIGN="bottom" SRC="../images/rbline6j.gif" ALT="----------------------"
156 WIDTH=1010 HEIGHT=2 >
157 <P>
158 <IMG ALIGN="right" SRC="../images/pslrb6d.gif" ALT="Parsifal Software"
159 WIDTH=181 HEIGHT=25>
160 <BR CLEAR="right">
161
162 <P>
163 Back to :
164 <A HREF="../index.html">Index</A> |
165 <A HREF="index.html">Class libraries for examples</A>
166
167 <P>
168 <ADDRESS><FONT SIZE="-1">
169 AnaGram parser generator - examples<BR>
170 Class libraries - String dictionary class definition<BR>
171 Copyright &copy; 1993-1999, Parsifal Software. <BR>
172 All Rights Reserved.<BR>
173 </FONT></ADDRESS>
174
175 </BODY>
176 </HTML>
177