aboutsummaryrefslogtreecommitdiff
path: root/libstdc++-v3/docs/html/ext/pb_assoc/hash_based_containers.html
diff options
context:
space:
mode:
Diffstat (limited to 'libstdc++-v3/docs/html/ext/pb_assoc/hash_based_containers.html')
-rw-r--r--libstdc++-v3/docs/html/ext/pb_assoc/hash_based_containers.html1056
1 files changed, 1056 insertions, 0 deletions
diff --git a/libstdc++-v3/docs/html/ext/pb_assoc/hash_based_containers.html b/libstdc++-v3/docs/html/ext/pb_assoc/hash_based_containers.html
new file mode 100644
index 00000000000..9fea0e6d168
--- /dev/null
+++ b/libstdc++-v3/docs/html/ext/pb_assoc/hash_based_containers.html
@@ -0,0 +1,1056 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
+<html>
+ <head>
+ <title>Hash-Based Containers</title>
+ <meta name="GENERATOR" content="Microsoft Visual Studio .NET 7.1">
+ <meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5">
+ </head>
+
+<body bgcolor = "white">
+
+<h1>Hash-Based Containers</h1>
+
+<p>
+ This section describes hash-based containers. It is organized
+as follows.
+</p>
+
+<ol>
+ <li>
+ <a href = "#overview">Overview</a> is an overview.
+ </li>
+ <li>
+ <a href = "#hash_policies">Hash Policies</a> discusses
+ hash policies.
+ </li>
+ <li>
+ <a href = "#resize_policies">Resize Policies</a> discusses
+ resize policies.
+ </li>
+ <li>
+ <a href = "#policy_interaction">Policy Interaction</a> discusses
+ interaction between policies.
+ </li>
+</ol>
+
+
+
+<h2><a name = "overview">Overview</a></h2>
+
+
+<p>
+ Figure
+<a href = "#hash_cd">Hash-based containers</a>
+ shows the container-hierarchy; the hash-based containers are circled.
+<a href = "cc_hash_assoc_cntnr.html"><tt>cc_hash_assoc_cntnr</tt></a>
+is a collision-chaining hash-based container;
+<a href = "gp_hash_assoc_cntnr.html"><tt>gp_hash_assoc_cntnr</tt></a>
+is a (general) probing hash-based container.
+</p>
+
+<h6 align = "center">
+<a name = "hash_cd">
+<img src = "hash_cd.jpg" width = "70%" alt = "no image">
+</h6>
+<h6 align = "center">
+</a>
+Hash-based containers.
+</h6>
+
+<p>
+ The collision-chaining hash-based container has the following declaration.
+</p>
+<pre>
+<b>template</b>&lt;
+ <b>typename</b> Key,
+ <b>typename</b> Data,
+ <b>class</b> Hash_Fn = std::hash&lt;Key&gt;,
+ <b>class</b> Eq_Fn = std::equal_to&lt;Key&gt;,
+ <b>class</b> Comb_Hash_Fn =
+ <a href = "direct_mask_range_hashing.html">direct_mask_range_hashing</a>&lt;&gt;
+ <b>class</b> Resize_Policy = <i>default explained below.</i>
+ <b>bool</b> Store_Hash = <b>false</b>,
+ <b>class</b> Allocator =
+ std::allocator&lt;<b>char</b>&gt; &gt;
+<b>class</b> <a href = "cc_hash_assoc_cntnr.html">cc_hash_assoc_cntnr</a>;
+</pre>
+
+<p>
+ The parameters have the following meaning:
+</p>
+<ol>
+ <li> <tt>Key</tt> is the key type.
+ </li>
+ <li> <tt>Data</tt> is the data-policy, and is explained in
+<a href = "ms_gen.html#ds_policy">Mapping-Semantics Genericity::Data Types as a Policy</a>.
+ </li>
+ <li> <tt>Hash_Fn</tt> is a key hashing functor.</li>
+ <li> <tt>Eq_Fn</tt> is a key equivalence functor.</li>
+ <li> <tt>Comb_Hash_Fn</tt> is a <i>range-hashing_functor</i>; it
+describes how to translate hash values into positions within the table.
+This is described in
+<a name = "#hash_policies">Hash Policies</a>.</li>
+ </li>
+ <li> <tt>Resize_Policy</tt> describes how a container object should
+change its internal size. This is described in
+<a name = #resize_policies">Resize Policies</a>.</li>
+ <li> <tt>Store_Hash</tt> indicates whether the hash value should
+be stored with each entry. This is described in
+<a name = "#policy_interaction">Policy Interaction</a>.</li>
+ <li> <tt>Allocator</tt> is (surprisingly) an allocator type.
+ </li>
+</ol>
+
+<p>
+ The probing hash-based container has the following declaration.
+</p>
+<pre>
+<b>template</b>&lt;
+ <b>typename</b> Key,
+ <b>typename</b> Data,
+ <b>class</b> Hash_Fn =
+ std::hash&lt;
+ Key&gt;,
+ <b>class</b> Eq_Fn =
+ std::equal_to&lt;
+ Key&gt;,
+ <b>class</b> Comb_Probe_Fn =
+ <a href = "direct_mask_range_hashing.html">direct_mask_range_hashing</a>&lt;&gt;
+ <b>class</b> Probe_Fn = <i>default explained below.</i>
+ <b>class</b> Resize_Policy = <i>default explained below.</i>
+ <b>bool</b> Store_Hash = <b>false</b>,
+ <b>class</b> Allocator =
+ std::allocator&lt;<b>char</b>&gt; &gt;
+<b>class</b> <a href = "gp_hash_assoc_cntnr.html">gp_hash_assoc_cntnr</a>;
+</pre>
+
+<p>
+ The parameters are identical to those of the collision-chaining container, except
+for the following.
+</p>
+<ol>
+ <li> <tt>Comb_Probe_Fn</tt> describes how to transform a probe sequence into
+a sequence of positions within the table.
+ </li>
+ <li> <tt>Probe_Fn</tt> describes a probe sequence policy.</li>
+</ol>
+
+
+<p>
+ Some of the default template values depend on the values of other parameters,
+and are explained in
+<a name = "#policy_interaction">Policy Interaction</a>.
+</p>
+
+<h2><a name = "hash_policies">Hash Policies</h2></a>
+<p>
+ This subsection describes hash policies. It is organized as follows:
+</p>
+<ol>
+ <li> <a href = "#general_terms">General Terms</a> describes
+ some general terms.
+ </li>
+ <li> <a href = "#range_hashing_fns">Range-Hashing Functions</a>
+ describes range-hasing functions.</li>
+ <li> <a href = "#hash_policies_ranged_hash_policies">Ranged-Hash Functions</a>
+ describes ranged-hash functions. </li>
+ <li> <a href = "#pb_assoc_imp">Implementation in <tt>pb_assoc</tt></a>
+ describes the implementation of these concepts in <tt>pb_assoc</tt>.
+ </li>
+</ol>
+
+
+<h3><a name="general_terms">General Terms</a></h3>
+
+<p>
+ There
+are actually three functions involved in transforming a key into a hash-table's
+position (see Figure
+<a href = "#hash_ranged_hash_range_hashing_fns">
+Hash runctions, ranged-hash functions, and range-hashing functions
+</a>):
+</p>
+<ol>
+ <li>
+ A <i>ranged-hash</i> function, which maps keys into an interval of the
+ non-negative integrals. This is the function actually required by the
+ hash-table algorithm.
+ </li>
+ <li>
+ A hash function, which maps keys into non-negative integral types. This is
+ typically specified by the writer of the key class.
+ </li>
+ <li>
+ A <i>range-hashing</i> function, which maps non-negative integral types into an
+ interval of non-negative integral types.
+ </li>
+</ol>
+
+<h6 align = "center">
+<a name = "hash_ranged_hash_range_hashing_fns">
+<img src = "hash_ranged_hash_range_hashing_fns.jpg" width = "70%" alt = "no image">
+</h6>
+<h6 align = "center">
+</a>
+Hash runctions, ranged-hash functions, and range-hashing functions.
+</h6>
+
+<p>
+ Let <i>U</i> be a domain (<i>e.g.</i>, the integers, or the strings of 3
+ characters). A hash-table algorithm needs to map elements of <i>U</i> "uniformly"
+ into the range <i>[0,..., m - 1]</i> (where <i>m</i> is a non-negative integral
+ value, and is, in general, time varying). <i>I.e.</i>, the algorithm needs a <i>ranged-hash</i>
+ function
+</p>
+<p>
+ <i>f : U &times; Z<sub>+</sub> &rarr; Z<sub>+</sub> </i>,
+</p>
+<p>
+ such that for any <i>u</i> in <i>U</i>
+,
+</p>
+<p>
+ <i>0 &le; f(u, m) &le; m - 1 </i>,
+</p>
+<p>
+ and which has "good uniformity" properties [<a href="references.html#knuth98sorting">knuth98sorting</a>].
+ One common solution is to use the composition of the hash function
+</p>
+<p>
+ <i>h : U &rarr; Z<sub>+</sub> </i>,
+</p>
+<p>
+ which maps elements of <i>U</i> into the non-negative integrals, and
+</p>
+<p>
+ <i>g : Z<sub>+</sub> &times; Z<sub>+</sub> &rarr; Z<sub>+</sub>, </i>
+</p>
+<p>
+ which maps a non-negative hash value, and a non-negative range upper-bound into
+ a non-negative integral in the range between 0 (inclusive) and the range upper
+ bound (exclusive), <i>i.e.</i>, for any <i>r</i> in <i>Z<sub>+</sub></i>,
+</p>
+<p>
+ <i>0 &le; g(r, m) &le; m - 1 </i>.
+</p>
+<p>
+ The resulting ranged-hash function, is
+</p>
+<p>
+ <i><a name="eqn:ranged_hash_composed_of_hash_and_range_hashing">f(u , m) = g(h(u), m) </a>
+ </i>(1) .
+</p>
+
+<p>
+ From the above, it is obvious that given <i>g</i> and <i>h</i>, <i>f</i> can
+ always be composed (however the converse is not true). The STL's hash-based
+ containers allow specifying a hash function, and use a hard-wired range-hashing function; the ranged-hash function is implicitly composed.
+</p>
+
+
+<p>
+ The above describes the case where a key is to be mapped into a <i>single
+position</i> within a hash table, <i>e.g.</i>, in a collision-chaining table.
+In other cases, a key is to be mapped into a <i>sequence of poisitions</i>
+within a table, <i>e.g.</i>, in a probing table.
+</p>
+<p>
+ Similar terms apply in this case: the table requires a <i>ranged probe</i>
+function, mapping a key into a sequence of positions withing the table. This is
+typically acheived by composing a <i>hash function</i> mapping the key
+into a non-negative integral type, a <i>probe</i> function transforming the
+hash value into a sequence of hash values, and a <i>range-hashing</i> function
+transforming the sequence of hash values into a sequence of positions.
+</p>
+
+
+<h3><a name="range_hashing_fns">Range-Hashing Functions</a></h3>
+
+<p>
+ Some common choices for range-hashing functions are the division,
+ multiplication, and middle-square methods [<a href="references.html#knuth98sorting">knuth98sorting</a>],
+ defined as
+</p>
+<p>
+ <i><a name="eqn:division_method">g(r, m) = r mod m </a></i>(2) ,
+</p>
+<p>
+ <i>g(r, m) = &lceil; u/v ( a r mod v ) &rceil; </i>,
+</p>
+<p>
+ and
+</p>
+<p>
+ <i>g(r, m) = &lceil; u/v ( r<sup>2</sup> mod v ) &rceil; </i>,
+</p>
+<p>
+respectively, for some positive integrals <i>u</i> and <i>v</i> (typically
+powers of 2), and some <i>a</i>. Each of these range-hashing functions works
+best for some different setting.
+</p>
+<p>
+ The division method <a href="#division_method">(2)</a> is a very common
+ choice. However, even this single method can be implemented in two very
+ different ways. It is possible to implement <a href="#division_method">(2)</a>
+ using the low level <i>%</i> (modulo) operation (for any <i>m</i>), or the low
+ level <i>&amp;</i> (bit-mask) operation (for the case where <i>m</i> is a power of
+ 2), <i>i.e.</i>,
+</p>
+<p>
+ <i><a name="eqn:division_method_prime_mod">g(r, m) = r % m </a></i>(3) ,
+</p>
+<p>
+ and
+</p>
+<p>
+ <a name="eqn:division_method_bit_mask">
+ <i>g(r, m) = r &amp; m - 1, ( m = 2<sup>k</sup>
+ </i>
+ for some<i> k) </i></a>(4) ,
+</p>
+<p>
+ respectively.
+</p>
+<p>
+ The <i>%</i> (modulo) implementation <a href="#division_method_prime_mod">(3)</a>
+ has the advantage that for <i>m</i> a prime far from a power of 2, <i>g(r, m)</i>
+ is affected by all the bits of <i>r</i> (minimizing the chance of collision).
+ It has the disadvantage of using the costly modulo operation. This method is
+ hard-wired into SGI's implementation [<a href="references.html#sgi_stl">sgi_stl</a>].
+</p>
+
+<p>
+ The <i>&amp;</i> (bit-mask) implementation <a href="#division_method_bit_mask">(4)</a>
+ has the advantage of relying on the fast bitwise and operation. It has the
+ disadvantage that for <i>g(r, m)</i> is affected only by the low order bits of <i>r</i>.
+ This method is hard-wired into Dinkumware's implementation [<a href="references.html#dinkumware_stl">dinkumware_stl</a>].
+</p>
+
+
+
+
+<h3><a name="hash_policies_ranged_hash_policies">Ranged-Hash Functions</a></h3>
+
+<p>
+ In some less frequent cases it is beneficial to allow the client to
+directly specify a ranged-hash hash function. It is true, that the writer of
+the ranged-hash function cannot rely on the values of <i>m</i> having specific
+numerical properties suitable for hashing (in the sense used in [<a href="references.html#knuth98sorting">knuth98sorting</a>]),
+since the values of <i>m</i> are determined by a resize policy with possibly
+orthogonal considerations.
+</p>
+
+<p>
+ There are two cases where a ranged-hash function can be superior. The firs is when using perfect hashing
+[<a href="references.html#knuth98sorting">knuth98sorting</a>]; the second
+is when the values of <i>m</i> can be used to estimate the
+"general" number of distinct values required. This is described in the following.
+</p>
+
+<p>
+ Let
+</p>
+
+<p>
+ <i>s = [ s<sub>0</sub>,..., s<sub>t - 1</sub>] </i>
+</p>
+
+<p>
+ be a string of <i>t</i> characters, each of which is from domain <i>S</i>.
+Consider the following ranged-hash function:
+</p>
+
+<p>
+ <a name="eqn:total_string_dna_hash">
+ <i>
+ f<sub>1</sub>(s, m) =
+ &sum; <sub>i =
+ 0</sub><sup>t - 1</sup> s<sub>i</sub> a<sup>i</sup> </i>mod<i> m </i>
+ </a> (5) ,
+</p>
+
+<p>
+ where <i>a</i> is some non-negative integral value. This is the standard
+string-hashing function used in SGI's implementation (with <i>a = 5</i>) [<a href="references.html#sgi_stl">sgi_stl</a>].
+Its advantage is that it takes into account all of the characters of the
+string.
+</p>
+
+<p>
+ Now assume that <i>s</i> is the string representation of a of a long DNA
+sequence (and so <i>S = {'A', 'C', 'G', 'T'}</i>). In this case, scanning the
+entire string might be prohibitively expensive. A possible alternative might be
+to use only the first <i>k</i> characters of the string, where
+</p>
+
+<p>
+ k <sup>|S|</sup> &ge; m ,
+</p>
+<p>
+ <i>i.e.</i>, using the hash function
+</p>
+<p>
+ <a name="eqn:only_k_string_dna_hash"><i>f<sub>2</sub>(s, m) = &sum; <sub>i = 0</sub><sup>k
+ - 1</sup> s<sub>i</sub> a<sup>i</sup> </i>mod <i>m </i></a>, (6)
+</p>
+<p>
+ requiring scanning over only
+</p>
+<p>
+ <i>k = </i>log<i><sub>4</sub>( m ) </i>
+</p>
+<p>
+ characters.
+</p>
+<p>
+ Other more elaborate hash-functions might scan <i>k</i> characters starting at
+ a random position (determined at each resize), or scanning <i>k</i> random
+ positions (determined at each resize), <i>i.e.</i>, using
+</p>
+<p>
+ <i>f<sub>3</sub>(s, m) = &sum; <sub>i = r<sub>0</sub></sub><sup>r<sub>0</sub> + k - 1</sup>
+ s<sub>i</sub> a<sup>i</sup> </i>mod <i>m </i>,
+</p>
+<p>
+ or
+</p>
+<p>
+ <i>f<sub>4</sub>(s, m) = &sum; <sub>i = 0</sub><sup>k - 1</sup> s<sub>r<sub>i</sub></sub>
+ a<sup>r<sub>i</sub></sup> </i>mod <i>m </i>,
+</p>
+<p>
+<p>
+ respectively, for <i>r<sub>0</sub>,..., r<sub>k-1</sub></i> each in the
+ (inclusive) range <i>[0,...,t-1]</i>.
+</p>
+
+
+<h3><a name="pb_assoc_imp">Implementation in <tt>pb_assoc</tt></a></h3>
+
+<p>
+<a href = "cc_hash_assoc_cntnr.html"><tt>cc_hash_assoc_cntnr</tt></a> is
+parameterized by <tt>Hash_Fn</tt> and <tt>Comb_Hash_Fn</tt>, a hash functor
+and a combining hash functor, respectively.
+</p>
+
+<p>
+ For any hash functor except <a href = "null_hash_fn.html"><tt>null_hash_fn</tt></a>,
+one of the
+<a href = "concepts.html#concepts_null_policies">Concepts::Null Policy Classes</a>,
+then <tt>Comb_Hash_Fn</tt> is considered a range-hashing functor.
+The container will synthesize a ranged-hash functor from both. For example, Figure
+<a href = "#hash_range_hashing_seq_diagram">
+Insert hash sequence diagram
+</a>
+shows an <tt>insert</tt> sequence diagram. The user inserts an element (point A),
+the container transforms the key into a non-negative integral using the hash
+functor (points B and C), and transforms the result into a position
+using the combining functor (points D and E).
+</p>
+
+<h6 align = "center">
+<a name = "hash_range_hashing_seq_diagram">
+<img src = "hash_range_hashing_seq_diagram.jpg" width = "70%" alt = "no image">
+</a>
+</h6>
+<h6 align = "center">
+Insert hash sequence diagram.
+</h6>
+
+
+<p>
+ <tt>pb_assoc</tt> contains the following range-hashing policies:
+</p>
+
+<ol>
+ <li>
+<a href = "direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a>
+and
+<a href = "direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
+are range-hashing functions based on a bit-mask and a modulo operation, respectively.
+ </li>
+</ol>
+
+
+<p>
+ If <tt>Comb_Hash_Fn</tt> is instantiated by
+<a href = "null_hash_fn.html"><tt>null_hash_fn</tt></a>,
+and a combining-hash functor, the container treats
+the combining hash functor as a ranged-hash function. For example, Figure
+<a href = "#hash_range_hashing_seq_diagram2">
+Insert hash sequence diagram with a null combination policy
+</a>
+shows an <tt>insert</tt> sequence diagram. The user inserts an element (point A),
+the container transforms the key into a position
+using the combining functor (points B and C).
+</p>
+
+
+<h6 align = "center">
+<a name = "hash_range_hashing_seq_diagram2">
+<img src = "hash_range_hashing_seq_diagram2.jpg" width = "70%" alt = "no image">
+</a>
+</h6>
+<h6 align = "center">
+Insert hash sequence diagram with a null combination policy.
+</h6>
+
+<p>
+ Similarly,
+<a href = "gp_hash_assoc_cntnr.html"></a><tt>gp_hash_assoc_cntnr</tt>
+is parameterized by <tt>Hash_Fn</tt>, <tt>Probe_Fn</tt>, and
+<tt>Comb_Probe_Fn</tt>. As before, if <tt>Probe_Fn</tt>
+and <tt>Comb_Probe_Fn</tt> are, respectively,
+<a href = "null_hash_fn.html"><tt>null_hash_fn</tt></a> and
+<a href = "null_probe_fn.html"><tt>null_probe_fn</tt></a>,
+then <tt>Comb_Probe_Fn</tt> is a ranged-probe functor. Otherwise, <tt>Hash_Fn</tt>
+is a hash functor, <tt>Probe_Fn</tt> is a functor for offsets from a hash value,
+and <tt>Comb_Probe_Fn</tt> transforms a probe sequence into a sequence of positions
+within the table.
+</p>
+
+<p>
+ <tt>pb_assoc</tt> contains the following probe policies:
+</p>
+
+<ol>
+ <li>
+<a href = "linear_probe_fn.html"><tt>linear_probe_fn</tt></a> is a linear probe
+function.
+ </li>
+ <li>
+<a href = "quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a> is
+a quadratic probe function.
+ </li>
+</ol>
+
+
+
+
+
+
+
+
+
+<h2><a name = "resize_policies">Resize Policies</a></h2>
+
+<p>
+ This subsection describes resize policies. It is organized as follows:
+</p>
+
+<ol>
+ <li> <a href = "#general">General Terms</a> describes general
+ terms.
+ </li>
+ <li> <a href = "#size_policies">Size Policies</a> describes size
+ policies.
+ </li>
+ <li> <a href = "#trigger_policies">Trigger Policies</a> describes trigger
+ policies.
+ </li> <li> <a href = "#imp_pb_assoc">Implementation in <tt>pb_assoc</tt></a>
+ describes the implementation of these concepts in <tt>pb_assoc</tt>.
+ </li>
+</ol>
+
+
+<h3><a name = "general">General Terms</a></h3>
+
+<p>
+ Hash-tables, as opposed to trees, do not naturally grow or shrink. It
+is necessary to specify policies to determine how and when a hash table should change
+its size.
+</p>
+
+<p>
+ In general, resize policies can be decomposed into (probably orthogonal)
+policies:
+</p>
+<ol>
+ <li> A <i>size policy</i> indicating <i>how</i> a hash table should
+grow (<i>e.g.,</i> it should multiply by powers of 2).
+ </li>
+ <li> A <i>trigger policy</i> indicating <i>when</i> a hash table should
+grow (<i>e.g.,</i> a load factor is exceeded).
+ </li>
+</ol>
+
+
+
+<h3><a name = "size_policies">Size Policies</a></h3>
+
+<p>
+ Size policies determine how a hash table
+changes size. These policies are simple, and there are relatively
+few sensible options. An exponential-size policy (with the initial
+size and growth factors both powers of 2) works well with a
+mask-based range-hashing function (see the <a href = "hash_policies.html#hash_policies_range_hashing_policies">Range-Hashing Policies</a> subsection),
+and is the
+hard-wired policy used by Dinkumware
+[<a href = "references.html#dinkumware_stl">dinkumware_stl</a>]. A
+prime-list based policy works well with a modulo-prime range
+hashing function (see the <a href = "hash_policies.html#hash_policies_range_hashing_policies">Range-Hashing Policies</a> subsection),
+and is the
+hard-wired policy used by SGI's implementation
+[<a href = "references.html#sgi_stl">sgi_stl</a>].
+</p>
+
+
+
+
+<h3><a name = "trigger_policies">Trigger Policies</a></h3>
+
+<p>
+ Trigger policies determine when a hash table changes size.
+Following is a description of two polcies: <i>load-check</i>
+policies, and a collision-check policies.
+</p>
+
+<p>
+ Load-check policies are straightforward. The user
+specifies two factors, <i>&alpha;<sub>min</sub></i> and <i>&alpha;<sub>max</sub></i>, and
+the hash table maintains the invariant that
+</p>
+<p>
+ <i>
+ <a name = "eqn:load_factor_min_max">
+ &alpha;<sub>min</sub>
+ &le;
+ (number of stored elements) / (hash-table size)
+ &le;
+ &alpha;<sub>max</sub>
+ </a>
+ </i>
+ (1)
+ .
+</p>
+
+<p>
+ Collision-check policies work in the opposite direction of
+load-check policies. They focus on keeping the number of
+collisions moderate and hoping
+that the size of the table will not grow very large,
+instead of keeping a moderate load-factor and
+hoping that the number of collisions will be small.
+A
+maximal collision-check policy resizes when the shortest
+probe-sequence grows too large.
+</p>
+
+
+<p>
+ Consider Figure
+<a href = "#balls_and_bins">Balls and bins</a>.
+ Let the size of the hash table be denoted by <i>m</i>, the
+length of a probe sequence be denoted by <i>k</i>, and some load
+factor be denoted by &alpha;. We would like to calculate the
+minimal length of <i>k</i>, such that if there were <i>&alpha; m</i> elements
+in the hash table, a probe sequence of length <i>k</i> would be found
+with probability at most <i>1/m</i>.
+</p>
+
+<h6 align = "center">
+<a name = "balls_and_bins">
+<img src = "balls_and_bins.jpg" width = "70%" alt = "no image">
+</a>
+</h6>
+<h6 align = "center">
+Balls and bins.
+</h6>
+
+
+<p>
+ Denote the probability that a probe sequence of length <i>k</i>
+appears in bin <i>i</i> by <i>p<sub>i</sub></i>, the length of the probe sequence
+of bin <i>i</i> by <i>l<sub>i</sub></i>, and assume uniform distribution.
+Then
+</p>
+ <p>
+ <a name = "eqn:prob_of_p1">
+ <i>p<sub>1</sub>
+ = </i>(3)
+ </a>
+ </p>
+ <p>
+ <i>
+ <b>P</b>(l<sub>1</sub> &ge; k)
+ =
+ </i>
+ </p>
+ <p>
+ <i><b>P</b>(l<sub>1</sub> &ge; &alpha; ( 1 + k / &alpha; - 1 )
+ &le; </i>(a)
+ </p>
+ <p>
+ <i>
+ e
+ ^
+ (
+ -
+ (
+ &alpha; ( k / &alpha; - 1 )<sup>2</sup>
+ )
+ /2
+ )
+ </i>
+ ,
+</p>
+<p>
+ where (a) follows from the Chernoff bound
+[<a href = "references.html#motwani95random">motwani95random</a>].
+To
+calculate the probability that <i>some</i> bin contains a probe
+sequence greater than <i>k</i>, we note that the <i>l<sub>i</sub></i> are
+negatively-dependent
+[<a href = "references.html#dubhashi98neg">dubhashi98neg</a>].
+Let <i><b>I</b>(.)</i>
+denote the indicator function. Then
+ <p>
+ <a name = "eqn:at_least_k_i_n_some_bin">
+ <i><b>P</b>( exists<sub>i</sub> l<sub>i</sub> &ge; k )
+ = </i>(3)
+ </a>
+ </p>
+ <p>
+ <i>
+ <b>P</b>
+ (
+ &sum; <sub>i = 1</sub><sup>m</sup>
+ <b>I</b>(l<sub>i</sub> &ge; k) &ge; 1
+ )
+ =
+ </i>
+ </p>
+ <p>
+ <i>
+ <b>P</b>
+ (
+ &sum; <sub>i = 1</sub><sup>m</sup>
+ <b>I</b>
+ (
+ l<sub>i</sub> &ge; k
+ )
+ &ge;
+ m p<sub>1</sub> ( 1 + 1 / (m p<sub>1</sub>) - 1 )
+ )
+ &le; </i>(a)
+ </p>
+ <p>
+ <i>
+ e
+ ^
+ (
+ (
+ -
+ m p<sub>1</sub>
+ (
+ 1 / (m p<sub>1</sub>) - 1
+ )
+ <sup>2</sup>
+ )
+ /
+ 2
+ )
+ ,
+ </i>
+ </p>
+<p>
+where (a) follows from the fact that the Chernoff bound can be
+applied to negatively-dependent variables
+[<a href = "references.html#dubhashi98neg">dubhashi98neg</a>].
+Inserting <a href = "#prob_of_p1">(2)</a> into
+<a href = "#at_least_k_i_n_some_bin">(3)</a>, and equating with <i>1/m</i>,
+we obtain
+</p>
+<p>
+ <i>
+ k
+ ~
+ &radic;
+ (
+ 2 &alpha; </i>ln<i> 2 m </i>ln<i>(m) )
+ )
+ </i>
+ .
+</p>
+
+
+
+
+
+
+
+
+
+<h3><a name = "imp_pb_assoc">Implementation in <tt>pb_assoc</tt></a></h3>
+
+<p>
+ The resize policies in the previous subsection are conceptually straightforward. The design
+of hash-based containers' size-related interface is complicated by some factors.
+</p>
+<ol>
+ <li> Most containers, <i>i.e.</i> lists, trees, and vectors, have a single "size" concept. There is no
+distinction between the number of entries the container holds and the number of entries it is using. This,
+of course, is not the case for hash-based containers. Moreover, even describing the
+"actual" size of a hash-based container (as opposed to its logical size) is difficult - a probing-based container
+holds entries to elements, even those it does not use, while a chaining-based container holds pointers to entries.
+ </li>
+ <li>
+ The policies mentioned above operate in terms of invariants. <i>E.g.</i> a load-check trigger policy
+maintains an invariant concerning the load factor of a container object. This is sometimes too rigid:
+ <ol>
+ <li>In some cases it is desirable to allow controlled override of an entire policy, <i>e.g.</i> by externally resizing a container object (or giving it an initial size, which is a special case of externally resizing the container).
+ </li>
+ <li>
+ In other cases it is desirable to allow changing the specifics of a policy in runtime, <i>e.g.</i>, changing the load factors of a load-check policy.
+ </li>
+ </ol>
+ </li>
+ <li>
+ Resize policies interact strongly with hash policies. Performance-wise, for example, it is undesirable to use an exponential size policy of powers of two with a modulo range-hashing function, and it is undesirable to use a prime size policy with a mask range-hashing function. In other cases, the effects are more dramatic. For example, using a quadratic probe function with an exponential size policy will probably cause cases where the container object has available entries which are never reached by the probe function. (<a href = "hash_policies.html">Hash Policies</a>
+discusses the previous concepts.)
+ </li>
+</ol>
+
+<p>
+ Clearly, the more of these points an interface addresses, the greater its flexibility but the lower its encapsulation and uniformity between associative containers.
+</p>
+
+
+
+<p>
+ This library attempts to address these types of problems by delegating all size-related functionality to
+policy classes. Hash-based containers
+are parameterized by a resize-policy class (among others), and derive publicly from
+the resize-policy class
+[<a href = "references.html#alexandrescu01modern">alexandrescu01modern</a>]
+ <i>E.g.</i>, a collision-chaining
+hash table is defined as follows:
+</p>
+<pre>
+cc_ht_map&lt;
+ <b>class</b> Key,
+ <b>class</b> Data,
+ ...
+ <b>class</b> Resize_Policy
+ ...&gt; :
+ <b>public</b> Resize_Policy
+</pre>
+
+<p>
+ The containers themselves lack any functionality or public interface for manipulating sizes. A container
+object merely forwards events to its resize policy object and queries it for needed actions.
+</p>
+
+<p>
+ Figure
+<a href = "#insert_resize_sequence_diagram1">
+Insert resize sequence diagram
+</a>
+shows a (possible) sequence diagram of an insert operation.
+The user inserts an element; the hash table
+notifies its resize policy that a search has started (point A);
+in this case, a single collision is encountered - the table
+notifies its resize policy of this (point B); the container
+finally notifies its resize policy that the search has ended (point C);
+it then queries its resize policy whether a resize is needed, and if so,
+what is the new size (points D to G); following the resize, it notifies
+the policy that a resize has completed (point H); finally, the element
+is inserted, and the policy notified (point I).
+</p>
+
+<h6 align = "center">
+<a name = "insert_resize_sequence_diagram1">
+<img src = "insert_resize_sequence_diagram1.jpg" width = "70%" alt = "no image">
+</a>
+</h6>
+<h6 align = "center">
+Insert resize sequence diagram.
+</h6>
+
+<p>
+ This addresses, to some extent, the problems mentioned above:
+</p>
+<ol>
+ <li>
+ Different instantiations of range-hashing policies can be met with different instantiations of
+ resize policies.
+ </li>
+ <li>
+ Questions on size-related interface are avoided, since the containers have no size-related methods. Thus
+ a container has no method for querying its actual size. It merely continuously forwards enough information to
+ its resize policy to answer such queries; the designer of the resize policy can decide whether, or how, to design the appropriate method. Also, a container has no methods for setting its size. It merely queries its
+resize policy for an initial size, queries it on a new size (if the resize policy indicates a resize is needed), and
+supports a <tt><b>protected virtual</b></tt> function for external resize.
+ </li>
+</ol>
+
+<p>
+ The library contains a single class for instantiating a resize policy,
+<tt>pb_assoc</tt> contains
+a standard resize policy,
+<a href = "ht_standard_resize_policy.html"><tt>ht_standard_resize_policy</tt></a> (the name is explained shortly).
+In terms of interface, it is parameterized by a boolean constant indicating whether its public interface supports
+queries of actual size and external resize operations (the inclusion and exclusion of these methods in the interface have obvious tradeoffs in terms of encapsulation and flexibility).
+([<a href = "references.html#alexandrescu01modern">alexandrescu01modern</a>] shows many techniques for
+changing between alternative interfaces at compile time.)
+</p>
+
+<p>
+As noted before,
+ size and trigger policies are usually orthogonal.
+<a href = "ht_standard_resize_policy.html"><tt>ht_standard_resize_policy</tt></a>
+is parameterized by size and trigger policies. For example,
+a collision-chaining hash table
+is typically be defined as follows:
+</p>
+<pre>
+cc_ht_map&lt;
+ key,
+ data,
+ ...
+ ht_standard_resize_policy&lt;
+ some_trigger_policy,
+ some_size_policy,
+ ...&gt; &gt;
+</pre>
+
+<p>
+ The sole function of
+<a href = "ht_standard_resize_policy.html"><tt>ht_standard_resize_policy</tt></a>
+ is to
+act as a standard delegator
+[<a href = "references.html#gamma95designpatterns">gamma95designpatterns</a>] for these
+policies.
+
+<p>
+ Figures
+<a href = "#insert_resize_sequence_diagram2">Standard resize policy trigger sequence diagram</a>
+ and
+<a href = "#insert_resize_sequence_diagram3">Standard resize policy size sequence diagram</a>
+ show sequence diagrams illustrating the interaction between
+ the standard resize policy and its trigger and size policies, respectively.
+</p>
+
+<h6 align = "center">
+<a name = "insert_resize_sequence_diagram2">
+<img src = "insert_resize_sequence_diagram2.jpg" width = "70%" alt = "no image">
+</a>
+</h6>
+<h6 align = "center">
+Standard resize policy trigger sequence diagram.
+</h6>
+
+<h6 align = "center">
+<a name = "insert_resize_sequence_diagram3">
+<img src = "insert_resize_sequence_diagram3.jpg" width = "70%" alt = "no image">
+</a>
+</h6>
+<h6 align = "center">
+Standard resize policy size sequence diagram.
+</h6>
+
+<p>
+ The library (currently) supports the following instantiations of size
+and trigger policies:
+</p>
+
+<ol>
+ <li>
+ <a href = "ht_load_check_trigger.html"><tt>ht_load_check_trigger</tt></a> implements
+ a load check trigger policy.
+ </li>
+ <li>
+ <a href = "ht_max_collision_check_grow_resize_trigger.html"><tt>ht_max_collision_check_grow_resize_trigger</tt></a>
+ implements a collision check trigger policy.
+ </li>
+ <li>
+<a href = "ht_exponential_size_policy.html"><tt>ht_exponential_size_policy</tt></a> implemens
+an exponential-size policy (which should be used with mask range hashing).
+ </li>
+ <li>
+<a href = "ht_prime_size_policy.html"><tt>ht_prime_size_policy</tt></a> implementing
+a size policy based on a sequence of primes
+[<a href = "references.html#sgi_stl">sgi_stl</a>] (which should be used with mod range hashing
+ </li>
+</ol>
+
+<p>
+ The trigger policies also support interfaces for changing their specifics which depend on compile time constants.
+</p>
+
+
+<p>
+ Figure
+<a href = "#resize_policy_cd">Resize policy class diagram</a> gives an overall picture
+of the resize-related classes.
+<tt>Container</tt> (which stands for any of the hash-based containers) is parameterized
+by <tt>Resize_Policy</tt>, from which it subclasses publicly
+[<a href = "references.html#alexandrescu01modern">alexandrescu01modern</a>].
+This class is currently instantiated only by
+<a href = "ht_standard_resize_policy.html"><tt>ht_standard_resize_policy</tt></a>.
+<a href = "ht_standard_resize_policy.html"><tt>ht_standard_resize_policy</tt></a> itself
+is parameterized by <tt>Trigger_Policy</tt> and <tt>Size_Policy</tt>.
+Currently, <tt>Trigger_Policy</tt> is instantiated by
+<a href = "ht_load_check_trigger.html"><tt>ht_load_check_trigger</tt></a>,
+or
+<a href = "ht_max_collision_check_grow_resize_trigger.html"><tt>ht_max_collision_check_grow_resize_trigger</tt></a>; <tt>Size_Policy</tt> is instantiated by
+<a href = "ht_exponential_size_policy.html"><tt>ht_exponential_size_policy</tt></a>,
+or
+<a href = "ht_prime_size_policy.html"><tt>ht_prime_size_policy</tt></a>.
+</p>
+
+
+<h6 align = "center">
+<a name = "resize_policy_cd">
+<img src = "resize_policy_cd.jpg" width = "70%" alt = "no image">
+</a>
+</h6>
+<h6 align = "center">
+Resize policy class diagram.
+</h6>
+
+
+
+
+<h2><a name = "#policy_interaction">Policy Interaction</a></h2>
+
+<p>
+ Hash-tables are unfortunately susceptible to choice of policies. One
+of the more complicated aspects of this is that poor combinations of good policies
+can alter performance drastically. Following are some considerations.
+</p>
+
+
+
+
+
+<h3>Range-Hashing Policies and Resize Policies</h3>
+
+<p>
+</p>
+
+
+<h3>Equivalence Functors, Storing Hash Values, and Hash Functions</h3>
+
+
+<p>
+<a href = "cc_hash_assoc_cntnr.html"><tt>cc_hash_assoc_cntnr</tt></a>
+and
+<a href = "gp_hash_assoc_cntnr.html"><tt>gp_hash_assoc_cntnr</tt></a>
+are parameterized by an equivalenc functor and by a <tt>Store_Hash</tt>
+parameter. If the latter parameter is <tt><b>true</b></tt>, then
+the container stores with each entry a hash value, and uses
+this value in case of collisions to determine whether to apply a hash value.
+This can lower the cost of collision for some types, but increase the cost of collisions for other types.
+</p>
+
+<p>
+ If a ranged-hash function or ranged probe function is directly supplied, however,
+then it makes no sense to store the hash value with each entry. <tt>pb_assoc</tt>'s container will fail at compilation, by design, if this is attempted.
+</p>
+
+
+
+</body>
+
+</html>