aboutsummaryrefslogtreecommitdiff
path: root/libstdc++-v3/docs/html/ext/pb_ds/tree_based_containers.html
blob: 7a1b554b26bdda6573d7e384bddb74bc98c7b681 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta name="generator" content=
  "HTML Tidy for Linux/x86 (vers 12 April 2005), see www.w3.org" />

  <title>Tree-Based Containers</title>
  <meta http-equiv="Content-Type" content=
  "text/html; charset=us-ascii" />
  </head>

<body>
  <div id="page">
    <h1>Tree Design</h1>

    <h2><a name="overview" id="overview">Overview</a></h2>

    <p>The tree-based container has the following declaration:</p>
    <pre>
<b>template</b>&lt;
    <b>typename</b> Key,
    <b>typename</b> Mapped,
    <b>typename</b> Cmp_Fn = std::less&lt;Key&gt;,
    <b>typename</b> Tag = <a href="rb_tree_tag.html">rb_tree_tag</a>,
    <b>template</b>&lt;
        <b>typename</b> Const_Node_Iterator,
        <b>typename</b> Node_Iterator,
        <b>typename</b> Cmp_Fn_,
        <b>typename</b> Allocator_&gt;
    <b>class</b> Node_Update = <a href=
"null_tree_node_update.html">null_tree_node_update</a>,
    <b>typename</b> Allocator = std::allocator&lt;<b>char</b>&gt; &gt;
<b>class</b> <a href=
"tree.html">tree</a>;
</pre>

    <p>The parameters have the following meaning:</p>

    <ol>
      <li><tt>Key</tt> is the key type.</li>

      <li><tt>Mapped</tt> is the mapped-policy.</li>

      <li><tt>Cmp_Fn</tt> is a key comparison functor</li>

      <li><tt>Tag</tt> specifies which underlying data structure
      to use.</li>

      <li><tt>Node_Update</tt> is a policy for updating node
      invariants. This is described in <a href="#invariants">Node
      Invariants</a>.</li>

      <li><tt>Allocator</tt> is an allocator
      type.</li>
    </ol>

    <p>The <tt>Tag</tt> parameter specifies which underlying
    data structure to use. Instantiating it by <a href=
    "rb_tree_tag.html"><tt>rb_tree_tag</tt></a>, <a href=
    "splay_tree_tag.html"><tt>splay_tree_tag</tt></a>, or
    <a href="ov_tree_tag.html"><tt>ov_tree_tag</tt></a>,
    specifies an underlying red-black tree, splay tree, or
    ordered-vector tree, respectively; any other tag is illegal.
    Note that containers based on the former two contain more types
    and methods than the latter (<i>e.g.</i>,
    <tt>reverse_iterator</tt> and <tt>rbegin</tt>), and different
    exception and invalidation guarantees.</p>

    <h2><a name="invariants" id="invariants">Node
    Invariants</a></h2>

    <p>Consider the two trees in Figures <a href=
    "#node_invariants">Some node invariants</a> A and B. The first
    is a tree of floats; the second is a tree of pairs, each
    signifying a geometric line interval. Each element in a tree is refered to as a node of the tree. Of course, each of
    these trees can support the usual queries: the first can easily
    search for <tt>0.4</tt>; the second can easily search for
    <tt>std::make_pair(10, 41)</tt>.</p>

    <p>Each of these trees can efficiently support other queries.
    The first can efficiently determine that the 2rd key in the
    tree is <tt>0.3</tt>; the second can efficiently determine
    whether any of its intervals overlaps
    <tt>std::make_pair(29,42)</tt> (useful in geometric
    applications or distributed file systems with leases, for
    example). (See <a href=
    "../../../../testsuite/ext/pb_ds/example/tree_order_statistics.cc"><tt>tree_order_statistics.cc</tt></a>
    and <a href=
    "../../../../testsuite/ext/pb_ds/example/tree_intervals.cc"><tt>tree_intervals.cc</tt></a>
    for examples.) It should be noted that an <tt>std::set</tt> can
    only solve these types of problems with linear complexity.</p>

    <p>In order to do so, each tree stores some <i>metadata</i> in
    each node, and maintains node invariants <a href=
    "references.html#clrs2001">clrs2001</a>]. The first stores in
    each node the size of the sub-tree rooted at the node; the
    second stores at each node the maximal endpoint of the
    intervals at the sub-tree rooted at the node.</p>

    <h6 class="c1"><a name="node_invariants" id=
    "node_invariants"><img src="node_invariants.png" alt=
    "no image" /></a></h6>

    <h6 class="c1">Some node invariants.</h6>

    <p>Supporting such trees is difficult for a number of
    reasons:</p>

    <ol>
      <li>There must be a way to specify what a node's metadata
      should be (if any).</li>

      <li>Various operations can invalidate node invariants.
      <i>E.g.</i>, Figure <a href=
      "#node_invariant_invalidations">Invalidation of node
      invariants</a> shows how a right rotation, performed on A,
      results in B, with nodes <i>x</i> and <i>y</i> having
      corrupted invariants (the grayed nodes in C); Figure <a href=
      "#node_invariant_invalidations">Invalidation of node
      invariants</a> shows how an insert, performed on D, results
      in E, with nodes <i>x</i> and <i>y</i> having corrupted
      invariants (the grayed nodes in F). It is not feasible to
      know outside the tree the effect of an operation on the nodes
      of the tree.</li>

      <li>The search paths of standard associative containers are
      defined by comparisons between keys, and not through
      metadata.</li>

      <li>It is not feasible to know in advance which methods trees
      can support. Besides the usual <tt>find</tt> method, the
      first tree can support a <tt>find_by_order</tt> method, while
      the second can support an <tt>overlaps</tt> method.</li>
    </ol>

    <h6 class="c1"><a name="node_invariant_invalidations" id=
    "node_invariant_invalidations"><img src=
    "node_invariant_invalidations.png" alt="no image" /></a></h6>

    <h6 class="c1">Invalidation of node invariants.</h6>

    <p>These problems are solved by a combination of two means:
    node iterators, and template-template node updater
    parameters.</p>

    <h3><a name="node_it" id="node_it">Node Iterators</a></h3>

    <p>Each tree-based container defines two additional iterator
    types, <a href=
    "tree_const_node_iterator.html"><tt>const_node_iterator</tt></a>
    and <a href=
    "tree_node_iterator.html"><tt>node_iterator</tt></a>.
    These iterators allow descending from a node to one of its
    children. Node iterator allow search paths different than those
    determined by the comparison functor. <a href=
    "tree.html">tree</a>
    supports the methods:</p>
    <pre>
    <a href="tree_const_node_iterator.html"><tt>const_node_iterator</tt></a>
    node_begin() <b>const</b>;

    <a href="tree_node_iterator.html"><tt>node_iterator</tt></a>
    node_begin();

    <a href="tree_const_node_iterator.html"><tt>const_node_iterator</tt></a>
    node_end() <b>const</b>;

    <a href="tree_node_iterator.html"><tt>node_iterator</tt></a>
    node_end(); 
</pre>

    <p>The first pairs return node iterators corresponding to the
    root node of the tree; the latter pair returns node iterators
    corresponding to a just-after-leaf node.</p>

    <h3><a name="node_up" id="node_up">Node Updater
    (Template-Template) Parameters</a></h3>

    <p>The tree-based containers are parametrized by a
    <tt>Node_Update</tt> template-template parameter. A tree-based
    container instantiates <tt>Node_Update</tt> to some
    <tt>node_update</tt> class, and publicly
    subclasses <tt>node_update</tt>. Figure
    <a href="#tree_node_update_cd">A tree and its update
    policy</a> shows this scheme, as well as some predefined
    policies (which are explained below).</p>

    <h6 class="c1"><a name="tree_node_update_cd" id=
    "tree_node_update_cd"><img src=
    "tree_node_update_policy_cd.png" alt="no image" /></a></h6>

    <h6 class="c1">A tree and its update policy.</h6>

    <p><tt>node_update</tt> (an instantiation of
    <tt>Node_Update</tt>) must define <tt>metadata_type</tt> as
    the type of metadata it requires. For order statistics,
    <i>e.g.</i>, <tt>metadata_type</tt> might be <tt>size_t</tt>.
    The tree defines within each node a <tt>metadata_type</tt>
    object.</p>

    <p><tt>node_update</tt> must also define the following method
    for restoring node invariants:</p>
    <pre>
    void 
    operator()(<a href=
"tree_node_iterator.html"><tt>node_iterator</tt></a> nd_it, <a href=
"tree_const_node_iterator.html"><tt>const_node_iterator</tt></a> end_nd_it)
</pre>

    <p>In this method, <tt>nd_it</tt> is a <a href=
    "tree_node_iterator.html"><tt>node_iterator</tt></a>
    corresponding to a node whose A) all descendants have valid
    invariants, and B) its own invariants might be violated;
    <tt>end_nd_it</tt> is a <a href=
    "tree_const_node_iterator.html"><tt>const_node_iterator</tt></a>
    corresponding to a just-after-leaf node. This method should
    correct the node invariants of the node pointed to by
    <tt>nd_it</tt>. For example, say node <i>x</i> in Figure
    <a href="#restoring_node_invariants">Restoring node
    invariants</a>-A has an invalid invariant, but its' children,
    <i>y</i> and <i>z</i> have valid invariants. After the
    invocation, all three nodes should have valid invariants, as in
    Figure <a href="#restoring_node_invariants">Restoring node
    invariants</a>-B.</p>

    <h6 class="c1"><a name="restoring_node_invariants" id=
    "restoring_node_invariants"><img src=
    "restoring_node_invariants.png" alt="no image" /></a></h6>

    <h6 class="c1">Invalidation of node invariants.</h6>

    <p>When a tree operation might invalidate some node invariant,
    it invokes this method in its <tt>node_update</tt> base to
    restore the invariant. For example, Figure <a href=
    "#update_seq_diagram">Insert update sequence diagram</a> shows
    an <tt>insert</tt> operation (point A); the tree performs some
    operations, and calls the update functor three times (points B,
    C, and D). (It is well known that any <tt>insert</tt>,
    <tt>erase</tt>, <tt>split</tt> or <tt>join</tt>, can restore
    all node invariants by a small number of node invariant updates
    [<a href="references.html#clrs2001">clrs2001</a>].)</p>

    <h6 class="c1"><a name="update_seq_diagram" id=
    "update_seq_diagram"><img src="update_seq_diagram.png" alt=
    "no image" /></a></h6>

    <h6 class="c1">Insert update sequence diagram.</h6>

    <p>To complete the description of the scheme, three questions
    need to be answered:</p>

    <ol>
      <li>How can a tree which supports order statistics define a
      method such as <tt>find_by_order</tt>?</li>

      <li>How can the node updater base access methods of the
      tree?</li>

      <li>How can the following cyclic dependency be resolved?
      <tt>node_update</tt> is a base class of the tree, yet it
      uses node iterators defined in the tree (its child).</li>
    </ol>

    <p>The first two questions are answered by the fact that
    <tt>node_update</tt> (an instantiation of
    <tt>Node_Update</tt>) is a <tt><b>public</b></tt> base class
    of the tree. Consequently:</p>

    <ol>
      <li>Any public methods of <tt>node_update</tt> are
      automatically methods of the tree [<a href=
      "references.html#alexandrescu01modern">alexandrescu01modern</a>].
      Thus an order-statistics node updater, <a href=
      "tree_order_statistics_node_update.html"><tt>tree_order_statistics_node_update</tt></a>
      defines the <tt>find_by_order</tt> method; any tree
      instantiated by this policy consequently supports this method
      as well.</li>

      <li>In C++, if a base class declares a method as
      <tt><b>virtual</b></tt>, it is <tt><b>virtual</b></tt> in its
      subclasses. If <tt>node_update</tt> needs to access one of
      the tree's methods, say the member function <tt>end</tt>, it simply
      declares that method as <tt><b>virtual</b></tt>
      abstract.</li>
    </ol>

    <p>The cyclic dependency is solved through template-template
    parameters. <tt>Node_Update</tt> is parametrized by the tree's node iterators, its comparison
    functor, and its allocator type. Thus, 
    instantiations of <tt>Node_Update</tt> have all information required.</p>

    <p class="c1"><tt>pb_ds</tt> assumes that constructing a metadata object and modifying it
    are exception free. Suppose that during some method, say
    <tt>insert</tt>, a metadata-related operation
    (<i>e.g.</i>, changing the value of a metadata) throws an
    exception. Ack! Rolling back the method is unusually complex.</p>

    <p>In <a href=
    "concepts.html#concepts_null_policies">Interface::Concepts::Null
    Policy Classes</a> a distinction was made between <i>redundant
    policies</i> and <i>null policies</i>. Node invariants show a
    case where null policies are required.</p>

    <p>Assume a regular tree is required, one which need not
    support order statistics or interval overlap queries.
    Seemingly, in this case a redundant policy - a policy which
    doesn't affect nodes' contents would suffice. This, would lead
    to the following drawbacks:</p>

    <ol>
      <li>Each node would carry a useless metadata object, wasting
      space.</li>

      <li>The tree cannot know if its <tt>Node_Update</tt> policy
      actually modifies a node's metadata (this is halting
      reducible). In Figure <a href=
      "#rationale_null_node_update">Useless update path</a> ,
      assume the shaded node is inserted. The tree would have to
      traverse the useless path shown to the root, applying
      redundant updates all the way.</li>
    </ol>

    <h6 class="c1"><a name="rationale_null_node_update" id=
    "rationale_null_node_update"><img src=
    "rationale_null_node_update.png" alt="no image" /></a></h6>

    <h6 class="c1">Useless update path.</h6>

    <p>A null policy class, <a href=
    "null_tree_node_update.html"><tt>null_tree_node_update</tt></a>
    solves both these problems. The tree detects that node
    invariants are irrelevant, and defines all accordingly.</p>

    <h2><a name="add_methods" id="add_methods">Additional
    Methods</a></h2>

    <p>Tree-based containers support split and join methods.
    It is possible to split a tree so that it passes
    all nodes with keys larger than a given key to a different
    tree. These methods have the following advantages over the
    alternative of externally inserting to the destination
    tree and erasing from the source tree:</p>

    <ol>
      <li>These methods are efficient - red-black trees are split
      and joined in poly-logarithmic complexity; ordered-vector
      trees are split and joined at linear complexity. The
      alternatives have super-linear complexity.</li>

      <li>Aside from orders of growth, these operations perform
      few allocations and de-allocations. For red-black trees, allocations are not performed,
      and the methods are exception-free. </li>
    </ol>
  </div>
</body>
</html>