aboutsummaryrefslogtreecommitdiff
path: root/libstdc++-v3/doc/html/manual/bk01pt05ch13s03.html
blob: 49f416a170cde474995c3f6252399b7b28536bfd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Arbitrary Character Types</title><meta name="generator" content="DocBook XSL Stylesheets V1.73.2" /><meta name="keywords" content="&#10;      ISO C++&#10;    , &#10;      library&#10;    " /><link rel="start" href="../spine.html" title="The GNU C++ Library Documentation" /><link rel="up" href="bk01pt05ch13.html" title="Chapter 13. String Classes" /><link rel="prev" href="bk01pt05ch13s02.html" title="Case Sensitivity" /><link rel="next" href="bk01pt05ch13s04.html" title="Tokenizing" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Arbitrary Character Types</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="bk01pt05ch13s02.html">Prev</a> </td><th width="60%" align="center">Chapter 13. String Classes</th><td width="20%" align="right"> <a accesskey="n" href="bk01pt05ch13s04.html">Next</a></td></tr></table><hr /></div><div class="sect1" lang="en" xml:lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="strings.string.character_types"></a>Arbitrary Character Types</h2></div></div></div><p>
    </p><p>The <code class="code">std::basic_string</code> is tantalizingly general, in that
      it is parameterized on the type of the characters which it holds.
      In theory, you could whip up a Unicode character class and instantiate
      <code class="code">std::basic_string&lt;my_unicode_char&gt;</code>, or assuming
      that integers are wider than characters on your platform, maybe just
      declare variables of type <code class="code">std::basic_string&lt;int&gt;</code>.
   </p><p>That's the theory.  Remember however that basic_string has additional
      type parameters, which take default arguments based on the character
      type (called <code class="code">CharT</code> here):
   </p><pre class="programlisting">
      template &lt;typename CharT,
                typename Traits = char_traits&lt;CharT&gt;,
                typename Alloc = allocator&lt;CharT&gt; &gt;
      class basic_string { .... };</pre><p>Now, <code class="code">allocator&lt;CharT&gt;</code> will probably Do The Right
      Thing by default, unless you need to implement your own allocator
      for your characters.
   </p><p>But <code class="code">char_traits</code> takes more work.  The char_traits
      template is <span class="emphasis"><em>declared</em></span> but not <span class="emphasis"><em>defined</em></span>.
      That means there is only
   </p><pre class="programlisting">
      template &lt;typename CharT&gt;
        struct char_traits
        {
            static void foo (type1 x, type2 y);
            ...
        };</pre><p>and functions such as char_traits&lt;CharT&gt;::foo() are not
      actually defined anywhere for the general case.  The C++ standard
      permits this, because writing such a definition to fit all possible
      CharT's cannot be done.
   </p><p>The C++ standard also requires that char_traits be specialized for
      instantiations of <code class="code">char</code> and <code class="code">wchar_t</code>, and it
      is these template specializations that permit entities like
      <code class="code">basic_string&lt;char,char_traits&lt;char&gt;&gt;</code> to work.
   </p><p>If you want to use character types other than char and wchar_t,
      such as <code class="code">unsigned char</code> and <code class="code">int</code>, you will
      need suitable specializations for them.  For a time, in earlier
      versions of GCC, there was a mostly-correct implementation that
      let programmers be lazy but it broke under many situations, so it
      was removed.  GCC 3.4 introduced a new implementation that mostly
      works and can be specialized even for <code class="code">int</code> and other
      built-in types.
   </p><p>If you want to use your own special character class, then you have
      <a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00163.html" target="_top">a lot
      of work to do</a>, especially if you with to use i18n features
      (facets require traits information but don't have a traits argument).
   </p><p>Another example of how to specialize char_traits was given <a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00260.html" target="_top">on the
      mailing list</a> and at a later date was put into the file <code class="code">
      include/ext/pod_char_traits.h</code>.  We agree
      that the way it's used with basic_string (scroll down to main())
      doesn't look nice, but that's because <a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00236.html" target="_top">the
      nice-looking first attempt</a> turned out to <a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00242.html" target="_top">not
      be conforming C++</a>, due to the rule that CharT must be a POD.
      (See how tricky this is?)
   </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="bk01pt05ch13s02.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="bk01pt05ch13.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="bk01pt05ch13s04.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Case Sensitivity </td><td width="20%" align="center"><a accesskey="h" href="../spine.html">Home</a></td><td width="40%" align="right" valign="top"> Tokenizing</td></tr></table></div></body></html>