diff options
Diffstat (limited to 'libstdc++-v3/docs/html/27_io/howto.html')
-rw-r--r-- | libstdc++-v3/docs/html/27_io/howto.html | 345 |
1 files changed, 345 insertions, 0 deletions
diff --git a/libstdc++-v3/docs/html/27_io/howto.html b/libstdc++-v3/docs/html/27_io/howto.html new file mode 100644 index 00000000000..c7eb7c05b15 --- /dev/null +++ b/libstdc++-v3/docs/html/27_io/howto.html @@ -0,0 +1,345 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"> +<HTML> +<HEAD> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> + <META NAME="AUTHOR" CONTENT="pme@sources.redhat.com (Phil Edwards)"> + <META NAME="KEYWORDS" CONTENT="HOWTO, libstdc++, GCC, g++, libg++, STL"> + <META NAME="DESCRIPTION" CONTENT="HOWTO for the libstdc++ chapter 27."> + <META NAME="GENERATOR" CONTENT="vi and eight fingers"> + <TITLE>libstdc++-v3 HOWTO: Chapter 27</TITLE> +<LINK REL=StyleSheet HREF="../lib3styles.css"> +<!-- $Id: howto.html,v 1.5 2000/12/03 23:47:49 jsm28 Exp $ --> +</HEAD> +<BODY> + +<H1 CLASS="centered"><A NAME="top">Chapter 27: Input/Output</A></H1> + +<P>Chapter 27 deals with iostreams and all their subcomponents + and extensions. All <EM>kinds</EM> of fun stuff. +</P> + + +<!-- ####################################################### --> +<HR> +<H1>Contents</H1> +<UL> + <LI><A HREF="#1">Copying a file</A> + <LI><A HREF="#2">The buffering is screwing up my program!</A> + <LI><A HREF="#3">Binary I/O</A> + <LI><A HREF="#4">Iostreams class hierarchy diagram</A> + <LI><A HREF="#5">What is this <sstream>/stringstreams thing?</A> +</UL> + +<HR> + +<!-- ####################################################### --> + +<H2><A NAME="1">Copying a file</A></H2> + <P>So you want to copy a file quickly and easily, and most important, + completely portably. And since this is C++, you have an open + ifstream (call it IN) and an open ofstream (call it OUT): + <PRE> + #include <fstream> + + std::ifstream IN ("input_file"); + std::ofstream OUT ("output_file"); </PRE> + </P> + <P>Here's the easiest way to get it completely wrong: + <PRE> + OUT << IN;</PRE> + For those of you who don't already know why this doesn't work + (probably from having done it before), I invite you to quickly + create a simple text file called "input_file" containing + the sentence + <PRE> + The quick brown fox jumped over the lazy dog.</PRE> + surrounded by blank lines. Code it up and try it. The contents + of "output_file" may surprise you. + </P> + <P>Seriously, go do it. Get surprised, then come back. It's worth it. + </P> + <HR WIDTH="60%"> + <P>The thing to remember is that the <TT>basic_[io]stream</TT> classes + handle formatting, nothing else. In particular, they break up on + whitespace. The actual reading, writing, and storing of data is + handled by the <TT>basic_streambuf</TT> family. Fortunately, the + <TT>operator<<</TT> is overloaded to take an ostream and + a pointer-to-streambuf, in order to help with just this kind of + "dump the data verbatim" situation. + </P> + <P>Why a <EM>pointer</EM> to streambuf and not just a streambuf? Well, + the [io]streams hold pointers (or references, depending on the + implementation) to their buffers, not the actual + buffers. This allows polymorphic behavior on the part of the buffers + as well as the streams themselves. The pointer is easily retrieved + using the <TT>rdbuf()</TT> member function. Therefore, the easiest + way to copy the file is: + <PRE> + OUT << IN.rdbuf();</PRE> + </P> + <P>So what <EM>was</EM> happening with OUT<<IN? Undefined + behavior, since that particular << isn't defined by the Standard. + I have seen instances where it is implemented, but the character + extraction process removes all the whitespace, leaving you with no + blank lines and only "Thequickbrownfox...". With + libraries that do not define that operator, IN (or one of IN's + member pointers) sometimes gets converted to a void*, and the output + file then contains a perfect text representation of a hexidecimal + address (quite a big surprise). Others don't compile at all. + </P> + <P>Also note that none of this is specific to o<B>*f*</B>streams. + The operators shown above are all defined in the parent + basic_ostream class and are therefore available with all possible + descendents. + </P> + <P>Return <A HREF="#top">to top of page</A> or + <A HREF="../faq/index.html">to the FAQ</A>. + </P> + +<HR> +<H2><A NAME="2">The buffering is screwing up my program!</A></H2> +<!-- + This is not written very well. I need to redo this section. +--> + <P>First, are you sure that you understand buffering? Particularly + the fact that C++ may not, in fact, have anything to do with it? + </P> + <P>The rules for buffering can be a little odd, but they aren't any + different from those of C. (Maybe that's why they can be a bit + odd.) Many people think that writing a newline to an output + stream automatically flushes the output buffer. This is true only + when the output stream is, in fact, a terminal and not a file + or some other device -- and <EM>that</EM> may not even be true + since C++ says nothing about files nor terminals. All of that is + system-dependant. (The "newline-buffer-flushing only occuring + on terminals" thing is mostly true on Unix systems, though.) + </P> + <P>Some people also believe that sending <TT>endl</TT> down an + output stream only writes a newline. This is incorrect; after a + newline is written, the buffer is also flushed. Perhaps this + is the effect you want when writing to a screen -- get the text + out as soon as possible, etc -- but the buffering is largely + wasted when doing this to a file: + <PRE> + output << "a line of text" << endl; + output << some_data_variable << endl; + output << "another line of text" << endl; </PRE> + The proper thing to do in this case to just write the data out + and let the libraries and the system worry about the buffering. + If you need a newline, just write a newline: + <PRE> + output << "a line of text\n" + << some_data_variable << '\n' + << "another line of text\n"; </PRE> + I have also joined the output statements into a single statement. + You could make the code prettier by moving the single newline to + the start of the quoted text on the thing line, for example. + </P> + <P>If you do need to flush the buffer above, you can send an + <TT>endl</TT> if you also need a newline, or just flush the buffer + yourself: + <PRE> + output << ...... << flush; // can use std::flush manipulator + output.flush(); // or call a member fn </PRE> + </P> + <P>On the other hand, there are times when writing to a file should + be like writing to standard error; no buffering should be done + because the data needs to appear quickly (a prime example is a + log file for security-related information). The way to do this is + just to turn off the buffering <EM>before any I/O operations at + all</EM> have been done, i.e., as soon as possible after opening: + <PRE> + std::ofstream os ("/foo/bar/baz"); + std::ifstream is ("/qux/quux/quuux"); + int i; + + os.rdbuf()->pubsetbuf(0,0); + is.rdbuf()->pubsetbuf(0,0); + ... + os << "this data is written immediately\n"; + is >> i; // and this will probably cause a disk read </PRE> + </P> + <P>Since all aspects of buffering are handled by a streambuf-derived + member, it is necessary to get at that member with <TT>rdbuf()</TT>. + Then the public version of <TT>setbuf</TT> can be called. The + arguments are the same as those for the Standard C I/O Library + function (a buffer area followed by its size). + </P> + <P>A great deal of this is implementation-dependant. For example, + <TT>streambuf</TT> does not specify any actions for its own + <TT>setbuf()</TT>-ish functions; the classes derived from + <TT>streambuf</TT> each define behavior that "makes + sense" for that class: an argument of (0,0) turns off + buffering for <TT>filebuf</TT> but has undefined behavior for + its sibling <TT>stringbuf</TT>, and specifying anything other + than (0,0) has varying effects. Other user-defined class derived + from streambuf can do whatever they want. + </P> + <P>A last reminder: there are usually more buffers involved than + just those at the language/library level. Kernel buffers, disk + buffers, and the like will also have an effect. Inspecting and + changing those are system-dependant. + </P> + <P>Return <A HREF="#top">to top of page</A> or + <A HREF="../faq/index.html">to the FAQ</A>. + </P> + +<HR> +<H2><A NAME="3">Binary I/O</A></H2> + <P>The first and most important thing to remember about binary I/O is + that opening a file with <TT>ios::binary</TT> is not, repeat + <EM>not</EM>, the only thing you have to do. It is not a silver + bullet, and will not allow you to use the <TT><</>></TT> + operators of the normal fstreams to do binary I/O. + </P> + <P>Sorry. Them's the breaks. + </P> + <P>This isn't going to try and be a complete tutorial on reading and + writing binary files (because "binary" covers a lot of + ground), but we will try and clear up a couple of misconceptions + and common errors. + </P> + <P>First, <TT>ios::binary</TT> has exactly one defined effect, no more + and no less. Normal text mode has to be concerned with the newline + characters, and the runtime system will translate between (for + example) '\n' and the appropriate end-of-line sequence (LF on Unix, + CRLF on DOS, CR on Macintosh, etc). (There are other things that + normal mode does, but that's the most obvious.) Opening a file in + binary mode disables this conversion, so reading a CRLF sequence + under Windows won't accidentally get mapped to a '\n' character, etc. + Binary mode is not supposed to suddenly give you a bitstream, and + if it is doing so in your program then you've discovered a bug in + your vendor's compiler (or some other part of the C++ implementation, + possibly the runtime system). + </P> + <P>Second, using <TT><<</TT> to write and <TT>>></TT> to + read isn't going to work with the standard file stream classes, even + if you use <TT>skipws</TT> during reading. Why not? Because + ifstream and ofstream exist for the purpose of <EM>formatting</EM>, + not reading and writing. Their job is to interpret the data into + text characters, and that's exactly what you don't want to happen + during binary I/O. + </P> + <P>Third, using the <TT>get()</TT> and <TT>put()/write()</TT> member + functions still aren't guaranteed to help you. These are + "unformatted" I/O functions, but still character-based. + (This may or may not be what you want.) + </P> + <P>Notice how all the problems here are due to the inappropriate use + of <EM>formatting</EM> functions and classes to perform something + which <EM>requires</EM> that formatting not be done? There are a + seemingly infinite number of solutions, and a few are listed here: + <UL> + <LI>"Derive your own fstream-type classes and write your own + <</>> operators to do binary I/O on whatever data + types you're using." This is a Bad Thing, because while + the compiler would probably be just fine with it, other humans + are going to be confused. The overloaded bitshift operators + have a well-defined meaning (formatting), and this breaks it. + <LI>"Build the file structure in memory, then <TT>mmap()</TT> + the file and copy the structure." Well, this is easy to + make work, and easy to break, and is pretty equivalent to + using <TT>::read()</TT> and <TT>::write()</TT> directly, and + makes no use of the iostream library at all... + <LI>"Use streambufs, that's what they're there for." + While not trivial for the beginner, this is the best of all + solutions. The streambuf/filebuf layer is the layer that is + responsible for actual I/O. If you want to use the C++ + library for binary I/O, this is where you start. + </UL> + </P> + <P>How to go about using streambufs is a bit beyond the scope of this + document (at least for now), but while streambufs go a long way, + they still leave a couple of things up to you, the programmer. + As an example, byte ordering is completely between you and the + operating system, and you have to handle it yourself. + </P> + <P>Deriving a streambuf or filebuf + class from the standard ones, one that is specific to your data + types (or an abstraction thereof) is probably a good idea, and + lots of examples exist in journals and on Usenet. Using the + standard filebufs directly (either by declaring your own or by + using the pointer returned from an fstream's <TT>rdbuf()</TT>) + is certainly feasible as well. + </P> + <P>One area that causes problems is trying to do bit-by-bit operations + with filebufs. C++ is no different from C in this respect: I/O + must be done at the byte level. If you're trying to read or write + a few bits at a time, you're going about it the wrong way. You + must read/write an integral number of bytes and then process the + bytes. (For example, the streambuf functions take and return + variables of type <TT>int_type</TT>.) + </P> + <P>Another area of problems is opening text files in binary mode. + Generally, binary mode is intended for binary files, and opening + text files in binary mode means that you now have to deal with all of + those end-of-line and end-of-file problems that we mentioned before. + An instructive thread from comp.lang.c++.moderated delved off into + this topic starting more or less at + <A HREF="http://www.deja.com/getdoc.xp?AN=436187505">this</A> + article and continuing to the end of the thread. (You'll have to + sort through some flames every couple of paragraphs, but the points + made are good ones.) + </P> + +<HR> +<H2><A NAME="4">Iostreams class hierarchy diagram</A></H2> + <P>The <A HREF="iostreams_hierarchy.pdf">diagram</A> is in PDF. Rumor + has it that once Benjamin Kosnik has been dead for a few decades, + this work of his will be hung next to the Mona Lisa in the + <A HREF="http://www.louvre.fr/">Musee du Louvre</A>. + </P> + +<HR> +<H2><A NAME="5">What is this <sstream>/stringstreams thing?</A></H2> + <P>Stringstreams (defined in the header <TT><sstream></TT>) + are in this author's opinion one of the coolest things since + sliced time. An example of their use is in the Received Wisdom + section for Chapter 21 (Strings), + <A HREF="../21_strings/howto.html#1.1internal"> describing how to + format strings</A>. + </P> + <P>The quick definition is: they are siblings of ifstream and ofstream, + and they do for <TT>std::string</TT> what their siblings do for + files. All that work you put into writing <TT><<</TT> and + <TT>>></TT> functions for your classes now pays off + <EM>again!</EM> Need to format a string before passing the string + to a function? Send your stuff via <TT><<</TT> to an + ostringstream. You've read a string as input and need to parse it? + Initialize an istringstream with that string, and then pull pieces + out of it with <TT>>></TT>. Have a stringstream and need to + get a copy of the string inside? Just call the <TT>str()</TT> + member function. + </P> + <P>This only works if you've written your + <TT><<</TT>/<TT>>></TT> functions correctly, though, + and correctly means that they take istreams and ostreams as + parameters, not i<B>f</B>streams and o<B>f</B>streams. If they + take the latter, then your I/O operators will work fine with + file streams, but with nothing else -- including stringstreams. + </P> + <P>If you are a user of the strstream classes, you need to update + your code. You don't have to explicitly append <TT>ends</TT> to + terminate the C-style character array, you don't have to mess with + "freezing" functions, and you don't have to manage the + memory yourself. The strstreams have been officially deprecated, + which means that 1) future revisions of the C++ Standard won't + support them, and 2) if you use them, people will laugh at you. + </P> + + +<!-- ####################################################### --> + +<HR> +<P CLASS="fineprint"><EM> +Comments and suggestions are welcome, and may be sent to +<A HREF="mailto:pme@sources.redhat.com">Phil Edwards</A> or +<A HREF="mailto:gdr@gcc.gnu.org">Gabriel Dos Reis</A>. +<BR> $Id: howto.html,v 1.5 2000/12/03 23:47:49 jsm28 Exp $ +</EM></P> + + +</BODY> +</HTML> + + |