Standard C++ IOStreams and Locales
Standard C++ IOStreams and Locales
C++ IOSTREAMS
AND LOCALES
Advanced Programmer’s Guide and Reference
Angelika Langer
and
Klaus Kreft
vv
Addison-Wesley
An imprint of Addison Wesley Longman, Inc.
Reading, Massachusetts ¢ Harlow, England ¢ Menlo Park, California
Berkeley, California ¢ Don Mills, Ontario ¢ Sydney
Bonn ¢ Amsterdam ¢ Tokyo © Mexico City
Many of the designations used by manufacturers and sellers to distinguish their products are
claimed as trademarks. Where those designations appear in this book and Addison Wesley Long-
man, Inc. was aware of a trademark claim, the designations have been printed in initial capital let-
ters or all capital letters.
The authors and the publisher have taken care in the preparation of this book, but make no
expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of the
information or programs contained herein.
The publisher offers discounts on this book when ordered in quantity for special sales. For more
information, please contact:
Langer, Angelika.
Standard C++ IOStreams and locales : advanced programmer’s guide and reference /
Angelika Langer, Klaus Kreft.
p- cm.
Includes bibliographical references and index.
ISBN 0-201-18395-1
1. C++ (Computer program language) I. Kreft, Klaus. II. Title.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or
transmitted, in any form, or by any means, electronic, mechanical, photocopying, recording, or oth-
erwise, without the prior consent of the publisher. Printed in the United States of America. Pub-
lished simultaneously in Canada.
ISBN 0-201-18395-1
Text printed on recycled paper
123456789 10-CRW-0403020100
First printing, January 2000
STANDARD
C++ IOSTREAMS
AND LOCALES
CONTENTS
Foreword XV
Preface XVii
Guide for Readers Xxi
1. IOStreams Basics 3
1.1 Input and Output 3
1.2 Formatted Input/Output | 12
1.2.1 The Predefined Global Streams 12
1.2.2 The Input and Output Operators 13
1.2.3. The Format Parameters of a Stream 16
1.2.4 Manipulators 22
1.2.5 The Locale of a Stream 27
1.2.6 Comparison Between Formatted Input and Output 28
1.2.7 Peculiarities of Formatted Input 29
1.3 The Stream State 31
1.3.1 The Stream State Flags 31
1.3.2 Checking the Stream State 34
1.3.3 Catching Stream Exceptions 36
1.3.4 Resetting the Stream State 38
1.4 File Input/Output 39
1.4.1 Creating, Opening, Closing and Destroying File Streams 39
1.4.2 The Open Modes 41
1.4.3. Bidirectional File Streams 45
V1 Contents
5. Locales 265
5.1 Creating Locale Objects 267
5.1.1 Named Locales 267
5.1.2. Combined Locales 268
5.1.3. The Global Locale 270
5.2 Retrieving Facets from a Locale 271
5.2.1 has_facet 272
5.2.2 use_facet 272
REFERENCE GUIDE
Introduction 343
1. Locale 346
header file <locale> 346
global functions 349
codecvt<internT,externT,stateT> 352
codecvt_base 358
codecvt_byname<internT,externT,stateT> 360
collate<charT> 362
collate_byname<charT> 365
ctype<charT> 367
ctype<char> 373
cytpe_base 377
ctype_byname<charT> 379
locale 381
messages<charT> 387
messages_base 390
messages_byname<charT> 391
money _base 393
money_get<charT,InputIterator> 395
moneypunct<charT,Inter> 398
moneypunct_byname<charT_Inter> 403
money_put<charT,OutputlIterator> 405
num_get<charT,InputIterator> 408
numpunct<charT> 414
numpunct_byname<charT> 417
num_put<charT, OutputIterator> 419
time_base 423
time_get<charT,InputIterator> 425
time_get_byname<charT, InputIterator> 430
time_put<charT,OutputIterator> 431
time_put_byname<charT, OutputlIterator> 434
Contents x1
time_base 436
tm 438
3. IOStreams 451
header file<iosfwd> 451
global type definitions 464
global objects 465
basic_filebuf<charT,traits> 467
basic_fstream<charT, traits> 471
basic_ifstream<charT,traits> 473
basic_ios<charT, traits> 475
basic_iostream<charT,traits> 480
basic_istream<charTtraits> 481
basic_istringstream<charT, traits,Allocator> 492
basic_ofstream<charT, traits> 494
basic_ostream<charT,traits> 496
basic_ostringstream<charT, traits,Allocator> 506
basic_streambuf<charT, traits> 508
basic_stringbuf<charT, traits,Allocator> 517
basic_stringstream<charT , traits,Allocator> 521
fpos<stateT> 523
ios_base 524
manipulators | 538
APPENDICES
I began working on C++ input/output libraries around 1986. I started by fixing a bug I
had noticed in the stream library that was part of the internal Bell Labs distribution of
C++. I decided to also add some of the functionality that was present in the C stdio library
but was not in the stream library. As I looked at the stream library, I realized that the archi-
tecture of an input/output library in C++ raised interesting issues about C++ design, and
I conceived what became the IOStream library. Originally I thought of this as a personal
library that could be used for experimentation and had no intention of replacing the stream
library. But at some point, Bjarne Stroustrup encouraged the product organization that was
responsible for C++ to replace the stream library with the IOStream library. They did so,
and what started out as an exercise that I expected to last a few months became an effort
that would span more than ten years, including the ANSI/ISO standardization effort.
A major goal in my original design was that it be extensible in interesting ways. In
particular, in the stream library the streambuf class was an implementation detail, but in
the IOStream library I intended it to be a usable class in its own right. I was hoping for the
promulgation of many streambufs with varied functionality. I wrote a few myself, but
almost no one else did. I answered many more questions of the type “How do I make my
numbers look like this?” than “How do I write a streambuf?” And textbook authors also
tended to ignore streambufs. Apparently they did not share my view that the architecture
of the input/output library was an interesting case study.
Another common question addressed to me was “Why weren't the members of the
stream classes virtual?” On further discussion it usually became clear that what the ques-
tioner needed to do was write a streambuf with some particular functionality. Or some-
times what was needed was to write an alternative top-level class that used streambufs
for transport. When I would explain this, they frequently lost interest.
XU
xv1 Foreword
All this led to a sense of frustration and disappointment that the library was not
being used to its best advantage. Several friends encouraged me to address this frustra-
tion by writing a book, the theory being that the extensibility features were hard to under-
stand but that a book explaining them clearly would encourage people to use them. As
the C++ standardization effort went on and the library became more complicated, the
need for such a book increased. I started on the project several times, but never organized
the time and energy required of an author.
During this period, when I heard about books on the JOStream library I approached
them with mixed feelings. On the one hand I was always glad to see the IOStream library
getting attention, on the other I worried that this book would beat me to the punch. But
my fears were groundless: None of them addressed the issues of architecture and C++
design that were my major concern.
None, that is, until the present book. Not only does it address the kinds of questions
that concern me, but also it does so with concrete examples that will enable readers to
quickly adapt the ideas to their own requirements. I am no longer contemplating writing
a book on the library so I have only positive feelings about this book.
—Jerry Schwarz
November 1999
PREFACE
Since 1998, the programming language C++ has been formally specified in the form of the
ISO/IEC International Standard 14882, a document that for historical reasons is often
referred to as the ANSI C++ Standard. Integral to this standard is a rich set of abstractions
known as the C++ Standard Library. In fact, half of the standard is devoted to the library.
This book covers two major domains of the standard library: IOStreams and locales.
During the process of standardization from 1989 to 1998, the new language features
and the standard library created a fair amount of interest in the C++ community. To
address this need for information, several books were published during and after the
standardization. Some cover standard C++ in general, typically including a brief intro-
duction to some of the library abstractions; one textbook is devoted exclusively to the
standard library. However, the only part of the library that has been discussed in depth so
far is the STL, a set of collections and algorithms that was developed at Hewlett-Packard
independent of the standardization effort and was later integrated into C++ standard
library. While the STL is, without doubt, the most popular part of the standard library, it
represents less than a third of the library as a whole (counting the pages in the standards
document and considering the time that the committee spent on it), whereas IOStreams
and locales form another third of the library.
When we got involved with the standardization of C++ through our professional
occupations in 1993, hardly anything had been published about IOStreams, and C++
locales had not yet been invented. The only book on IOStreams was the C++ IOStreams
Handbook by Steve Teale, which describes the classic, prestandard IOStreams; and there
was a definite lack of information regarding the standardized IOStreams. The situation
has not radically changed since then. Even now, at the time of this writing in 1999, little
XVII
XVIII Preface
has been published about the standardized IOStreams, and even less about C++ locales.
The few books that exist about IOStreams are out of date; they all cover the classic, pre-
standard IOStreams. The C++ textbooks provide introductory information about
JOStreams but rarely anything about locales. For this reason, we felt the need for a book
exclusively devoted to these topics that begins where the tutorials leave off.
TARGET AUDIENCE
This book is a programmer’s guide to the standard IOStreams and locales, together with a
complete reference of all relevant classes, functions, templates, headers, etc. It is neither a
tutorial nor a textbook. It does not aim to teach the reader C++ or the basics of IOStreams.
We expect of the readers that they know, at least roughly, what happens when they type a
line of C++ code such as
Hence this book is not for absolute beginners, but rather for C++ programmers who have
been studying a C++ textbook, or have comparable practical experience, and who intend
to use IOStreams and locales in more than a casual way.
As locales are an abstraction that is new to C++, as opposed to IOStreams, which has
been around for more than a decade, we cover locales from the ground up. Some knowIl-
edge of locales in C will aid understanding, but it is not required. We do not aim to cover
internationalization in a comprehensive way. Internationalization is too broad a topic,
and an adequate discussion of it would fill another whole book. However, IOStreams and
locales are closely related, and for this reason the book explains the concept of C++
locales, with emphasis on usage of locales in conjunction with IOStreams.
Regarding IOStreams, we acknowledge the fact that the classic IOStreams library
has been in existence since the early days of C++. We assume that readers are familiar
with the basic features as they are explained in every C++ textbook. Instead of repeating
the basics, we aim to go beyond that introductory level. For instance, we demonstrate
advanced features—such as user-defined shift operators and manipulators, extending
streams by use of iword/pword, and derivation of new stream and stream buffer
classes—as well as less ambitious topics like format control and error handling.
Overall, the goal of this book is to provide as much information about the general
principles as is needed to enable readers to accomplish their concrete programming tasks
using IOStreams and locales. The focus is on the underlying concepts and the more
advanced programming techniques that IOStreams and locales support, rather than on
the details of each and every interface. For this reason we refrain from presenting exten-
sive and lengthy case studies and code examples. IOStreams and locales are general pur- |
pose tools and can be used to solve a sheer abundance of problems. It would have been
impossible to find a representative and comprehensive set of case studies. Instead, we
concentrate on a few condensed examples that we use to explain programming tech-
Preface x1x
niques and concepts, and we deliberately refrain from blowing them up to full applica-
tions in order to avoid unnecessary distractions. We trust that readers will be astute
enough to figure out concrete applications once the principles are clear.
We have received a considerable number of queries such as “Why is this and that so and
so?” seeking an explanation of why IOStreams and locales are designed the way they are.
Where we know of an underlying rationale, we explain it. Yet there are inconsistencies
and “interesting” design decisions that can be explained only by “historical reasons” or
“design by committee.” Where we feel that certain features introduce potential pitfalls,
we point them out, so that the reader can avoid them. Beyond that, we neither aim to
defend the standard nor intend to discuss alternative designs. We describe it as it is.
ACKNOWLEDGMENTS
Writing this book took us more than three years, and during this long period many people
helped us to endure and finish the task. As with any book, the authors are only part of the
story, and we would like to thank all those people who contributed in one way or another.
At Addison-Wesley, we would like to thank Mike Hendrickson, Deborah Lafferty,
and in particular Marina Lang; they believed in the value of this book and accompanied
us through the entire process from proposal to print. Our thanks also to Beth Burleigh
Fuller and John Fuller for their support during the production process of the book.
We would like to thank all those knowledgeable and patient people at the standards
committee and elsewhere who answered our countless and sometimes stupid questions:
Nathan Myers, who invented the C++ locales and proposed them to the standards com-
mittee, told us everything about locales and helped us understand his proposal as well as
any resulting discussions. Jerry Schwarz, who is the “father of IOStreams,” that is, the
author of the first version of IOStreams (or “streams” as they were called in C++ 1.0), gave
us invaluable insights into the intent of many of the IOStreams features, and we thank him
for his patience and support. Bill Plauger, author of the Microsoft version of the C++ stan-
dard library, helped us distinguish between bugs in the implementation and misunder-
standings on our side; he was also invaluable in helping us understand and interpret the
standard correctly. Philippe LeMouel, a former colleague at RogueWaveSoftware, imple-
mented JOStreams there and explained his implementation to us. So did Joe Delaney, who
worked on RogueWave’s implementation of locales. Dietmar Kithl worked on the imple-
mentation of IOStreams and locales for the gnu compiler and answered numerous ques-
tions. John Spicer of EdisonDesignGroup and Erwin Unruh of Siemens answered questions
regarding templates and other language features. Beman Dawes, who maintains the library
issue list for the standards committee, helped clarify countless open issues in the standard.
Thanks also to our reviewers, some of whom spent a considerable amount of effort
and time compiling thorough and helpful comments: (in alphabetical order) Chuck Allison,
xx Preface
Stephen Clamage, Mary Dageforde, Amelia Lewis, Stan Lippman, Dietmar Kuhl, Werner
Mossner, and Patrick Thompson, as well as others who preferred to stay anonymous.
—Angelika Langer
Klaus Kreft
I would like to thank Bernd Eggink, author of a book about the classic IOStreams written
in German. Our email correspondence about IOStreams spawned the idea of a joint book
project on the standard IOStreams. The original idea had been to translate his book into
English and upgrade to the standard, but his sudden, serious illness thwarted our plans.
I would like to thank Thomas Keffer, the founder of RogueWaveSoftware, for com-
ing up with the idea of writing a book of my own and for supporting and encouraging me
ever since. I had been working at his company when he suggested the book project. I was
a German alien working at a U.S. corporation when he proposed that I write a book about
internationalization in C++. I would like to thank Roland Hartinger, my former supervi-
sor and head of the C++ compiler construction group at Siemens Nixdorf, who threw me
into the library project and encouraged me to join the standards committee.
Last, but not least, I thank Klaus Kreft for joining me on this project. Without him
this book would have never been finished, and not much is worth doing without him.
—Angelika Langer
First of all, I would like to thank my parents for recognizing and fostering my interest and
talent in mathematics and natural sciences. Without their support I would not be who I am.
Next, I thank two individuals whom I met through my professional work who were
true sources of inspiration and insight: Gerhard Draxler, whom I miss tremendously since
he retired from his professional life, and Werner Mossner, with whom I had my first con-
tact with IOStreams. Together we implemented a logging facility by derivation from the
IOStreams classes.
Finally, I thank Angelika Langer, who is a constant source of ideas and an over-
whelmingly persistent worker. Without her effort this book would not have been finished,
yet the significance of her contribution is nothing compared to what she means to me in
my private life.
—Klaus Kreft
CONTACT INFORMATION
If you would like to provide feedback on the book, report errors, ask questions, or if you
just want to share your thoughts on JOStreams and locales with us, feel free to e-mail us at:
[email protected].
GUIDE FOR READERS
This book on internationalization and stream input and output components from the C++
standard library consists of the following parts:
Users’ Guide:
Part I: Stream Input and Output
Part II: Internationalization
Reference Guide
Appendices
Both the users’ guide and the reference guide are organized in a way that allows
lookup of information as needed. The users’ guide is organized around topics, architec-
tural concepts, and certain types of usage rather than discussing each class and interface
one by one. For instance, in the users’ guide you find sections such as Error Indication in
IOStreams and Creation of Locale Objects. The reference guide, on the other hand, is orga-
nized in terms of classes and interfaces, with an entry for every header, class, function,
and so forth.
xx1
XX11 Guide for Readers
The reference guide is designed to allow lookup of function signatures and class inter-
faces. It is divided into five major sections:
Guide for Readers XXil1
1. locale
2. character traits
3. IOStreams
4. stream iterators
CLASS DIAGRAMS
As notation for class diagrams we use the standard object modeling language UML. For
those who are not familiar with the UML notation, here is a brief overview of the elements
we use in this book.
Class inheritance
ios_base ios_base
preter
ere ee ee eee
Parameterized Class | charT:class, traits:class|!
- i
| charT:class, traits:class| basic_ios
EE
Jj
Bound Element
ctype<wchar_t>
Aggregation
pote rere ee ee Ke Ke —COC mw eee ee ee eee
| charT:class, traits:class! j charT:class, traits:class|
eae mee we wee we wes Se | -— J = ope ii il J
| basic_ios oo basic_streanbut |
Composition
| ios_base lo | locale |
xx1V Guide for Readers
CODE EXAMPLES
Please note that the source code examples in this book might not compile in all develop-
ment environments. At the time of this writing, scarcely any C++ compiler understands
the full range of language features defined by the C++ standard. Neither does any com-
piler come with a complete standard-compliant C++ library.
We tried to compile the code examples in this book using Microsoft’s MVC 6.0 com-
piler. As expected, some of our sample programs could not be compiled and tested suc-
cessfully in this development environment. Despite this discouraging situation, we made
our best effort to show you standard C++ programs as they are supposed to work under a
standard-compliant C++ compiler.
While this is less than ideal, and we really wish that we could have verified all of our
code examples, we still decided to include examples. The danger is that the sample code
might not compile in your environment, because your compiler, like ours, does not comply
with the standard yet. Another hazard is that minor mistakes might have sneaked into the
code and stayed undetected. The safe alternative would have been to omit certain tech-
niques entirely, because today they do not compile. However, we wanted to demonstrate
the full range of techniques that standard C++ will support. Hence, this book is written with
an eye to the standard C++ developments environments of the (hopefully near) future.
Compilers and libraries will catch up, and techniques that do not work with your current
compiler will work once you will have a true standard-compliant C++ compiler available.
We feel that including these techniques contributes to the usefulness of this book.
All code examples are simplified. We did this to make the examples as concise and
focused as possible, rather than endlessly repeating the same code fragments. Simplifica-
tions include:
e Include statements. The necessary #include< ... > statements for standard
library header files are omitted. An overview of the header files can be found in
this book’s reference guide. You can find out which header files you have to
Guide for Readers XXV
include by looking up the item (object, class, function, type, etc.) in the reference
guide; each entry has a section that mentions the required header file.
¢ Standard namespace. The entire standard C++ library resides inside the reserved
global namespace std. For the sake of readability we generally omit the scope
operator for the standard library namespace. Hence, instead of writing
we simply write
#include <iostream>
using namespace ::std;
int main()
{
cout << “Hello world” << endl;
return 0;
}
¢ Error handling. 1OStreams operations indicate failure by setting certain flags in the
so-called stream state. Optionally, they can also throw exceptions in case of failure.
In principle, it is advisable to check failure of operations by checking the stream
state or catching exceptions. To keep examples in this book focused on the usage
and functionality of the IOStreams operation under discussion, we omit the error
handling in most examples. The exceptions are section 8, Input/Output of User-
Defined Types, and section 9, Manipulators; they include extended and compre-
hensive source code examples, which among other aspects also demonstrate the
proper use of error-checking strategies in IOStreams.
TERMINOLOGY
The standard C++ library consists mostly of class and function templates. In this book we
use the following notations and abbreviations for class templates:
¢ Fully qualified class template names. An example is: template <class charT,
class Traits> class basic_fstream
XXUI Guide for Readers
e A short form of class template name. The short form corresponding to the example
above would be: basic_fstream <class charT, class Traits>
In addition to abbreviations for class templates, you will find certain contrived tech-
nical terms. An example is file stream. It stands for the abstract notion of the file stream class
template basic_fstream <class charT, class Traits> and its instantiations.
Also, we omit the class scope if it is unmistakable. An example would be badbit
standing for ios_base: :badbit. Another example is facet, which stands for class
locale: : facet, a class type nested into class locale.
The term facet is a hybrid one. It can also be used as a technical term designating all
of the following:
¢a class template, e.g.,template <class charT> class ctype
eclasses and class templates derived from such a class template, eg.,
ctype_byname<class charT>
IMPLEMENTATION:-SPECIFIC FEATURES
In many places throughout the book the term implementation-specific is used. It is a techni-
cal term from the ISO/ANSI standardization, which denotes features that are not stan-
dardized, but may differ between implementations of the standard C++ library. Each
library implementor has to provide a list of these properties specific to his or her imple-
mentation of the standard C++ library.
An example of an implementation-specific feature is the type streamsize that is
used in IOStreams. It is an implementation-specific integral type. This means that it can be
an int in one implementation of the standard library and a long in another standard
library. If you aim for portability of your programs, you should never rely on implemen-
tation-specific features. If in this example you take advantage of the knowledge that a
streamsize object is an int ina particular implementation of the standard library, then
your code will break once you port it to a standard library where a streamsize object is
a long.
PART I
INTRODUCTION
This part of the book explains JOStreams, the stream input and output component in the
standard C++ library. In its chapters, we will repeatedly refer to part II of this book, which
explains locales and facets, the means for internationalization in the standard C++ library.
The reason for these references is that IOStreams have been internationalized using stan-
dard C++ locales and facets. We suggest either reading part II first or being prepared to
look up the topics referenced in part II as needed.
stream buffer iterators. This chapter provides information that is not used
every day and is meant as preparation for chapter 3.
[OStreams Basics
© Record or block I/O. Alternatively, a certain structure might be imposed on the trans-
ferred data, such as a record, block, or message structure. In that case, larger
chunks of data, i.e., records, blocks, or messages, are transported. These chunks
may also contain information additional to the actual data. For instance, if you
read from an ISAM! file, you receive a data record that consists of the actual data
plus a record identification. In such cases of structured I/O we talk of record or
block I/O. The main difference from stream I/O is that input and output are struc-
tured and additional information is transported along with the actual data.
The standard C++ IOStreams, as the name already implies, supports stream I/O.
This does not mean, however, that the actual external device may not have any structure,
only that the concept of IOStreams is that of stream I/O, and that the specifics of the
actual external device are hidden behind the IOStreams interfaces. For a user of
1OStreams, input and output to and from an external device are streams of characters.
Note that they are streams of characters, as opposed to streams of bits or bytes. This is
because IOStreams facilitates text 1/O.
So far we have seen how data are transported by IOStreams. We have not yet consid-
ered what kind of data are transported. What is the content of the transferred data? How is
it represented?
In order to discuss the data representation in IOStreams, let us get back to the defin-
ition of input and output given above: Input and output are the transfer of data between a
program and any kind of external device. The representation of data in a program and on
an external device may differ. We distinguish between an internal and an external
representation.
The internal representation of data is of a form that is convenient for data processing
in a program. Common examples are the binary format of integral numbers, the IEEE rep-
resentation of floating-point numbers, or the ASCII or Unicode encoding of a string.
The external representation varies depending on the type of device and the intended
use of the data. Here are some examples:
1. ISAM stands for Index Sequential Access Method. ISAM files are record oriented, i.e., they are not streams.
1.1 Input and Output : 5
Formatting
0x0000009e > 158
Oxfffffefd —259
Parsing
External representation
J a Pp a n | ese $ B i. ab
Multibyte characters
Wide characters
Transport involves access to the external device for reading and writing data. It man-
ages the physical transfer of character sequences to the device after formatting, buffering,
and code conversion, as well as extracting data from the device and making the received
data available as a sequence of characters for subsequent code conversion, buffering, and
parsing.
Program
Formatting layer
| Transport layer |
[_ External device
e Field width for output. IOStreams is capable of inserting fill characters in order to
adjust fields in the output.
e Precision and notation of floating-point numbers. You might want to control how
many digits should be printed as fractional parts of a floating-point value.
e Hexadecimal, octal, or decimal representation of integers. IOStreams can produce
and recognize integral values to various bases.
e Adapting of number formatting to local conventions. A number like 1,000,000
in the United States is represented as 1.000.000 in other countries. IOStreams has
the ability to adjust its formatting and parsing of numerical values to local
conventions.
The transport layer’s main task is transporting character sequences to and from an
external device. It encapsulates all the necessary knowledge about the properties of a spe-
cific external device. This knowledge can include several aspects:
e Access to the external device. Typically, a connection must be established before
characters can be transported to and from an external device. For instance, the
transport layer knows how to open and close files.
¢ Buffering. Transport of data to and from an external device might be most efficient
when done in blocks of a certain size. The transport layer knows how to do block-
wise output to files through system calls. In such a case the actual output is
delayed, and character sequences received from the formatting layer are buffered
until the block size is reached. Conversely, input from the external device is
received in larger chunks and made available to the formatting layer, which
requests smaller character sequences.
¢ Code conversion. If the character representation on the external device differs
from the character representation used by the formatting layer, the transport layer
performs the necessary code conversion. For instance, the transport layer knows
how to convert wide-character codes to multibyte encodings.
FILE STREAMS AND STRING STREAMS. Files and strings are two categories of external
devices. File I/O involves the transfer of data to and from an external device that con-
forms to the file abstraction. The device need not necessarily be a file in the usual sense of
the word. It could just as well be a communication channel like sockets or pipes, or
another device that exhibits a file-like behavior. In contrast, neither string I/O nor in-
memory I/O involves an external device. The source and destination of in-memory I/O
is a memory location in your program’s storage space and can be retrieved in the form of
a C++ string.
NARROW- AND WIDE-CHARACTER STREAMS. [hese two types of streams differ in the type
of character sequence passed between the formatting and transport layers. In narrow-
character streams the character sequences produced by the formatting layer and con-
sumed by the transport layer are sequences of characters of type char, which is the
built-in C++ character type. For instance, a unit of type char can hold a character literal
like x, U, or \n; arrays of such narrow characters can hold string literals like Hello
world\n. In wide-character streams the character sequences passed between the format-
ting and transport layers are arrays of units of type wchar_t, which is a type defined in
C++ for storage of wide characters. For instance, a unit of type wchar_t can hold a char-
acter literal like L'A' or L' ® '; arrays of such wide characters can hold string literals like
L"Hello world\n" or L"LEF". Note that in C++, wide-character literals are prefixed
with an uppercase L to distinguish them from narrow-character literals.
The concepts of external devices on the one hand and the character type used for
streaming on the other hand are orthogonal, i.e., there are narrow and wide file streams as
well as narrow and wide string streams. The concept of external devices is implemented
by means of inheritance; the variation on the character type of a stream is achieved via
templates. The next section gives an overview of classes and templates in IOStreams.
CLASSES IN IOSTREAMS
Here is a brief overview of the stream classes in IOStreams. There are two stream base classes
that encapsulate information and functionality common to all stream classes. Class
ios_base encapsulates all information that is independent of the character type handled
by a stream. Class basic_ios is a class template taking the character type as a template
argument.’ It contains character-type dependent information common to all stream classes.
Then there are the general input and output stream classes that implement the concepts
of input, output, and bidirectional I/O. They provide the entire functionality for parsing
of input and formatting for output. However, they do not contain any information that is
specific to the external device associated with the stream.
2. For the sake of brevity and simplicity, we omit the more “exotic” template parameters of the IOStreams classes
in this introduction. Class basic_ios, like all class templates in the standard C++ library that take the character
type as a template argument, has a second template parameter called traits, which is associated with the first
template parameter. The traits type contains information about the character type, such as the end-of-file
value or the way characters are compared or copied. A detailed discussion of the template parameters of the
IOStreams classes is deferred until section 2.3, Character Types and Character Traits.
10 iOStreams Basics
For each direction two derived classes implement the concepts of file and string I/O
respectively. The file stream classes support input and output to and from files. They add
functions for opening and closing files. The string stream classes support in-memory I/O,
that is, reading and writing to a string held in memory. These classes add functions for
getting and setting the string to be used as a buffer.
Figure 1-4 shows the class hierarchy of all stream classes and their base classes.
Almost all classes are template classes, parameterized on the character type and a
related traits type.> Instances of these class templates are provided for the character types
char and wchar_t; they represent narrow and wide character streams. Additionally, for
convenience and for compatibility to the classic IOStreams, there are type definitions for
those instantiations. Here are some examples:
narrow-character file streams:
typedef basic_ifstream<char> ifstream;
typedef basic_ofstream<char> ofstream;
typedef basic_fstream<char> fstream;
There are equivalent type definitions for string streams and all other streams classes
in IOStreams.
IOSTREAMS AS A FRAMEWORK
IOStreams is not only a set of ready-to-use classes, like fstream or stringstream. You
can also think of the Standard IOStreams as a framework that is intended to be cus-
tomized and extended. Just to give you an idea of the power of this framework, here is a
list of means for extending the Standard IOStreams. We will explore them in greater detail
later in this book (in section 3, Advanced IOStreams Usage).
e You can add input and output operations for user-defined types. [OStreams
already provides input and output operations for all built-in types and many types
in the Standard C++ Library. Still, you can add operations for types you defined
for your application. We discuss techniques for building such I/O operations in
section 3.1, Input and Output of User-Defined Types.
e You can add new concepts for transport, i.e., new categories of external devices,
such as communication channels in a network or display fields in a graphical user
3. The traits template parameter is a type that contains information about the character type charT. See sec-
tion 2.3, Character Types and Character Traits, for details on the template parameters of IOStreams classes. All
IOStreams classes define a default for the traits template parameter; hence the traits argument can usually be
omitted.
1.1 Input and Output 1]
__ 4, 1 CharT:class, traits:class
General stream
classes
basic_istream
basic_iostream
Concrete stream
classes
charT:class,
.
I
A peo
1
e ee
charT:class,
4
’
A
charT:class,
.
J
Ue
traits:class ; ——1 traits:class i traits:class i
Lt
t 1 I 4 1 q
, charT:class, I 1 charT:class, 1 1 charT:class, 1
! traits:class, 1 ' traits:class, I ' traits:class, 1
Allocator:class t Allocator:class ! Allocator:class }
basic_istringstream [ basic_stringstream | basic_ostringstream |
interface. IOStreams supports I/O to files and strings. Other concepts may be
added. We discuss this in section 3.4, Adding Stream Buffer Functionality.
¢ The IOStreams classes are templatized on the character type. IOStreams already
facilitates narrow- and wide-character streams by provided instantiations for the
built-in character types char and wchar_t. However, you can instantiate the
standard IOStreams classes for user-defined character types. An example of a user-
defined type could be a type Jchar for Japanese characters that contains addi-
tional information about each character. More information about user-defined
character types in IOStreams can be found in section 2.3.3, Character Types.
¢ Localization and code conversion are factored out into separate abstractions called
locales.* Locale objects are attached to streams and used by IOStreams’ input and
output operations for adapting their behavior to cultural conventions. Locales can
be replaced, and as a result the I/O operations exhibit an adjusted behavior. Also,
locales can be extended by adding new categories of information about cultural
dependencies. An example of such an addition would be information about time
zones or rules for address formats. You can attach extended locales to streams, and
you can add I/O operations that use this additional culture-dependent informa-
tion to adapt their behavior. For instance, you can add operations for culture-
dependent formatting of an address data type. Techniques for extending locales
and JOStreams in this way are explained in part II.
Table 1-1: Predefined Global Streams with Their Associated C Standard Files
Like the C standard files, these streams are all associated by default with the
terminal.
The predefined streams have certain special behaviors. Details can be found in sec-
tion 1.8.2, Synchronizing the Predefined Standard Streams. Here are the most important
features:
*cin is “tied” to cout. This means that whenever input is requested from cin,
cout is first flushed, i.e, cout writes output to the external device before any
input operation on cin.
5. Like all other abstractions from the IOStreams, the global streams reside in the namespace std.
1.2 Formatted Input/Output 13
¢ Both cerr and clog are associated with stderr. The difference between clog
and cerr lies in their synchronization with the associated external device. Output
to cerr is written to the external device immediately after formatting, i.e.,cerr's
buffer is flushed after each output operation. Flushing of clog 's buffer either has
to be explicitly invoked or happens automatically, depending on the internal
buffering mechanism.
*cerr and clog are used for different purposes: cerr is preferred for error output
to the terminal; clog is typically used for error output that is redirected to a file.
e The wide-character counterparts wclog and wcerr have the same buffering
characteristics.
e The predefined global streams are by default synchronized with their associated C
standard files. This means, for instance, that you can write output to the same C
standard file, say stdout, via C stdio functions like printf () and via IOStreams
output operations to cout. Although you mix output operation from the C and
the C++ library, the output is not garbled, but appears in the order in which the
respective instructions were executed.
A possible output is (under the assumption that x is a variable of an integral type and con-
tains the value 10):
result: 10
Input is done through the other shift operator operator>>(), often referred to as the
extractor. It reads text from an input stream. Here is an example of how this is done:
6. All examples in this section are restricted to the use of predefined global streams. This is because we defer the
discussion of creating other stream objects to subsequent sections: section 1.4, File Input/Output, and section
1.5, In-Memory Input/Output.
7. Note that code examples in this book are simplified: Both the necessary #include statements and the using
statement for the standard library namespace ::std are omitted. See the Guide for Readers for more
information.
14 1OStreams Basics
The two variables x and y are filled with valid values of their respective type, if the
input extracted from the stream can be parsed according to the rules for those types. For
instance, if x is a variable of a type int and the input is a sequence of digits, the variable x
is filled with the integral value that is equivalent to the extracted sequence of digits.
to
The latter is more compact, because the shift operators permit the printing of several
units in one expression. Source code like the line above is also more readable, because one
line of output inserted to the stream corresponds to one line of source code in the
program.
Among operators that can be overloaded in C++ the operators << and >> were cho-
sen because their precedence is low enough to allow arithmetic expressions as operands
without using parentheses. You can, for example, write
without a need for parentheses around the arithmetic expression a*b+c. Still, parenthe-
ses must be used to write expressions containing operators of lower precedence, such as
bit operations, for instance:
(cout.operator<<("result: ")).operator<<
(x) ;
1.2 Formatted Input/Output 15
The concise notation is possible thanks to the use of operators as input/output opera-
tions instead of functions. Another prerequisite for the concatenation of the shift operators
is that each operator return a reference to the respective stream. In the example above, the
subexpression cout << "result: " is equivalent to cout .operator<< ("result: "),
i.e., it invokes the inserter for C strings. This inserter, operator<<(const char*),
returns a reference to the stream it was invoked on, in this case cout. Hence, the subex-
pression cout << "result: " evaluates to a reference to cout. Consequently, the whole
expression cout << "result: " << xis equivalent to cout << x after evaluation of the
first subexpression. The remaining subexpression cout << x again results in a call to an
inserter, which also returns a reference to cout. Thanks to the convention that all shift
operators return a reference to the stream object they were invoked on, concatenation of
I/O operation is possible.
There are inserters and extractors for bool, char, int, long, float, double, C strings,
C++ strings, complex numbers, etc. You can also add shift operators for user-defined
types (see section 3, Advanced IOStreams Usage, for details). When you insert or extract a
value to or from a stream, the C++ function overload resolution chooses the appropriate
operator, based on the value’s type. This makes C++ IOStreams type-safe and superior to
C stdio, where you can produce unpredictable results by mismatching format specifier
and value type, as in
FORMAT CONTROL
Simple input and output of data as shown in the examples above are useful, yet insuffi-
cient in many cases. For example, others ways of formatting output or parsing input may
be needed. IOStreams allows control over many features of its input and output opera-
tors, for example:
16 1iOStreams Basics
e the width of an output field and the adjustment of the output within this field
¢ the precision and format of floating point numbers, and whether or not the deci-
mal point should always be included
¢ whether you want to skip whitespace when reading from an input stream
¢ whether integral output values are displayed in decimal, octal, or hexadecimal
format
Contained in each stream are a number of format parameters that control such details of for-
matting and parsing. The following sections discuss format parameters; format flags,
which are predefined values to be stored in a format parameter variable; and manipula-
tors, which provide convenient access to the format parameters.
8. Implementation-specific means that a feature is not standardized but may differ between implementations of the
standard C++ library.
1.2 Formatted Input/Output 17
The remaining columns, Effect and Default, show the purpose and effect of the for-
mat parameter and list the default value, which is used if you do not explicitly set the flag.
Here are some examples showing how to use the access functions:
cout .width(10) ;
cout.fill('.');
You can set the boolalpha flag, which controls alphabetic representation of bool
values, by calling
cout.setf(ios_base::boolalpha);
You can retrieve the current setting of the boolalpha flag by saying
You can remove the boolalpha flag from the current format flag setting by saying
cout.unsetf(ios_base::boolalpha);
BIT GROUPS
Some format flags are mutually exclusive; for example, output within an output field can
be adjusted to the left or to the right, or to an internally specified adjustment. Only one of
the corresponding three format flags—left, right, or internal—can be set.’ If you
want to set one of these bits, you need to clear the other two bits. To make this easier, there
are bit groups defined whose main function is to reset all bits in one group. A bit group is
the combination of all valid flags that mutually exclude each other. For instance, the bit
group for adjustment in an output field is adjust field; itis defined as left | right |
internal. The operation bitfield &= ~bitgroup; clears all flags in the bit field.
Hence the operation bitfield = (bitfield& ~bitgroup) | (flag & bitgroup), clears
all flags and then sets one particular flag only. Those bit operations that are necessary for
setting mutually exclusive flags are encapsulated into the set £() function. You do not
have to perform them manually. Instead, there is an overloaded version of the set£ ()
function. We have seen the one-argument version of set f() before. In addition to the
format flag, the overloaded version takes the corresponding bit group as an argument. It
clears all format flags belonging to the bit group and sets the specified format flag, basi-
cally performing the bit operations described above.
Here is an example: You can set the right adjustment flag and clear all other flags
in that bit group by calling
cout.setf(ios_base::right, ios_base::adjustfield);
9. IOStreams does not prevent you from setting other, invalid combinations of these flags, however. Use of ille-
gal combinations results in undefined behavior of the program.
10. For details on how the format flags affect input and output operations, look up ios_base in the reference
section and in appendices A and B.
1.2 Formatted Input/Output 19
The Group column lists the name of the group for flags that are mutually exclusive.
The groups are defined in class ios_base too.
The second column, Format flag, lists the flag names. All values are defined in class
ios_base. The class scope is omitted, e.g., showpos stands for ios_base: : showpos.
The third column, Effect, gives a brief description of the effect of setting the flag.
The last column, Default, lists the setting that is used if you do not explicitly set the
flag.
11. The adjustfield does not have a default value, but if none of the flags right, left, or internal is set, all
predefined inserters behave as though the adjustfield were set to right.
12. If none of the floatfield flags is set, the formatting depends on the value that is to be formatted: Scientific
notation is produced if the exponent is less than —4 or greater than or equal to the precision; otherwise, the result
is in fixed-point notation. Details regarding the formatting of floating-point values can be found in appendix B.
13. For the predefined standard stream cerr, the unitbuf format flag is set by default. For other streams, it is
not set.
20 lOStreams Basics
cout.unsetf(ios_base: :adjustfield) ;
streamsize original_precision = cout.precision(2);
cout.setf (ios_base: :uppercase|ios_base::scientific);
cout << 831.0 << ' ' << 8e2;
cout.flags(original_flags);
cout.precision(original_precision);
First we retrieve the current format flag setting via the stream’s flags () function,
in order to restore the original setting later on.
After output of 812 and ' | ' we modify the setting: We set the adjustment to left,
which means that padding characters will be inserted after the actual output. Also, we set
the field width from its default 0 to 10. (A field width of 0 means that no padding charac-
ters are inserted, and this is the default behavior of all insertions.) The effect of these set-
tings can be seen in the resulting output:
812|813 815
No padding characters are inserted after 812 and ' | ', because the field width was 0
at the time of this output. 813 is written after the modification of adjustment and field
width. Hence, padding characters are inserted after 813 so that the field width of 10 is
reached.
Then we clear the adjustment flags and change the precision for floating-point val-
ues from its default 6 to 2. Also, we set some format flags that affect the formatting of
floating-point values (ios_base::uppercase and ios_base: :scientific). The
resulting output is
8.31E+02 8.00E+02
Eventually we restore the original flags by calling the flags () function again.
1.2 Formatted Input/Output 21
// input
int i; char s[11];
Ccin.width(10);
cin >> i>> Ss;
12345 abcdefghijklmnopqrstuvwxyz
the program extracts the 5 digits and places the value 12345 into the integer variable i; it
then skips the separating whitespace character; and it subsequently extracts another 10
characters, namely abcdefghij, and places them into the character array s followed by
the end-of-string character '\0'. Hence the result would be
1: 12345
and
s: “abcdefghij"
Let us see how and why it happens this way. Extracting an integer is independent of
the specified field width. The extractor for integers always reads as many digits as belong
to the integer. As extraction of integers does not use the field width setting, the field width
of 10 is still in effect after evaluation of the subexpression cin >> i. When a character
sequence is subsequently extracted, only 10 characters will be extracted in this case,
because the field width setting is still in effect. After the extraction of the character
sequence, however, the field width is reset to 0. This is because the extractor for C strings
uses the field width setting and resets it after use.
22 (OStreams Basics
In contrast to the extractor for integers, the inserter for integers uses the field width
and resets it. To understand the difference, let us consider the following program frag-
ment in which characters are inserted to an output device:
// output
int i; char s[11];
cout.width(10);
cout <<i << §;
i: 123
and
s: "abc"
123abe
that is, seven white spaces followed by the three digits representing the value of i, fol-
lowed by the character sequence contained in s.
This is because the inserter for integers uses the specified field width and fills the
field with padding characters if the integral value has less than ten digits. As the inserter
uses the field width setting, it resets the field width to 0. Hence, after evaluation of the
subexpression cout <<i, the field width is reset. (Note the difference: On input the field
width setting was still in effect after extraction of the integer.) Hence the subsequent inser-
tion of the string s will not fill the field with padding characters for a string of less than
ten characters.
1.2.4 Manipulators
Format control requires calling a stream’s member functions. Each such call interrupts the
respective shift expression. But what if you want to change formats within a shift expres-
sion? This is possible in IOStreams. Instead of writing
cout.setf(1los_base::left,ios_base: :adjustfield) ;
cout << 813;
In this example, an expression like left is called a manipulator. You can think of a
manipulator as an object you can insert into or extract from a stream, in order to manipu-
late that stream.
is equivalent to
Some manipulators also take arguments, like setw(int). The setw manipulator
sets the field width. The expression
is equivalent to
cout .width(10);
LIST OF MANIPULATORS
Table 1-4 gives an overview of all manipulators defined by IOStreams. Further details
about the standard manipulators can be found in this book’s reference guide. The infor-
mation, however, is spread over several entries in the reference guide, because manipula-
tors with and without arguments are implemented in different ways. (Section 3.2,
User-Defined Manipulators, explains these differences in greater detail, describes how
manipulators work, and shows how you can implement your own manipulators.)
Manipulators with arguments construct an object of a class derived from the imple-
mentation-specific type smanip. These manipulators are listed under a special entry:
manipulators.
Manipulators without arguments take and return a reference either to class ios_base,
class basic_istream <charT, traits>, or class basic _ostream <charT,traits>.
Because they are so closely coupled to one of the classes, they are described under the
entry of the corresponding class.
24 lOStreams Basics
Table 1-4, which describes manipulators in IOStreams, has the following entries:
The first column, Manipulator, lists its name. As usual, we omit the scope operator
for the namespace : : std.
The second column, Affects, indicates whether the manipulator is intended to be
used with istreams (i), ostreams (0), or both (io).
The third column, Purpose, summarizes the effect of the manipulator.
The fourth column, Equivalent, lists the corresponding call to the stream’s member
function.
The last column, Ref, indicates where further information can be found in the refer-
ence guide:
O = basic_ostream <charT,traits>,
I = basic_istream <charT,traits>,
B = 10S_base,
M = manipulators.
flushing is not necessary, because the standard output stream cout is tied’? to the stan-
dard input stream cin, which means that input and output to the standard streams are
synchronized anyway. Since no flush is required, the intent is probably to insert the end-
of-line character. If you consider typing '\n' more trouble than typing endl, you can
easily add a simple manipulator n1 that inserts the end-of-line character but refrains from
flushing the stream.’°
14. The field width controls input and output. However, on input it is relevant only for extraction of strings (see
section 1.2.7, Peculiarities of Formatted Input) and is otherwise ignored.
15. See section 1.8.1.3, Synchronization by Tying Streams, for further details.
16. See section 3.2, User-Defined Manipulators, to learn how you can define such a manipulator.
1.2 Formatted Input/Output 27
stream in “working order.” Note that some of the manipulators conform to this rule; oth-
ers do not. The endl manipulator, for instance, has no effect on out-of-order streams.
Manipulators that set format flags like setw, setprecision, etc., do have an effect:
They set the respective format parameter whether the stream is in working order or not.!”
19,99 1.000.000
Naturally, the radix character is relevant to the formatting and parsing of floating-
point numbers, because floating-point numbers usually have a fractional part.
17. Section 3.2.2.4.1, Manipulator Base Template with Error Handling, explains the reason for this.
28 1OStreams Basics
The thousands separator, on the other hand, is relevant only for integral values.’®
You cannot produce output of a floating-point number that contains thousands separa-
tors, i.e., there is no such output as 1,000,000.50.
WHITESPACE SKIPPING
All extractors by default ignore whitespace characters that precede the item to be
extracted. Imagine that an input sequence contained " \t46sec", and we read an inte-
18. Numerous formatting options can be switched on and off by setting or clearing corresponding format flags.
However, the insertion of thousands separators to integral values cannot be suppressed once the locale specifies
rules for thousands grouping.
1.2 Formatted Input/Output 29
gral value from that input sequence via an IOStreams extractor. The shift operator would
extract, but then discard, all preceding whitespace characters (blanks, tabs, newlines).
When it found the first relevant character, i.e., the digit character "4" in our example, it
would accumulate the characters until it encountered one that did not belong to the item.
In this case the separator would be the alphabetical character "s" after the digit sequence.
The separator would remain in the input sequence and become the first character
extracted in a subsequent read operation.
You can switch off the default behavior of skipping preceding whitespace characters
by means of the manipulator noskipws or the equivalent stream operation unset f
(1os_base:: skipws) . This may be useful if you expect the input to have a certain for-
mat and the whitespace characters to be part of the format specification; then you need to
extract the whitespace characters rather than silently ignoring them, so that you can check
for violations of the format requirements.
Here is an example. It extracts one line of input that is supposed to consist of a list of
floating-point numbers separated by commas; no whitespace is permitted.
IGNORING CHARACTERS
If you have to skip a sequence of characters other than whitespace, you can use the
ignore (streamsizen, int_type delim) function. Its functionality is to read and
discard characters until a certain number of characters are extracted or a separator is
found. If you want to use the ignore () function for skipping any number of characters
until a particular character is found, simply set the limit to the largest possible number of
characters in a file so that the maximum number of extracted characters will never be
reached. Only the occurrence of the separator will stop the extraction. Here is an example:
cin.ignore(numeric_limits<streamsize>::max(),'\n');
In this call the function ignore () reads and discards all characters until the end of the
line. The constant numeric_limits<streamsize>::max() is the largest possible
number of characters in a stream.
Note that the example relies on the fact that cin is a stream for narrow characters.
For a wide-character input stream the equivalent to the call above would be
30 1OStreams Basics
wcin.ignore(numeric_limits<streamsize>::max(), wchar_t('\n'));
The main difference from the case of narrow-character streams is the end-of-line
character. Different character types can have different end-of-line characters. The equiva-
lent of '\n' for any given character type charT can be created via the character type’s
constructor, i.e. charT('\n').
INPUT OF STRINGS
JOStreams supports extraction of character sequences into strings. String extractors are
slightly different from extractors for other types, because there is no particular format
specification. Instead, all characters except whitespace characters are considered part of a
string. Hence the only separator that can stop the extraction of a string is a whitespace
character. In contrast, the extraction of items of other types stops once a character is found
that does not belong to the format for that type. For instance, the extraction of an integral
number from the sequence "5ft" stops once the letter "f" is found, because an "f" is not
considered part of an integral number.
The field width setting, too, can stop the extraction of characters into a string. More
precisely, when you extract strings from an input stream, characters are read until (l)a
whitespace character is found, (2) an end-of-string character’? is found, (3) the end of the
input is reached, or (4) a certain number of characters are extracted if width() != 0.
This maximum number of extracted character is the field width width() for C++
strings and width () -1 for C strings.
Note that the field width will be reset to 0 after the extraction of a string.
// extraction of C string
char buf[SZ];
Cin >> buf;
is different from
When characters are extracted into a basic_string object you need not worry
whether the number of extracted characters might exceed the string’s capacity.
19. See appendix G.12, C++ Strings, for an explanation of the end-of-string character.
1.3 The Stream State 31
basic_string objects dynamically allocate additional storage and adjust their size as
necessary. C-style strings, on the other hand, are character arrays, which have fixed size
and cannot dynamically extend their capacity. If more characters are available from the
input than the character array can hold, the extractor writes beyond the end of the array.
To prevent this, you must set the field width to the array size each time you extract a C
string:
char buf[SZ]};
cin >> setw(SZ) >> buf;
20. More information about bitmask types can be found in appendix G.1, Bitmask Types.
32 iOStreams Basics
1. Characterwise extraction. After reading the last available character, the stream is
still in good state; neither eofbit nor fal lbit is set. Any subsequent extraction not
only reads past the end of the input sequence, which results in setting the eofbit, but
also fails to extract the requested character. Hence, failbit is set in addition to eofbit.
1.3 The Stream State 33
2. Extraction of an item other than a single character. Here it is different. Let us again
imagine the input sequence contained "912749<eof>" and an integer was supposed to
be extracted. Although the end of the input sequence is reached by extracting the integer,
which results in setting the eofbit, the input operation does not fail. The desired integer
can be extracted. Hence in this situation failbit is not set; only the eofbit is set.
coverable, whereas failbit indicates a situation that might allow you to retry the failed
operation. The flag eofbit simply indicates that the end of the input sequence has been
reached, which need not be considered an error at all.
However, before you react to any stream error, you need to detect it. How can you
do this? There are two possibilities:
1. You can actively access the streams state after each stream operation and check
for errors. This is the default.
2. You can declare that you want to have an exception raised once an error occurs in
any stream operation.
We will explore these possibilities in the next two sections.
The following examples show how you would use these functions.
if ('cout) // error!
The state of cout is examined with operator! (), which will return true if
failbit or badbit are set, i.e., if the stream state indicates an error has occurred.
21. There are three further access functions: rdstate(),clear(),and setstate().rdstate() returns the
value of the stream state. clear () and setstate() allow you to modify the stream state. Their main purpose
is for modifications of the stream state, not for checking the state flags.
1.3 The Stream State 35
The magic here is the operator void* () in conjunction with standard conversion
sequences implicitly performed by the compiler. In particular, the compiler implicitly
converts an expression of a given type when used in the condition of an if statement: the
destination type of such a conversion is bool. In the example above the expression cout
<< x 1s used in a condition, and therefore a sequence of implicit conversion is applied: Ini-
tially, the expression cout << x evaluates to a reference to a stream, because shift opera-
tors generally return a reference to the stream on which they are invoked. The compiler
then performs the following conversions:
1. Acast from a reference of a stream to a pointer of type void* using the stream’s
cast operator operator void* ().
2. Apromotion from the pointer type void* to type bool.
The cast operator operator void*() returns a zero pointer value if failbit or
badbit are set, and zero otherwise. Hence, the value of cout << xis false incase of an
error situation, and true if the stream is in a good state.
cout << x;
if (cout.good()) // okay!
The value of the comma expression cout << x, cout.good() is the return value
of good (). This is because a comma expression always evaluates to the value of the right-
hand side of the comma operator; i.e., the rightmost expression in a sequence of comma
operators determines the value of the entire expression.
A final note on subtle differences between good() and the other access functions:
The function good () takes all flags into account, the eofbit included, whereas fail () ,
operator ! (),and operator void* () ignore eofbit. In other words, a stream that is
not in a good state according to good() need not be in a state of failure according to
fail();it may just have the eofbit set.
Recommendation: Considering the meaning of the state flags, we recommend
checking for error situations via fail(), operator! (), OF operator void*(), and
checking for failure of input operations via ! good ().
36 10Streams Basics
try {
cout.exceptions (ios_base: :badbit | ios _base::failbit);
cout << xX;
// do lots of other stream output
}
catch(...)
{ if (cout.bad())
{ // unrecoverable error
1.3 The Stream State
37
throw;
}
else if (cout.fail())
{ // retry
return;
In calling the exceptions () function, you specify what flags in the stream’s state
will cause an exception to be thrown. In this example we want an exception thrown each
time either badbit or failbit gets set in the stream state. It is generally recommended
to set badbit and failbit for output streams, because this conforms to the checking for
error situations via fail(), operator! (), and operator void*(). Similarly, you
would set badbit, failbit, and eofbit for input streams, because it is equivalent to
checking for input errors via ! good (). Note also that the call to exception () raises an
exception, if the respective state flag is already set. For this reason, the activation of excep-
tions is in itself a critical operation that has to be called inside the try block.
The catch clause catches any exception raised by one of the previously invoked
I/O operations. As the exception object itself does not contain any information about the
State flag that triggered the exception, you must also check the stream state. If the stream
state is bad, the error situation is unrecoverable. Otherwise, it was just a failure of a certain
operation, and a retry might make sense.
An alternative to wrapping the critical operations into a try block is to suppress
exceptions, execute all critical operations, and activate them afterwards. Here is an example:
throw;
}
else if (cout.fail())
{
// vetry
return;
38 1OStreams Basics
void clear (iostate state = goodbit) sets the stream state to the value
of the argument state. The default argument is goodbit, which means that
a call to clear with no argument sets the stream state to goodbit.
To reset the stream state to good after an error has occurred, either clear() or
clear (goodbit) can be used. Let’s add a call to clear () to the example above where
we suggest retrying I/O processing. The code changes to
throw;
1.4 File Input/Output 39
22. See section 4.2.7, Character Encodings, for information on multibyte character files.
40 lOStreams Basics
open. Open file streams are fully functioning file stream objects that are ready to per-
form input and output operations, because they are connected to an open file. A closed file
stream can be turned into an open file stream by invoking open() , which is demon-
strated in the example above. The converse can be achieved by calling close(). The
close () member function closes the file and disconnects it from the file stream.
The file stream classes additionally have a constructor that allows creation of an
open file stream by providing a file name. The constructor implicitly opens the file and
connects it to the stream (see example below).
One can check whether a file stream is connected to an open file by means of the
is_open() member function.
When a file stream object is destroyed, the connected file is automatically closed by
the file stream’s destructor.
Calls to open() and close () cannot be nested; a file stream must be closed before
it can be opened again. Attempts to open an already open file stream fail. Similarly, you
cannot close a file that is already closed. Every call to open () must be matched by a call to
close() (see example below):
{
ofstream fil("src.cpp"); // fil is implicitly opened at construction time
// we.
if (fil.is_open())
fil.close();
The functions open () and close (),as well as the file stream constructors, indicate
failure by setting failbit in the stream state in case they cannot open or close the exter-
nal file (see also section 1.3.2, Checking the Stream State).
tion was successful. However, a call to open () can fail even if the file stream is open, for
example, if you attempt to reopen an already open file stream. As expected, the call to
open() fails, and as a result fail () yields true; at the same time is_open() yields
true, because the stream is connected to an open file due to a previous successful call to
open (). The code below shows such a situation:
if (fi1.is_open())
{ /* connected to an open file, namely "src.cpp" */ }
Both functions store the open mode setting in one of the stream’s data members, and
subsequent input and output operations adjust their behavior according to the current
setting. The open mode setting is always determined when the file stream is connected to
a file. The setting cannot be changed afterwards. Also, the current setting cannot be
retrieved after it has been set.
42 lOStreams Basics
Table 1-7 shows the flag names and effects. The open mode flags control how the file
is opened and in which way it is used later on. More specifically, the open mode flags
have an impact on the initial file position and the initial file length. The open mode deter-
mines whether system-specific conversions are performed or suppressed and whether the
file to be opened must exist or shall be created. Here are the details:
THE INITIAL FILE PosiTION. Each file maintains a file position that indicates the position
in the file where the next byte will be read or written. When a file is opened, the initial file
position is by default at the beginning of the file. The open modes ate (meaning at end)
and app (meaning append) change this default to the end of the file. There is a subtle dif-
ference between ate and app mode.
If the file is opened in append mode, all output to the file will be done at the current
end of the file, regardless of intervening repositioning. Even if you modify the file posi-
tion”4 to a position before the file’s end, you do not write there.
With at-end mode, only the initial file position is at the end of the file. You can reposi-
tion to a position before the end of file and write to that position.
THE INITIAL FILE LENGTH. The open mode trunc (meaning truncate) sets the initial
file length to zero, which has the effect of discarding the file content. The trunc flag is
included in the default open mode of output file streams, so you can omit the trunc flag
and think of the open mode out as being equivalent to out | trunc. This only holds true
for output file streams. For bidirectional file streams, there is no default open mode and
trunc must always be explicitly specified, i.e., you must say in| out |trunc, if the file
content is to be discarded.
23. See section G.1, Bitmask Types, in appendix G for further information on bitmask types.
24. The file position can be changed via the seekpos () member function of a stream. See section 1.7, Stream
Positioning, for reference.
1.4 File Input/Output 43
If an output file is to be extended rather than having its content replaced, we must
omit the trunc flag and include the at-end or append flag instead. These flags move the
initial file position to the file’s end; the missing trunc flag causes the initial file length not
to be set to zero but retained; and as a result of these open mode settings, the file is
extended rather than overwritten.
CREATING FiLes. When the open mode contains in (meaning that the file is opened
for input), the attempt to open the file fails if the file does not exist. When the open mode is
the out flag (meaning that the file is opened for output), the file will be created if it does
not yet exist. In that case, the attempt to open the file fails only if the file cannot be created.
SYSTEM-SPECIFIC CONVERSIONS. Ihe open mode flag binary has the effect of sup-
pressing automatic conversions performed by underlying system services. The represen-
tation of text files varies among operating systems. For example, the end of a line in a
UNIX environment is represented by the linefeed character '\n', whereas on Microsoft
operating systems the end of the line consists of two characters, carriage return '\r' and
linefeed '\n'. An operating system’s I/O functions therefore perform automatic conver-
sions, such as converting between "\r\n" and '\n'.
The open mode flag binary has the effect of suppressing such automatic conver-
sions. Basically, the binary mode flag is passed on to the respective operating system’s
service function, which means that in principle all system-specific conversions will be
suppressed, not just the carriage return/ linefeed handling. The effect of the binary open
mode is frequently misunderstood. It does not put the inserters and extractors of
IOStreams into a binary mode in the sense of suppressing the formatting they usually per-
form. Binary input and output, in the sense of unformatted I/O, is done via certain mem-
ber functions of the stream classes: basic_istream <charT> ::read() and
basic_ostream<charT> ::write().
fstream BiStr("poem.txt");
In order to make sure that an output stream is opened in output mode and an input
stream is opened in input mode, the default open modes are always implicitly added to
the mode argument. This way, you cannot inadvertently open an input file for output, or
vice versa. The correct open mode would be added implicitly, and the wrong open mode
would have no effect. Moreover, the implicit addition of the default open mode for input
and output streams is convenient to use. For instance, instead of writing
For bidirectional streams, the default open mode is not added implicitly, because a
bidirectional stream need not always be opened for both transport directions. You can
open a bidirectional stream for just reading or just writing. A bidirectional file stream is
25. Some of the open modes correspond to the file open modes used in the C stdio (for invocation of the
fopen() function). See section D.1, File Open Modes, in appendix D for details.
1.4 File Input/Output 45
opened in exactly the modes that are provided to its constructor or open() function,
without implicit additions. The result is that we always have to fully specify a bidirec-
tional file stream’s open mode, as in the examples below:
Or
The call seekg (0, ios_base: : beg) has the effect of emptying the internal buffer
to the external file and resetting the file position indicator to the beginning of the file. Sub-
sequent read attempts start reading from the beginning of the file. In the example above,
we read what we had previously written to the file.
Instead of repositioning, we can flush the internal buffer by invoking the flush ()
member function instead of seekg(). In that case the file position remains unchanged
and in the example above the subsequent read attempt fails because the file position is at
the end of the file where there is nothing to read.
Note that after reading the entire content of the file, the stream status indicates that
the end of file has been reached. When any of the stream state bits is set, all stream opera-
tions refuse to do anything. Before we can successfully write to the file stream, we must
clear the stream state. As we did not reposition the file position indicator, the output
appears at the end of the file. Had we reset the position, we would have overwritten part
of the file content.
int i;
if (istringstream("4711") >> i)
{
// 1 has the value: 4711
ostringstream oBuf;
oBuf .imbue("German_Germany");
oBuf << setprecision(2) << fixed;
The current content of a string stream can be replaced via an overloaded version of
the str () member function that takes a string. The string is copied to the internal buffer
and from then on serves as the source or sink of characters to subsequent insertions and
extractions.
UNFORMATTED INPUT
Unformatted input differs from formatted input as follows:
Leading whitespaces are not skipped. They are extracted and delivered as part of
the extracted character sequence.
The input operations read as many characters as are requested, or until a delimiter or
the end of file is found. Otherwise, they do not interpret the extracted characters in any way.
There are several types of unformatted input functions:
e read a single character, e.g., via int_type get ()
Each of these functions comes in several flavors. They differ in the criteria that stop
the extraction of characters. Details can be found in the reference part of this book. The
extracted characters are stored into successive locations of an array that is passed to the
operation. The operation adds a terminating null character to the extracted character
sequence.
Like the formatted input operations, the unformatted input operations return the
stream for which they are invoked, with the exception of int_type get (), which
returns the extracted character. The unformatted input functions indicate failure in the
same way as the formatted input functions do, namely, by setting the stream state bits
and/or raising exceptions.
In order to find out how many characters were read, the function gcount () can be
called. It returns the number of characters extracted by the last unformatted input opera-
tion called for the stream. This number can be different from the number of characters that
are stored in the provided character array. For instance, a delimiter is read and counted,
but not stored.
Below is an example where input is read line by line into a fixed-size character
buffer. If a line is longer than the buffer size, the line must be read in several steps until it is
completely consumed. |
if (cin.eof())
// end of file found; last line extracted
cout << "Partial final line";
else if (cin.fail())
{ // max number of chars was reached; line was longer 99 chars
cout << "Partial long line";
// clear stream state so that we can continue reading
cin.clear(cin.rdstate() & ~ios::failbit);
} else {
// don't include newline in count
// mind: delimiter was extracted and counted, but not stored
count-;
cout << "Line " << ++line_number;
}
cout << " (" << count << " chars): " << buffer << endl;
Note that there is an additional getline() function defined in the header file
<string>, which is substantially more convenient to use than the stream member func-
tion getline () that was used in the example above. While the stream member function
reads a line into a character array of fixed size, the string get line () function reads a line
into a string object of dynamic size. We need not worry whether the line will fit into the
buffer or not, because the string will grow as needed. Let us compare use of the two
getline() functions. The stream member function would be called as
and would indicate failure if the line is longer than the buffer size. In contrast, the string
getline() function would be called as
string buffer;
getline(cin, buffer, '\n');
and would fail if the line is longer than the maximum string size, which is an implementa-
tion-dependent size that can be obtained via buffer.max_size().
In addition to the unformatted input functions, there are related functions for the
following:
0 nei ln
nn = MB a a wae a es + me —_
UNFORMATTED OUTPUT
Unformatted output differs from formatted output in that the characters are written as
they are, without any interpretations, additions, or modifications.
There are two categories of functions for unformatted output:
1. Write a single character; an example is put (char_type).
2. Write a specified number of characters; an example is write (const
char_type*, streamsize).
Like the formatted output operations, the unformatted output operations return the
stream for which they are invoked. The unformatted output functions indicate failure in
the same way as the formatted output functions do, namely, by setting the stream state
bits and/or raising exceptions.
defined in each stream class.” For bidirectional file streams, tellp() and tellg() yield
the same joint stream position. For bidirectional string streams, the two stream positions can
differ: tel1g() provides the read position and te11p () provides the write position.
Acall to tellg() /tellp () can fail under the following circumstances:
e If the stream has the fail bit set as a result of any previous operation.
¢ (For file streams:) If the file stream is not connected to a file, ie., is_open() yields
false.
e If the file stream performs a code conversion and the encoding of the external file is
either state-dependent or the number of external characters needed to produce an
internal wide character is not constant.
elf the call to tellg()/tellp() fails then pos_type(-1) is returned to indicate
the failure.
26. The pos_type is defined in the stream’s character traits and contains all information necessary to represent
the stream position and, in the case of file streams, also the current state of the character conversion that is per-
formed on input and output.
27. The symbolic stream positions correspond to the argument that is passed to the C stdio function fseek ().
See section D.2, Stream Positions, in appendix D for details.
1.7 Stream Positioning . 53
The offset is an object of type of f£_type, which is a nested type defined in each
stream class. Stream offsets can be created from integral values’ and represent the num-
ber of characters between two stream positions. Stream offsets are compatible with stream
positions in the sense that offsets can be added to and subtracted from stream positions;
the distance between two stream positions is an offset; and stream positions are convert-
ible to stream offsets and vice versa.”?
Here is an example where the joint stream position of a bidirectional file stream is
manipulated in various ways:
// reset the joint file position to the begin of the file stream
fstr.seekg(0, ios_base::beg);
The error handling is omitted from the example. The functions tellp() /tellg()
indicate failure by returning the invalid stream position pos_type (-1); and seekp () /
tellp() indicate failure by setting the stream state.
28. To be more precise, stream offsets can be created from values of the type streamsize, and the type
streamsize is a synonym for one of the signed basic integral types. The type streamsi ze is used to represent
the number of characters transferred in an I/O operation.
29. Note that an offset represents a number of characters and not a number of bytes. For wide-character streams,
the distance between two stream positions is calculated as the number of wide characters between the two posi-
tions. This is slightly confusing in the case of wide-character file streams, because they are typically connected to
multibyte files. The distance between two file stream positions is expressed as a number of wide characters,
whereas the distance between the corresponding file positions on the external multibyte file is usually expressed
in terms of bytes or tiny characters (for instance, when the C stdio function fseek () is called).
54 . lOStreams Basics
String stream buffers do not define synchronization at all; i.e., they do not override
the virtual sync () function. This is because string streams do not distinguish between the
internal buffer and the external device; they maintain only one memory location that plays
the role of both. Hence the synchronization functions of string streams have no effect.
User-defined stream buffer classes need not override the virtual sync () function, if
synchronization has no meaning for the new stream buffer type. Otherwise, the semantics
of the newly defined synchronization should conform to the abstract idea of synchroniza-
tion defined above.
ofstream ostr("/tmp/fil");
int 1=10;
ostr << unitbuf;
while (i-)
{ ostr << i; }
is the same as in
ofstream ostr("/tmp/fil");
int i=10;
while (i-)
{ ostr << l;
ostr.flush();
Synchronization via the unitbuf format flag is intended for output to logbook files.
The idea is that output should be written to the file as soon as possible, so that no infor-
mation gets lost in case of a system crash.
ofstream ostr("/tmp/fil");
ifstream istr("/tmp/fil");
string s;
56 lOStreams Basics
The input stream istr is tied to the output stream ostr. The tie() function
returns a pointer to the previously tied output stream, which is memorized for restoring
the tie afterwards. The effect of tying istr to ostr is the same as in
ofstream ostr("/tmp/fil");
ifstream istr("/tmp/fil");
string s;
Ties are intended for terminal I/O. If you read from a stream that is connected to a
terminal, then it is useful for output to be flushed to the stream, i.e., written to the termi-
nal, before input is read from the terminal. In such a case you would tie the terminal input
stream to the terminal output stream.
Table 1-10: Predefined Global Streams with Their Associated C Standard Files
Like the C standard files, these predefined standard streams are all associated by
default with the terminal.
The predefined standard streams are of type istream, ostream, wistream, and
wostream. These are the general stream classes, and the consequence is that the type of
their stream buffer is not specified. The predefined standard streams are allowed to use an
implementation-specific, special purpose, concrete stream buffer type. Little can be said
about its behavior beyond what is defined for the stream buffer abstraction in general (see
section 2.2, The Stream Buffer Classes, for details). Especially, the semantics of synchro-
nization are mostly undefined for abstract stream buffer types, but fully depend on each
concrete type. The concrete stream buffer type of the predefined standard streams is not
1.8 Synchronization of Streams 57
output operations are executed in the order of invocation, independent of whether the
operations were using the predefined C++ streams or the standard C files.
This synchronization is time-consuming and therefore might not be desirable in all
situations. You can switch it off by calling
sync_with_stdio(false);
After such a call, the predefined streams will operate independent of the C standard
files, with possible performance improvements in your C++ stream operations.
The standard recommends that you call sync_with_stdio() prior to any input |
or output operation on the predefined streams if you want to switch off the synchroniza-
tion between C and C++ I/O. The effect of calling sync_with_stdio () later is not stan-
dardized, but implementation-defined.
31. This does not only hold for the predefined standard streams, but for all situations where a narrow character
stream has a wide character counterpart, i.e. whenever a narrow character stream and a wide character stream
are associated to the same external file.
CHAPTER 2
In chapter 1, we demonstrated the principles of using stream objects for formatted input
and output, but we did not pay much attention to the internal organization of the
IOStreams component. More sophisticated usage of IOStreams (chapter 3), however,
requires a thorough understanding of the structure of the IOStreams framework. Some
background information is needed to understand this chapter, because it is organized
around typical examples of using and extending IOStreams rather than explaining the
internals of IOStreams. The necessary background information is provided in this chap-
ter. It explains aspects of the software architecture of IOStreams, including a general
description of all IOStreams classes; their responsibilities, relationship, and collabora-
tion; and a couple of related topics. The focus in this chapter is on the principles used
inside IOStreams rather than on detailed descriptions of all classes and functions. The
goal is to provide an understanding of the way the pieces work together. Detailed and
complete references can be found in the reference section of this book.
Not all of the subjects addressed in this chapter are needed to understand the rest of
the book. Feel free to skip what is of no interest to you now and return later once you feel
you need some background information. Here is an overview of the topics covered in this
chapter.
As explained in chapter 1, IOStreams has two layers, one for parsing and formatting
and another for buffering, code conversion, and transport of characters to and from the
61
62 The Architecture of 1OStreams
external device. As a reminder, figure 2-1 repeats the previous illustration of the IOStreams
layers.
STREAM AND STREAM BUFFER cLasses. Related to each layer is a separate hierarchy of
classes: Classes that belong to the formatting layer are often referred to as the stream
classes. Classes of the transport layer are often referred to as the stream buffer classes.’ The
stream classes and their internals are explained in section 2.1, The Stream Classes, along
with their relationship to stream buffers and locales. A description of the stream buffer
classes can be found in section 2.2, The Stream Buffer Classes. Locales are explained in
part II of this book. |
CHARACTER TYPE AND TRAITS TyPE. Almost all stream and stream buffer classes are
class templates that take two template arguments, the character type and the traits type.
Section 2.3, Character Types and Character Traits, is devoted to these type parameters.
iword/pword AND STREAM CaLLBaAcks. [OStreams provides a hook for adding infor-
mation to a stream that can be used for arbitrary purposes. This additional stream storage
is also known as iword/pword, which are the names of the respective data members
holding the information. Section 2.5, Additional Stream Storage and Stream Callsbacks,
explains this special feature of IOStreams. The stream callbacks are often needed for
proper maintenance of such additional stream storage. Stream callbacks are also relevant
for imbuing locales.
( Program
[ External device
1. We use “the stream classes” as a synonym for “the formatting layer” and “the stream buffer classes” as a syn-
onym for “the transport and buffering layer.”
2.1 The Stream Classes 63
¢ Iype definitions for bitmask types representing the format state (fmt flags), the
stream error state (iostate), the file open modes (openmode), stream positions
(seekdir), and the definitions of flag values associated with those bitmask types.
2. See section 2.3, Character Types and Character Traits, for details on these template parameters.
64 The Architecture of 1OStreams
General stream
tL
classes
basic_istream
Concrete stream
classes \
. 3
charT:class, charT:class, ( charT:class, |
basic_ifstream
I ' 1
, charT:class, charT:class, ' charT:class, t
' traits:class, t traits:class, i
I traits:class, !
\ Allocator:class ‘ Allocator:class ! ' Aliocator:class i
—_
3. This was previously discussed in section 1.2.3, The Format Parameters of a Stream.
4. The additional stream storage is called iword/pword and its concept is explained in section 2.5, Additional
Stream Storage and Stream Callbacks. Section 3.3.1, Using Stream Storage for Private Use: iword, pword, and
xalloc, demonstrates how iword/pword can be used for extending IOStreams.
2.1 The Stream Classes 65
e The locale imbued on the stream, as well as functionality for retrieving and imbu-
ing the locale (getloc() and imbue()).5
¢ Callback functions related to certain events (e.g., the destruction of the stream),
plus functionality for their registration (register_callback()), as well as
related type definitions.®
¢ The exception mask, plus functionality for access to the exception mask (excep-
tions ()).
5. See section 2.1.4, How Streams Maintain Their Locale, for details on locales in IOStreams, and
chapter 5 of
this book for locales in general.
6. See section 2.5, Additional Stream Storage and Stream Callbacks, for details.
7. See section 2.1.2, How Streams Maintain Their Stream Buffer, for details on how stream
buffers are maintained
by streams, and 2.2, The Stream Buffer Classes, for an explanation of the internal principles of stream
buffers.
8. This was previously discussed in section 1.3, The Stream State.
9. The fill character is the only character-type-dependent formatting information. It was previously
discussed
in conjunction with the format state in section 1.2.3, The Format Parameters of a Stream.
10. See section 2.1.3, Copying and Assignment, for details.
11. See section 1.8, Synchronization of Streams, for details.
The Architecture of lOStreams
66
12. See section 2.3.1, Character Representations, for more information on character representations, and section
character transfor-
6.1.1, Character Classification, for more information on the ctype facet, to which the task of
mation is eventually delegated.
2.1 The Stream Classes 67
adds only new device-specific operations such as open() and close() for files, for
instance.
2.1.2.3 The Concrete Stream Classes
The concrete stream classes encapsulate formatted input and output to external devices.
They contain concrete stream buffer objects, which perform the actual access to the
respective device. There are two categories of concrete stream classes: (1) file streams and
(2) string streams.
They are derived from the general input and output stream classes and add func-
tions for opening and closing files (open() and close() ).
Internally, they contain and use a file buffer object to control the transport of charac-
ters to/from the associated file. File buffers are specialized stream buffers that encapsu-
late the knowledge about reading and writing to the underlying file system.”
These classes, too, are derived from the general input and output stream classes. The
source and destination of parsing and formatting is a string held in memory. This string
also serves as an internal buffer. This means that the string stream buffer and the external
14. File streams were previously explained in section 1.4, File Input/Output.
15. See also section 2.2.4, File Stream Buffers.
16. String streams were previously explained in section 1.5, In-Memory Input /Output.
2.1 The Stream Classes 69
device are identical. The string stream classes add functions for getting and setting the
string to be used as a buffer (str () ).
Internally, a string stream buffer is used; this is a specialized stream buffer that
encapsulates reading and writing to the string in memory.!”
' SSEIOIOJCOOIYV 1
‘SSBIO:SHeI) 4 i ‘SSBjO:SHel) 1 ‘ ‘SSBjO:SHel} I
! ‘SSejo:Syel} 1 \
! i = ‘ssejo:yseyo ! 1 = ‘ssejo:pseyo !
1 = ‘sseyo:pseyo ! 1 = ‘ssejo:pueByo be ew ew ew ew = = !
me ee ee ! we oe om
Peewee i
bw wee ee 1
The
The various stream classes offer different functionality for maintaining the stream
buffer depending on their responsibilities:
rdbuf() SUBTLETIES
Note the subtle difference between a concrete stream class’s rdbuf () function and the
inherited version of rdbuf(): The inherited rdbuf() function basic_ios<charT,
traits>::rdbuf() returns the pointer to the stream buffer that is maintained by the
base class. The concrete stream class’s rdbuf () function returns a pointer to its contained
stream buffer object.
For a newly constructed concrete stream object, both functions return the same
pointer. However, after the stream buffer pointer is replaced by a call to
basic _ios<charT,traits>:: rdbuf (basic_streambuf<charT, Traits>* sb)
the base class’s rdbuf () function returns the newly set pointer, whereas the concrete
stream class’s rdbuf () function keeps on returning the pointer to its contained. stream
buffer object.
Here is an example that illustrates the difference between the two versions of
rdbuf ():
basic_filebuf<char> buf;
buf.open("in.txt");
basic_streambuf<char>* bp;
bp = ifstr.rdbuf(); // redefined rdbuf() returns pointer to contained, but
// unused file buffer
bp = ifstr.basic_ios<char>::rdbuf(); // base class version of the rdbuf() function returns
// pointer to newly assigned and actually used file buffer
After replacement of the stream buffer pointer, the buffer object contained in the
stream object is unused, because all operations are invoked via the stream buffer pointer,
which now refers to a different stream buffer object.
Note, however, that both the member functions open() and close() of the file
stream classes and the member function str () of the string stream classes are invoked
via their overridden version of rdbuf (). The effect is that calls to these functions work
only on the embedded stream buffer object, not on the stream buffer that was later
assigned and is actually used. For this reason, calls to open (), close (),and str () after
replacement of the stream buffer behave in an extremely counterintuitive way.
18. This is achieved by declaring the copy constructor and copy assignment operator of the stream base class
basic_ios private.
2.1 The Stream Classes
73
*void clear (iostate state = goodbit), which allows setting of the stream
state.
Note that clear () can raise exceptions when the stream state is assigned.
basic_streambuf<class charT, class Traits>* rdbuf () and
basic_streambuf<class charT, class Traits>* rdbuf (basic_streambuf |
<class charT, class Traits>* sb) allow retrieval and setting of the stream
buffer
pointer.
Mind the difference described in the preceding chapter between the base class ver-
sion of rdbuf () and the hiding version in the concrete stream classes. The base class
ver-
sion returns the stream buffer object referred to by the stream buffer pointer; the
derived
class version returns a pointer to the embedded stream buffer object. After the
stream
buffer pointer is reset, both are different.
basic_ios<class charT, class Traits>s& copyfmt (basic_ios<class
charT, class Traits>& rhs) allows setting of all other data members of rhs.
The following function template shows the use of these functions in an example.
Similar to the assignment operator or functions like strcpy (), its first paramete
r is the
destination and its second parameter is the source:
19. Interestingly, stream buffers, which contain pointers to their get and put areas,
can be copied and assigned;
the semantics of this process are not even defined by the standard. As neither the copy
constructor nor the copy
assignment for any of the stream buffer classes is specified by the standard, they will
most likely not be imple-
mented, which means that the compiler-generated default functionality for copying
and assignment will apply;
that is, all pointers are copied. Two stream buffer objects that are copies of each other
would operate on the same
character array without any coordination. The results are likely to be unpredictable. Avoid
inadvertent copies or
assignments of stream buffer objects.
74. The Architecture of |OStreams
template<class Stream>
void streamcpy (Stream &dest, const Stream& src)
{
// clear exception mask
dest.exceptions(ios_base::goodbit);
dest.clear(src.rdstate());
First, we must disable all exceptions because we want to invoke clear () in order
to copy the stream state. This is done as follows: We suppress all exceptions for the time
being, because the function clear () might otherwise raise exceptions depending on the
current exception mask. As we have not yet copied the exception mask of the source
stream at this point, the old, incorrect exception mask would be taken into account. Rais-
ing exceptions according to the old exception mask would not be correct, and in order to
avoid this effect, we disable all exceptions by calling exceptions (ios_base:
goodbit). The exceptions will implicitly be enabled again when the exception mask of
the source stream is later copied to the destination stream when copyfmt () is invoked.
After these preliminaries, we do the actual work and copy the stream state via
rdstate() and clear (), the stream buffer pointer via rdbuf (), and eventually the
rest of the stream data members (including the exception mask) via copyfmt ().
Keep in mind that the issues that led to the prohibition of copy and assignment oper-
ations in the stream classes still apply. The stream buffer object, whose pointer is held by
both copies of the stream, is a shared object. The caller of streamcpy () must make sure
that lifetime and ownership issues are handled correctly. We generally do not recommend
copying streams, even using a function like streamcpy (), unless you are sure you really
need the copy and can handle all the resulting pitfalls. Here are two examples of how the
streamcpy () function could be used and what the resulting surprises would be.
In the code below, we assign all properties of one string stream to another string stream
by means of the streamcpy () function. Afterwards, the two string streams share the sec-
ond stream’s buffer object so that output via both string streams goes to the same buffer:
As expected, the second string buffer will contain the combined output of both out-
put operations, namely:
ofstream log("log.txt");
streamcpy (cerr,static_cast<ostream&>
(log) );
As desired, any output to cerr will actually be written to the file log. txt that the
file stream log represents. Here is the snag:
The stream buffer used by cerr after the call to streamcpy() is that of the file
stream object log. Now, the standard streams, including cerr, are global objects that are
destroyed after exit from main(). As a consequence, the file stream object log is
destroyed before cerr will be destroyed. Unfortunately, the destructor of cerr will
flush () the stream buffer, which is the shared stream buffer that belonged to the already
76 The Architecture of |OStreams
destroyed file stream log. An access violation due to the attempted invocation of
flush() will occur.
As a rule of thumb, reassignment of the stream buffer pointer should be done with
care. As the examples above illustrate, lifetime dependencies and surprises due to the
overridden version of rdbuf () are likely to create problems.
22. See section 2.3.1, Character Representations, which motivates the necessity of code conversions. See also sec-
tion 4.2.7, Character Encodings, for character encodings and code conversions in general.
23. Copying locale objects is a cheap operation. You need not be concerned that the numerous copies of locale
objects create any significant overhead. Copies of locales internally refer to the same container of facets. For
details see chapter 7, “The Architecture of the Locale Framework.”
‘SSVI a]VIO] ay} pun ‘sassuja Jaff{ng wvadys ‘sassyjd wuvadjs UsaMjaq diysuoljU]aL 914045 *p-7 aAnS1q
77
JoAe7] odsuesy
‘TeYO‘SEYO>JADBPOD
<} weyom>ed/jo +}—
a
| _snqeyy-o1seq
, SSBO:syey ‘ssejo:pseyo |
<1eyo>adAyo }—
ssejo
sja0eR adAjo JOYNG WeINSAI4
pl-----
, Ssejo:syesy
$]@DB} JAYIO t *‘SSBlO: | yeUO
Bee ewe ee sassejo
WPAIJS 8JBINUOD
9/290]
<}yeyom>jounduinu
JoAe7] Bunyyewso4
<yeyo>joundwnu -}—
Classes
<} yeyom>jeb
wnu +—
Sassejo
WANS [BJQUBE
<seyo>je6 wnu R—
Stream
<seyo>jnd wnu
eseq Sol Sasseja aseq WAS
$1998] DUAWNN
2.1
78 The Architecture of l1OStreams
INITIALIZATION. When a stream object is constructed, a copy of the current global C++
locale is used to initialize the stream’s locale. The same locale object is used to initialize
the stream buffer’s locale.
netRievaL. A copy of the stream’s locale object can be obtained by calling the
getloc() member function of the stream base class ios_base. Retrieval of the stream
buffer’s locale object is via the get loc () member function of stream buffer base class
basic_streambuf<class charT, class Traits>.
REPLACEMENT. The locale objects can be replaced independently of each other, or they
can both be consistently replaced by the same new locale.
For independent replacement the following functions have to be invoked: The func-
tion ios_base::imbue(const locale& loc) replaces the stream’s locale object. The
stream buffer base class’s pubimbue () function replaces the stream buffer’s locale object.
For consistent replacement of both locale objects, the stream base class
basic_ios<> defines an additional imbue () function that implicitly invokes the two
functions for separate replacement. The subsequent section explains the details.
IMBUING LOCALES
In choosing the name imbue() for the locale replacement function in IOStreams, the
standards committee coined a new technical term: imbuing a stream with a locale, which
means replacing a stream’s locale object(s).
The straightforward way of imbuing a stream with a new locale is to replace both
locale objects consistently, so that the change affects formatting, parsing, and code conver-
sion. This is easily achieved by calling a stream’s imbue () function. Here is an example.
If str is a stream object and loc a locale object, then str. imbue (loc) ; is a call to
basic_ios<charT, Traits>: : imbue (), which replaces both locales consistently.
If, for whatever reason, a new locale is to be set only for formatting and parsing, this
can be done by calling ios_base: : imbue () explicitly:
((ios_base&)
str) .imbue(loc) ;
if (str.rdbuf() != 0)
str.rdbuf ()->pubimbue (loc) ;
a delicate task. Consider, for example, an input stream that has numeric data waiting to be
extracted and parsed. If the stream’s locale is replaced, the remaining numeric data might
no longer conform to the parsing rules defined by the locale, which might lead to surpris-
ing results when the data are eventually extracted.
There is no indication of whether or not it is safe and sensible to switch locales. Nei-
ther the stream nor the stream buffer can indicate this. Hence, proper replacement of a
locale is completely under the control of the user code, and the user has to come up witha
concept that makes locale changes safe.
The general scheme for an inserter is retrieval of the stream’s locale via a call to
getloc(), retrieval of the locale’s num_put facet, and invocation of the facet’s put ()
function. Here is a simplified version of an inserter for integral values:
The extractor does basically the same. It retrieves the stream’s locale, retrieves the
locale’s num_get facet, and invokes the facet’s get () function. Here is a simplified ver-
sion of an extractor for integral values:
We do not want to discuss all of the arguments provided to the calls of the put ()
and get () functions. The details can be looked up in appendix A, Parsing and Extraction
of Numerical and bool Values, appendix B, Formatting of Numerical and bool Values,
and section 6.2.1, Numeric and Boolean Values. Certain principles, however, are worth
exploring here, because they aid understanding of the intended ways of using parsing
and formatting facets in general.
Consider the functions’ signatures:
and
10s_base::iostate& err,
long& val) const;
In the following, we examine access to source and destination of input and output,
access to formatting information, and error reporting, in particular:
irerators. [he operations of parsing and formatting facets take iterators as argu-
ments. put () functions take an output iterator that designates the destination for output;
get () functions take an input iterator range that designates the character sequence to be
parsed. Both operations return an iterator that points to the position after the sequence
written to or read from.
FORMATTING INFORMATION. Parsing and formatting facets retrieve formatting informa-
tion from an ios_base object that is provided as an argument to their operations.
ERROR INDICATION. get () functions store error information in an ios_base::
Lostate object; put () functions report failure indirectly via the iterator they return.
24. Stream buffer iterators are described in greater detail in section 2.4.3, Stream Buffer Iterators.
25. We are going to use the mathematical notation [ ), which indicates a half-open interval, whenever we need to
specify an iterator range.
82 The Architecture of |OStreams
The begin iterator is the stream buffer iterator pointing to the current position of the
input sequence; it is created from the stream itself. This is possible because stream buffer
iterators can be constructed from a stream itself by providing *this as the first argument.
The implicit conversion mechanism for function arguments in C++ cares for construction
of the input stream buffer iterator from *this by calling the iterator’s converting con-
structor, which takes a reference to a stream as an argument.
The end iterator is an input stream buffer iterator designating the end of the input
sequence. It is created via the default constructor for input stream buffer iterators. Default
constructors of iterators by convention always create an iterator pointing to the position
one step past the end of the sequence.
This means that you can instantiate formatting facets so that they can output their
results to any kind of container, as long as the container is accessible via an output iterator.
Parsing facets do exactly the same, using input iterators for receiving the character
sequence to be parsed. Only when used in IOStreams do these facets use stream buffer
iterators and in this way achieve direct access to the stream buffer.
this reference to a locale object containing related facets is accessible via the ios_base&
argument to its operations.
This means that the ios_base object provided to parsing or formatting operations
should contain a compatible, if not identical, locale object. In the code fragment from the
example above
INDICATING ERRORS
Two different techniques are applied for indicating failure of a facet’s parsing or format-
ting operation.
The parsing facets take a reference to an ios_base: :iostate object as an argu-
ment to their get () function. They set ios_base: : failbit in case of parse error and
ios_base: :goodbit in case of success.
The formatting facets do not make provisions for error reporting. Instead, the itera-
tor returned by their put () function can be checked for failure.
Let us revisit the previous examples of an extractor and an inserter for integral val-
ues. This time the error checks are added to the calls of put () and get ().
The extractor provides a reference to an iostate object to the get () function and
checks it for failure after parsing:
The inserter checks for failure of the formatting by checking the iterator that is
returned by the get () function. Output stream buffer iterators have a failed() mem-
ber function that reports whether output was possible:
Note that the error reporting of an inserter slightly differs from that of an extractor:
The extractor sets the stream state to whatever the parsing facet’s get () function decided
to report in the iostate object. This can be eofbit (if the end of the input sequence was
found), failbit (if the extracted character sequence violates the parsing rules, i.e., does
not have the format of an integer), and/or badbit (in case reading from the external
device failed). The inserter, on the other hand, always sets badbit, because problems
during output can occur only if writing to the external device fails.
Stream buffers are used by streams for actual transport of characters to and from a
device, whereas the streams themselves are responsible for parsing and formatting the
text input and output.
basic_streambuf
The following sections describe first the principles of the stream buffer abstraction
in general and then the concrete mechanisms for each of the derived stream buffer classes.
We concentrate on the main functionality of stream buffers, namely input, output, and
putback. Other aspects such as positioning and locale management are omitted, but can
be looked up in the reference part of this book if needed.
Stream buffer
The protected nonvirtual interface of a stream buffer class provides operations that
manipulate one or both of the internal sequences. Such operations
¢ retrieve the values of the pointers (the get area’s begin_, next_, and end_pointer via
eback(), gptr(), egptr(), and the put area’s begin_, next_, and end_
pointer via pbase(),pptr(),epptr()),
¢ alter the value of the pointers (by assigning new pointers via setg(), setp(), or
by incrementing the next_pointer via gbump (), pbump () ).
The public interface is built on top of the protected interface and is used by the
stream layer to implement its operations. The stream buffer’s public interface includes
operations for extraction and insertion of characters from/to the get/put area, stream
positioning, and other functionality:
¢ extract characters from the get area (sgetc(),sgetn(),sbumpc (), etc.)
26. The list of stream buffer operations is not meant to be complete. Only the most important and typical func-
tions are listed. For a complete description of the stream buffer base class’s interface, see the reference section.
Also, section 3.4, Adding Stream Buffer Functionality, provides more details on the protected interface.
88 The Architecture of 1|OStreams
e The stream buffer base class’s destructor is public and virtual, as is usual for a class
that is designed to serve as a base class.
e The stream buffer base class has only one constructor, which is a protected default
constructor. This is to ensure that only derived stream buffer objects may be con-
structed. The concrete stream buffer classes, of course, have public constructors.
Neither the copy constructor nor the copy assignment for any of the stream buffer
classes is specified by the standard. In particular, it is not required that they are inaccessi-
ble. They will most likely not be implemented at all, which means that the compiler-
generated default functionality for copying and assignment will apply. As a consequence,
stream buffers, which contain pointers to their get and put areas, can be copied and
assigned, meaning that the internally held pointers will be copied. Two stream buffer
objects that are copies of each other would operate on the same character array without
any coordination. The results are likely to be unpredictable. For this reason, avoid inad-
vertent copies or assignments of stream buffer objects.
Let’s return to the stream buffer’s core functionality and look at the principles of
handling character input and output in the stream buffer classes.
BASE cLass. For the stream buffer base class, basic_streambuf, underflow() is
in a nonoperational mode; its implementation returns traits: : eof (), which indicates
that the end of the stream is reached. Any useful behavior of underflow() fully
depends on the characteristics of the external device, and underflow() is well defined
for the derived stream buffer classes, which redefine this virtual function. The functional-
ity of uflow() is that of underflow() plus advancing the read position.
STRING BuFFeER. A string buffer cannot make additional characters available from an
external device, because string streams are not connected to an external character
sequence.”’ A string stream buffer can make characters available for reading only when
they have previously been stored in the internal buffer, for instance, as a result of a previ-
ous output operation. Such characters are made accessible by adjusting the get area point-
ers; more precisely, the get area’s end pointer must be moved forward to include
additional positions. This pointer adjustment can be done in underflow() or uflow()
as part of an input operation. Alternatively, it can be performed during overflow() as
part of an output operation. The standard allows both implementations.
Fite BurFer. A file buffer’s underflow() function makes additional characters
available by reading new characters from the file. It then converts them to the internal
character representation (if necessary), writes the result of the conversion into the get
area, and returns the first newly read character.
27. Section 2.2.3, String Stream Buffers, explains in greater detail why this is.
90 The Architecture of |OStreams
Then the character passed to overflow() as an argument is added to the put area, and
the get area’s end pointer might be adjusted to include this new character.”®
Fite BurFer. A file buffer makes positions in its internal buffer available by writing to
the external file. To be precise, it converts the characters contained in the put area to the
external character representation (if necessary) and writes the result of the conversion to
the file. After that it puts the character that was received as an argument to overflow ()
into the (fully or partly) emptied put area, unless it was equal to end-of-file.
28. The adjustment of the get area’s end pointer might alternatively be deferred to the next input operations and
would then be performed during underflow/().
2.2 The Stream Buffer Classes 91
STRING BuFFER. For string stream buffers, only the functionality (1) of pbackfail(),
storing a character in the input sequence, is implemented. The next_pointer is decreased,
and if the character to be put back is not the previously extracted one, the new character is
stored at that position.
Functionality (2), making available additional putback positions, does not make
sense for a string stream buffer. Putback positions are available only if characters have
previously been extracted from the string. When there are no previously extracted charac-
ters, pbbackfail() cannot make any available either.
FILE BuFFeER. For file stream buffers, functionality (1), storing a character in the input
sequence, is implemented in the same way as for string stream buffers. The next_pointer
is decreased, and if the character to be put back is not the previously extracted one, the
new character is stored at that position. A file buffer might fail to actually store the charac-
ter, because the associated file was opened only for input and does not allow write access.
Functionality (2), making available additional putback positions, is implemented-
dependent. For a file stream buffer it is conceivable that additional putback positions are
made available by reloading characters from the external file. The standard, however,
does not specify any implementation details.” |
The subsequent two sections describe the behavior of the string buffers and file
buffer in terms of an example. We explain in detail how input and output sequence, the
internal character buffer, and the get and put areas are related to each other for these two
derived classes. The third section describes the principle of the putback area, which is
_ basically the same for string buffers and file buffers.
In order to show the principles, we make assumptions about the implementation of
these classes. Standard compatible implementations, however, are allowed to differ and
may work in a slightly different way than demonstrated in the following. Still, the general
principles will be the same. The implementations of string buffers and file buffers over-
ride the virtual functions discussed above in order to achieve the results that we are going
to describe. In the following, we do not aim to explain exactly how each of the virtual
functions is redefined, but we intend to explain the overall net effect. Details of how to
redefine which of the virtual functions, and under which circumstances, are discussed in
section 3.4, Adding Stream Buffer Functionality.
29. Details of a typical implementation are described in section 2.2.4, File Stream Buffers.
92 The Architecture of |OStreams
Put area
Begin Next End
pointer pointer pointer
i
i Get area | | |
AERARODOEOARRERO
DOO ERERSSEEPOCOARARDEOEECARARSSODEREAERDS
In this example the capacity of the internal character buffer is 16 characters, which is
utterly unrealistic for real implementations. We do this on purpose, in order to keep the
example simple yet demonstrate the crucial case of what happens if the buffer is full or
empty.
The character sequence Hello Worl1d\n has been written to the output sequence,
and the pointers of the put area are assigned in the following way:
begin_pointer to the beginning of the character array
next_pointer to the next empty position behind the text written to the output
sequence
end_pointer to the next position behind the character array
The character sequence Hello has already been read from the input sequence, and
the pointers of the get area are assigned in the following way:
begin_pointer to the beginning of the character array
next_pointer to the next position behind the text already read from the input
sequence
end_pointer to the same positions as the put area’s next_pointer, because it is
not possible to read text that has not already been written
Let us discuss the effect of input and output operations on the string stream buffer
starting from the situation described above.
OUTPUT
“NORMAL” siTUATION. In this situation we write an additional character to the string
stream buffer. The put area’s next_pointer refers to the next available position in the put area.
2.2 The Stream Buffer Classes 93
Hence the additional character is put to the position the put area’s next_pointer refers to.
Afterwards the next_pointer is incremented, so that it points to the next available position.
“OVERFLOW” situation. If we keep on adding characters to the string stream buffer,
the put area will eventually be full. When the internal buffer is full, the put area’s
next_pointer points to the end of the buffer area, i.e., next_pointer == end_pointer. Figure
2-8 illustrates this situation:
Get area —
Begin Next
pointer pointer pointer
*
RPSPOREODDDELSOER ASO DORRREODOOREDERSOOCEDORADEOREDREROS
INS ORAROOEROREREAORODR ERED OOURARE.
This situation is special, because the internal buffer is full. If we want to write an
additional character, the string stream buffer needs to make available a new position in
the put area. This is achieved by calling overflow(). The function overflow()
acquires a new character array that can hold more characters. Figure 2-9 shows the situa-
tion after the call to overflow/():
ite
z:
¢ q
:
far Begin Next End
:
p ointer p ointer ointer i: |
| | x
se ponnaccsegnnnescenenen,
Segenssace : es sonece, Sen enrereesacsseseeeseshrregne heh POOCPeSAAAPOASEt SF PNIASERLOPOEDEEE EARP LORENSSEORERTOERADOERASAARELEDNGSERDSOSCLSSOOCRRERS OSS ERERBeaeoncensnaneeenneaesens teens, j
Afterwards the new character is put into the new position in the put area and the
put area’s next_pointer is incremented as always.
INPUT
During all these output operations on the string stream buffer the input area basically did
not change. After the reallocation of the internal buffer due to the overflow (), the get
area’s pointers are reassigned to the same positions relative to each other.
Hle[z{z[o] |wlo}z|ijdha
DESOTO OEEDOON OE SSREDSESEDEDEEEEDSHESOREAREP ESOS EBELOSOSOLELELS OSH
[ae r| | | TL]
!
OHCOSESOOOLLSUSSOBODIESOESEOS
mC
CSoee ae cd cedee eee eee TOR UEEC ECON SES HEC O CORT CONST EEONOEE ESE TOTOECETEECTOE OOOH TEESECODOOETOTEEDOOEESOTUUTED ED ESECOOR DOO ETSCLELOUHE TOO TONOOOSOTITOOE OS DS OTSEUHEES OI OSOR TRe
|! Ananecereenacensegcroaanacorvocaangevesescageusscesananeseseressesesereeens eee
|! AAA EORTC OSASSSREDOCSONE
“NORMAL” SITUATION. If we read a character from the string stream buffer, we receive
the character that the get area’s next_pointer refers to. Considering the situation in figure
2-10, this is a whitespace character. Afterwards the get area’s next_pointer is incremented.
“UNDERFLOW” SITUATION. Let us assume that we keep on extracting characters from
the string stream buffer and there is no intervening insertion; i.e., the put area does not
change. We will ultimately reach the end of the get area, ie., next_pointer == end_pointer,
as shown in figure 2-11.
If we now try to read a new character from the string stream buffer, underflow ()
is called in order to make additional characters available for reading. underflow ()
adjusts the get area’s end_pointer so that it points to the same positions as the put area’s
next_pointer. In this way, all previously written characters are made available for subse-
quent read attempts. If all previously written characters have already been read and the
get area’s end_pointer equals the put area’s end pointer, underflow () fails. In the situa-
tion shown in figure 2-11, additional characters can be made available, and
underflow() adjusts the get area’s end_pointer as shown in figure 2-12:
codenenasensnenaeeanseneceneaensescenesensanesoonaunneaneronsananenconanssssesrsnancecooosusses send |
pisc.aimer. The model explained above is just one of many ways to implement a
string buffer. As an alternative, overflow() could allocate a new buffer area that holds
exactly one additional position and adjusts not only the put area’s pointers but also the
get area's end_pointer. In this way each character written is immediately available for
reading, without any pointer adjustment performed via underflow() asin the example
above. In this alternative model, underflow() need not be redefined at all. Naturally,
this solution is less efficient than the one described before, because the internal character
buffer is always full and must be reallocated for each single character written to the string
stream.
96 The Architecture of l|OStreams
PUTBACK
Figure 2-13 shows a typical situation in which a number of characters have already been
read from the input sequence. In this situation, characters can be put back to the input
sequence.
Only the get area is relevant to our discussion of the putback support; the pointers of
the put area are not affected at all. The string He110 has been extracted, and the get area’s
next_pointer points to the next available read position. If a character is now requested via
sbumpc (), the next character (the blank between Hello and World\n) is extracted and
afterwards the next_pointer points to the character W.
“NORMAL” SITUATION. Let us see what happens if we then call sungetc (), with the
intention of putting back the just extracted character, which was the blank. In this case the
get area’s next_pointer is simply decremented and points to the blank again. The next
extraction would again return the blank character, which means that the previous extrac-
tion was reversed by the call to sungetc(). A further call to sungetc() would decre-
ment the next_pointer even further and make available the character o for a subsequent
read operation.
“PBACKFAIL” SITUATION. What if, in that situation, sputbackc('1') is called
instead of sungetc () ? The function sungetc () is supposed to make available the char-
acter o, whereas sputbackc('1') should override the character o and put back the
character 1 in its position. As the character that is put back is different from the character
that was extracted from this position, the function pbackfail() is called, and
Put area
Begin
pointer pointer pointer :
35
=
e
en]
e;
ee
€=
€5
Get area
°
°
$4
Put area Begin Next End ;
3Fe
pointer |
pointer pointer j,
| | é
ese:
e:
«
e
3
«$.
e
«
e
“Pnggpag i ed
i or .
CSeSeees seu : COED COSTE ERODE HOSES OO DDE Os CUE TENE HSC USEE OS 6 eS TCOU DED COECECO OO URC C COCO EEE sO EEENDEDECEEECD LOD SL CESUEeETSLeeSEooOL
SECT EObOOLSs peceeebe CRESSE
Figure 2-14: String stream buffer after putting back the character 1.
pbackfail() performs the write access to the get area and overrides the character 0.29
The situation after sputbackc ('1') looks like the one in figure 2-14.
ANOTHER “PBACKFAIL” siTuaTION. We can keep on putting back characters via sput-
backc() or sungetc() until we hit the beginning of the get area, as shown in figure 2-15.
The next attempt to put back a character triggers pbackfail (), whichis supposed
to make further putback positions available. The get area’s next_pointer cannot be decre-
mented any further, and pbackfail () indicates failure. Only if characters are read from
the get area will putback positions become available again.
Note that the put area’s pointers are not affected by any of the putback operations.
However, overwriting characters in the get area by means of sputbackc () changes the
content of the internal buffer, much like an output operation. The modifications will be
visible when the content of the string buffer is retrieved via str (), for instance.
30. Positions in the internal sequence are overwritten only if the stream buffer’s open mode allows it. A stream
buffer whose open mode does not include output mode will not allow any write access to the internal sequence.
31. Only in rare situations, when the file size is less than or equal to the buffer size, can the internal buffer hold
the whole file.
98 The Architecture of |lOStreams
Next End ;
Begin
pointer pointer
Ese ee yes
nsensneeneseeewerees ees en seen enseesansnsssuvesesessesensesessssnscnesnveseevarsesnsesoussevesHeusenenDeLeoueue
sceeececccaccsSeeaeeebscedvsidesevceceeeeccncnsecceesessecaceseccebecssccserseesenee
pointer
Figure 2-15: String stream buffer with the putback position available.
respectively, or whether there is a shared internal character array for both areas. The
assumed sample implementation we present in the following sections is one of a variety
of conceivable implementations of a file stream buffer. Your particular implementation
might have implemented a different scheme.
In our assumed implementation, the file stream buffer maintains only one internal
character array, which is of a fixed size and too small to hold the entire content of the
external character sequences. For this reason, the internal character array holds only a
subsequence of the input sequence in the get area and a subsequence of the output
sequence in the put area. Logically, both the put and get areas are present simultaneously;
in practice only one of them can be active at a time, because the file stream buffer has only
one internal character buffer: During output operations, the internal character array rep-
resents the put area, and the get area is inactive; during input operations, the internal
character array represents the get area, and the put area is inactive.
The respective inactive area does logically exist, but it may not be immediately
accessible. If, for instance, the get area is active, no output operation should be triggered,
because it would need access to the currently inactive put area. An output operation can
only follow an input operation if the file is repositioned in between, which puts the file
stream buffer into a neutral state, from which it can reactivate the put area and make its
content available in the internal buffer.
Let us first explore input and output separately before we discuss the scheme for
exchanging the get and put areas while switching from input to output and vice versa.
2.2 The Stream Buffer Classes 99
OUTPUT
Initially, neither the put nor the get area is available. An area is considered unavailable
when its next_pointer is zero. The begin_pointer and the end_pointer are undefined when
the next_pointer is zero; they can also be zero or have any other arbitrary value. The con-
tent of the internal character buffer is undefined, too, in this situation; it might be empty,
filled with garbage, or not even allocated. Figure 2-16 shows this neutral situation.
Any output request in that neutral situation triggers over f1low(), which activates
the put area, places the first character into the internal character buffer, and adjusts the
put area’s pointers. Afterwards, the internal buffer area is filled with the remaining char-
acters that were passed to the output operation, and the next_pointer is advanced accord-
ingly. Figure 2-17 shows the situation after output of the string Hello World\n.
If we keep on writing output to the file stream buffer, the put area’s next_pointer
will eventually hit the end_pointer. Then overflow() is called again in order to
make available additional put positions. overflow () achieves this by transferring data
from the internal buffer via code conversion (if necessary) to the external file. It is
implementation-dependent whether all or only parts of the data in the internal buffer are
transferred to the external file. The standard requires only that overflow() make
“enough” positions available in the buffer; it does not specify how many positions. For
our sample implementation, we assume that the entire internal character buffer is writ-
ten to the external file. Afterwards, overflow/() stores the first character in the internal
character buffer and adjusts the put area pointers as shown in figure 2-18.
Now there is plenty of room in the put area for further output, and the output
request that triggered overflow() can be completed.
The character sequence that is transferred from the internal character buffer to the
external file during overflow() is placed into successive locations on the external file
File buffer
PE ECO OOO R CCST TEESE EEECC ENDOSC CECT NOTES SECE NEC SO NESS ROL eee e UE EEC CE RCO ETESRDEbECOEDES
Next
pointer
pointer | 6 pointer
>
eeenees
| —> Undefined < |
PPEREOOOEOPOREREEH
EOD EDAEOOLOOEEARESED OPO ORIMEEOO EEE IRALLOLODOFISROOEODOAEENOOODEOODOEREODESORARDDSOEOFDE
Seeeee
+
Begin y End
Next ne
pointer
°
PAMORMEOEOEAEERRDOOEOAANAROEEEEEDOESODSOEMOORRODOCTOESE
OE CEEREAASRODOEOEREEOESOOS
File buffer
000069908 RG80000CCAREEECSSOREEEELUEPSAREROOCOOEPRESDOD
DOS OREEERASRPRELOOSOOFERSOROSECORESREEORODEREENSOOOSSERAEDEDEOSPODEEESESHEASSOHSODORSESANEDHOSESESENSEDFOSLLNOLSOE?
|wiojrj1]/dj\n |
‘buller |H|e|1|1ljo}|
—»> Undefined < | |
Get area |
End /
\
Begin
p ointer i
in r Next —
pointe
pointer
File buffer
Begin
: I r :
| Put area pointe
== End |
:
Next pointer :
pointer
y End
pointer
DOOR ORF ORAAARROOOOOERDDEOOOSOORARS P2LOPEPRRD990OOLAEADREDESOERARDSSOEORARERES
starting at the current external file position. Where the external file position indicator
stands depends on the circumstances.
Immediately after a file stream buffer is connected to an external file (via open ()),
the external file position indicator is either at the beginning of the file, which is the default
situation, or at the end of the file, if the open mode included the at-end flag.
After preceding output operations (via sputc (), sputn ()), the external file posi-
tion indicator stands where the last output operation left it.
2.2 The Stream Buffer Classes 101
After an explicit repositioning of the stream position (via seekof f (), seekpos()),
the external file position indicator is reset to a corresponding position in the external file.
If the open mode includes the append flag, the external file position indicator stands
at the end of the file and cannot be repositioned to any other position.
INPUT
Input, like output, starts with a neutral situation, in which neither get nor put areas are
active. Figure 2-19 shows this neutral situation.22
An input request in this situation triggers underflow () in order to make available
get positions for reading. This is achieved by transferring data from the external file via
code conversion (if necessary) to the internal character buffer. It is implementation-
dependent whether underf1ow/() fills the entire internal buffer or only a part of it with
characters transferred from the external file. In our sample implementation we assume
that underflow/() fills the entire internal buffer if possible. The get area is activated, and
the get area’s pointers are adjusted. Figure 2-20 shows the situation after the invocation of
underflow().
This is the situation after requesting the first character from the file stream via
sgetc(). Had we extracted the character via sbumpc () instead of sgetc(), uflow()
would have been called instead of underflow(), with basically the same result. The
only difference would be that the put area’s next_pointer would be advanced by one posi-
tion and point the next available read position.
File buffer
Ne|
pointer MULE
NULL
_ End
pointer | f pointer
1 —»> Undefined <—
32. Whether the initial neutral state exists in practice is implementation defined. An implementation can
also
activate the get area right away and fill it with characters transferred from the external file before any
actual
input request.
102 The Architecture of l|OStreams
File buffer
Next
Put area . End
Begin pointer
pointer | 4 pointer
pointer
If we keep on requesting input from the file stream buffer, the get area's next_
pointer will eventually hit the end of the internal buffer. underflow() or uflow() will
then be triggered again. These operations discard the current content of the internal char-
acter buffer and transfer the next sequence of characters from the external file into the
internal buffer.
The character sequence that is transferred from the external file to the internal char-
acter buffer during underflow() or uflow/() is taken from successive locations on the
external file starting at the current external file position. Where the external file position
indicator stands depends on the circumstances.
Immediately after a file stream buffer is connected to an external file (via open ()),
the file position indicator is either at the beginning of the file, which is the default situa-
tion, or at the end of the file, if the open mode included the at-end flag.
After preceding input operations (via sgetc (), sbumpc ()), the external file posi-
tion indicator stands where the last input operation left it.
After an explicit repositioning of the stream position (via seekoff (), seekpos ()),
the external file position indicator is reset to a corresponding position in the external file.
After output, the file stream must be flushed or repositioned before any input
is permitted.
2.2 The Stream Buffer Classes 103
After input, the file stream must be repositioned before any output is allowed,
unless the preceding input operations have reached end-of-file, in which case
output can immediately follow input.
In our example, where the file stream buffer has only one internal character array,
which represents either the put or the get area, the file stream buffer must exchange the
get and put areas with every switch between input and output operations. Again, the fol-
lowing explanations are based on our sample implementation; your particular implemen-
tation might work differently.
File buffer
£ OP oeeeeeeeeeeereee eee eeeuueueeeeeeeeUeUe es eLeLU EUS LOSE SET OEDSORS THULE NES DOLTUYUDEOEEETUUYSSEOCUUCUUSSOOUETHUTYDOSDUSUSUY
SOUS COeESOveeSebereeeseseueUTDDSSSeeSSEDeceedsSeuecccereuesoececeecs
penenses |
POC eC COVE Cease eteee
:
Yt {ttI |
|
End
Next - pointer
pointer
.
FEET OSLO SAAEUAASSEGGGG AGO SS SG AEABEEEEERSITESSONSLGAMER ES OSULAESEUESEOOLOLDNS OS UE RECEDES SOLDER OEDSESCREDSRODET OR CODEOECOSESALINODECCL
AOSD ESOS CE DEROLES ORR EAESELSSORAEAEDOSOSSDERDELE SO eAAAeSSesenaaussecuecananeseseganaesesoeansnusesencace
File buffer
Next owe
Put area Begin. i
pointer — End
pointer | _f pointer
Figure 2-22: File stream buffer in neutral state after flush or repositioning.
and the get area’s pointers are adjusted accordingly. The character sequence transferred
from the external file starts at the current external file position. Depending on whether the
preceding operation was a flush or a repositioning, the external file position is either the
last write position or the position to which the file position indicator was repositioned.
Figure 2-23 shows the situation after a successful input operation.
File buffer
| Next
:
Put area
Begin, pointer End
pointer A pointer
|
Get area |
End
Begin Next
tion reached the end of the file. Otherwise, before any output operation can follow, the file
stream must be repositioned.
Reaching the end of the file during input puts the file stream buffer into its neutral
state, because the entire file content has been consumed, and further input is not possible
without any intervening output or repositioning. For that reason, the content of the inter-
nal character buffer can be discarded and both areas deactivated. As expected, the file
position indicator of the external file stands at the end of the external file in this case.
Repositioning, too, involves the file stream buffer’s discarding the content of its
internal character buffer and putting itself into the neutral state, in which both areas are
inactive. The file position indicator of the external file is reset accordingly, which affects:
only the external file but has no immediate effect on the get or put areas.
No matter whether the file stream is repositioned or whether the preceding input
operation has reached the end of the file, the file stream buffer is put into its neutral state,
as shown in figure 2-24.
An output operation in this situation works as described earlier for output in gen-
eral: First, overflow() is invoked, which activates the put area. Then the respective
character sequence that was passed to the output operation is stored in the internal buffer
area, and the put area’s pointers are adjusted. Figure 2-25 shows the situation after suc-
cessful output of Hello World\n.
piscLaimer. The explanations given above regarding the management of a file
stream buffer’s put and get areas are not to be taken literally. An implementation is free to
achieve the same effect in a different way. In particular, the neutral state can be expressed
in a different way, but it always exists logically. The neutral state serves as the initial state
of a file stream buffer, but it is also logically reached when input operations hit the end of
File buffer
PO CWOCUCO CUTE CeCe eR ENE CCCeneeewedeseoeesaTeES
| — >
>
Undefined
.
<—
CAOeeKoeeeeeeennoasserrccseoaraneenesoneses Han ccaesoreaseanaeses
Figure 2-24: File stream buffer in neutral state after repositioning or reaching end of file during
input. |
106 The Architecture of lOStreams
File buffer
Beesscee CHOON TESTES OTC T TSF ERODED EEEETED DEED CEEOOOODESEESESER
DEE SS TE TECNR CECT OOOS
, Put are , :
area Begin Next End
pointer pointer pointer :|
|
;
DOORSs
OOOO CO DOEEEOOSEEASEOEEO
putter [Je
pointer
>Ban eencee ene neereeeesasesne eee esses eDeesssaeEseeheesMOROROeOOSEDEOEORESEDESEO DOES OLESESIODOAAHR SO OOSESSSDOEOIEASAROLOLORERESODEDOSOHSN OTR OCHEOUASSROCESASARN DOD OCOHRERSESOPEOSSS SS
the file or when the stream position is reset. A file stream buffer may also put itself into the
neutral state for other reasons, such as error situations. How the neutral state is expressed
or how exactly an implementation of a file stream buffer uses its internal character
buffer(s) to represent the put and get areas is an implementation detail left open by the
standard.
PUTBACK
Putting back characters to the input sequence via sungetc() or sputbackc() can be
successful only following preceding input operations. Let us consider such a situation. As
a result of the preceding input operations, the get area is active, and the file stream buffer
might look like the one shown in figure 2-26.
Putting back the previously read character means decrementing the get area’s
next_pointer. Putting back a character different from the previously read one means
decrementing the get area’s next_pointer and storing the different character at that loca-
tion in the internal character buffer. pbackfail() is responsible for this write access to
the get area. The write access will be rejected if the file stream buffer is not connected to an
open file. Figure 2-27 shows the situation after three previously read characters have suc-
cessfully been put back.
If we keep on putting back characters, we will eventually hit the begin_pointer. Then
the next_pointer cannot be decreased any further, and pbackfai1 () is triggered in order
to make further putback positions available. What pbackfail() does in such a situation
is implementation-dependent. In our example, the attempt to put back any further charac-
ters will fail, because we consider it unusual that a large number of characters is put back
into the input sequence, and for that reason we do not support it. Alternatively, a file
2.2 The Stream Buffer Classes 107
File buffer
* re ; ve * Dee een re ee TOOL CLEA T AE OTe tert isisisesneteninermeresscnens
se ei OOOO FeCOEOE ORO t de CSET O HTH TTES TCD UE TS 0 6 COSH EEE E56 SO00 680 O COCO OEE H DH ESEET OE TE Es CETTE TCC SE OCTET NE OE CTETR EE Ue Rees OuUeCESEBESbReeSeeeeES
Next
Put area pointer End
pointer | 4 | pointer
| |
: Get area | | |
Begin Next End |
pointer pointer pointer i]
File buffer
>poreserevevecrscscerseveses eee eawveceeneeseveccessseviesendevensveseseceswevecsseseurvevesseseceevessscudveubedecauciecssseseeeesevesceres Convecesveceosereue: weresetersenessevevetesesendenssssessvenoresseneeusnesonsenvenssnstavenstesesntsuneseessenins :
: q
:é
ext
>
:
N ‘‘
:
: e¢
°
3:
.
.
End 2
34 4
a]
. 3
ointer ointer 3‘
.
¢
: |
> . ai. |
:
*
> naerine <
3
e PE Aeeaenrorccesecgogores ROP OASLOSDEDCORAESORAD
ED EEDEDRDOREANEOSEDAOLEegeeFonPeesesaeenennns: PAACOSOORAOAREEOO ROOD AAGE nee.
€.
$4
$
Get area i
¢:
$‘
sS2
‘
:
e«
Begi
:
egi ot i:
a3
°
: .
n Next
. e
n
End e
23
pointer pointer pointer ¢
or
«=
$ Pe
on
°
* <&
«“
lssassssengoconssannsnnsonnnopesnsoaanaaneasenensnensaransannaresansnanennasaasnonenanen SOeseessasacen Aeccercescces seceseneennaneenoncennesascnesnnseancaceneesenssscnssssacsssnoosennassnnsnad
File buffer
ev ee eres eCDT ONE T TES C COLE EE TENSE OO ORS ESCO REDE DEES ORS DON ER SECS ENO RS Sees eeeeeeD
:
Next | |
Put area . pointer End
pointer | 4 pointer
|
>
Y.
Undefined <
PRURVOOOEEIAG
SORE RP RERGS OSS CES POLOROEOOD soeeenen
character buffer reserved as putback positions in our sample implementation. The num-
ber of putback positions a file stream buffer reserves, if any, is implementation defined. In
our sample implementation, under flow() or uflow() copies the last four characters of
the consumed get area to the first four locations of the internal character buffer before
they fill the rest of the internal buffer with characters transferred from the external file.
Figure 2-29 shows the file stream buffer after invocation of under flow().
Now it is possible to put back four characters into the get area even if it has just been
refilled from the external sequence.
File buffer
p Next
ut area ——. 7
Begin pointer End
pointer | A pointer
:
| » Undefined <
Get area |
Begin Next End
pointer pointer
Figure 2-29: File stream buffer after overflow(), showing the reserved putback positions.
2.3 Character Types and Character Traits 109
In general, putting back characters is possible only if the get area is active, which
means that for bidirectional file streams putback cannot immediately follow an output
operation. The same rules as for input following output apply, that is, the file stream must
be flushed or repositioned before any characters can be put back into the input sequence
after an output operation.
If an output operation is performed after putting characters back into the input
sequence, the entire get area, including the putback positions, is discarded to make room
for the put area. As a result, any changes made to the putback positions are lost.
WHAT IS A CHARACTER?
Before we discuss the data representations used in IOStreams in further detail, we want to
take a look at character representations in general. Many issues regarding characters are
Program < >| Formatting j~<———~> Buffering and — >; External device
A A transport A
frequently confused. First, there is the abstraction of a character. This is what we intuitively
associate with a shape. It has numerous properties: visual representations (glyphs),
binary representations (codes), and many more. Then we deal with characters inside our
C++ programs. Here the character is an object. Like other objects in our program, character
objects are instances of a certain type and have an individual object state. The most com-
monly known character types in (C and) C++ are type char for narrow characters and
type wchar_t for wide characters. A character object's state is the content of the character,
i.e., its binary representation. It is the bit pattern stored inside a storage unit of type char
or wchar_t, for instance. The content of a character is also called a character code. A code
usually belongs to a character encoding, which is a set of character codes along with rules
for their interpretation. In this book we talk of character objects. Keep in mind that when-
ever we mention a “character” in the following text, we mean a “character object inside a
C++ program.”
char ASCII
char EBCDIC
33. For further information on character encodings, please refer to section 4.2.7, Character Encodings.
2.3 Character Types and Character Traits 111
wchar_t Unicode
Let us now return to the character representations in IOStreams and find out which
type and encoding they have.
int i;
cout << "Number of elements is: " << i;
In this example a character sequence and an integral value are passed from the pro-
gram to the formatting layer.
In both cases the data representation depends on the programming environment’s
internal encoding. It is the compiler that decides whether an integral value, for instance,
has a 32-bit or 64-bit binary representation, or whether a string literal is represented as a
sequence of one-byte ASCII characters or two-byte Unicode characters. With regard to
character representations, the C++ programming language supports the data types char
for narrow characters and wchar_t for wide characters. However, both the size of these
types and the encoding of narrow and wide characters may vary between different pro-
gramming environments.
In sum, the native character representation is that of characters or character sequences
exchanged between the program and the formatting layer, while its character type and
encoding are determined by the programming environment.
SINGLE-BYTE FILES
CHARACTER TYPE. Single-byte text files contain characters of a one-byte character
encoding. A narrow file stream extracts data from the file in portions of one byte each.
One-byte characters can be stored in units of type char. Hence, the type of characters
exchanged between the transport layer and a single-byte file is char.
CHARACTER ENCODING. By default, the character encoding of characters stored in the
single-byte text file is supposed to be the same as used inside IOStreams; i.e., it is the pro-
gramming environment’s native encoding. Single-byte text files accessed via IOStreams
in a programming environment that internally encodes characters in ASCII, for instance,
are supposed to contain ASCII characters. This is the default for narrow file streams in
IOStreams.
However, file streams are designed to be flexible and adaptable. A narrow file stream
can also handle single-byte files that contain different character encodings. If the encoding
of characters contained in a text file differs from the native encoding, a conversion between
34. Input and output to wide-character files, such as Unicode file, are not directly supported by IOStreams.
2.3 Character Types and Character Traits 113
the internal and the external encoding is performed. File streams delegate such code con-
versions to their locale object—to be precise, to their stream buffer’s locale’s code conver-
sion facet. Hence, in imbuing a narrow file stream with a locale that has an appropriate
code conversion facet, the file stream can be made capable of handling EBCDIC files in an
ASCII environment for example. Such narrow-character code conversion facets are not
provided by the standard library though, but have to be provided otherwise.
In sum, the external character representation of narrow file streams is that of units
transferred to and from a single-byte text file. Its character type is char, and the encoding
depends on the stream’s code conversion facet. By default the encoding is the program-
ming environment’s native encoding.
MULTIBYTE FILES
CHARACTER Type. Multibyte files contain characters in a multibyte encoding. Differ-
ent from one-byte or wide-character encodings, multibyte characters do not have the
same size. A single multibyte character can have a length of 1, 2, 3, or more bytes. Obvi-
ously, none of the built-in character types, char or wchar_t, is large enough to hold any
character of a given multibyte encoding. For this reason, multibyte characters contained
in a multibyte file are chopped into units of one byte each. The wide-character file stream
extracts data from the multibyte file byte by byte, interprets the byte sequence, finds out
which and how many bytes form a multibyte character, identifies the character, and trans-
lates it to a wide-character encoding.
Due to the decomposition of the multibytes into one-byte units, the type of charac-
ters exchanged between the transport layer and a multibyte file is char.
CHARACTER ENCODING. The encoding of characters exchanged between the transport
layer and a multibyte file can be any multibyte encoding. It depends wholly on the con-
tent of the multibyte file. As wide-character file streams internally represent characters as
units of type wchar_t encoded in the programming environment’s wide-character
encoding, a code conversion is always necessary. The code conversion is performed by
the stream buffer’s code conversion facet. There is no default conversion defined. It all
depends on the code conversion facet contained in the stream buffer’s locale object, which
initially is the current global locale.
In sum, the external character representation of wide-character file streams is that of
the units transferred to and from a multibyte file. Its character type is char, and the
encoding depends on the stream’s code conversion facet.
SUMMARY
Let’s summarize the character representations used in IOStreams:
The native character representation is that of characters or character sequences
exchanged between the program and the formatting layer. Its character type and encod-
ing are determined by the programming environment’s internal conventions.
The internal character representation is that of the units produced and consumed by
the formatting layer and is identical to the representation of the units buffered in the
114 The Architecture of lOStreams
transport layer. Its character type is determined by the stream’s character type charT,
and the encoding is the programming environment’s internal encoding for narrow and
wide characters.
Character sequences in the internal representation are produced and consumed by
the formatting layer, and in particular they are produced and consumed by the formatting
and parsing facets of the stream’s locale. Hence, a charT stream has a charT stream
buffer and uses the charT numeric and ctype facets of its locale.
Character sequences in the internal representation are also produced and consumed
by the stream’s code conversion facet. A charT file stream uses a code conversion facet of
type codecvt <charT, char, stateT>.»
The external character representation of file streams is that of the units transferred to
and from a single-byte or multibyte text file.*° Its character type is char; the encoding
varies and depends on the file’s encoding. The stream buffer’s code conversion facet con-
verts between the external and internal character representation.
35. For an explanation of the state type stateT, see section 2.3.2.1.3, Conversion State.
36. Transport of units of type wchar_t is not supported by any of the concrete streams in the standard library.
2.3 Character Types and Character Traits 115
its type differs from the character type: the end-of-file value’s type is defined as a type
nested in the character traits called int_type, whereas the character type is a different
type: char_type. For convenience reasons each stream class contains a nested typedef
int_type, defined as traits: :int_type.
The end-of-file value and its type are used by numerous operations of IOStreams.
Typically, stream operations receive or return characters, which can either be valid charac-
ters of type char_type or the end-of-file value of type int_type. In order to handle
both cases, these functions receive or return values of type int_type, which is large
enough to hold valid characters and the end-of-file value.
This leads to situations in which one needs to translate between char_type and
int_type. For this purpose, the character traits provide conversion functions:
Here is an example that demonstrates their use. The stream buffer’s member func-
tion sgetc() is invoked. It returns the next available character, or the end-of-file value if
no characters are available. In the example below, the returned value is checked and con-
verted to its character equivalent in case it is different from the end-of-file value.
Note that two values of type int_type cannot simply be compared by means of the
built-in equality operator, because int_type can be any arbitrary type. Instead they are
compared via the eq_int_type() member function provided by the character traits.°”
Similarly, two character values are compared by means of the traits member function
eq().
37. In the Classic IOStreams this would simply have been a comparison of two integral values. Also, the explicit
conversion from int_type to char_type was not needed in Classic IOStreams, because the compiler auto-
matically converted int to char.
116 The Architecture of !OStreams
38. See section 4.2.7.5, Code Conversion, for a detailed explanation of state-dependent code conversions.
2.3 Character Types and Character Traits 117
fied position together with an offset. The functions for retrieving stream positions are
tellg() andtellp().
ABSOLUTE STREAM Positions. An absolute stream position is a position that has been
obtained by a previous successful call to a tell function. When a seek function is provided
with an absolute stream position, it alters the stream buffer so that the provided position
becomes the current position. Positions that have not been obtained by previous success-
ful calls to a tell function on the same external device are not valid and lead to undefined
results.
The type of absolute stream position is defined in the stream’s character traits as
pos_type. It is a type that is able to hold all the information necessary to reposition a
stream buffer, which in the case of file streams includes a code conversion state. This is
because file streams are designed to perform state-dependent code conversions. (See sec-
tion 2.3.2.1.3, Conversion State.) Associated with each position is not only the actual infor-
mation about the position itself but also the information about the conversion state at that
position.
The position type of a file stream is always fpos<Traits: :state_type>, where
fpos<class stateT> is a predefined class template in IOStreams for defining position
types that carry a conversion state. This requirement is mandatory only for file streams
because they are the only stream that performs code conversions and hence need to main-
tain a conversion state. Other types of stream can have a different position type.
SPECIFIED STREAM Positions. A specified stream position is either the beginning of
the external sequence, the current position, or the end. Specified stream positions are of
type ios_base::seekdir, and the three options are represented by the constants
ios_base: :beg, ios_base::cur,and ios_base::end.
A specified position is always accompanied by an offset. Offsets represent a signed
displacement, measured in characters, from a specified position within the external char-
acter sequence. When a seek function is provided with a specified position and an offset,
it repositions the stream buffer by a displacement calculated in terms of bytes as the prod-
uct of character size*’ and offset. The direction of the repositioning is determined by the
offset’s value. If the value is positive, the current position moves toward the stream’s end;
if it is negative, it moves toward the beginning.
Not all external sequences can be repositioned. If the external sequence is a mullti-
byte sequence, it is impossible to calculate the displacement in bytes from the offset,
because each character can be of different size. Another typical exception is external
sequences that are connected to display devices.
39. The character size of an encoding can be obtained via the code conversion facet’s member function encod-
ing ().If encoding () >0, then the returned value is the character size. Otherwise, the encoding does not have
a fixed character size, and positioning is not possible.
118 The Architecture of |1OStreams
istringstream buf;
// £111 string stream
// read input until a position of interest and memorize the position
istringstream::pos_type p = buf.tellp();
if (p != istringstream: :pos_type(-1) )
{ // tellp() was successful
// read further input
}
// return to the point of interest
buf.seekp(p);
// return to beginning of stream
buf.seekg(0,i0s_base: :beg) ;
For convenience reasons each stream class contains a nested typedef pos_type and
off type defined astraits::pos_typeandtraits: :off_type.
40. See section G.4, Template Specialization, in appendix G for an explanation of template specialization in
general.
2.3 Character Types and Character Traits 119
The traits type parameter of string and IOStreams classes has a default value, so that
users of these classes need not specify the traits type argument. The default value for the
traits type naturally depends on the character type. It is a specialization of the predefined
character traits class template char_traits. In this way the predefined specializations
for char and wchar_t are used as defaults, and even for user-defined types there is a
natural default value.
Imagine you would define a new character type myChar. Then you would have to
provide an associated traits type. If you defined it as a specialization of the char_traits
template, i.e., as char_traits<myChar>, the default would apply, and a myChar-file-
stream would be of type basic_fstream<myChar>. Alternatively, you could give the
traits type a name of its own, say myCharTraits. In that case the traits type argument
could not be omitted, ie, you would have to say basic_fstream<myChar,
myCharTraits> instead of just basic_fstream<myChar>. For this reason it is recom-
mended that you define the traits type associated with a character type as a specialization
of the char_traits template. There is only one situation in which you would want to
define traits types that are not specializations of the char_traits template: when you
define more than one traits type for the same character type.
41. Note that some implementations require even more of a character type, such as
* a conversion to the character type char to allow conversion of a character of user-defined type to its
corresponding one-byte character code, or ‘\0’ if no such code exists
¢ aconversion from the character type char to convert a char value to its corresponding character code
of the user-defined character type
° an operator!=() to compare elements
¢ a guarantee that construction from the character ‘\n’‘ yields the end-of-line character
Implementations with such additional requirements are not strictly standard conforming.
2.3 Character Types and Character Traits 121
int_type get()
{int_type c;
if (!_Ok) c = traits::eof();
else {... = rdbuf()->sbumpc(); ... }
return (c);
}
You can see that the first get () function has to return a value of type traits::
int_type,” because the return value is either the end-of-file value, which is of type
traits: :int_type, ora character of type charT. The second get () function demon-
strates the need for conversions between both types and for comparison of values of those
types.
IOStreams classes additionally use the traits members that relate to stream position-
ing and code conversion.
Note that the standard does not guarantee that IOStreams classes restrict themselves
to the traits member typedefs and functions related to the end-of-file value and stream
positioning and code conversion. They are also allowed to make use of member functions
like compare(), assign(), etc. Similarly, strings could theoretically use eof (),
not_eof(),pos_type, of f_type, etc., although this is unlikely in practice. The princi-
ple is that strings and IOStreams are permitted to rely on the full set of properties that are
required of character types and their traits.
42. See section G.7, The typename Keyword, in appendix G for further information.
122 The Architecture of 1|OStreams
char_type ct = *beg;
char c;
if ( ct == use_facet<numpunct<charT> >(iob.getloc()).decimal_point() )
c= '.';
bool discard =
( ct == use_facet<numpunct<charT> >(iob.getloc()) .thousands_sep ()
& &
use_facet<numpunct<charT> >(iob.getloc()).grouping().length() != 0 );
As you can see, the comparison of a character of type charT is performed using an
operator==() for that character type. This explains why facets impose additional
requirements on the character type.
An interesting side effect is that IOStreams classes generally use the traits: :
eq() function for comparison of characters, but for parsing and formatting numeric val-
ues they use operator==(), because the parsing and formatting of numeric values are
delegated to the stream’s locale’s numeric facets. We’ve seen above that facets do not use
character traits. It follows that one should better implement an operator==() fora user-
2.4 Stream Iterators and Stream Buffer Iterators 123
defined character type that has the same semantics as the character type’s traits: :
eq() function. One problem remains: You can have only one operator==() fora given
character type, but several character traits types associated with that character type. What
if the traits types have different eq() functions? IOStreams might yield “interesting”
results under these circumstances. However, this problem is unlikely to occur in practice.
Consider also that IOStreams classes do not only rely on certain properties of the
character type and the character traits type, but additionally require facets for that charac-
ter type. IOStreams needs the numeric facets, as we’ve mentioned above. It also needs
conversions between the character type and the built-in type char, which have to be
defined in the ctype facet in the form of member functions narrow() and widen().
Character classification functions from the ctype facet are also needed in order to iden-
tify whitespace characters. A code conversion facet is needed for file streams. In turns out
that eventually you have to provide all standard facets for a new character type, because
facets are generally allowed to be interdependent.
To sum up, character types are used as template arguments to strings, [OStreams,
and facets. Such character types have to meet certain requirements (listed in section
2.3.2.1, Requirements for a Character Traits Type) and must be accompanied by both an
associated character traits type (described in section 2.3.2, Character Traits) and the stan-
dard facets for that character type (described in section 6.3.1.4, Mandatory Facet Types).
The member function begin () yields an iterator that refers to the first element of
the list, while end() yields an iterator one past the end of the list. The find-algorithm
traverses this iterator range checking if any of the elements of the list match the value
specified by the last parameter (see implementation of find below). If it reaches the itera-
tor specified by the second parameter without finding any match, it returns this iterator,
which refers past the end of the list. Otherwise, when f ind has found a matching element
in the list, it returns an iterator referring this element.
return first;
There are two aspects in find that are typical for standard library algorithms and
their relation to iterators: (1) the way iterator ranges are used and (2) the way algorithms
are implemented generically based on the iterators’ types.
We will explore both aspects in greater detail now. Let’s start with iterator ranges.
ITERATOR RANGES
An iterator range is specified by two iterators. The first indicates the beginning of the
range and the second one the end. All iterator positions in that range can be reached by
consecutively applying ++ (either postfix or prefix) to the first iterator until ++ yields the
end iterator. The end iterator is excluded from the iterator range (denoted as
(first, last) ). It need not even refer to a valid container element; it only has to be
reachable. This past-the-end iterator can be used to indicate failure: When an algorithm
normally returns a valid iterator from the iterator range as the result of its task, it can
return the past-the-end iterator to indicate that it failed to accomplish the task. The f ind-
algorithm shown above does so when it cannot find the specified value. Hence, iterator
ranges specify sequences of elements that an algorithm can step through, and the end iter-
ator can also be used by an algorithm as an error indication.
2.4 Stream Iterators and Stream Buffer iterators 125
ITERATOR CATEGORIES
The standard library classifies the iterator types into five categories according to their
interfaces, as shown in figure 2-31:
Input Output
Forward
Bidirectional
Random access
Note that figure 2-31 does not show inheritance relationships. Iterator categories are
just abstractions, which represent a set of requirements to an iterator’s interface, listed
briefly below:
Input iterators allow algorithms to advance the iterator and give “read only”
access to the value.
Output iterators allow algorithms to advance the iterator and give “write
only” access to the value.
Forward iterators combine read and write access, but only in one direction
(i.e., forward).
ifstream str("my_text_file");
istream_iterator<string> beginIter(str);
istream_iterator<string> endIter;
This solution uses, on one hand, the IOStreams functionality: A stream is constructed
from a file name, and formatted input is read from the stream via _ the
istream_iterator. On the other hand, it uses STL components: The generic count-
algorithm determines how often the word the can be found inside the iterator range
[beginIter, endIter).
2.4 Stream Iterators and Stream Buffer Iterators 127
The second example we want to look at deals with the problem that standard library
containers do not support any stream I/O directly. So the question is, What is the best
way to print a container? Here is a solution that prints all container elements separated by
a blank. The approach is based on the ost ream_iterator and the copy-algorithm:
List<int> myList;
Both examples show that stream iterators allow algorithms to see a stream as a con-
tainer of homogeneous elements. Standard library algorithms can apply their functional-
ity to the stream in the same way as they would do to any other standard library
container. We will explore how these examples work in the following section as we take a
detailed look at the stream iterators and how they relate to IOStreams.
public:
typedef charT char_type;
typedef traits traits_type;
typedef basic_ostream<charT,traits> ostream_type;
private:
const charT* delim;
basic_ostream<charT,traits>* ost;
};
ERROR INDICATION
The ostream_iterator has no specific feature to indicate that the insertion of an object
into the stream failed or caused an error. For error detection, only the error indication
mechanisms of the underlying stream are available. (For details of new stream features
such as IOStreams exceptions, see section 1.3, The Stream State.) The best idea is perhaps
to set badbit, eofbit, and failbit in the output stream’s exception mask before the
ostream _iterator is used. The stream will then throw an ios_base::failure
exception when an error occurs. Alternatively, the ostream_iterator’s state can be
checked after the ostream iterator has been used, to see if an error occurred.
2.4 Stream Iterators and Stream Buffer Iterators 129
public:
typedef charT char_type;
typedef Traits traits_type;
typedef basic_istream<charT,Traits> istream_type;
istream_iterator() : istp(0) {}
istream_iterator(istream_type& s) : istp(&s) { readElem(); }
istream_iterator& operator++()
{
readElem();
return *this;
130 The Architecture of !OStreams
istream_iterator operator++(int)
{
istream_iterator tmp = *this;
readElem();
return tmp;
"private:
void readElem()
{
if (istp != 0)
if (!(*istp >> value))
istp = 0;
basic_istream<charT,Traits>* istp;
T value;
};
ERROR INDICATION
What happens if readElem() tries to extract the next object of type T from the stream
and none is available? By IOStreams convention, the failbit is set in the stream state to
indicate that an extraction has failed. After each extraction readElem() checks the
stream state. If the stream state is not good () anymore, the private member istp is set to
0, which indicates that the iterator is detached from its stream. As a consequence, the
stream iterator cannot be used to extract any further objects. An istream_iterator
2.4 Stream Iterators and Stream Buffer Iterators 131
that is in this state is called an end-of-stream iterator. This name might be a bit misleading,
because the iterator becomes an end-of-stream iterator whenever the stream state is not
good (),1e., when either failbit, badbit, eofbit, or a combination of them has been
set. An input iterator that turns into an end-of-stream iterator either signals an error or
indicates that the end of the input stream was reached. As with the output stream iterator,
we have to resort to the underlying stream’s error indication mechanisms to distinguish
between these two situations.
As an end iterator, we need an iterator that is reachable from the beginning of the
range; that is, successive increments of the first iterator must eventually yield the second
iterator.
A word on the while-condition: We silently assumed the existence of an (in)equal-
ity operator for istream_iterators. How is that defined? The standard requires the
following semantics:
¢ Iwo end-of-stream iterators of the same type are always equal.
e An end-of-stream iterator is not equal to a non-end-of-stream iterator.
¢ Iwo non-end-of-stream iterators are equal when they are constructed from the
same stream.
43. In the implementation shown above, the end-of-stream iterator state is expressed by the fact that the private
member istp is 0. While this is a valid and efficient implementation (which is also used in all standard library
implementations we know of), the standard allows implementation of the end-of-stream state indication in any
appropriate way.
132 The Architecture of lOStreams
From the third requirement it follows that two istream_iterators that are equal
do not necessarily refer to the same stream position. One might intuitively expect such a
property, because it is true for pointers and container iterators. Note that it is not true for
input stream iterators.
Back to our problem: How do we express an iterator range of
istream_iterators? For the begin iterator we can simply use the istream_iterator
constructed from the input stream. It represents the current stream position. If we succes-
sively increment this iterator, which means that we successively extract items from the
stream, we will eventually hit the end of the stream. For the end iterator we therefore need
an input stream iterator in end-of-stream state. How do we get one? The
istream_iterator’s default constructor creates it.
Note that the only input stream iterator ranges are from the current stream position
to the end of the stream. It is not possible to specify a range from one stream position to
another stream position, because any two non-end-of-stream iterators referring to the
same stream always compare equal.
By comparing an istream_iterator to an end-of-stream iterator, it is possible to
detect if a stream error has occurred. Yet istream_iterators have no feature to reset
an iterator that has gone into an end-of-stream state. If the error is not fatal, we can do the
following: clear the stream’s error state; construct a new istream_iterator, which
then represents the current stream position; and restart the algorithm with this iterator as
the begin iterator.
2.4.2.3 Stream Iterators Are One-Pass iterators
Stream iterators are one-pass iterators; that is, an element referred to by a stream iterator
can only be accessed once. For instance, it is not possible to reread elements through a
memorized iterator. The following would fail: memorizing the begin position of the
stream, then extracting elements from the stream, and later trying to reread the first ele-
ment through the memorized begin iterator.“ The reason is that once extracted, the ele-
ment is consumed and cannot be reread.
The single-pass property can best be understood in terms of I/O from/to a terminal.
Once we’ve read from the terminal stream, the input is consumed. Once we've written to
the terminal stream, we cannot reposition and override the output. In contrast, container
iterators are multipass iterators. We can repeatedly access any element referred to by any
iterator (except the end iterator, of course). The one-pass or multipass property is
expressed in the iterator categories. Iterators in the input and output iterator category are
one-pass iterators. Iterators in the forward, bidirectional, or random-access iterator cate-
gories must be multipass iterators.
Another consequence of the one-pass property is that you would usually not want
to have more than one stream iterator operating on the same stream, because they influ-
44, “Rereading” through the memorized iterator need not even fail. It will instead extract the “next” element
from the stream, should there be one available at the current stream position.
2.4 Stream Iterators and Stream Buffer Iterators 133
ence each other. If one iterator is incremented, it moves the stream position of the under-
lying stream so that the other iterator is affected, too.
Is the single-pass property a restriction? Can all algorithms live with this restriction?
Let’s see what algorithms typically need of an iterator.
Algorithms that write output via an output iterator usually do not access the same
output position twice. That would mean that they override a position they had previously
filled. No standard library algorithm we know of does anything like that. Therefore, all
algorithms that write output via an iterator require an iterator type of the output iterator
category and happily live with its one-pass property. For this reason, an
ostream_iterator can be used in all standard library algorithms that require an out-
put iterator.
Algorithms that read input via an iterator usually take an input iterator range. Not
all such algorithms can live with the one-pass restriction of input iterators like the
istream_iterator. The find _end() algorithm, for instance, does a look ahead and
for that matter needs the multipass property. In order to explain this, let’s take a closer
look at the find_end() algorithm. It finds a subsequence of equal values in a sequence
and returns an iterator to the beginning of the subsequence. Here is an example of how it
would be used:
string sl = "abcdefghijk";
string s2 = "def";
string::iterator i = find_end(sl.begin(),sl.end(),s2.begin(),s2.end());
cout << 1 << endl;
The result would be an iterator to the letter d in s1. The algorithm maintains two
iterators: the first refers to the first input sequence, the second to the potential subse-
quence. In the beginning the first iterator points to the ‘a’ in s1, the second to the d in s2.
Then the algorithm looks for a match, that is, whether the ‘a’ is the beginning of the subse-
quence "def". It performs this search by successively advancing both iterators and com-
paring the characters referred to. When it can’t find a match here, it resets both iterators:
the first one to the ‘b’ in s1 and the second iterator back to the beginning of "def". Then
it starts looking for the match again. And so on and so forth. The crux is that the
find_end() algorithm needs to reread elements from the input sequences. This cannot
be done with iterators from the input iterator category, because they only support one-
pass access. And indeed, the interface description of the find_end() function asks for
an iterator from the forward_iterator category: :
template<class ForwardIteratorl, class ForwardIterator2> inline
ForwardIteratorl find_end(ForwardIteratorl firstl, ForwardIteratorl lastl,
ForwardIterator2 first2, ForwardIterator2 last2);
Note that the find_end() algorithm does not need the entire functionality
required of a forward iterator. Forward iterators allow multipass access for reading and
134 The Architecture of |OStreams
writing. Write access isn’t needed in find_end(). Hence, all that this algorithm really
needs is a multipass input iterator. However, there is no such iterator category.
Let us hasten to add that there are of course algorithms for whom one-pass input
iterators perfectly suffice. The often quoted find () algorithm is an example, and so is the
count () algorithm that we used in our examples. These algorithms read elements suc-
cessively until they find what they’re looking for or until they’ve counted all relevant ele-
ments. No element needs to be accessed twice, no look ahead is needed, no repositioning
required. Their interface description asks only for iterators from the input iterator
category:
public:
typedef charT char_type;
typedef traits traits_type;
typedef basic_streambuf<charT,traits> streambuf_type;
typedef basic_ostream<charT,traits> ostream_type;
ostreambuf_iterator(ostream_type& s) throw()
sbuf(s.rdbuf()), failedFlag(false) {}
try
{
result = sbuf->sputc(t);
}
catch (...)
{
failedFlag = true;
throw;
}
if (traits_type::eq_int_type(result, traits_type::eof()))
failedFlag = true;
}
return *this;
private:
bool failedFlag;
streambuf_type* sbuf;
};
Again, the two increment operators and the operator*() do nothing but return
*this. The real work is done by the assignment operator, which inserts a character into the
stream buffer by invoking the stream buffer’s public member function sputc (). If the call
to sputc() fails, indicated by either an exception or the return value of traits_type: :
eof (), the private data member failedFlag is set to true. failedFlag holds the error
state of the iterator. Once an error has occurred, the iterator’s error state is set to true, and
the iterator does not insert further characters into the stream buffer.
The iterator’s error state can be checked by the iterator’s public member function
failed(). This is the stream buffer iterator’s explicit error handling. Here the output
stream buffer iterator differs from the output stream iterator, which does not have any
explicit error indication. Errors that occur during the use of a stream iterator are reflected
in the stream state of the underlying stream object. Errors that occur during the use of a
stream buffer iterator cannot be indicated through the underlying stream buffer, because
stream buffers, unlike streams, have no explicit error indication mechanism. For this rea-
son, the stream buffer iterator must provide an error indication mechanism on its own.
Despite the fact that input stream buffer iterators and input stream iterators have a
lot in common, their implementations can differ significantly. The stream iterator typi-
cally has a private data member, where it stores the element that it has extracted from the
stream. This is necessary because the underlying stream does not allow this element to be
reextracted from the stream. The stream buffer iterator could be implemented in a similar
Way, 1.e., it could buffer the character extracted from the stream buffer. However, it need
not do so, because it can get the same character from the current stream buffer position
repeatedly by calling sgetc (). This way, the implementation of a stream buffer iterator
need not buffer the character but can use sgetc() in order to save the space for the
character.
The implementation of the postfix increment operator needs a nested proxy class,
which complicates the implementation a little bit. Here is such an implementation using a
proxy class. We will discuss the details of the implementation below.
public:
typedef charT char_type;
typedef traits traits_type;
typedef typename traits::int_type int_type;
typedef basic_streambuf<charT,traits> streambuf_type;
typedef basic_istream<charT,traits> istream_type;—
public:
CharT operator*() { return keep_; }
private:
proxy(charT c, streambuf_type * sb) : keep_(c), sbuf(sb) {}
charT keep_;
basic_streambuf<charT,traits>* sbuf;
45. See section G.7, The typename Keyword, in appendix G for further information.
138 The Architecture of |OStreams
istreambuf_iterator& operator++()
{
if (sbuf)
{
sbuf->sbumpc();
if ( traits::eq_int_type(sbuf->sgetc(),traits::eof()) )
sbuf = 0;
}
return *this;
}
else
return proxy (0,0);
private:
streambuf_type* sbuf;
};
As explained before, the iterator does not buffer the character at the current stream
buffer position, but rather gets it via sgetc() each time it is needed. The postfix incre-
ment operator must return the iterator position before the increment. The character
referred to before the iterator increment cannot be obtained from the stream buffer any-
more after the increment operation has been performed, because the increment operator
invoked sbumpc() on the underlying stream buffer, which moves the current buffer
position forward. In this case it is necessary to save the previously retrieved character
temporarily, so that it can be returned in some kind of “pseudo” iterator. This pseudo iter-
ator is represented by the proxy class, which provides an operator* () for returning the
stored character.
The proxy class has a second data member: the stream buffer pointer. The pointer is
needed when a new istream buffer iterator must be constructed from a proxy object. This
happens when another operation is applied to the return of a post-incremented istream
buffer iterator, as in (i++ == end), where a comparison operator follows the postfix incre-
ment operation.
The error checking of our implementation might at first sight look a bit overcau-
tious, with a negative impact on the performance. In the postfix increment operator,
sgetc() is called after sbumpc (), only to check if the iterator still refers to a valid char-
acter or if it has already reached the end of the stream. The next increment operation or
the next call to operator* () would detect the end of the stream buffer anyway. Why is
this look-ahead necessary? When the character before the increment is the last available
character, this character will be stored for later use in the returned proxy object. As we
then perform the look-ahead, the istream iterator changes its state and turns into an end
iterator, which closely mimics the behavior of pointers to memory. Incrementing a pointer
to the last valid position is expected to yield the past-the-end position. That is exactly
140 The Architecture of lOStreams
what happens due to the look-ahead: the istream iterator for the last valid position turns
into an end istream iterator.
iword() and pword() provide read and write access to the arrays: iword() to the
data fields of type long, pword() to those of type void*. Both functions take an index
into the respective array as an argument and return a reference to the associated array
element.
Before these functions can be used, a valid index must be acquired, which is
achieved by calling the static member function xalloc(). Each call to xalloc()
acquires a different, new index. The data fields associated with this index are guaranteed
to be initialized with 0. In principle, there is no limit to the number of indices that can be
acquired; class ios_base allocates additional storage if necessary. Note that an index
2.5 Additional Stream Storage and Stream Callbacks 141
once acquired can never be released again; ios_base does not have an operation for
release of a valid index.
Note that despite the fact that xalloc() is a static member function in ios_base,
each ios_base object has its own iword and pword array, which means that every
acquired index is valid for all stream objects regardless of their type, because all stream
types are derived from ios_base.
1. After a call to the object’s iword() or pword() with a different index argument. In
order to understand this, consider that the iword and pword arrays grow dynamically.
The functions iword() and pword() might reallocate the arrays if an index is provided
that is greater then the arrays’ current capacity. In such a situation, the operation might
allocate new memory for both arrays, copy the old values into the new arrays, initialize
the requested data field, and release the old arrays—which would render all references
obtained by previous calls invalid.“
2. After a call to the object’s copyfmt () member function. The function copyfmt ()
takes a stream as an argument and copies its data members (except the stream buffer
pointer and stream state) to the data members of *this. The data members to be copied
include the iword and pword arrays. If the arrays of the source stream are longer than the
ones of *this, additional memory must be allocated. Hence a reallocation similar to the
one described above might be performed, by which all references obtained by previous
calls to iword() or pword() would become invalid.’”
3. When the stream object is destroyed. The dynamic memory used for the iword and
pword arrays is freed on destruction of the stream object. Hence all previously obtained
references to data fields of the arrays become invalid.
ERROR INDICATION
Calls to iword() and pword () can fail, for instance due to memory shortage. As usual in
IOStreams, failure is indicated by setting ios_base: :badbit in the stream state and,
46. The C++ standard states that a reference to the data field associated with a valid index may become invalid
after any call to the object’s iword() or pword() with a different index argument. The behavior described
above is a conceivable strategy, not necessarily the one your IOStreams implementation really chooses. To be
portable, you must be prepared to handle invalid references after any call.
47. Again, the C++ standard states that a reference to the data field associated with a valid index may become
invalid after any call to the object’s copyfmt () member function. The behavior described above is a conceivable
strategy, not necessarily the one your JOStreams library implements.
142 The Architecture of |OStreams
depending on the exception mask, throwing an exception. The reference returned in case
of failure refers to a data field containing 0.
Note that iword() and pword() are member functions of class ios_base,
whereas the functions for checking the stream state and manipulating the exception mask
are contained in the derived class template basic_ios<>. This way you can call
iword() and pword() via a reference or pointer to ios_base, but in such a case you
cannot check for failure afterwards.
CALLBACKS
IOStreams allows registration of callback functions, which are invoked under certain con-
ditions, for instance when the stream object is destroyed. These callback functions
increase the usability of iword() and pword(). For instance, it is common to store a
pointer to dynamically allocated memory in a pword data field. Once the stream object is
destroyed, the pword array is destroyed too. In order to avoid garbage, one would want
to free the memory, which the pointer in the pword field refers to, before the pointer itself
is discarded. This kind of cleanup is typically performed by a callback function that is
invoked on destruction of a stream object. For details see section 2.5.2, Stream Callbacks.
The callback mechanism completes the iword/pword concept in such a way that
both together allow streams to be extended without derivation: The iword/pword arrays
provide facilities to store additional data streams, and the callback functions allow the
functionality of streams to be extended. Section 3.3.1, Using Stream Storage for Private
Use: iword, pword, and xalloc, explores the respective techniques in depth.
Please see the example from section 3.3.1, Using Stream Storage for Private Use:
iword, pword, and xalloc, which uses iword() /pword() in conjunction with the call-
back mechanism.
and an index to the iword/pword arrays. A callback function must not throw any excep-
tions. Here is the complete typedef:
Callback functions are registered via the following member function of ios_base:
ERASE_EVENT. [his event occurs when the stream object is destroyed or when the
stream object’s copyfmt () member function is called. In the latter case the callback is
then invoked before any data members of the stream are assigned.
Callback functions for erase_events are needed, for instance, if pword entries
point to dynamically allocated memory. On destruction of the stream object it might be
necessary to free the memory pointed to before the pointers stored in the pword array are
destroyed.
Invocation of erase_event callback functions on calls to copyfmt() is useful
because copyfmt () overrides the original data members of the stream object. In particu-
lar, the iword and pword entries are overwritten. If pword entries point to dynamically
allocated memory, it might be desirable to free the memory pointed to before the pointer is
discarded. This would typically be done by callback function that handles erase_events.
ERROR INDICATION
Callback functions are not allowed to indicate errors by means of exceptions. Also, they
cannot indicate errors by setting error flags in the stream state, because the stream is
passed to the callback function as ios_base& and access to the stream state and the
exception mask is via the derived class basic_ios<class charT class Traits>. The
consequence is that callbacks cannot indicate errors in the same way as other IOStreams
functions do. It also means that callback functions must be prepared to catch all conceiv-
able exceptions, because they have to suppress them.
CHAPTER 3
145
146 Advanced /!OStreams Usage
1. The concatenation of shift operators is explained in the introductory section 1.2.2, The Input and Output
Operators.
3.1 Input and Output of User-Defined Types 147
array type would be implemented by iterating through the array and using the inserters
and extractors of the array element type for reading and writing each single element.
For a class type, input and output of the object would boil down to input and output
of the object’s data members and base classes.
We intend to explain the technique in terms of an example. Let us start with intro-
ducing the type in question—a date class.
class date {
public:
date(int d, int m, int y)
{ tm_date.tm_mday = d; tm_date.tm_mon = m-1; tm_date.tm_year = y-1900; }
date(const tm& t) : tm_date(t) {}
date ()
{ /* get current date */ }
// more constructors and useful member functions
private:
tm tm_date;
};
The date class has a private data member of type tm, which is the time structure
defined in the C library (in header file <ct ime>). It is a type suitable for representing date
values and consists of a number of integral values, among them the day of a month, the
month of a year, and the year since 1900. In the tm structure, days are counted from 1 to
31, but months are denoted by values 0 to 11.
We want to allow insertion and extraction of date objects in exactly the same way
as input and output of built-in types like integers or strings, i.e., via shift operators, as in
the following code fragments:
date eclipse(11,8,1999);
cout << "solar eclipse on " << eclipse << '\n';
and
date aDate;
cout << '\n' << "Please, enter a date (day month year):" << '\n';
Cin >> aDate;
cout << "date: " << aDate << '\n';
To facilitate this convenient kind of input and output, we need to implement shift
operators as inserters and extractors for date objects:
148 Advanced lOStreams Usage
The extractor:
template<class charT, class Traits>
basic_istream<charT, Traits>&
operator>> (basic_istream<charT,Traits>& is, date& dat)
{
int tmp;
is >> dat.tm_date.tm_mday;
is.ignore();
is >> tmp; dat.tm_date.tm_mon = tmp-1;
is.ignore();
is >> tmp; dat.tm_date.tm_year = tmp-1900;
return is;
}
The inserter:
template<class charT, class Traits>
basic_ostream<charT, Traits>&
operator<< (basic_ostream<charT, Traits >& os, const date& dat)
{
os << dat.tm_date.tm_mday << '.';
os << dat.tm_date.tm_mon+l << '.';
os << dat.tm_date.tm_year+1900 ;
return os;
}
A date object is broken down into its elements, in this case the day, month, and year
contained in the struct tmdata member of the date object. Each such element is an inte-
ger value and is inserted or extracted by means of the standard shift operator for type int.
Note that it usually is a good idea to make insertion and extraction complementary
operations: An item should be written in a format that is understood by the input opera-
tion, so that you can read what you've written.
The date class still needs a minor modification: Both operations access private data
members of class date and must therefore be declared friend functions to class date.
Here’s a completed version of class date:
class date {
public:
date(int d, int m, int y)
{ tm_date.tm_mday = d; tm_date.tm_mon = m-1; tm_date.tm_year = y-1900;
tm _date.tm_sec = tm_date.tm_min = tm_date.tm_hour = 0;
tm_date.tm_wday = tm_date.tm_yday = 0;
tm_date.tm_isdst = 0;
}
date(const tm& t) : tm_date(t) {}
date ()
{ time_t ltime;
3.1 Input and Output of User-Defined Types 149
time (<ime) ;
tm_date = *localtime(<ime);
}
// more constructors and useful member functions
private:
tm tm_date;
The default constructor uses the C library functions time () for getting the current
time from the system and localtime() for converting the time value into a tm
structure.
With the input and output operations implemented as outlined above, we can insert
and extract date objects via shift operators. The code fragments from above (repeated
below for illustration) would behave as follows:
date eclipse(11,8,1999);
cout << "solar eclipse on " << eclipse << '\n';
would print
and
date aDate;
cout << '\n' << "Please enter a date (day month year):" << '\n';
cin >> aDate;
cout << "date: " << aDate << '\n';
date: 2.6.1952
150 Advanced !OStreams Usage
3.1.3 Refinements
The technique of creating new inserters and extractors by composing existing inserters
and extractors is simple and powerful, but it can be refined in several ways: More elabo-
rate format control can be desired, errors might occur and must be handled, the format
might depend on cultural conventions. Also, there are several other possibilities for read-
ing and writing data from/to a stream rather than via existing shift operators. We discuss
such refinements in the following sections.
The predefined inserters and extractors in IOStreams follow a number of conventions:
They report errors in a uniform way, interpret format control parameters consistently, factor
out culture-sensitive information into locales and facets, and so on. User-defined I/O oper-
ations should apply the same rules. Along with each refinement suggested in the following
sections we explain the related conventions and eventually combine all the information
in another example: an internationalized date inserter and extractor.
date today;
cout << "today: " << left << setw(10) << setfill('*')
<< today << endl;
would print
Probably the expected result after setting the field width prior to insertion of a date
object is that the entire date is adjusted, and not just the first part of it. You might want to
fix this problem and control the field width yourself. This leads us to the more general dis-
cussion of format control in inserters and extractors.
For the sake of consistency, format control facilities defined in IOStreams should
generally be interpreted and manipulated by user-defined I/O operations in the same
way as the predefined inserters and extractors are. Not all format flags are relevant to
input and output of all types of objects. Some format flags apply to insertion and extrac-
tion of certain data types only. They are often irrelevant to input and output of user-
defined types. For example, the oct, dec, and hex format flags have an impact solely on
input and output of integral values and can be ignored for the formatting and parsing of
dates, as in our example. Other format flags, such as unitbuf, skipws, or the field
width, are independent of the type of object inserted or extracted. They have an impact on
user-defined types, too.
3.1 Input and Output of User-Defined Types 151
As a rule of thumb, you should first determine all format flags that you want to be
relevant to the user-defined type that you intend to parse and format. Once you have
identified the relevant format facilities, understand how they are used in the predefined
inserters and extractors, and make sure that your user-defined operations interpret and
manipulate them in exactly the same way.
Here are some things to keep in mind:
We've already mentioned that the field width is the only format information that is
not permanent but is reset to 0 each time it is used. Stick to this rule if you decide to adjust
output fields according to the field width, and reset the field width to 0 at the end of your
inserter.
Some of the format flags do not have a default value. The adjustfield, for
instance, is not initialized, so you must cope with the possibility that none of the adjust -
field flags might be set. In this case, all the predefined inserters behave as though the
field were set to right, that is, they add padding characters before the actual output.
Stick to this rule if you decide to adjust output fields in your inserters.
Note also that typical candidates of relevance for your user-defined inserter or
extractor such as unitbuf and skipws are automatically performed as so-called prefix
and suffix operations. If you decide that these format elements are relevant for your
inserters and extractors, don’t forget to use the sentry objects in your implementation.
3.1.3.2 Prefix and Suffix Operations
It’s a convention for stream input and output operations to carry out certain tasks prior
(prefix activities) and subsequent (suffix activities) to any actual input or output. The pre-
fix activities include flushing of a tied stream and skipping of whitespace. The suffix activ-
ities include flushing the stream if the unitbuf flag is set.
In the date inserter and extractor from the previous section, there is a certain perfor-
mance overhead due to the use of existing inserters and extractors. The flushing, for
instance, is performed for each single shift operation, although it would only be necessary
for input or output of the entire date. If we want to eliminate the overhead, we have to
care about flushing ourselves.
In IOStreams, the prefix and suffix activities are encapsulated into classes called
sentry, nested into the general stream classes basic_istream and basic_ostream.
The constructors of these classes perform the prefix activities, the destructors carry out the
suffix activities. The sentry classes have an operator bool (), which allows checking
for success of construction (i.e., success of the prefix operations).
The standard allows the provider of an IOStreams library to add operations, beyond
the ones listed above, to the constructors and destructors of the sentry classes. Conceiv-
able additions could be locking and unlocking of mutually exclusive locks for ensuring
thread-safety of your IOStreams library, or maintenance of internal caches, etc.
Instead of using sentries, you could theoretically manually add the prefix and suffix
activities into your shift operators. Such explicitly implemented prefix and suffix activities,
however, introduce potential portability problems, because the hidden vendor-specific
additions would be missing. If you implement an inserter or extractor, you should use the
sentry classes in order to make sure that all necessary prefix and suffix tasks are carried
out. When you compose existing shift operators, you implicitly do so anyway, because the
predefined inserters and extractors use sentries. If you build shift operators on top of low-
level I/O operations, you have to care about prefix and suffix activities yourself.
Here is how you should use the sent ry classes in your inserters and extractors:
Create a local sentry object (which receives the stream as parameter) on the stack
prior to any other activity in your shift operator. The sentry constructor, which will be
the first operation executed in your shift operator, performs all necessary prefix tasks. The
local sentry object goes out of scope when the shift operator returns; its destructor will
always, even in the presence of exceptions, be called and cares about the suffix tasks.
Check the success of the prefix operations after construction of the sentry object by
means of its bool operator. The operator returns false in case of ! good (), which means
either that an error occurred or that the end of the input was encountered. It’s another
convention that stream operations stop if the stream state is not good. Following this pol-
icy, one should return from the function if the check after construction of the sentry object
does not indicate success.
The example in section 3.1.3, Refined Inserters and Extractors, demonstrates the cor-
rect use of sentries.
For this purpose every stream contains an exception mask, which consists of several
exception flags. Each flag in this mask corresponds to one of the stream state flags
failbit, badbit, and eofbit. For example, once the badbit flag is set in the exception
mask, an exception will be thrown each time the badbit flag is set in the stream state.
THE OUTPUT VARIABLE OF AN EXTRACTOR. The value of the output variable must be a
valid value; it need not be meaningful in the case of a failed extraction. Other than that,
there are no requirements imposed by IOStreams. However, common sense indicates that
the behavior should be similar at this point for all user-defined extractors.
Inserters and extractors for user-defined types should obey these policies for error
indication in IOStreams. Here is what you need to do when implementing input and out-
put operations:
DETECTING ERROR situations. Problems can be reported from any component you
invoke to accomplish the task of formatted input and output. Examples are a bad_alloc
exception thrown by an invoked operation due to memory shortage, or failure of an
invoked stream operation indicated by a bit set in the stream state. Also, the internal logic
of your inserter or extractor can identify error situations, e.g., extraction of incomplete or
invalid dates in our example. Therefore:
¢ Catch all exceptions.
e Check all return codes, the stream state, or other error indications.
Raise an exception if the exception mask asks for it. If any of the invoked operations
raises an exception, catch the exception and rethrow it, if the exception mask allows it. It is
important that you rethrow the original exception rather than raising an [OStreams
exception of type ios_base: : failure. The original exception is likely to convey infor-
mation that precisely describes the error situation, whereas ios_base:: failure isa
rather nonspecific, general IOStreams error. If you do not retain the original exception,
useful information might get lost.
CARE ABOUT THE OUTPUT VARIABLE OF AN EXTRACTOR. Find a concept for the value you
return in the output variable of a failed extractor. Ideally, one would retain or restore the
original value of the output variable, so that the variable would return unaltered in case of
154 Advanced lOStreams Usage
failure. Another strategy is to clear the content and return a nil value in case no valid
value can be extracted. This is doable only if there is a nil value for the respective type.
Examples of nil values are a zero-length string or a date like 00-00-00.
A concrete example is shown in section 3.1.3, Refined Inserters and Extractors.
3.1.3.4 Internationalization
The textual representation of a date value varies from one cultural area to another. The
inserter and extractor from the previous section, however, ignore this fact and are inca-
pable of adjusting the formatting and parsing of dates to cultural conventions. For
instance, the order of day, month, and year are hard-coded, and the typical U.S. notation
of 12/31/99 instead of the German 31.12.99 can neither be parsed nor produced.
Also, textual representations of the month name are not supported; we cannot use dates
such as December 31, 1999 o0r31.Dezember 1999. |
We want to eliminate this restriction and intend to internationalize our date inserter
and extractor. Users of our shift operators will be allowed to indicate and switch cultural
environments, and as a result the formatting and parsing will be adjusted accordingly. In
IOStreams, internationalized I/O operations are implemented by factoring out culture-
sensitive parsing and formatting into exchangeable components: locales and facets. A
short review of locales and facets follows.
The purpose of a locale is to represent a cultural area and contain all the culture-
dependent information and services relevant to that area. They are used by components
that deal with culture-sensitive data for retrieving information about cultural differences.
Inserters and extractors in IOStreams are examples of such components; they need to
know how culture-sensitive items like numeric values and dates are formatted.
The services and information in a locale are organized in smaller units, the facets.
Often, there is not just a single facet for a certain problem, but rather a group of facet types
related to one problem domain. For instance, the standard facets num_put, num_get, and
numpunct together represent the knowledge about numeric formats. Each locale object
represents a particular cultural area and contains facets for that culture. A German locale,
for example, has the numeric facets for German numeric format, and a U.S. locale has the
numeric facets for U.S. formats.
Locales are provided to IOStreams on a per stream basis. That is, each stream has a
locale of its own. The locale attached to a stream can be replaced. Locale-sensitive I/O
operations are expected to use a stream’s current locale object in order to achieve adapt-
ability to cultural environments.
- If you implement inserters and extractors for culture-sensitive data types, you
should take the following steps:
3. Provide locales that contain the necessary facets, or provide means for creating
such locales. (Every locale contains at least all standard facets. If you need
nonstandard facets, you must support creation of these locales.)
4. Imbue streams with such locales.
5. Use the stream’s locale and its facets for implementing your internationalized
inserters and extractors. (Users of your culture-sensitive inserters and extractors
must first create locales that contain the necessary facets and then imbue streams
with such locales, by which the necessary facets are made available to your
inserters and extractors.) |
Some of these aspects are demonstrated in the example presented in section 3.1.3,
Refined Inserters and Extractors. It shows use of facets for implementation of inserters
and extractors. We use existing standard facets in that example, and for this reason we do
not explain how new facet types are created and how they become part of a locale. The
creation of nonstandard locales is explained in chapter 8, “User-Defined Facets.”
The delegation of culture-sensitive parsing and formatting to locales and facets
demonstrates a more general design principle: factoring out parsing and formatting into
independent, replaceable components. Such a design makes for a high degree of flexibil-
ity that can be used for many purposes: We have seen that users of internationalized I/O
operations can replace a stream’s locale and in this way enable parsing and formatting
rules for an unlimited number of cultural areas. The inserters and extractors presented in
section 3.1.4, Generic Inserters and Extractors, are another demonstration of the principle.
In that example, formatting and parsing rules are decoupled from IOStreams-specific
tasks and become part of the data type in the form of member functions. In this way not
only can formatting and parsing be replaced, but the entire data type is exchangeable, and
the resulting inserters and extractors become entirely generic.
3. We do not discuss use of format control parameters in conjunction with the example presented here, for two
reasons: First, in the example we will be using the time facets, and facets have their own rules for use of the for-
mat control parameters. Second, use of format control parameters is straightforward and there is little value in
explaining it in detail.
3.1 Input and Output of User-Defined Types 157
The extractor:
template<class charT, class Traits>
basic_istream<charT, Traits>&
operator>> (basic_istream<charT, Traits >& is, date& dat)
{
if (!is.good()) return is;
ios_base::liostate err = 0;
return is;
}
The inserter:
template<class charT, class Traits>
basic_ostream<charT, Traits>&
operator<< (basic_ostream<charT, Traits >& os, const date& dat)
{
if (!os.good()) return os;
Use of the time_get and time_put facets is similar to use of the numeric and mon-
etary facets. Section 2.1.5, Collaboration Among Streams, Stream Buffers, and Locales,
explains the principle of using parsing and formatting facets. The details specific to time
facets are described in section 6.2.3, Date and Time Values. We will not repeat the details
here. Let’s just get a rough idea of the time facets’ interfaces. A time-parsing facet of type
time_get<char_type, iter_type> has the following parsing function for dates:
iter_type get_date(iter_type in, iter_type end,
1os_base& fmt,
ios_base::iostate& err,
tm* time) const;
irERATORS. Formatting functions like put () take an output iterator (parameter out)
that designates the destination for output. Parsing functions like get_date() take an
input iterator range (parameters in and end) that designates beginning and end of the
character sequence to be parsed. The operations return an iterator that points to the posi-
tion after the sequence written to or read from.
The iterators used in our example are input and output stream buffer iterators. The
begin iterators are created by converting the stream itself into stream buffer iterators. This
is possible because the istreambuf_iterator and the ostreambuf_iterator have
converting constructors that take a reference to a stream and convert it into a stream
buffer iterator to the current stream position. The end iterator is created by means of the
default constructor of class istreambuf_iterator.
FORMATTING INFORMATION. Unlike other formatting facets in the locale, the time_put
facet uses neither format information from the ios_base object that is provided as an
argument (parameter fmt) nor the provided fill character (parameter fi11).4
vatue. Naturally, parsing and formatting functions take a pointer or reference to the
value to be written or read. In this case it’s a pointer to a time structure (parameter t ime).
FORMAT SPECIFICATION. [he time-formatting function takes a format specifier, plus an
optional format modifier (parameters fmt spec and fmtmodi fier). These are characters
as defined for the C library function strftime() (defined in header <ctime>).
ERROR INDICATION. Parsing functions like get_date() store error information in an
ios_base::iostate object (parameter err). Formatting functions like put () do not
have an error parameter. If the output iterator returned by the put () function has a
failed() member function, you can use this to check for success or failure. If the iterator
type does not have means for error indication, you cannot check for error of a put ()
operation.
3.1.4.2 Prefix and Suffix Operations ,
We will now add the prefix and suffix operations by means of sentries. As a reminder,
here are the recommendations for use of sent ry classes in inserters and extractors:
4. Nonstandard time_put facets, that is, facets that are not required by the standard as part of the standard
library, might be using format information from the ios_base object as well as the provided fill character.
However, the predefined standard facets time_put<char> and time_put<wchar_t> , which we use in our
sample implementation, do not use these arguments.
3.1 Input and Output of User-Defined Types 159
The extractor:
template<class charT, class Traits>
basic_istream<charT, Traits>&
operator>> (basic_istream<charT, Traits >& is, date& dat)
{
if (!is.good()) return is;
10s_base::iostate err = 0;
typename basic_istream<charT,Traits>::sentry ipfx(is);
if (ipfx) |
{
use_facet<time_get<charT, istreambuf_iterator<charT,Traits> > >(is.getloc())
.get_date(is, istreambuf_iterator<charT,Traits>(),is, err, &dat.tm_date);
}
return is;
}
The inserter:
template<class charT, class Traits>
basic_ostream<charT, Traits>&
Operator<< (basic_ostream<charT, Traits >& os, const date& dat)
{
if (!os.good()) return os;
5. Class sentry is a type that depends on the template arguments of the extractor function and for this réason
requires use of the typename keyword. If you are not familiar with the use of typename, see Section G.7, The
typename Keyword, in appendix G, for further information.
160 | Advanced lOStreams Usage
setting but also the adjustment flags (right, left, internal) and the fill character. Naturally,
we intend to follow the recommendations that we previously explained:
As the field width is not permanent, but is reset to 0 each time it is used, we need to
reset the field width to 0 at the end of our inserter.
The adjust field need not be initialized, so we stick to the rule that all inserters
behave as though the field were set to right.
Below you find the inserter from the previous example, with field width adjustment
added.
The inserter:
template<class charT, class Traits>
basic_ostream<charT, Traits>&
operator<< (basic_ostream<charT, Traits >& os, const date& dat)
{
if (!os.good()) return os;
}.
os.width(0);
return os;
}
3.1 Input and Output of User-Defined Types 161
¢ Catch exceptions.
¢ Check the validity of extracted objects and determine other potential errors.
INDICATING ERROR SITUATIONS. Have your inserter and extractor report errors accord-
ing to the IOStreams principles, and set the stream state flags as follows:
°*ios_base: :badbit to indicate loss of integrity of the stream.
*ios_base::failbit if the formatting or parsing itself fails due to the internal
logic of your operation.
class date {
public:
date(int d, int m, int y)
{ tm_date.tm_mday = d; tm_date.tm_mon = m-1; tm_date.tm_year = y-1900;
tm_date.tm_sec = tm_date.tm_min = tm_date.tm_hour = 0;
tm_date.tm_wday = tm_date.tm_yday = 0;
tm_date.tm_isdst = 0;
ok = ( valid() && (mktime(&tm_date) !=time_t(-1)) ) ? true : false;
}
date(const tm& t) : tm_date(t)
{ ok = ( mktime(&tm_date) !=time_t(-1) ) ? true : false; }
162 Advanced |OStreams Usage
date ()
{ time_t ltime;
tm* tm_ptr;
time(<ime); // get system time
tm_ptr=localtime(<ime); // convert to tm struct
if (tm _ptr != NULL)
{ tm_date = *tm_ptr;
ok = true;
}
else // is date before 1-1-1970
ok = false;
}
bool operator!
() // check for the date's validity
{ ok = ( valid() && (mktime(&tm_date) !=time_t(-1)) ) ? true : false;
return !ok;
}
private:
tm tm_date;
bool ok;
bool valid() // check for sensible date; rejects nonsense like 32 .12.1999
{ if (tm_date.tm_mon < 0 || tm_date.tm_mon > 11) return false;
if ( tm_date.tm_mday < 1 ) return false;
Switch (tm_date.tm_mon) {
case 0: case 2: case 4: case 6: case 9: case 7: case 11:
if (tm_date.tm_mday > 31) return false;
break;
case 1:
if (tm_date.tm_mday > 29) return false;
if (tm_date.tm_mday > 28 && tm_date.tm_year%4) return false;
break;
default:
if (tm_date.tm_mday > 30) return false;
}
return true;
}
The operator! () is used in the extractor below to check whether the extracted
date is valid. Here are the previous inserter and extractor, this time with error handling
and error indication added:
The extractor:
template<class charT, class Traits>
basic_istream<charT, Traits>&
operator>> (basic_istream<charT, Traits >& is, date& dat)
{
if (!is.good()) return is;
ios_base::iostate err = 0;
try
{
typename basic_istream<charT,Traits>::sentry ipfx(is);
if (ipfx)
{
use_facet<time_get<charT, istreambuf_iterator<charT,Traits> > >
(is.getloc()).get_date
(is, istreambuf_iterator<charT,Traits>(),is, err, &dat.tm_date);
}
catch(bad_alloc& )
{
err |= ios_base: :badbit;
ios_base::iostate exception_mask = is.exceptions();
is.setstate(err);
}
else if (exception_mask & ios_base: :badbit)
{
try { is.setstate(err); }
catch( ios_base::failure& ) { }
throw;
}
}
catch(...)
{
err |= ios_base::failbit;
i1os_base::iostate exception_mask = is.exceptions();
164 Advanced lOStreams Usage
is.setstate(err);
}
else if (exception_mask & ios_base::failbit)
{
try { 1s.setstate(err); }
catch( ios_base::failure& ) {}
throw;
}
}
if ( err ) is.setstate(err);
return is;
}
The inserter:
template<class charT, class Traits>
basic_ostream<charT, Traits>&
operator<< (basic_ostream<charT, Traits >& os, const date& dat)
{
if (!os.good()) return os;
ios_base::iostate err = 0;
try
{
typename basic_ostream<charT,Traits>::sentry opfx(os);
if (opfx)
{
basic_stringbuf<charT,Traits> sb;
if (charToPad <= 0)
{
sink = copy(s.begin(), s.end(), sink);
}
else
{
if (os.flags() & ios_base::left)
{
sink = copy(s.begin(), s.end(), sink);
sink = fill_n(sink, charToPad,os.fill());
}
else
{
sink = fill_n(sink, charToPad,os.fill());
sink = copy(s.begin(), s.end(), sink);
}
}
if (sink. failed())
err = ios_base::failbit;
}
os.width(0);
}
// error handling
catch (bad_allocé& )
{
err |= ios_base::badbit;
ios_base::iostate exception_mask = os.exceptions();
os.setstate(err);
}
else if (exception_mask & ios_base::badbit)
{
try { os.setstate(err); }
catch( ios_base::failure& ) { }
throw;
}
catch(...)
{
err |= ios_base::failbit;
ios_base::iostate exception_mask = os.exceptions();
166 Advanced !OStreams Usage
os.setstate(err);
}
else if(exception_mask & ios_base::failbit)
{
try { os.setstate(err); }
catch( ios_base::failure& ) { }
throw;
}
if ( err ) os.setstate(err);
return OS;
value, and automatically raises an ios_base: : failure exception if the exception mask
requires it. |
e The exception mask is checked in order to find out whether an exception must be
thrown.
¢ The stream state must be set and, if required, an exception must be raised.
In our example, both the inserter and the extractor handle caught exceptions exactly
the same way. They both have two catch clauses: We qualify memory shortage indicated
by a bad_alloc exception as a loss of the stream’s integrity (badbit) and all other
exceptions as failure of the operation (failbit). The respective error flag is added to the
local stream state object.
By examining the exception mask and the local stream state object, we determine
whether an exception must be raised at all and if so, which exception it must be: either
ios_base: : failure or the originally caught exception.
Due to the accumulation of errors, it can happen that both badbit and failbit
are set in the local stream state object. If both flags are also set in the exception mask, we
have to raise two exceptions, so to speak. As only one exception can be thrown, we decide
to throw the exception that belongs to the badbit, because we consider a badbit situa-
tion the more severe error situation. The exception associated with the badbit is
bad_alloc, if such an exception was caught, or ios_base: : failure otherwise.
Depending on the exception mask and the local stream state object, the following
actions are taken: |
The setstate() function is called with the accumulated local stream state as an
argument. This call sets the stream state and raises an ios_base: : failure exception.
The setstate() function is called in a try block with an empty corresponding
catch block. As a result, the stream state is set to the accumulated local stream state and
the automatically raised ios_base: : failure exception is caught and discarded. Then
the originally caught exception is rethrown.
168 Advanced !OStreams Usage
date eclipse(11,8,1999);
cout << "eclipse (default right adjustment): " << setw(10)
<< setfill('(') << eclipse << endl;
cout << "eclipse (left adjustment) : " << left << setw(10)
<< setfill('*') << eclipse << endl;
yields the following output, assuming that the standard output stream cout uses a U.S.
locale:
Second, the date handling abides cultural conventions. The code snippet below
shows that our refined inserter and extractor are internationalized:
cout << "A date like Dec 2, 1978" << " is needed: ";
date d;
Cin >> d;
cout << "This is the specified date in US notation: " << qd << endl;
cout .imbue(locale("German")
);
cout << "This is the specified date in German notation: " << d << endl;
The program can result in the following dialog, assuming that the standard output
stream cout initially uses a U.S. locale:
3.1 Input and Output of User-Defined Types 169
Third, we show that our refined inserter and extractor indicate extraction of an
invalid date by means of the stream state and the stream exceptions. Consider the follow-
ing program:
cout << "A date like Dec 2, 1978" << " is needed: ";
date d;
cin.exceptions(ios_base::badbit | ios_base::failbit);
try { cin >> d; }
catch (ios_base::failure&)
{ err << "date extraction failed" << endl; throw; }
cout << "This is the specified date in US notation: " << d << endl;
We deliberately set the exceptions mask so that badbit and failbit situations
must raise an exception. In particular, extraction of an invalid date should be detected and
indicated by means of an exception. The program might lead to the following dialog and
output:
We can see that the control flow passes through the exception handler for the
ios_base: : failure exceptions, that is, the result of the extraction of the invalid date
Dec 32, 1999.
The user-defined type has two member functions, called print_on() and
get_from(), that represent the knowledge about the type-specific parsing and format-
ting of an object of that class.
Note that we did not make the shift operators themselves generic. This would have
required adding a third template parameter for the data type that is inserted or extracted.
Such templates would be instantiated for any kind of data type that does not have a shift
operator of its own. The potential danger is that those shift operator templates might be
accidentally instantiated for types that should actually have I/O operations of their own.
The extractor:
template <class charT, class Traits, class Argument>
basic_istream<charT, Traits>& g_extractor
(basic_istream<charT, Traits>& is, Argument& arg)
3.1 Input and Output of User-Defined Types | 171
los_base::iostate err = 0;
try
{ typename basic_istream<charT,Traits>::sentry ipfx(is);
if (ipfx)
{ err = arg.get_from(is); }
}
catch(bad_alloc& )
{ err |= ios_base: :badbit;
ios_base::iostate exception_mask = is.exceptions();
if ( (exception_mask & ios_base::failbit)
&& !(exception_mask & ios_base: :badbit) )
{ is.setstate(err); }
else if (exception_mask & ios_base: :badbit)
{ try { is.setstate(err); }
catch( ios_base::failure& ) {}
throw;
}
}
catch(...)
{ err |= ios_base::failbit;
ios_base::iostate exception_mask = is.exceptions();
if ( (exception_mask & ios_base::badbit)
&& (err & 10S_base: :badbit) )
{ is.setstate(err); }
else if(exception_mask & ios_base::failbit)
{ try { is.setstate(err); }
catch( ios_base::failure& ) { }
throw;
}
if ( err ) is.setstate(err);
return is;
}
The inserter:
template <class charT, class Traits>
basic_ostream<charT, Traits>& g_inserter
(basic_ostream<charT, Traits>& os, con3St Argument& dat)
{
ios_base::iostate err = 0;
try
{ typename basic_ostream<charT,Traits>::sentry opfx(os);
if (opfx)
{ err = arg.print_on(os);
os.width(0);
172 Advanced !OStreams Usage
catch(bad_alloc& )
{ err |= ios_base::badbit;
ios _base::iostate exception_mask = os.exceptions();
if ( (exception_mask & ios_base::failbit)
&& !(exception_mask & ios_base: :badbit) )
{ os.setstate(err); } |
else if (exception_mask & ios_base::badbit)
{ try { os.setstate(err); }
catch( ios_base::failure& ) { }
throw;
}
catch(...)
{ err |= ios_base::failbit;
ios_base::iostate exception_mask = os.exceptions()j;
if ( (exception_mask & ios_base::badbit)
&& (err & ios_base::badbit))
{ os.setstate(err); }
else if (exception_mask & ios_base::failbit)
{ try { os.setstate(err); }
catch( ios_base::failure& ) { }
throw;
}
if ( err ) os.setstate(err);
return oS;
}
The listing below shows the date class, which is basically unchanged but now has
two additional public member functions: print_on() and get_from().
class date {
public:
date(int d, int m, int y);
date(const tm& t);
date();
bool operator!
();
else
{ if (os.flags() & 1os_base::left).
{ sink = copy(s.begin(), s.end(), sink);
sink = fill _n(sink,charToPad,os.fill());
else
{ sink = fill_n(sink,charToPad,os.fill());
sink = copy(s.begin(), s.end(), sink);
}
if (sink. failed())
err = ios_base::failbit;
}
return err;
}
private:
tm tm_date;
bool ok;
bool valid();
};
by means of string streams and according to inserters and extractors. Similarly, it might be
necessary to convert a wide-character text into a multibyte representation for storage in
the narrow-byte text field of a database. IOStreams can be used for this. In general, it
might be worth implementing user-defined inserters and extractors, because they extend
the existing parsing and formatting mechanisms in C++ in a natural and beneficial way.
The inserted objects setw(10) and endl are the manipulators. The manipulator
setw(10) sets the stream’s field width to 10; the manipulator end1 inserts the end-of-
line character and flushes the output. As you can see, manipulators can take arguments or
be parameterless.
Extensibility is a major advantage of IOStreams. We’ve seen in the previous section
how you can implement inserters and extractors for user-defined types that behave like
the built-in input and output operations. Similarly, you can add user-defined manipula-
tors that fit seamlessly into the IOStreams framework. In this section, we will explain how
to do this.
Manipulators are inserted and extracted via shift operators. To be inserted and
extracted in this way, there must be shift operators defined for each type of manipulator.
We will denote the type of manipulator object as manipT in the following text. The extrac-
tor for a manipulator of type manipT looks like this:
With this extractor defined, you can extract a manipulator object Manip of type
manipT by saying
For these four manipulator types, IOStreams already contains the required inserters
and extractors. The extractor for parameterless manipulators to input streams, for
instance, takes the following form:
It uses a function pointer of type (3) from the list above. Similarly, an inserter for
parameterless manipulators to output streams uses a function pointer of type (4) and is
already defined in IOStreams as
The inserters and extractors for function pointers of type (1) and (2) are also prede-
fined in IOStreams. They allow insertion and extraction of parameterless manipulators
that can be applied to input and output streams.
178 Advanced [OStreams Usage
The list of manipulator types is not limited to the examples above. If you have cre-
ated your own user-defined stream types, you might want to use additional signatures as
parameterless manipulators.
return oS;
with end] as the actual argument for pf. In other words, cout << endl; is equal to
cout .operator<<(endl);
Here is another manipulator, boolalpha, that can be applied to input and output
streams. The manipulator boolalpha is a pointer to a function of type (1):
return strm;
};
The class type width is the manipulator type, i.e., it is a concrete example of the
type we previously denoted as manipT. A width manipulator can be used like this:
cout << width(5) << 0.1 << endl;
6. The constructor of the width class is declared as explicit. If you are not familiar with explicit constructors,
see section G.3, Explicit Constructors, for an explanation of the explicit keyword in general.
180 Advanced |OStreams Usage
class mendl {
public:
explicit mendl(unsigned int i) : i_(i) {}
private:
unsigned int i_;
In the next section we show you a different implementation technique that pays off
if you implement several manipulators with the same number of parameters, but with
different types of arguments.
3.2.2.2 Generalized Technique: Using a Manipulator Base Template
Manipulators with a parameter vary with respect to the type of their parameter and their
respective functionality. We now build a framework that abstracts from those two proper-
ties and generally eases the implementation of manipulators with arbitrary functionality
and parameters of arbitrary type. Again, we will discuss only the case of one parameter.
The key idea is that the manipulation must not be hard-coded into the shift opera-
tors, but factored out into an associated function. A pointer to this function is provided to
the constructor of the manipulator type and stored as a data member of the manipulator
object for subsequent invocation in the shift operators.
The manipulator type, which was a class type in the straightforward solution, now
becomes a class template that takes the types of the manipulator arguments as template
parameters. Concrete manipulator types are derived from this base class template.
3.2 User-Defined Manipulators 181
private:
manipFct pf_;
const Argument arg_;
The core of the shift operators is invocation of the associated manipulator function
with the manipulator arguments. Using the base manipulator template one_arg_manip,
we could reimplement the width manipulator from the previous example as follows:
The problem is that member functions like put () and flush() are defined in
basic_ostream and are not accessible via an ios _base& argument. For them to be
accessible, we would need an associated manipulator function like this:
Note that fct () becomes a function template, because the stream it operates on is
templatized. Inevitably, the manipulator type mend1 and the manipulator base class
one_arg_manip become templates, too. Here is the correct definition of the manipulator
class template, assuming that the function signature in one_arg_manip is changed as
necessary:
Brean At i. doh = AL -
Now that mend] is a template, the manipulator expression is less convenient than it
used to be. Each time we manipulate a stream by inserting a mend1 object, we need to
know the character and traits type of that stream. Instead of simply saying
We can make the calls more convenient by defining typedefs for each of the instanti-
ations of mend1l. Then we have different manipulators for each type of stream. It’s an
improvement because we need not know the character and traits type of the stream.
Instead we have to figure out which manipulator type is the right one to be used with a
particular stream:
Another idea for making the manipulator calls more convenient relies on automatic
function template argument deduction: The compiler is capable of deducing function
template arguments from the actual arguments provided to a function call. We can use
this language feature to have the compiler figure out the character and traits type of a
stream and the respective manipulator: We wrap the construction of a manipulator object
184 Advanced lOStreams Usage
into a function that we call mend1 and rename the manipulator type to basic_mendl. In
other words, we add the following function template:
The manipulator expression would now be a call to the wrapper function instead of
the construction of an unnamed object of the manipulator type. We would use the mend1
function like this:
The downside is that we have to repeat the stream object redundantly in the manip-
ulator expression.
manipulator type itself has to be a class template. The example of mend1 demonstrated
related drawbacks and corresponding solutions.
An alternative for implementing manipulators for derived stream classes is the
straightforward solution, i.e., putting all the manipulator functionality directly into a shift
operator for the derived stream type.
whereas input and output operations have no effect when the stream is not in a good
state. The standard manipulators, for instance, reflect these differences. They do not
actively check the stream state but simply call one or more stream member functions and
exhibit exactly the same behavior. The standard manipulators setw and end1 illustrate
the difference: The setw manipulator boils down to a call to the width () member func-
tion, which changes the width setting regardless of the stream state. Consequently, setw
always has an effect. The end1 manipulator behaves differently: It inserts the end-of-line
character and calls the flush() member function; both activities have no effect on
streams that are not in a good state.
We feel that user-defined manipulators are easier to understand and use if they
behave consistently. For this reason we decided to check the stream state in the shift oper-
ator before invoking the associated manipulator function in the example presented in the
previous section.
3.2.2.4 Refinements
The manipulations performed by the previously presented manipulators are relatively
simple: usually they consist of the invocation of one or two stream member functions. But
manipulators can be much more sophisticated and powerful. In the following two sec-
tions we show you some useful examples of extensions: (1) manipulations with error han-
dling, and (2) manipulators that maintain state between subsequent manipulations.
We demonstrate how these refinements can be built into.the manipulator base class.
The ideas behind these extensions are equally relevant for manipulators without a base
template.
3.2.2.4.1 Manipulator Base Template with Error Handling
During the manipulation of a stream, error situations can occur. Section 3.3, Extend-
ing Stream Functionality, gives a practical example; it contains a manipulator that has to
create a string by calling operator new(). Naturally, this memory allocation can fail,
and such a failure needs to be handled by the manipulator. Section 3.1.3, Refined Inserters
and Extractors, lists conventions for error reporting, which operations in IOStreams
should conform to. I/O operations must not propagate exceptions unless the exception
mask permits it, and the stream state must be set according to the errors that occurred.
These rules are equally relevant for user-defined manipulators.
Because of their simplicity, the standard manipulators have no need for error han-
dling. All they do is call one or two stream member functions, and each of these functions
already handles errors in a way that matches the IOStreams conventions. Consequently,
the standard manipulators can fully rely on the invoked functions for error handling.
User-defined manipulators, however, might perform any kind of operation and must
handle errors if necessary.
We show how to build the necessary error-handling logic into a manipulator base
template and a wrapper for the associated manipulator function.
The base idea is to catch all exceptions that are raised during a manipulation and to
accumulate information about the respective error situations in a data member of the
3.2 User-Defined Manipulators 187
manipulator. After completion of the manipulation the accumulated error state is used for
adjusting the stream state and raising an exception according to the exception mask.
For the purpose of catching exceptions during the manipulation, we wrap the asso-
ciated manipulator function into a wrapper function that we call do_manip(). The
wrapper is responsible for
¢ invoking the associated manipulator function
° catching all exceptions that are propagated from this function
¢ adjusting the streams state and raising an exception in accordance with the excep-
tion mask
In this way the associated manipulator function itself need not worry much about
error handling: If no cleanup is necessary, it can simply propagate all exceptions. Other-
wise it catches exceptions, does the cleanup, and rethrows the exception. In any case, it
need not worry about adjusting the stream state and raising an exception according to the
exception mask.
Here is the proposed solution in detail. Let us start with the manipulator type. As
before, a concrete manipulator type is supposed to be derived from a manipulator base
template. The manipulator object must now maintain a data member for accumulating
error information. This data member, along with access functions, is added to the manip-
ulator base template:
protected:
void setFail() { error_ |= ios_base::failbit; }
void setBad() { error_ |= ios_base::badbit; }
private:
manipFct pf_;
const Argument arg_;
1os_base::iostate error_;
The wrapper function itself is shown below. It first checks the manipulator’s error
variable in order to find out whether there had been a previously detected error situation,
encountered, for instance, during the construction of the manipulator object. If so, it han-
dles the error. Otherwise, the associated manipulator function is invoked, and all excep-
tions propagated from this function are caught and handled. The error indication policy is
exactly the same as in section 3.1.3.4, Error Indication, for user-defined inserters and
extractors:
{
if (oamw.error_ != ios_base::goodbit)
{ str.setstate(oamw.error_); }
else
{ ios _base::iostate err = oamw.error_;
if ( err ) str.setstate(err);
An example of a concrete manipulator type that uses the base template with excep-
tion handling is presented in section 3.3, Extending Stream Functionality.
3.2.2.4.2 Manipulators with State
All manipulators with parameters that we have considered so far have been state-
less objects that are created, used only once, and instantly discarded.
We talk of manipulators with state when the manipulation depends on information
that is maintained between subsequent manipulations. A manipulator object with state
is created, stays permanent, and is used in multiple ways, memorizing information
between uses. As an example, we extend the width manipulator discussed in section
3.2.2.2, Generalized Technique: Using a Manipulator Base Template, so that it restricts
acceptable width values to a certain interval. The bounds of this interval are provided
when the manipulator is created and all subsequent uses of the manipulator perform a
bounds check for the current width value. Here is how the bounded width manipulator
would be used:
You can see that the stateless width manipulator is created as an unnamed object of
type width each time it is used. In contrast, the width manipulator with state is created as
a permanent manipulator object width_2_6 and is used several times. The manipulator
expression width_2_6(i) isa function call. In our implementation, the manipulator
type bounded_width has a function call operator for this purpose. The function call
operator takes the manipulator argument, performs the bounds check, and calculates the
effective width value. Here is the manipulator type bounded_width:
The manipulator base template one_arg_manip is basically the one used in section
3.2.2.2, Generalized Technique: Using a Manipulator Base Template, with one minor mod-
ification: The manipulator argument was originally stored as a private constant data
member and now has to be nonconstant and protected because it is altered by the function
call operator.
3.3 Extending Stream Functionality 191
sions to the IOStreams framework, and how to map the presented sample solution to
your specific problem.
The sample implementations suggested in the following sections are based on
implementations that were discussed in previous chapters. You might consider rereading
the respective sections for better understanding of this chapter. Specifically, we suggest
reading the following:
section 3.1, Input and Output of User-Defined Types, for understanding the
implementation of the date inserter (which we will use in our example) and in
particular, subsection 3.1.4, Generic Inserters and Extractors.
section 3.2.2.4.1, Manipulator Base Template with Error Handling, for under-
standing the implementation of a manipulator in the example
If you are not familiar with iword/pword, xalloc, and stream callbacks, you might
consider reading section 2.5, Additional Stream Storage and Stream Callbacks, where the
underlying concepts are described.
xalloc(), which isa static member function in class ios_base that makes provision for
acquisition of additional memory for an iword/pword pair and returns a valid index as a
handle to it.
ADDING FUNCTIONALITY. New functionality that uses the user-defined stream attrib-
utes is implemented outside the stream, because it cannot be added to the existing stream
classes themselves. Such functionality typically, but not necessarily, has an obvious rela-
tionship to IOStreams and is often implemented as inserters, extractors, or manipulators.
If new functionality related to the stream’s destruction or invocation of its member
functions copyfmt () or imbue () is needed, the stream’s callback mechanism is used to
add such functionality. The details of the callback mechanism are explained in section
2.5.2, Stream Callbacks. Here is a summary: Callback functions must be registered per
stream and are later automatically invoked when the stream’s locale is replaced, when the
stream’s copyfmt () is called, and when the stream is destroyed. Each callback function
is registered together with an index to the iword/pword fields. This index is later pro-
vided to the function when it is invoked and enables the function to use and manipulate
the respective iword/pword field. The callback function is also provided with an
ios_baseé& that gives access to the stream’s error state and other base class features. It
also receives information about the type of event for which it is invoked.
date birthday(8,2,1979);
cout << setfmt("%x") << birthday << endl;
set fmt would be a manipulator that allows modification of the format information. The
manipulator will accept a format specification for a date as an argument. Such format
specifications are already used in the standard library, namely, in conjunction with the
194 Advanced !OStreams Usage
time-formatting facets that are contained in each locale object. The format specification is
a C-style string, such as "%x" in the example above. (For further details about these for-
mat strings, see appendix C, strft ime () Conversion Specifiers.)
A manipulator that receives arguments typically stores some information in the
stream object, so that any subsequent output operation can access and interpret it and
adjust its formatting accordingly. In our example, the manipulator set fmt must store the
date format specification in the stream object, and the inserter for date objects must read
the format specification and put date objects into the appropriate format before actual
output.
id iword pword
Manipulator
o°
°
Vv
-”
*
@
Date inserter
Figure 3-1: Both the manipulator and the date inserter use the same tword/pword field. —
3.3 Extending Stream Functionality 195
Note that use of heap memory adds an additional issue: We must make sure that the
allocated heap memory is deallocated correctly, so that no memory leaks are created.
When the stream object is destroyed, either because it goes out of scope or because it is
explicitly deleted, the stream’s destructor properly deletes all of the stream’s data mem-
bers. In particular, the stream’s destructor will release the pword entry, but not the string
object on the heap the pword entry refers to. Without any further measures, a memory
leak is the result. To eliminate the memory leak, we will use the callback mechanism. We
will register a function that deletes the format string on the heap when the stream object is
destroyed.
int getIdx()
{
Static const int myIdx = ios_base::xalloc();
return mylIdx;
}
{
void *p;
basic_string<charT> patt;
if ((p os.pword(getIdx())) != 0)
patt = *(static_cast <basic_string<charT> *> (p));
else
patt = defaultPatt
(os) ;
if (os.good() &&
! use _facet<time_put<charT, ostreambuf_iterator<charT,Traits> > >
(os.getloc()) .put(os,os,os.fi11(),&tm_date,patt.c_str()
,patt.c_str()+patt.length())
.failed()
)
return ios_base::goodbit;
3.3 Extending Stream Functionality 197
else
return ios_base::failbit;
return helper;
This function is a template, because there are different types of date format strings.
The background is that date format strings are passed to the time_put facet of the
stream’s locale, as you could see in the implementation of the print_on() function. For
instance, wide-character streams have a time_put facet that requires wide-character
strings as a date format string. For that reason, the date format strings are of different
types depending on the character type of the stream.
Our idea for the default date format string is to provide "%x" as the default. If you
study the time_put facet in greater detail (see section 6.2.3.1, The time_put Facet), you
will find that it takes the format string and narrows it, using the ct ype facet’s narrow ()
function. Here we have to do the opposite: widen the narrow character string "%x" in
order to provide the result as the default date format string of the required character type.
In this technique, the really interesting part of the manipulator is the helper object
type setfmt_helper. It’s a typical manipulator class as described in section 3.2.2,
Manipulators with Parameters. It’s derived from the base class one_arg_manip_weh
that was suggested in section 3.2.2.4.1, Manipulator Base Template with Error Handling.
We do this because the manipulation (i.e., storing the format string in the stream) can fail,
and we aim to indicate that failure by setting bits in the stream state and/or raising excep-
tions. The manipulator base class one_arg_manip_weh provides a foundation for this
error handling.
The core of the class template setfmt_helper is the static function
set fmt_fct (), which performs the actual manipulation. It creates the copy of the for-
mat string on the heap, deletes any previously created format string object, and stores the
address of the new format string object in the pword field. Here is the implementation:
public:
setfmt_helper(const charT* fmt)
: one_arg_manip_weh<const charT*>(setfmt_fct,fmt) {}
private:
Static void setfmt_fct(ios_base& str, const charT* fmt)
{
void*& formatStringPtr = str.pword(getIdx());
basic_string<charT>* newFormatStringPtr = new basic_string<charT>
(fmt) ;
basic_string<charT>* oldFormatStringPtr
= static_cast<basic_string<charT>*> (formatStringPtr);
formatStringPtr = newFormatStringPtr;
delete oldFormatStringPtr;
The call to pword() as well as operator new and operator delete can raise excep-
tions. Let us scrutinize the problematic cases:
If the memory allocation via operator new fails, a dangling pointer will be left in the
pword field. The pointer will point to the already deleted old format string object. This is
certainly a problem. If the second call to pword () fails, the new string object on the heap
will already be created, but its address cannot be stored in the pword field as intended. As
a result, the allocated string object can no longer be deleted, and a memory leak will
be left.
With the use of additional local variables and the way the statements are arranged in
the suggested implementation, the pword entry is kept consistent, and neither the dan-
gling pointer nor the memory leak can occur. The deletion of the old format string is post-
poned until the very end, after the new format string has successfully been allocated and
stored in the pword field.
Note that exceptions that are raised by any of the invoked operations are propa-
gated to the caller and handled there. The caller in this case is the inserter defined for the
manipulator base class one_arg_manip_weh that was suggested in section 3.2.2.4.1,
Manipulator Base Template with Error Handling. This inserter is designed so that it maps
any exceptions to the usual error indication for I/O operations, i.e., stream state bits
and/or propagation of the exception depending on the stream’s exception mask.
Another issue to be considered in this context is the validity of the reference
returned by pword(). Such a reference is not guaranteed to be valid for ever. The stan-
dard specifies:
The reference returned (by a call to iword() /pword()) may become invalid after
another call to the object’s iword/pword member with a different index, after a call
to copyfmt, or when the object is destroyed.
STREAM DESTRUCTION
What happens when the stream object is destroyed, either because it goes out of scope or
because it is explicitly deleted? As expected, the stream’s destructor provides proper
resource management for all of the stream’s data members. In particular, the stream’s
destructor will release the pword entry itself, but not the string object on the heap the
pword entry refers to. Without any further measures, a memory leak is inevitable.
To address this issue, callback functions can be registered for iword/pword entries.’
These callback functions are triggered under certain conditions, one of which is the
destruction of the stream object. The details of callbacks in IOStreams can be found in sec-
tion 2.5.2, Stream Callbacks. We will be using the callback mechanism here in our example
to establish proper resource management for the format string in order to avoid the mem-
ory leak. Before we do that, let us see whether there are other resource management
problems.
INVOCATION OF copyfmt ()
It turns out that destruction of the stream object is not the only situation in which a mem-
ory leak is likely to occur. Another problematic situation is invocation of the copyfmt ()
function. When the stream’s copyfmt () function is invoked, data members from one
stream are assigned to the stream it is invoked on. The details of copyfmt() are
described in section 2.1.3, Copying and Assignment of Streams. Among the data members
assigned by copyfmt () are the iword/pword entries.
In our example, the effect is that the stream’s pword field, which holds the pointer to
the format string, is overwritten by another stream’s pword entry, which points to
another format string. One problem here is that the old format string is neither released
nor accessible any longer, because the pointer stored in the pword field was overwritten.
The result is a memory leak. Another problem is that both streams involved in the call to
copyfmt () refer to the same date format string on the heap afterwards. Imagine that one
stream object receives a new date format string from the set fmt manipulator. The other
stream object will not be affected by this change and will still be referring to the previous
date format string, which both stream objects used to share. Unfortunately, the stream
that receives a new date format string will release the old one and will therefore delete the
7. The old, classic IOStreams library did not support callback functions, and for that reason there was no solution
to this problem at all.
3.3 Extending Stream Functionality 201
shared date format string. The result is that the other stream will be referring to the previ-
ous date format string that is now deleted, and we have a dangling pointer problem.
Again, the callback mechanism can be used to solve the problem. Registered call-
back functions are triggered not only when the stream is destroyed but also when the
stream’s copyfmt () function is invoked. The case of a copyfmt () event is a little more
complicated than the case of stream destruction, because the callback functions are trig-
gered twice:
First, the callback functions are called before any stream data members are assigned,
in order to allow proper release of any resources that will be replaced by the subsequent
assignments performed by copyfmt (). In our example, the respective callback function
must delete the old format string that the pword entry refers to.
The callbacks are called a second time, after all stream data members—except the
exception mask, but including iword/pword and the callback function pointers—have
been assigned. We will take advantage of this second invocation to have the callback func-
tion care about duplication of the format string object on the heap, so that both streams
involved in copyfmt () will refer to copies of their own afterwards.
How do we distinguish between both invocations? A callback is always provided
with an argument that describes the event that led to its invocation. On destruction, the
event is ios_base: :erase_event; in case of copyfmt (), the event is ios _base::
copyfmt_event. The first invocation of a callback function during copyfmt () is called
with an erase_event argument, much as it would be called on destruction of the
stream. The reason for this is that the callback has to care about the exact same issues,
namely, proper release of any resources. The second time the callback functions are called
with a copyfmt_event argument.
Before we delve into the implementation of the callback function, let us recall what it
is supposed to do. When it is invoked during destruction or at the beginning of
copyfmt (), it will delete the format string object that the pword entry refers to. When
the callback function is invoked at the end of copyfmt (), it will duplicate the date for-
mat string and set the pword entry so that it refers to the newly created copy of the date
format string.
The reference to the stream that the callback function will operate on is passed to the
function as a reference of type ios_base&. The stream base class ios_base does not
give access to the stream state, because all members related to the stream state are defined
in class basic_ios, which is derived from class ios_base. Asa result, there is no access
to the stream state via an ios_base reference, and therefore no way to indicate the failure
of a callback function by setting the stream state flags.
It turns out that there is no direct way in which a callback function can indicate that
it failed. For the time being, we will simply suppress all potential exceptions. In section
3.3.1.5, Error Indication of Stream Callback Functions, at the end of this chapter we will
return to this issue and suggest a viable way for error indication from a stream callback
function.
1f (formatStringPtr == 0)
str.register_callback(callback, getIdx());
if (formatStringPtr == 0)
str.register_callback(callback, getIdx());
8. If a callback function is registered that violates this requirement, the behavior of any IOStreams operations
that are subsequently invoked is undefined.
3.3 Extending Stream Functionality 205
tion mechanism must not be used for error indication, we must find another way of indi-
cating failure. Callback functions are required to have the following signature:
void (*event_callback) (ios_base::event, ios_base&, int index);
As you can see, callback functions do not have a return code. Consequently, we can-
not “return” the error. As the callback function receives a reference to the stream in the
form of an ios_base reference, one could consider storing the error in the stream’s state.
Section 3.3.1.4, Using Stream Callbacks for Memory Management, already explained that
the callback function cannot set failbit or badbit in the stream’s state either, because the
stream base class ios_base does not give access to the stream state. As the callback func-
tion cannot store the error anywhere in the stream, it must store it elsewhere.
A solution is to store the error indication in the not-yet-used iword field. Remember,
callback functions are registered per iword/pword index, and each index gives access to
an iword/pword pair. In our example, we have been using only the pword entry; we store
a pointer to the format string in the pword field. The corresponding iword field has not
been used so far. We will now use it for error indication of a callback function. The call-
back function will store the error information in the iword field, and afterwards the user
must check the error indication stored in the iword field. This check need not be done by
directly accessing the iword field. The idea is to map the information contained in the
iword field to the stream state, so that the error information can then be accessed in the
usual way (via good (),bad(),etc., or in the form of an exception raised if the exceptions
mask requires it).
In the following sections we will complete our example of formatted output of date
objects as outlined above. We will then show you a more general approach for error indi-
cation of callbacks that also works when more than one callback function is registered.
3.3.1.5.1 Extending the Example
If an error occurs in the callback function, a corresponding error indication is stored
in the iword field. As described above, this error information is mapped to the stream
state in a second step. For this reason, the error information is expressed in terms of an
object of type ios_base::iostate, because this is the type of the stream state data
member of the stream classes; it is a bitmask type that can hold the stream state flags
failbit, badbit, and eofbit.
Below is the extended version of the callback function. We catch the bad_alloc
exception and store a badbit in the iword field.
template <class charT>
void setfmt_helper<charT>::callback(ios_base::event ev, ios_base& str, int i)
{
if (ev == ios_base::erase_event)
{
try { delete static_cast<basic_string<charT> *> (str.pword(i)); }
catch(...) { }
}
206 Advanced |lOStreams Usage
Note that we did not do anything about the exception that might be raised in case
the callback function was invoked due to an erase_event. These situations are either
destruction of the stream or the beginning of a copyfmt () call. In the case of stream
destruction, it would be futile to store any information in the iword field; the error infor-
mation cannot be retrieved afterwards anyway, because the entire stream has disap-
peared. In the case of a copyfmt() call, the iword field will be overwritten by the
subsequent assignment of stream data members performed by copyfmt (). Again, it
would be futile to store any information in the iword field; it will be overwritten before it
can be checked.
While the callback function cares about storing error information in the iword field,
it remains the user’s responsibility to fetch the error information from the iword field and
transfer it to the stream state. To make this more convenient, we provide the function
copyfmtErr() for this purpose. It must be called by the user after any invocation of the
copyfmt () function. Here is the implementation of the copyfmtErr () function:
Note that copyfmtErr() can trigger an exception. This is a result of the call to
setstate(): It throws an exception if the stream’s exception mask is configured to react
in this way to the newly set stream state.
The function copyfmtErr() must be called after any invocation of the copy-
fmt (). The sample code in the next section is an example of how it would be used.
3.3 Extending Stream Functionality 207
date d(11,8,1999);
if (copyfmtErr(cout) != i1os_base::goodbit)
{ /* do the appropriate error handling */ }
cout << d << endl; // now using the date format taken from cerr
Assuming a U.S. locale is used, the output of the program above would be
1. Using iword or using pword? A good point to start the stream extension is to deter-
mine which information should be added to the stream. If the new attribute is of a type
that can safely be promoted to long, an iword entry can be used to store the value. Other-
wise, the attribute can be created on the heap and a pointer referring to the heap object can
be stored in a pword entry. (The latter is the approach we used in our example: The
manipulator creates a copy of the date format string on the heap and stores a pointer to
the heap object in a pword field.)
Another related consideration is the number of iword/pword fields that should be
used to store the information. In most cases, one entry will suffice. If, for instance, two
integral values should be stored, it is more convenient to use two iword entries rather
than one pword entry that points to a heap object containing the two integral values.
2. Acquiring the index. The next step is to determine which part of the program will
acquire the index (or the indices) by invoking ios_base: :xalloc().Itis important to
make sure that this is done before the index is used by any other part of the program. (In
our example, we wrapped the index into an access function that acquires a valid index
immediately before it is accessed for the first time.)
3. Initializing the new attribute(s). Next it must be determined which parts of the pro-
gram will have read, write, or even read-and-write access to the acquired iword/pword
field. (In our example, the manipulator stores the date format string in the pword field,
and the inserter retrieves it for formatted output of date objects.)
Depending on the context, it might be necessary for the iword/pword field to be ini-
tialized with a sensible value before it is read. This might require coordination between
read and write accesses; you must make sure that there is initial write access before any
read access to the new attribute(s).
Instead of making provisions for initial write access, a default value can be used.
This alternative approach relies on the guarantee that iword/pword fields are initialized
to 0. This is a viable solution whenever those parts of the program that read the informa-
tion can live with an iword/pword set to 0. The 0 value is typically interpreted as an indi-
cation that no explicit initialization has been processed so far, and a default for the
missing attribute value can be used instead of an explicitly provided attribute value. (We
used the latter approach in our example. When the inserter is invoked before any format
specification has been provided, a default format is used.)
4, Memory management of the new attribute(s). The use of callbacks is typical of situa-
tions where handles to dynamically acquired resources are stored in the iword/pword
fields and these resources logically belong to the stream. Destruction of the stream and
copying stream data members have a potential for creating a resource leak under such cir-
cumstances. The dynamic resource used most often is memory allocated on the heap.
If a pword field is used to store a pointer referring to an object on the heap, the
stream’s destructor releases only the pword entry itself, but not the object on the heap the
pword entry refers to, and a memory leak is the result. Usually, a similar problem arises
3.3 Extending Stream Functionality 209
when the pword entry is copied among other stream data members during copyfmt ().
To avoid any memory leaks, memory management for the heap object must be installed in
the form of a stream callback. The memory management can be implemented by deleting
or copying the object on the heap. Alternatively, more sophisticated schemes, such as ref-
erence counting, can be used. (In our example, we implemented a callback function that
deletes and duplicates the dynamically allocated format string object as needed in case of
stream destruction and invocation of copyfmt ().)
More generally, the callback function must provide the functionality for situations
where the stream is destroyed or its member functions copyfmt () or imbue() are
called.? The motivation is usually proper management of a dynamically acquired
resource referred to by the iword/pword entries.
5. Indicating callback errors. Errors that occur in the callback function cannot be indi-
cated directly, because the callback function has no access to the stream state and is
required not to throw any exceptions. As a workaround, the status information must be
stored elsewhere. (In our example, we used the iword field corresponding to the pword
field to store the callback’s error information. We also provided additional functionality to
help the user, who must copy the error information from outside the stream to the stream
state after any stream operation that triggers the callback function.)
The difference from the previous solution is that ost r is an output stream of a class
type derived from a standard stream class. As before, the format string is passed as an
argument to the manipulator set fmt and must be stored in the stream object for subse-
quent use by the date inserter. In the previous approach the iword/pword fields were
used for this purpose. In this solution, we wrap the format string into a class of its own,
9. In our example, we had no need to react to a call of imbue () in our callback function. Yet it can be necessary
under certain circumstances. Just to give you an idea of when and why a callback function for an imbue_event
might be needed, here is a conceivable example: If information from the stream’s locale is used or may be
cached, it might be necessary to update this information when the locale is replaced.
210 Advanced |OStreams Usage
and the new stream class will then inherit the format string handling from the wrapper
class. Before we care about the actual stream functionality of the new stream class, let us
see how the format string handling can be provided. Here is the date format wrapper
class.
template<class charT>
class datefmt
{
public:
datefmt (const charT* f) : myFmt(f) {}
virtual ~datefmt() {}
private:
basic_string<charT> myFmt;
};
10. We briefly mentioned the need for different types of format strings before, in section 3.3.1.2, Implementing
the Date Inserter, when we discussed the default date format string. There are different types of date format
strings, because the format string is passed to the time_put facet of the stream’s locale, and a wide-character
stream’s time_put facet requires a wide-character format string.
3.3 Extending Stream Functionality 211
Here is a first sketch of the new stream class template, called ocdatest ream. Note
that we use multiple inheritance here, because the new stream class inherits the date format
string from the class datefmt and the text output functionality from a stream base class.
// 1...
};
adapter_type ostr(cout);
ostr << setfmt("%B $d, %Y is a %A.") << date(11,8,1999) << endl;
Let’s see how we can design such an adapter stream type. Its constructor is to receive
a stream object and create an adapted stream object that is a copy of the original. The first
problem is that streams cannot be copied. Both the copy constructor and the copy assign-
ment operator of the stream classes are inaccessible. In section 2.1.3, Copying and Assign-
ment of Streams, we explained a technique for copying the entire state of a stream from one
stream to another, piece by piece. The technique uses the following member functions
defined in the stream base class basic_ios as copying all relevant parts of a stream:
rdbuf (), which can be used to exchange the stream buffer between streams
clear() and rdstate(), which can be used to exchange the error state
between streams
copyfmt (), which can be used to exchange all other data members between
streams
We will use this technique to implement the constructor of the adapter stream type and
copy the state of the existing stream to the newly constructed stream.
The second question is, Which stream class shall serve a base class for the new
adapter stream class?
As before, we want to leave this open and implement the adapter stream class as a
class template that takes its stream base class as a template argument. Here is a sketch of
the adapter stream type:
214 Advanced !OStreams Usage
clear (ost.rdstate());
copyfmt (ost);
//
};
Note that in this suggested implementation, only stream classes from the general
stream class layer can serve as template arguments. This is because the base classes’ con-
structors must be called when a derived stream object is created. Unfortunately, different
stream classes have constructors with different signatures.'' The moment we decide how
we want to invoke the base class constructor, we impose restrictions on the stream type
that can be used for instantiation of the adapter stream template. In this case, we decided
to allow the general stream classes as base classes.'”
11. The concrete stream classes have, among other constructors, a no-argument-constructor, which is conve-
nient, because it need not be explicitly invoked. The general stream classes, in contrast, have only one construc-
tor that requires a stream buffer.
12. A minor disadvantage of this design is that string-and-file stream objects that are adapted lose their device-
specific functionality. Consider, for instance, an adapted file stream. Had we allowed the concrete stream classes
as base classes, an adapter derived from basic_fstream would have inherited all the file stream—specific
operations such as open() and close (). With a general stream class as a base class, the adapter stream type
does not inherit the open() and close() member functions. The effect is that a file stream object, which is
adapted, cannot be closed via the adapted stream object, but only via its original object.
3.3 Extending Stream Functionality 215
//
};
This combined stream class template has two advantages: Only one new abstraction
is defined, and any redundant code is eliminated. The downside is that the class might be
a little hard to understand. Only one of the constructors will compile. Which one it is
depends on the type of the template argument OSt ream. The first constructor will com-
pile only when OSt ream is a concrete output stream type, while the second constructor
can be used only when OSt ream is a general output stream type. The combined stream
class template odatest ream is a highly flexible abstraction. However, it puts the burden
of correct usage onto the user.
13. A standard compatible compiler can only instantiate those member functions from a template that are really
used from the instantiated type. There are still compilers available that do not conform to this rule, ie., they
always instantiate all member functions. Such a nonstandard compatible compiler cannot compile the two
odatestream constructors: If the odatestream template parameter is a general stream class, the first con-
structor cannot be compiled, because a general stream does not provide a constructor that can receive a charac-
ter string and a mode argument. If the template parameter is a concrete stream class, the second constructor
cannot be compiled, because the concrete stream class provides no constructor that can receive a pointer to a
stream buffer. Note that no error occurs with a standard compatible compiler, because the constructor that can-
not be compiled will not be used and hence will not be instantiated.
14. We use function try blocks for catching exceptions in these contructors. If you are not familiar with function
try blocks, see section G.10, Function try Blocks, for further explanation.
216 Advanced !OStreams Usage
And
clear (toBeAdapted.rdstate());
copyfmt (toBeAdapted) ;
} catch (...) { setbad(); }
Both constructors pass their arguments to their respective base classes. The second
constructor passes the stream buffer of the stream provided as parameter to its base class
constructor and copies the stream state via clear () and the other stream members via
copyfmt () from the stream provided as parameter to its newly constructed instance.
Note that we made the format string an optional parameter and use a default when
no explicit format string is provided. The default date format string is provided by the
function template defaultPatt () as in the previous chapter.
template<class charT>
class datefmt
{
public:
datefmt (const charT* f£) : myFmt(f) {}
virtual ~datefmt() {}
private:
basic_string<charT> myFmt;
};
The function that returns the format string does so by returning a const reference
to a private data member. Consequently, no error can occur and no exception can be
thrown.
218 Advanced !OStreams Usage
The function that sets the format string does not explicitly throw an exception, but
an exception (most probably a bad_alloc) could be raised when the format string argu-
ment of type const charT* is implicitly converted into a string object of type
basic_string<charT>. Propagating this bad_alloc exception without setting badbit
and checking the stream’s exception mask violates the error-handling rules in IOStreams.
As this virtual function is inherited by the derived class odatestream, which is a
stream class and therefore supposed to obey the IOStreams rules, we must redefine this
virtual function in the derived class. Here is the redefined function that provides an
IOStreams-compliant error handling:
template<class OStream>
odatestream<OStream>& operator<< (odatestream<OStream>& ods, const date& dat);
Imagine we used this inserter and tried to compile the following lines of code:
The compiler would complain that it cannot not find an appropriate operator<< ()
for the insertion of the temporary date object. Why is that? Let’s study it step by step. The
compiler evaluates the sequence of shift expressions from left to right. First, it looks for
an operator<<() that allows text of type const char* to be inserted into the
odatestream object ods. It finds the following matching global operator, which is
already defined in the IOStreams library:
This operator returns a reference to a basic_ostream. Next, the compiler looks for
an operator<<() that allows insertion of the date object into the basic_ostream that
was returned by the first operator<<().No such operator is available, because the date
inserter takes a reference to an odatest ream, not a basic_ostream. The compiler even
considers implicit conversions for the arguments, but an implicit conversion .from
basic_ostreamé& (a base class reference) to odatestream& (a derived class reference)
is not defined in C++, and as a result the compiler cannot find a matching function and
issues an error message.
Obviously, we have a problem with a date inserter that takes a reference to an
odatestreanm instead of a basic_ostream. The odatestream reference is necessary
for gaining access to the date-formatting facilities defined in class odatestream, but
such an inserter is of limited use in a chain of inserters. The problem occurs whenever an
object other than a date object is inserted before a date object. The “nondate” inserters can
be applied to an odatestream, but they always return basic_ostream references.
Once a basic_ostream reference was returned in a chain, the compiler lost the type
information necessary to find the date inserter.
This is a typical problem with inserters to derived stream classes and is not limited
to our particular example. How does one solve it? As the compiler cannot find an inserter
that takes a reference other than a basic_ostream reference, we must define the date
inserter to take a base class reference. How, then, do we get access to the stream’s member
functions for date formatting?
The solution takes advantage of a relatively new language feature: the dynamic cast.
The idea is to cast the basic_ostream reference to an odatestream reference so that
220 Advanced !OStreams Usage
we gain access to the member functions for date formatting. This kind of cast is called a
downcast and is performed via the dynamic_cast operator. Section G.9, Dynamic Cast,
in appendix G explains the dynamic cast and related issues.
At last we are ready to implement the inserter. We use the same framework as
before, and again the inserter itself relies on a member function of the date class, called
print_on(), which performs the actual output. (The details of this implementation are
described in section 3.1.4, Generic Inserters and Extractors.) This member function has the
following signature:
template <class charT, class Traits>
ios_base::iostate date: :print_on(basic_ostream<charT, Traits>& os) const;
if ((p = dynamic_cast<datefmt<charT>*>(&os)) == 0)
formatString = defaultPatt (os);
else
formatString = p->fmt();
15. Again, we omitted handling of the field width, in order to keep the example focused.
3.3 Extending Stream Functionality 221
else
return ios_base::goodbit;
The peer cast works the following way: If the dynamic cast of the pointer to the
stream to a pointer to datefmt<charT> does not yield 0, then we know that the stream
object is of a type that supports date format string handling, and the member function
fmt () can be called. Otherwise, we know that the stream does not support date format
string handling, and we use the default string provided by default Patt () instead.
The rest of print_on()’s implementation does not need any further explanations,
because it is similar to the implementation discussed in section 3.1.4, Generic Inserters
and Extractors, where we used a fixed format string. Also, exceptions are propagated to
the calling function and handled there, as in all previously discussed examples.
3.3.2.2.2 Implementing the Manipulator
For implementation of the manipulator set fmt, we take the same approach as
before, and the entire implementation boils down to providing the manipulator helper
class set fmt_helper, and in particular its static member function setfmt_fct(),
which does the actual work. In its implementation we use the dynamic cast in the same
way, and for the same reasons, as in the function print_on() above: If the pointer to the
stream can be cast to a pointer to datefmt<charT>, then the manipulator sets the new
format string in the stream. Otherwise, the manipulator does nothing. Here is the
implementation:
cout << "Hello World, this is the year " << setfmt("%Y") << date(1,1,2000)
<< endl;
The format string "%Y" produces the year without century as a decimal number, yet
the output on the terminal is
Considering the implementation of the manipulator set fmt explained above, this
should not surprise us. The predefined stream cout is not a date-formatting stream, but
just a basic_ostream<char>, and in that case the manipulator does nothing. Conse-
quently, setting the format string to "ty" has no effect.
If we want to see an effect, we must use a date-formatting stream. We can use
odatestream’s adapting constructor and turn cout into a date-formatting stream, as in
the example below:
3.3.2.3 Summary
The stream classes of the standard IOStreams form a typical object-oriented class hierar-
chy, where derived classes extend the functionality of their base classes. For that reason, it
seems only natural to add user-defined stream extensions by derivation. The main idea is
to derive from one of the existing stream classes and extend it by adding new data and
operations to the derived stream class. Let us recap the main aspects that are typical for
stream extension via derivation.
1. Determining the right base class(es). For derivation of a new stream class, we must
choose an appropriate base class from the standard IOStreams library. We do this because
we want to reuse already existing stream functionality such as the parsing and formatting
of numbers and strings, format control, stream-specific error indication, management of
the locale, etc. As there is such a variety of stream classes defined in IOStreams, it is quite
typical for more than one existing stream class to be extended with the new functionality.
In such a situation, it is a good idea to combine the use of inheritance with templates
by making the base class the template argument of the derived class. In this way, the new
functionality can be added to all potential base classes, and we gain an extra level of flexi-
bility because the decision about the base class is deferred from the point of implementa-
tion to the point of template instantiation. For this reason, not only predefined stream
classes, which are known at the time of implementation of the new derived class, but also
user-defined stream classes, which are unknown to the implementer of the derived class,
can serve as base classes for the new abstraction.
2. Use of dynamic cast. Typically, it is user-defined inserters, extractors, and manipula-
tors, which use the new functionality that was added to the derived stream class, that are
implemented. As these new operations will also work with the old stream classes and col-
laborate with existing operations, they usually take references to stream classes defined in ©
IOStreams. This makes a downcast necessary from the base classes reference to the
3.3 Extending Stream Functionality 223
derived class reference, because otherwise the new functionality would not be accessible.
It is advisable to perform the downcast as a dynamic cast rather than a static, reinterpret,
or old-style cast. Only the dynamic cast checks whether the downcast is safe; the other
casts perform the type conversion unconditionally and can lead to disaster. _
Although the dynamic cast is safe, its use is often seen as a symptom of “poor pro-
gramming style.” In general, we agree, because a downcast can almost always be avoided
by proper class design and use of virtual functions. In this case here, the design cannot be
“fixed,” because we derive from base classes that come from a standard library. We cannot
add virtual functions to the existing stream classes in the IOStreams library, although we
need two versions of an operation—one for the base class and one for the derived class
that uses the new functionality. Use of the dynamic cast is the only viable and correct way
of implementing the dual functionality.
¢ The callback mechanism. It can be seen as a generic way to add new functionality to
the stream’s destructor, copyfmt () (which is comparable to a stream assign-
ment), and imbue () member functions.
There are conceivable situations in which the added functionality will not be avail-
able to all streams independent of their type. One example is a manipulator that semanti-
cally has an effect on output of objects, but not on input of objects. Yet with the iword/
pword technique, such a manipulator is applicable to input streams as well. Such a
semantically nonsensical application to an input stream might not do any harm, but it is
confusing in the case of bidirectional streams, because extraction of the manipulator from
an input stream has no effect on any input operations, but affects subsequent output oper-
ations. If functionality is restricted to a certain set of stream types, a dynamic cast can be
performed, much as in the inheritance-based solution, in order to check the stream type,
and the functionality can be suppressed if necessary.
CONCLUSION
The iword/pword technique is the IOStreams-specific way to add new stream functional-
ity. For users who are not familiar with the details of iword/pword and the stream call-
backs, this is an unusual and maybe a slightly awkward approach. The inheritance-based
technique, in contrast, is the typical object-oriented approach, as supported by C++. It has
16. See section 6.2.1, Numeric and Boolean Values, for further information about the facets for parsing and for-
matting numerical values; and section 8, User-Defined Facets, for an explanation on how to provide special pur-
pose facets.
3.4 Adding Stream Buffer Functionality 225
a certain amount of complexity, too, because the use of dynamic cast is necessary to access
the newly added functionality, and the derived class will be templatized taking the base
class type as a template argument for greater convenience and flexibility.
are nonvirtual functions, and they cannot be redefined by any derived class. Instead, the
public nonvirtual member functions call protected virtual member functions, which can
or must be redefined for a specific external device. The stream buffer base class
basic_streambuf is not an abstract class, but provides a default functionality for the
virtual functions. In general, the implementation of these functions in
basic_streambuf is a sensible null operation, which can be used by derived stream
buffer classes that do not want to override the respective virtual member function.
In the following, we discuss which of the virtual member functions must be rede-
fined under which circumstances when a new stream buffer class is implemented. We
start with the mandatory functionality of character transfer to and from the external
device in section 3.4.1.1 and move on to the optional stream buffer functionality in section
3.4.1.2. To illustrate the character transfer functionality of a derived stream buffer class,
we provide two examples: a buffered and an unbuffered external device. The optional
stream buffer functionality is explained only in principle; we do not present any sample
implementations.
3.4.1.1 Core Functionality of Stream Buffers: Character Transportation
The ability to transport characters between the stream and the external device is the core
functionality of a stream buffer. For a derived stream buffer class, it is often the only func-
tionality implemented. All other functionality is optional and simply inherited from the
base class, which exhibits the default behavior of a nonoperational mode.
In principle, a stream buffer can take two different approaches: buffered or unbuffered
character transport. An unbuffered stream buffer sounds like a contradiction in terms, but
the idea of unbuffered character transport is that characters are transferred directly
between the stream and the external device, without being buffered internally in the
stream buffer object. The use of unbuffered character transport may well be sensible and
can be motivated by comparing it with buffered character transport on the one hand and
direct use of the device-specific I/O functionality on the other. In a buffered stream buffer,
the internal character buffer represents the get and put areas at the same time, and the
stream buffer must coordinate access to both areas. This mechanism is described in greater
detail in section 2.2, The Stream Buffer Classes, and is a relatively complicated activity.
Alternatively, characters may be transported to an external device independent of
IOStreams by direct use of device-specific operations for transfer of characters. The down-
side of this approach is that all the convenient stream features, such as formatting and pars-
ing, cannot be used, because the external device is not encapsulated into a stream buffer.
Below we explain derivation of user-defined stream buffer classes by discussing two
sample implementations: (1) a stream buffer class for unbuffered character transport (in
subsection 3.4.1.1.1) and (2) the more typical case of a buffered stream buffer class (in sub-
section 3.4.1.1.2). Before we turn to these examples, let us first recapitulate the principle of
character transportation via stream buffers.
3.4 Adding Stream Buffer Functionality 227
character without consuming it, must have the same effect as a call to sbumpc () followed
by sungetc (), which means first consuming the character and putting it back afterwards.
The stream base class’s implementation of sungetc () works without any support
from a derived stream buffer, as long as it can reposition the get area’s next pointer one
step back. This, of course, is not possible if the next pointer is already positioned at the
beginning of the internal buffer, or if the stream does not have an internal buffer. In such
situations, pbackfail() is invoked and receives traits_type::eof() as an argu-
ment. sputbackc (char_type c) calls pbackfail() with c as an argument whenever
c does not equal the previously consumed character; otherwise it performs the same
operations as sungetc().
A more detailed description of the way in which stream buffers perform input from
the external device and support putback of characters can be found in section 2.2, The
Stream Buffer Classes.
For the output direction: Only overflow() must be redefined. sync () need not be
redefined, because there is no internal buffer, hence nothing to synchronize.
For the input direction: underflow() must be redefined. Also, uflow() must be
redefined, because the default implementation moves the next pointer of the get area, and
this is not possible in the unbuffered case, because there is no internal buffer area and,
therefore, no next pointer.
As mentioned above, input functions and putback support are related. A call to
sbumpc () followed by sungetc() must have the same effect as invocation of sgetc().
For this reason, we must also implement pbackfail(), because it is called by
sungetc(). The default implementation of pbackfail() does not work, because it
produces a decrement in the next pointer, which is not possible in the unbuffered case.
BUFFERED STREAM BuFFER. For a buffered stream buffer, we must redefine the follow-
ing protected stream buffer functions:
For the output direction: overflow() and sync () must be redefined.
For the input direction: underflow() must be redefined. uflow() need not be
redefined, because the default implementation provided by basic_streambuf works
perfectly well.
pbackfail() must be redefined in order to support sputbackc (char_type c)
with a character c different from the previously extracted one, that is, when pbackfail ()
must actually write to the internal buffer and a simple decrement of the next pointer does
not suffice.
In the following two sections, we show examples for each of the two types of stream
buffers discussed above, the unbuffered and the buffered case.
3.4.1.1.1 Stream Buffer for Unbuffered Character Transport
To keep the examples comprehensible and generic, we factored out the device-
specific functionality into two functions rather than mixing it into the actual stream
buffer implementation. In this way, you can reuse the example for an implementa-
tion of an unbuffered stream buffer by simply overwriting these two device-specific
functions.
The two functions transport a single character from or to the external device. In our
example the external device is a file, and we implement the functions in terms of the oper-
ating system functions read() and write() on UNIX platforms or _read() and
_write() for the Microsoft operating systems. For the sake of simplicity, the example is
restricted to character I/O for narrow character, that is, the character type char. Also, we
did not consider any character code conversion that might be necessary before or after
character transport. If needed, they can be included in the implementation of the two
functions. Here are the suggested implementations:
int char_to_device(char c)
{
if (write (1, &c, 1) != 1)
return -1;
230 Advanced lOStreams Usage
else
return 0;
protected:
int_type overflow(int_type c = traits_type::eof());
int_type uflow();
int_type underflow();
int_type pbackfail(int_type c);
private:
char_type charBuf;
bool takeFromBuf;
17. We declare the copy constructor and the copy assignment operator private without defining them in order to
make these operations inaccessible. The intent is to prohibit copy operations on stream buffers, because the
semantics of the copying stream buffer objects are generally undefined. See also section 2.2.2, The Stream Buffer
Abstraction.
3.4 Adding Stream Buffer Functionality 231
CHARACTER OUTPUT
For transport of characters to the external device, overflow() must be overridden. Here
is its implementation:
18. Note that we convert the character received by means of the traits function to_char_type() before we
pass it to the device specific output function. This is necessary because the argument that overflow() receives
can be either a valid character of type char_type or traits_type: : eof (), which is of type int_type. For
conversion between these two types, the traits provide conversion functions to_char_type() and
to_int_type(). More about the traits member functions can be found in section 2.3.2, Character Traits.
232 Advanced lOStreams Usage
pbackfail(). All three functions work with the private members charBuf and
takeFromBuf, whose meaning we have left open so far. Let us see what these two pri-
vate data members represent.
As we already know, underflow() does not consume the character that it trans-
fers; that is, it does not change the stream position. As a result, successive calls to
underflow() must all return the same value referred to by the current stream position.
Some external devices might allow the same character to be fetched from the device
repeatedly, while others will not. For a device-independent implementation, we assume
the worst and store the character in charBuf once we get it from the device, so that we do
not have to fetch it twice. takeFromBuf is a Boolean value that is true when the charac-
ter in charBuf has not been consumed and false otherwise. If takeFromBuf is true,
then the next call to underflow() or uflow() must return the character stored in
charBuf instead of fetching a new character from the external device.
With these explanations in mind, the implementations of underflow() and
uflow() are straightforward. Here is the implementation of under flow():
{
char_type c;
if (char_from_device(&c) < 0)
return traits_type::eof();
else
{
takeFromBuf = true;
charBuf = c;
return traits_type::to_int_type(c);
int_type uflow()
{
1f (takeFromBuf)
{
takeFromBuf = false;
return traits_type::to_int_type(charBuf);
}
else
{
char_type c;
if (char_from_device(&c) < 0)
return traits_type::eof();
else
{
charBuf = ¢c;
return traits_type::to_int_type(c);
uflow() does the same as underflow (), except that it always sets takeFromBuf
to false in order to indicate that the character has been consumed.
Here is the implementation of pbackfail ():
takeFromBuf = true;
return traits_type::to_int_type(charBuf);
else
return traits_type::eof();
234 Advanced IOStreams Usage
int main()
{
unbuffered_streambuf<char, char_traits<char> > mybuf;
iostream mystream(&mybuf);
protected:
streamsize xsputn(const char_type *s, streamsize n);
int_type overflow (int_type c = traits_type::eof());
int sync();
private:
static const int bufSize = 16;
char_type buffer[bufSize];
int buffer_out();
The essential attribute of a buffered stream buffer is a character array that serves as
an internal buffer. It is made a private member together with an integer constant that
defines the size of this array.
Let us see what the constructor and destructor must do. Here is the implementation
of the constructor:
template <class charT, class traits>
outbuf<charT, traits>::outbuf ()
{
setp(buffer, buffer + bufSize);
The constructor installs the private character array as the internal buffer, so that the
functionality implemented in basic_streambuf uses this array as the put area. This is
achieved by calling basic_streambuf’s protected member function setp(char_
type*, char_type*). Its two parameters are the beginning and the end of the character
array that serve as the put area.”°
19. We declare the copy constructor and the copy assignment operator private without defining them in order to
make these operations inaccessible. The intent is to prohibit copy operations on stream buffers because the
semantics of the copying stream buffer objects are generally undefined. See also section 2.2.2, The Stream Buffer
Abstraction.
20. If no buffer array is installed via setp(), the default setting initialized by basic_streambuf is used; it
indicates that no buffer is available. This is what we implicitly took advantage of in the unbuffered case.
3.4 Adding Stream Buffer Functionality 237
The standard does not specify the functionality of basic_streambu f’s destructor.
For this reason, we override the base class functionality so that the destructor calls
sync (). In other words, we force the stream buffer to empty its internal buffer array to
the external device before it is destroyed.
Let us see how sync () is implemented:
In our example, sync () calls the private member function buf fer_out (), which
is implemented as follows:
pbump(-cnt);
return retval;
else
{
if (!traits_type::eq_int_type(c, traits_type::eof()))
return sputc(c);
else
return traits_type::not_eof(c);
Like sync (), overflow() first calls the private member function buf fer_out ()
to write the content of the buffer array to the external device. Then it writes the value that
it has received as an argument to internal buffer array by calling its own public member
sputc(), if the value does not equal traits_type::eof(); otherwise it returns
immediately.
Note that the error handling of both functions, sync () and overflow(), is again
slightly oversimplified. Both functions return traits_type::eof() in case of any
error. A more specific error handling, which we deliberately omitted to keep the example
concise, would return traits_type::eof() only when the end of the stream is
reached and indicate other errors by throwing appropriate exceptions.
The last virtual function that is redefined is xsputn(). While sync() and
overflow() must be redefined for a user-defined buffered output stream buffer, over-
riding xsputn() is optional. A derived stream buffer class can redefine it, when the han-
dling of a character sequence, as opposed to a single character, can be optimized. Here is
its implementation:
int main()
{
outbuf<char, char_traits<char> > mybuf;
ostream mystream(&mybuf);
protected:
int_type underflow();
int_type pbackfail(int_type c);
private:
static const streamsize bufSize = 16;
static const streamsize pbSize = 4;
char_type buffer[bufSize];
240 Advanced !OStreams Usage
int buffer_in();
The private data members are basically the same as for the output stream buffer. A
character array is needed to serve as the internal buffer, along with an integer constant
that represents the size of the array. For the input buffer, we also need to specify the size of
the putback area, that is, the number of characters copied to the beginning of the internal
buffer by underf1low/() in order to allow successful calls to sputbackc () after refilling
the internal buffer. Details of this technique are described in section 2.2.4, File Stream
Buffers.
The constructor must install the private character array as the get area. Here is its
implementation:
setg() is the function that tells the base class which buffer will be used. Its argu-
ments are the beginning of the get area, the next read position in the get area, and the get
area’s end position.”!
For the input direction we must redefine the virtual function underflow (). Here is
its implementation:
if (buffer_in() < 0)
return traits_type::eof();
21. If no buffer array is installed via setg(), the default setting initialized by basic_streambuf is used; it
indicates that no buffer is available. This is what we implicitly took advantage of in the unbuffered case.
3.4 Adding Stream Buffer Functionality 241
else
return traits_type::to_int_type(*gptr());
First, underflow() checks if the next pointer has reached the end pointer. If not,
it returns the character to which the next pointer refers. Otherwise, it must refill the
internal character buffer with characters from the external device. In order to do this,
under flow/() calls the private member function buffer_in(). Here is the implemen-
tation of buffer_in():
if (retval <= 0)
{
setg(0,0,0);
return -1;
}
else
{
setg(buffer + pbSize - numPutbacks,
buffer + pbSize, buffer + pbSize + retval);
return retval;
The private member function buffer_in() copies the number of characters that
will remain in the internal character buffer from the end to the beginning of the array.
Afterwards, it fills the rest of the internal buffer with characters from the external device
by calling buffer_from_device(). Then it adjusts the get area pointer so that the
begin position refers to the beginning of the character array, the next position refers to the
first character read from the external device, and the end pointer points after the last char-
acter read from the external device. If an error is returned from
buffer_from_device(),buffer_in() invalidates the get area by setting all get area
pointers to 0.
242 Advanced !lOStreams Usage
int main()
{
inbuf<char, char_traits<char> > mybuf;
istream mystream(&mybuf);
int testInt = 0;
cout << "Test Begin !!! - type in an integer: " << endl;
3.4 Adding Stream Buffer Functionality 243
if (!mystream)
cout << "Error: " << mystream.rdstate() << endl;
else
cout << "Echo: " << testInt << endl;
mystream.putback('3');
if (!mystream)
cout << "Error: " << mystream.rdstate() << endl;
else
cout << "Echo: " << testInt << endl;
neutral state, from which I/O processing can be started in any direction. The details are
described in section 2.2.4, File Stream Buffers. As a consequence of this usage model, the
file buffer’s internal character array typically holds, at any given point in time, either the
get or the put area, or none of them. When the file buffer is engaged in input processing, it
is the get area that is held in the internal array; when the file buffer does output process-
ing, it is the put area; when the file buffer is in the neutral state, the internal character
array holds none of the areas. For a user-defined stream buffer, a similar model can be
used whenever it is possible to define an adequate set of conditions that put the stream
buffer into a neutral state.
These are two examples for the coordination of get and put areas, derived from the
models used for the concrete stream buffers of the standard library. Other solutions are
possible, as long as they fit into the IOStreams’ framework and are in line with proper
usage of the respective external device that the new stream buffer class represents.
3.4.1.2 Optional Functionality of Stream Buffers
Having discussed the core functionality of stream buffers, let us now turn to the features
that a stream buffer can, but need not, provide. We do not show any example implemen-
tations for the optional functionality that we are going to discuss in this section. The rea-
son for this is simply that these features are either highly device-specific (like stream
positioning), so that any implementation would not have much explanatory value in gen-
eral, or that the features are so vaguely specified by the standard (like setbuf ()) that
any implementation would be misleading.
because streams have a richer and more convenient interface. For this reason you will
often construct a new stream buffer object and use this stream buffer to create a new
stream. To make this task less tedious, a new stream type can be defined that automati-
cally creates a stream buffer of the newly defined type and uses it.
Additionally, new stream buffer classes sometimes have additional device-specific
operations. File stream buffers are a typical example; they have functions for opening and
closing the underlying file. A stream class that is defined in relation to a new stream buffer
class can act as a facade” for the device-specific functionality provided by the stream
buffer. For instance, the file stream classes are facades for their file buffers; they offer the
same functionality (open and close) in their interface and implement it by forwarding it to
the file buffer that they contain.
If you are considering deriving a new stream class along with a new stream buffer
class, you might want to take a look at section 3.3.2, Creating New Stream Classes by
Derivation.
22. A facade is a structural design pattern, described in Design Patterns—Elements of Reusable Object-Oriented Soft-
ware by Gamma, Helm, Johnson, and Vlissides (also known as the gang-of-four [GOF] book).
3.4 Adding Stream Buffer Functionality 247
tures. All nonvirtual member functions defined in basic_streambuf are precisely spec-
ified and do not allow the library implementers much latitude. Hence, additional func-
tionality does not impair the portability of your stream buffer class, as long as it can be
implemented in terms of stream buffer functions that are not redefined in the concrete
stream buffer class. In all other situations, it is hard to come up with a comprehensive set
of rules for staying independent of any implementation-specific stream buffer features.
Often, you have to investigate in detail, whether you are in danger of relying on imple-
mentation-dependent features or not.
To understand the portability problem better, let us take a look at a simple example.
Say we want to extend the string buffer class so that it is able to determine how often a cer-
tain character occurs in the not-yet-consumed input sequence. We derive a new string
stream buffer class from basic_stringbuf and add a new public member function
called scontainc(). Here is its implementation:
if (gptr() == egptr())
underflow();
}
return cnt;
}
my_stringbuf
() {}
private:
// prohibit copying and assignment
my_stringbuf (const my_stringbuf®&);
my_stringbuf& operator=(const my_stringbuf&);
};
Internationalization
This part of the book explains the internationalization and localization features in the
standard C++ library. The standard C++ library encompasses locales and facets. These
abstractions are used for internationalization of C++ programs.
Here is an overview of the chapters in this part:
This chapter should be read by readers who are not familiar with internation-
alization and seek a general overview of the problem domain.
Chapter 5, “Locales,” describes C++ locales and facets from a conceptual point
of view. It explains the major interfaces of class locale, gives a first overview of
facets, describes the relationship between locales and facets, and demonstrates
the usage of locales in internationalized applications.
This chapter should be read by everyone who wants to understand the con-
cept of C++ locales and facets. It is useful for software developers who want to
use C++ locales and facets in their programs.
249
250 Internationalization
Chapter 6, “Standard Facets,” deals with standard C++ facets in greater detail.
There are standard facets for numeric parsing/formatting, parsing /format-
ting of monetary values, parsing / formatting of time and date, character classi-
fication, string collation, character code conversion, and message retrieval
from catalogues. For each of the standard facets the chapter gives an overview
of its purpose, its main interfaces, and examples of its usage.
This chapter is of interest to readers who want to use the standard C++ facets
in their programs.
Chapter 7, “The Architecture of the Locale Framework,” describes the frame-
work that is provided by C++ locales and facets. It explains how facets are cre-
ated, identified, and replaced in a locale object.
This chapter should be read by everyone who strives for a deeper understand-
ing of the locale framework and intends to extend it. It is useful for software
developers who want to implement additional facets.
Chapter 8, “User-Defined Facets,” presents an example of a user-defined facet.
The example is meant as a guideline for implementing novel facets and their
usage in conjunction with standard C++ IOStreams.
This chapter is helpful for readers who want to implement a new facet rather
than using a standard one.
CHAPTER 4
Introduction to Internationalization
and Localization
For reasons of profitability, modern computer programs must be useful and attractive to
users all over the world. Naturally, computer users in different countries prefer to interact
with their computer in their native language. Ideally, a computer program will be adapt-
able to all conceivable local languages and cultural conventions. As the developer of a
product that aims for high international acceptance, you have to build adaptability to
local conventions into your program.
This chapter provides an introduction to internationalization for those readers who
want to internationalize their C++ programs but are not yet familiar with international-
ization. If you seek an initial overview of the problem domain, this is the right place to
begin.
First we will explain what internationalization and localization mean, how they relate,
and where they differ. Then we will show you examples of differences in cultural conven-
tions. These differences are a key problem that has to be dealt with when a program is
internationalized. Such cultural conventions include, among others, language, formatting
of numbers, currency symbols, formatting of time and date, ordering of words, and char-
acter encodings. You will find examples for each of the cultural conventions. Finally, we
devote a longer section to character encodings, because they are particularly interesting
for software developers. Differences in character encodings are a challenging problem;
they are relevant for file input and output and have an impact on the functionality of the
standard IOStreams.
251
252 Introduction to Internationalization and Localization
1. I18N is a common abbreviation for internationalization. The 18 stands for the 18 characters in the word inter-
nationalization between the first character I and the last character N.
2. This distinction is not a contrived one; internationalization and localization are popular terms in the area of soft-
ware development for the world market.
4.2 Cultural Conventions 253
components that are available in the standard C++ library, so-called locales and facets.
However, before we delve into the details of C++ locales and facets, we want to give you
an idea of the relevant cultural differences. The subsequent examples will not be exhaus-
tive; there are many issues that need to be addressed when software is internationalized
that we will not mention here. Examples are orientation, sizing and positioning of screen
displays, vertical writing and printing, selection of font tables, handling of international
keyboards, and so on. The issues we will delineate below are those that can be addressed
by using components provided by the standard C++ library.
4.2.1 Language
Different ethnic groups use different languages, and of course language is the most appar-
ent difference between cultures. Even within a single country people might prefer differ-
ent languages. The Swiss, for example, use French, Italian, and German.
Languages also differ in the alphabets they use. Here are a couple of examples of
languages and their respective alphabets:
3. Because the radix character in U.S. English is a period, it is also often referred to as the decimal point. However,
in other cultures a comma is used as a radix character. Hence the term decimal point is misleading. Radix character
is the correct term for the character separating the integer part of a number from its fractional portion.
254 Introduction to Internationalization and Localization
numbers with more than three digits, the so-called thousands separator, is a comma in En-
glish and a period in much of Europe.
Even the grouping of digits varies. In U.S. English, digits are grouped by threes, but
in Nepal, for instance, the first group has three digits, while all subsequent groups have
two digits.
USA: 10,000,000.00
France: 10.000.000,00
Nepal: 1,00,00,000.00
Here is an example that shows different cultural conventions for placing the cur-
rency symbol. It can appear before, after, or within the numeric value:
Germany: 49,99 DM
Japan: ¥ 100
United Kingdom before decimalization: £13 18s. 5d.
Also, the depiction of negative currency values varies among countries:
the current emperor. Many countries, especially in the Western world, use the Gregorian
calendar instead.
Here are examples of representations of the same date in different countries. They
differ in order of day, month, and year, the separators between those items, and the use or
omission of items such as the weekday in the long form of the date in Hungarian.
SHORT FORM LONG FORM
USA: 10/14/97 Tuesday, October 14, 1997
Germany: 14.10.97 Dienstag, 14. Oktober 1997
Italy: 14/10/97 martedi 14 ottobre 1997
Greece: 14/10/1997 Tputn, 14 OxtwBprov 1997
The same time can have different representations in different cultures. Here is an
example of the same time in a 12-hour clock and a 24-hour clock:
Germany: 17:55 Uhr
USA: 5:55 PM
alien American
American Zulu
zebra alien
Zulu zebra
256 introduction to Internationalization and Localization
In an ASCII encoding, the numerical values of uppercase letters are smaller than the
values of lowercase letters. For this reason, all words with capital letters appear at the
beginning of a list sorted according to ASCII rules.
Here is an example of the difference for two-to-one character code pairs:
SPANISH DICTIONARY RULES ASCII RULES
ano ano
cuchillo chaleco
chaleco cuchillo
dénde donde
lunes llava
llava lunes
maiz maiz
The word cuchi1lois sorted before the word chaleco, because in Spanish chis a
digraph; i.e., it is treated as a single character that is sorted after c and before d. Generally,
a digraph is a combination of characters that is written separately but forms a single lexi-
cal unit. In the list above there is another example of a two-to-one character code pair, the
digraph 11, which is sorted after 1 and before m. |
Here is an example of the difference for one character treated as two:
GERMAN DICTIONARY RULES ASCII RULES
Musselin Musselin
Muse Muster
Muster Mufe
The German character &, called sharp s, is treated as if it were two characters, namely
ss. This makes a difference in ordering; it is sorted after ss and before st.
4.2.6 Messages
A program contains many pieces of text that can become visible to a user. Examples are
error messages that are displayed in certain situations, labels in a graphical user interface,
or standard text in headers and footers of listings and printouts. Internationalized pro-
grams never contain any such text strings or messages hard-coded in their source code.
The text is separated from the program itself, because these strings and messages depend
on the language and have to be translated prior to a product’s worldwide distribution.
The state-of-the-art technique for separating language-dependent text from a pro-
gram is to store all strings that need to be translated in so-called message catalogs. Such a
4.2 Cultural Conventions 257
message catalog can be a file or a database that can be translated and exchanged indepen-
dently of the program. The program, rather than using hard-coded strings, accesses the
message catalog for a certain language and retrieves the respective localized messages.
Character i ? @ A B C D E F
Character Code
(hexadecimal) ee Ox3F 0x40 x041 0x42 0x43 0x44 0x45 0x46
4. Section 2.3, Character Types and Character Traits, explains how IOStreams handles character encodings.
5. ASCII stands for “American standard code for information interchange.” It is a 7-bit character codeset that is
the U.S. national variant of the internationally used ISO646 codeset.
6. EBCDIC stands for “extended binary coded decimal interchange code.” It is an 8-bit character codeset devel-
oped by IBM.
7. Unicode is a fixed-width, 16-bit worldwide character encoding that was developed and is maintained and
promoted by the Unicode Consortium, a nonprofit computer industry organization.
258 Introduction to Internationalization and Localization
Usually, all character codes in a codeset are of equal size. Examples of character
codes are
e the traditional 7-bit ASCII codeset, where every U.S. English character is encoded
in 7 bits,
e ISO 8859-1, a character codeset for Western European cultures using one byte per
character, and
In wide-character codesets, all characters are larger than one byte. This format is
needed when an alphabet contains more than 256 characters and thus cannot be repre-
sented by a one-byte character codeset. The Japanese alphabet is an example; it consists of
thousands of characters. There are, for instance, two character codesets defined by the
Japanese Industry Standards Committee to represent Japanese characters: JIS X 0208-
1990,3 which contains the most frequently used Japanese characters, and JIS X 0212-1990,°
which represents the less frequently used characters. Other examples of wide-character
codesets are international standards such as Unicode, ISO 10646.UCS-2, and ISO
10646.UCS-4."° Unicode and ISO 10646.UCS-2 are two-byte character codesets, ISO
10646.UCS-4 is a four-byte codeset.
8. The official name for ISO X 0208-1990 is Code of the Japanese Graphic Character Set for Information Interchange.
9. The official name for ISO X 02112-1990 is Code of the Supplementary Japanese Graphic Character Set for Information
Interchange.
10. ISO 10646 is an encoding defined by ISO, the International Standards Organization. ISO 10646-UCS-2 is
code-for-code equivalent to Unicode.
260 Introduction to Internationalization and Localization
Figure 4-1 shows a Japanese sentence composed of these four writing system.
The sentence means: “Encoding methods such as JIS can support texts that mix
Japanese and English.”
The characters from the four writing systems can be represented as follows:
1. One huge, wide-character codeset that encompasses all characters of the Japanese
alphabet. This is the approach that is taken by international standards such as
Unicode or ISO10646.
2. A mix of several character codesets, some of which are one-byte codesets, others
being two-byte codesets. Each of the character codesets represents a subset of
characters of the Japanese alphabet. This is the approach standardized by the
Japanese industry’s standards organization JISC.
Let us consider the second case, because it is an example of where multibyte charac-
ter encodings are used. The Japanese industry uses national standards for character code-
sets. The characters from the four Japanese writing systems are represented in a number
of character codesets, the most common of which are listed below. Each of these character
codesets represents a certain subset of the Japanese alphabet:
There are many ways of translating a byte sequence into a sequence of characters
from the character codesets listed above. Consequently, there are several multibyte encod-
ings for Japanese, none of which is universally recognized. Instead, there are three com-
11. JIS-ROMAN is a combination of ASCII characters and half-width katakana characters. Katakana and hira-
gana characters are encoded in one byte, using the 128 character codes left over by the 7-bit ASCII codes.
4.2 Cultural Conventions 261
[| katakana S| kanji
mon multibyte encoding schemes: JIS (Japanese Industrial Standard), Shift-JIS, and EUC
(Extended UNIX Code).
The three multibyte encodings just described are typically used in separate areas:
e JIS is the primary encoding method used for electronic transmission such as email,
because it uses only 7 bits of each byte. This is a required feature, because some
network paths strip the eighth bit from characters. Escape sequences are used to
switch between one- and two-byte modes, as well as between different character
sets.
¢ Shift-JIS was invented by Microsoft for use on its platforms. Each byte is inspected
to see if it is a one-byte character or the first byte of a two-byte character. Shift-JIS
does not support as many characters as JIS and EUC do.
e EUC encoding is implemented as the internal code for most UNIX-based plat-
forms available on the Japanese market. It allows for characters containing more
than two bytes and is much more extensible than Shift-JIS. EUC is a general
method for handling multiple character sets and is not peculiar to Japanese
encoding.
in the way a character sequence is interpreted. The use of the shift sequence is demon-
strated in figure 4-2.
For encoding schemes containing shift sequences, like JIS, it is necessary to maintain
a shift state while parsing a character sequence. In the example above, we are in some ini-
tial shift state at the start of the sequence. Here it is ASCII. Therefore, characters are
assumed to be one-byte ASCII codes until the shift sequence <ESC>$B is seen. This
switches us to two-byte mode, as defined by JIS X 0208-1983. The shift sequence <ESC> (B
then switches us back to ASCII mode.
external code, because they are compact and save storage space. When used for internal
storage or processing in a program, they are not very efficient. For instance, they do not
allow random access to arbitrary positions within a character sequence. Instead, wide-
character codesets are used because they are more convenient for data processing. The fol-
lowing example will illustrate how wide characters make text processing inside a program.
Consider a filename string containing a directory path with adjacent names sepa-
rated by a slash, like /CC/include/1locale.h. To find the actual filename in a single-
byte character string, we can start at the back of the string and process the character
sequence byte by byte. When we find the first separator, we know where the filename
starts. If the string contains multibyte characters, we always have to scan from the front so
we don’t inspect bytes out of context. However, if we represent the multibyte string as a
sequence of wide characters instead, we can treat it like a single-byte character and scan
from the back, processing it in portions of “wide-character size.”
Multibyte encoding provides an efficient way to move characters around outside
programs, and between programs and the outside world. Once inside a program, how-
ever, it is easier and more efficient to deal with wide-character codesets where all charac-
ters have the same size and format.
4.2.7.5 Code Conversions
Since wide-character codesets are usually used for internal representation of characters in
a program and multibyte encodings are used for external representation, converting
multibytes to wide characters is a common task during input/output operations. Input to
and output from files is a typical example. The external file might contain multibyte char-
acters. When you read such a file, you convert these multibyte characters into wide char-
acters that you store in an internal wide-character buffer for further processing. When you
write to a multibyte file, you have to convert the wide characters held internally into
multibytes for storage on an external file. Figure 4-3 demonstrates graphically how this
conversion during file input is done:
External file
J|alp a a | <ESC>| § B
JIS
Internal buffer vy
P a n
Unicode
Locales
In the preceding chapter we discussed various areas in which cultural conventions differ
from one region to another. For each of these cultural conventions, the standard C++
library provides services that enable you to internationalize your C++ programs. These
services are bundled into so-called facets. A facet encapsulates data that represent a set of
culture and language dependencies and/or offer a set of related internationalization ser-
vices. Facet objects are maintained by so-called locales. Basically, a locale is a container of a
facet. Ina C++ program, locales are objects of a class type called locale and a facet is an
object of a facet type. All facet types are derived from a class called locale: : facet.!
Facet types are either predefined in the standard library or user-defined. The prede-
fined facet types are called the standard facets. The standard facets provide services and
information about the basic set of cultural differences. Such differences concern language
and alphabet as well as the formatting of numeric, monetary, date, and time values.” User-
defined facets cover further areas of cultural differences, beyond the basic set provided by
the standard facets. User-defined facets are present in a locale only if they are explicitly
added to the locale. Standard facets, in contrast, are automatically contained in every
1. Details about facet types can be found in section 7.2.1, Facet Identification.
2. Examples of such cultural differences are given in section 4.2, Cultural Conventions.
265
266 Locales
locale. The idea is that the standard facets provide a basic set of internationalization ser-
vices that must be available in every locale.°
There are two grouping mechanisms for facets in a locale: facet families and locale
categories.
Facet Famity. A facet family is a hierarchy of facet types that are derived from each
other. The base class facet defines a facet interface common to all members of the facet fam-
ily. Some of these facet families are closely related because their base classes are instantia-
tions or specializations of a facet base class template. For instance, template <class
charT> class ctype is the base class template of the ctype facet families. The facet base
classes instantiated or specialized from it are ctype<char> and ctype<wchar_t>. Any
derived classes, such as ctype_byname<char> and ctype_byname<wchar_t>, are
members of their respective facet families.
Locace catecory. A locale category denotes a group of standard facet families. In a
C++ program, a category is a bitmask value of type locale: : category. For instance,
the locale category locale: : ctype comprises the ctype and codecvt facet families.
Table 5-1 lists the standard facets and shows how they are grouped into categories.
3. The predefined standard facets are discussed in section 6, Standard Facets. User-defined, nonstandard facet
types are explained in section 8, User-Defined Facets.
5.1 Creating Locale Objects 267
After this first cursory overview of locales and facets, let us take a look at the
locale class. In the following two sections, we explain how locale objects can be created
and how facets can be retrieved from a locale object.
locale with one facet replaced or added is nameless. (Combined locales and replacement
of facets are discussed in section 5.1.2, Combined Locales.) Such nonstandard locales do
not have a meaningful name and the name () member function returns "*" in such cases.
In this book we use locale names as they are allowed on Microsoft platforms. Here
are a couple of examples:
locale native("");
locale usa("American_USA.1252");
locale holland("Dutch");
locale global;
native : German_Germany.1252
classic: C
global : C
holland: Dutch_Netherlands.1252
usa : English_United States.1252
From the name of the native locale you can tell that these examples were compiled
on a “German” computer. The default constructor creates a snapshot of the so-called
global locale. The global locale is discussed in section 5.1.3, The Global Locale.
The locale name can be used for comparing locales. In general, two locales are equal
if they are the same locale or one is a copy of the other. If both locales have a name and the
names are identical, they are also considered equal. As a result, unnamed locales are equal
only to themselves and copies of themselves; in addition, locales are equal to locales with
the same name.
components that use locales, because they can rely on the fact that the locale will never
change and that references to facets obtained from that locale will stay valid as long as the
locale exists.
However, due to the immutability of locale objects, nonstandard locales cannot be
created by modifying existing locales, but only as a copy of an existing locale with one or
several facets replaced or added. For the creation of such a combined locale, the locale
class offers several constructors and member functions. The following two functions
allow replacement or addition of exactly one facet.
This constructor creates a locale that is a copy of an existing locale with exactly one
facet replaced or added. Instead of the original facet of type Facet, the provided facet
fac is contained in the locale object. If the original locale does not contain a facet of type
Facet, the provided facet fac is added to the constructed locale.
This member function creates a copy of the locale object on which it is invoked, and
the copy has the facet of type Facet replaced or added by the corresponding facet from
the existing locale other.
Here are some examples:
locale holland("Dutch");
dutch_german = locale("German") .combine< moneypunct<char> >(holland);
The construction and the invocation of combine () yield the same combined locale;
it’s a German locale that has the moneypunct facet from a Dutch locale. As mentioned
before, the resulting combined locale has no name.
Note that the facet to be replaced or added is identified by its family name. As men-
tioned before, a locale contains at most one member of a given facet family, and for this
reason the name of the base class type can be used to identify the respective representative
of a family ina locale. Section 7.2, Identification and Lookup of Facets in a Locale, explains
the details.
The predefined facets that are required to be contained in each locale are grouped
into so-called categories. Section 6.3.2, Locale Categories, describes the details of locale cat-
egories. A locale category is of type locale: : category, which is a bitmask type, and
values of that type can be combinations of categories. Instead of replacing only one facet,
all facets from one or several categories can be exchanged. The following two locale con-
structors allow replacement of locale categories:
270 Locales
This constructor creates a locale that is a copy of the existing locale other, which
has all facets belonging to the specified locale categories replaced by facets taken from a
named locale with the name std_name.
This constructor creates a locale that is a copy of the existing locale other, which
has all facets belonging to the specified locale categories replaced by facets taken from
another locale.
The resulting combined locales have names, if both source locales are named. The
name of the resulting locale is, however, undefined.
Here are some examples:
locale polish_german_1(locale("German"),"Polish",locale::monetary);
out << "german_base polish_monetary: " << polish_german_1l.name() << endl;
locale polish_german_2(locale("German")
, locale("Polish"),locale::time) ;
out << "german_base polish_time : " << polish_german_2.name() << endl;
At first glance these locale constructors, which copy an existing locale and replace
certain facets, look like expensive operations. But we will see in chapter 7, “The Architec-
ture of the Locale Framework,” that they are in fact lightweight operations, due to the
locale’s architecture.
locale global;
out << "global: " << global.name() << endl;
5.2 Retrieving Facets from a Locale 271
locale holland("Dutch");
locale: :global (holland) ;
out << "global: " << global.name() << endl;
global = locale(); // get a new snapshot
out << "global: " << global.name() << endl;
global: C
global: C
global: Dutch_Netherlands.1252
global: French_Switzerland.1252
The standard does not mandate which locale is the default for the global locale.
From the code above, you can see that on our computer the initial global locale is the clas-
sic “C” locale.
You can also see that making a locale global does not affect any previously created
snapshots of the global locale: The snapshot held in the variable global does not change
after making a Dutch locale global. A new snapshot must be created to see the effect of
having changed the global locale.
The C library also has the notion of a global locale. Setting the global C++ locale
might have an effect on the global C locale. The details are described in appendix E, “Rela-
tionship Between C and C++ Locales.”
Now that we know how locales can be created, let us see how we can access the
facets contained in a locale.
9.2.1 has_facet
has_facet<Facet> (loc) checks whether the locale loc contains a facet that is of the
requested type Facet or any type derived therefrom.
For all mandatory facet types, such as money_put<char> for instance, has_
facet () returns true, of course, and the call to has_facet () is really not necessary.
For other facet types, in contrast, has_facet () provides valuable information, and one
might want to check for a facet’s existence before use_facet() is called in order to
avoid the exception that use_facet () would throw if the facet could not be found. In
our example above, we checked whether a facet of type money_put<char, string_
inserter<char>> is contained in the current global locale.* The instantiation of
money _put for a nondefault iterator type is not automatically contained in every facet,
and for this reason, has_facet () might return false for this facet type.
5.2.2 use_facet
use_facet<Facet>(loc) searches the locale loc for a facet that is of the requested
type Facet or any type derived therefrom. If a facet of the requested type can be found in
the locale, a reference to the found facet is returned. If eventually no matching facet can be
found, a bad_cast exception is thrown.
When use_facet () can find a matching facet, it returns a reference to it. Refer-
ences to elements in a container always raise the question: How long do these references
stay valid? Often, element references become invalid after the first modifying access to the
container. As we’ve mentioned earlier, locales are immutable objects. There is no such
thing as a modifying access to a locale object. For this reason, the reference to the facet
returned from use_facet () remains valid as long as the containing locale exists. Actu-
ally, the guarantee is even stronger: The reference returned remains valid at least as long
as any copy of the containing locale exists. Section 4.2, Memory Management of Facets in
a Locale, explains why this is and how it works.
4. The facet type money_put<char, string_inserter<char> > is an instantiation of the money_put tem-
plate for an output iterator other than the default, which is ostreambuf_iterator<charT> and was omitted
from the list of template arguments in money_out<char>. Instead of writing the formatted monetary value to
an output stream, we've assumed that there is an iterator type string_inserter<char> that enables output
to a string. Details about the monetary facets are described in section 6.2.2, Monetary Values.
5.2 Retrieving Facets from a Locale 273
Yet keep in mind that the validity of the facet reference returned from
use_facet () is tied to the lifetime of its containing locale and any copies of that locale.
In particular, be cautious with passing a temporary locale object as an argument to
use_facet (). Depending on the context, the facet reference returned might be invalid,
because the containing locale has already been destroyed. In the following code snippet,
use of a temporary locale object will lead to a program crash:
The temporary German locale lives until the end of the enclosing expression and
goes out of scope at the end of the first statement in the code snippet above, which is
before the returned reference to the contained numpunct facet is used. The facet referred
to has already been destroyed along with its containing locale, and a program crash is the
result of the second statement.
Use of temporary locale objects does work if the lifetime of the temporary object is
taken into account correctly. Below, there is no problem, because the locale and the
returned facet reference are temporary objects within the same enclosing expression:
because the temporary locale is a copy of the current global locale, and the facet reference
returned by use_facet () stays valid as long as any copies of the containing locale exist.
To avoid any confusion, we recommend generally NOT passing temporary locales
to use_facet (). Typically, locales are objects with a relatively long lifetime. The code
below is much safer than any of the examples above:
locale german("German");
const numpunct<char>& fac = use_facet<numpunct<char> >(german);
cout << "true in German: " << fac.truename() << endl;
5. truename() and falsename() are member functions of the numpunct facet that yield the international-
ized counterparts to true and false.
CHAPTER 6
Standard Facets
Facets are objects of a facet type that is either predefined in the standard library or user-
defined. The predefined facet types, called the standard facets, are discussed in detail in
this chapter. User-defined, nonstandard facet types are explained in chapter 8, “User-
Defined Facets.”
The standard facets provide services and information about the basic set of cultural
differences. Such differences concern language and alphabet as well as the formatting of
numeric, monetary, date, and time values. Examples of such cultural differences were
given in section 4.2, Cultural Conventions. User-defined facets cover further areas of cul-
tural differences, beyond the basic set provided by the standard facets. User-defined
facets are present in a locale only if they were explicitly added to that locale. Standard
facets, in contrast, are automatically contained in every locale. The idea is that the stan-
dard facets provide a basic set of internationalization services that must be available in
every locale.
To be more precise, we should say that in every locale there is one representative of
each standard facet family. Remember, we mentioned before, at the beginning of chapter 2,
that facet types are organized in facet families, that is, hierarchies of facet classes derived
from each other. Some of these facet families are closely related, because their base classes
are instantiations or specializations of a facet base class template. For instance, template
<class charT> class ctype is the base class template of the ctype facet families.
275
276 Standard Facets
The facet base classes instantiated or specialized from it are ctype<char> and
ctype<wchar_t>. Each locale object contains one representative from each ctype fam-
ily, that is, one facet of type ctype<char>, or any type derived therefrom, and one
facet of type ctype<wchar_t>, or any type derived therefrom. A facet of type
ctype<myCharacterType>, or any type derived therefrom, would not automatically
be contained in each locale. This is because although an instantiation or specialization of
the ctype facet template for a character type other than char or wchar_t does introduce
a new ctype facet family, the new family is not a mandatory one. We will take a closer look
at the facet templates and required instantiations or specializations of these templates in
section 6.3.1.4, Mandatory Facet Types, where we provide a complete list of the facet fam-
ilies that are represented in every locale.
In the following sections we review all of the standard facet families and explain
their functionality. We discuss alphabet- and language-related facets first; then we explore
the parsing and formatting facets; and eventually we take a second look at the standard
facet families, their base and derived classes, and the locale categories.
It contains the classification and conversion functionality for a character set that can
be represented by the character type charT. Two instantiations or specializations of this
template are provided by the standard library: a ctype facet for narrow characters of type
char and one for wide characters of type wchar_t.
6.1.1.1 Character Classification
Among other services, the ctype facet provides the functionality to classify the characters
of a character set. Criteria for this classification are provided as an enumerated bitmask
6.1 Alphabet- and Language-Related Facets 277
type, ! which is called mask. It is a nested type in ctype_base, the public base class of the
ctype facet template. The values of mask and their semantics are listed in table 6-1:
¢to determine all criteria to which each character from a range of characters con-
forms, by means of an overloaded version of the member function is ():
const charT* is(const charT* beg, const charT* end,
mask* vec) const;
* to find the first element in a range of characters that conforms or does not conform
to certain criteria, by means of the member functions:
and
Let’s have a look at some examples that show how these member functions can be
used. If Loc is a locale object that contains a ctype<char> facet and if c is a character of
type char,
returns a bool value that indicates if c is a lowercase character or not. If we use a variable
m of type ctype_base: :mask, then a different overloaded version of is (), namely,
places the bitmask into m that contains all classification criteria to which c conforms. This
second version of is() allows for a character range, where the first parameter indicates
the beginning and the second parameter points one past the end. The third parameter
points to the vector of type ctype_base: :mask[], which contains the bitmasks that
characterize each character from the range after the operation has finished.
For the sake of convenience, additional global functions, in the namespace std, are
provided that allow for classification if certain criteria apply to a single character. These
are functions like isspace(), isprint(), isupper(), etc. Each function is imple-
mented by calling the ctype facet’s member function is () with the corresponding mask
argument; e.g., isupper () is implemented in the following way:
replaces each character in the range [cp1, cp2) for which a corresponding uppercase
character exists, with that character. The overloaded versions of tolower () have analo-
gous behavior.
For convenience, the character conversion from uppercase to lowercase and vice
versa is also provided by global functions in the namespace std. These functions are
Again, the convenience functions are implemented by using loc’s ctype member
functions tolower() and toupper ().
const charT* narrow(const charT* beg, const charT* end, char dfault,
char* to) const;
const char* widen (const char* beg, const char* end,
charT* to) const;
The parameter dfault is used by narrow() as the return value whenever a wide
character to be converted does not have a narrow-character counterpart. Not all wide
characters must have a corresponding narrow-character equivalent. The only characters
for which the standard requires that widen () must provide a unique transformation are
those in the basic source character set.* For the basic source characters, the following
invariant holds:
widen(narrow(c), 0) == c
2. The basic source character set consists of 96 characters: the space character, the control characters representing
horizontal tab, vertical tab, form feed, and newline, plus the following characters:
abcdefghigjgkimnopqrstuvwxyz
ABCDEFGHIJIKLMNOPQRSTUVWX
YZ
01234567 8 9
{ }[ ] # ( ) <> % ; ?>*+-/%*8& |~!=z,\ "'
280 Standard Facets
returns an integer value that indicates the order of two character sequences [beg1,
endl) and [beg2, end2):
For two strings, for which compare () yields 0, the calculated hash value is the same.
With its member function
the collate facet provides a way to speed up the comparison of one character sequence
against many others. transform() returns a string that is the transformation of the char-
acter sequence represented by the [beg, end) to an internal representation. The lexico-
graphic comparison® of two strings resulting from transform() yields the same result
as the comparison of the original character sequences with compare (). When one string
is compared against n others, using the n+1 calls to transform() plus the n lexico-
graphic comparisons can yield a better performance than n calls to compare (). This is
particularly true when the one string contains characters that cannot be compared on a
character-by-character basis, such as strings containing characters that are treated like
two characters, such as the character 3 from the German alphabet. The code below shows
an implementation of a one-to-many comparison that uses transform ():
String tmpOne =
collateFacet.transform(one.c_str(),one.c_str()+one.length());
if (tmpOne == tmpMany)
return (begin) ;
return (end);
The function template compare1toM() compares the string one against the strings
in the range from begin to end. An iterator pointing to the first string in this range, which
equals the string one, is returned; otherwise end is returned. The expression tmpOne==
tmpMany invokes the operator==() for basic_string<charT>, which is semanti-
cally equivalent to a lexicographic comparison of tmpOne and tmpMany that yields 0.
The fact that the comparison operator for basic_string<char> is used to do
the lexicographic comparison of two strings reveals that this operation on strings is not
3. The lexicographic comparison means that the actual numeric codes are compared, by using the
operator<() of the character type, which is a fast and efficient operation at least for the built-in character
types char and wchar_t. In particular, it is much faster than the culture-dependent compare.
282 Standard Facets
internationalized. The same is true for all other basic_string<charT> operations. For
“culture sensitive” string comparison the functionality of the collate facet must be used
instead. For convenience, the locale provides culture-dependent string comparison by
means of its function call operator operator () (), which is declared as
Its implementation is based on the collate facet’s member function compare (). It
can be used as a convenient alternative and additionally allows usage of a locale as a com-
parator argument to the standard containers and algorithms.
¢ interntT, which is the character type associated with the internal code set
® externtT, which is the character type associated with the external code set
¢stateT, which is the state type that is capable of holding the conversion state. It
must be maintained during a conversion from the external to the internal character
set and vice versa
The codecvt facet contains two types of member functions: those that provide infor-
mation about the code conversion, and those that perform the conversion. The first cate-
gory has five member functions:
®always_noconv (), which indicates if a code conversion is needed
in() is used to convert from the external to the internal character set; out () is used
for the opposite direction. The parameters and usage of both functions are similar.
The first parameter is the conversion state. In section 7.3.2, Immutability of Facets in
a Locale, we describe that and explain why facets can have only const member func-
tions: They are immutable objects. A side effect is that the conversion state cannot be a
data member of the code conversion facet, because the functions in() and out () change
the conversion state during code conversion. Instead, the conversion state must be held
outside the facet and must be passed into the facet with each call to in() or out ().
The arguments [from, from_end) specify the character sequence that is going to
be converted. The resulting converted character sequence is written to [to,to_limit).
The return type codecvt_base::result is a nested enumerated type in
codecvt_base, the base class of codecvt. The values of codecvt_base::result
and their semantics are listed in table 6-2.
After the return, from_next and to_next point one character past the last charac-
ter processed by in() or out ().
284 Standard Facets
Correct usage of the conversion functions is not completely obvious from studying
their signatures. The following example should help to reveal the subtleties. The example
shows a situation where multibyte characters are read from a file and then converted to
wide characters via in (). The multibyte characters are read byte by byte from the external
file into a buffer. The buffer is an array of bytes, i-e., an array of units of type char. The oper-
ating system function read () is used to fill the buffer. The characters resulting from the
conversion are wide characters and are accumulated in a wchar_t buffer. The conversion
therefore is from the external character type char to the internal character type wchar_t.
The type of an appropriate conversion state is assumed to be ConversionState. Here is
the source code that does the conversion:
ConversionState cs;
int err = 0;
while (!err)
{
char *fromNext;
wchar_t *toNext;
int readResult;
codecvt_base::result convResult;
break;
to = toNext;
if (to == toLimit)
err = ResultBufferFull;
else if (convResult == codecvt_base::error)
err = ConversionError;
else if (convResult == codecvt_base::ok)
{
readSize = transBufSize;
readStart = transBuf;
}
else if (convResult == codecvt_base::partial)
{
int num = transBuft+transBufSize-fromNext;
copy (fromNext, transBuf+transBufSize, transBuf);
readSize = transBufSize - num;
readStart = transBuf + num;
The example is straightforward in the way in() is invoked: The characters in the
byte buffer transBuf are used as input, and the resulting output characters are written
to wide-character buffer result Buf. After the invocation of in(), toNext is pointing
one behind the last output character and marks the position in the wide-character buffer
to which subsequent output can be written. For this reason, toNext can be assigned to to
for the next invocation of in(). If the wide-character buffer is full, that is, the new to
equals toLimit, then the conversion is stopped.
The conversion also stops when the call to in() has failed, ie,
codecvt_base::error has been returned. When codecvt_base: :ok has been re-
turned, the whole byte buffer transBuf is specified to be available for storing new char-
acters from the next read(): we set readSize to transBufSize and readStart to
transBuf.
An interesting situation occurs when the return value from in() is codecvt_base: :
partial. It indicates that the last characters of the transBuf have not been consumed
by in(). The reason for this is that more characters from the input are needed to produce
an output character. fromNext indicates where the not-yet-consumed input starts in the
byte buffer transBuf. The leftover bytes are copied to the beginning of transBuf. Only the
rest of transBuf is specified as available for new characters from the next read (): We set
readSize to transBufSize-numand readStart to transBuf+num.
Services for retrieval of user-defined localized messages from message catalogs are
provided by the messages facet. It is defined by the following class template:
It opens a message catalog specified by a name and a locale. It returns a catalog iden-
tification of type catalog, which is typically a typedef for an integral type.
The C++ standard describes how message catalogs can be accessed via the message
facet’s member functions. Both the syntax of message catalogs and the way message cata-
logs must be created, installed, and maintained are beyond the scope of the standard and
are implementation-specific. Consequently, the name for a message catalog provided to
the open() member function as well as the set_id and msg_id provided to the get
member functions may be different for different platforms.
The numpunct facet contains the information about the format and punctuation of
numeric and Boolean expressions. Based on this information, the facets num_put and
num_get provide the functionality to generate a formatted character sequence from a
numeric or Boolean value, and the reverse functionality provides the parsing of a charac-
ter sequence to extract a numeric or Boolean value.
6.2.1.1 The numpunct Facet
The following information is provided by numpunct’s member functions:
decimal_point() and thousands_sep() return the characters that represent the
radix separator and the thousands separator respectively. truename () and falsename ()
return the strings that represent the Boolean values true and false respectively.
grouping() returns a value of type string (= basic_string<char>). The
semantics of this value are quite tricky. Each character in the string is interpreted as an ele-
ment of an integer array, which describes the way in which digits of the integral part of a
numeric value are grouped. Each integer specifies the number of digits in a group, start-
ing with the rightmost group; the last integer in the string determines the size of all
remaining groups. If the last integer is <= 0 or CHAR_MAX, the described group is unlim-
ited. If the string is empty, there is no grouping.
Let us consider some examples for grouping rules and how they would be repre-
sented as a string returned from grouping (). In the United States, for example, digits
are grouped by threes, so the number 10 million is formatted as 10,000,000. A string s,
where s.length()==1 and s[0] ==3, describes this grouping.* In Nepal, on the other
hand, the first group has three digits and all subsequent groups have two digits, so that
the same number, 10 million, is formatted as 1,00,00,000. This grouping is described by a
two-element string, where the first element is 3 and the second 2. Please note that an
empty string and a string that contains a single 0 describe the same grouping.
6.2.1.2 The num_put Facet
Based on the information contained in the numpunct facet, num_put provides the func-
tionality to generate a formatted character sequence that is the representation of a
numeric or a Boolean value. For this reason num_put contains an overloaded version of
its member function put () for the following types: bool, long, unsigned long,
double, long double, void*. At first sight it might look as if versions of put () for
4. Note that the required string is "\003", not "3", because the numeric value of "3" is that of the character
code of the digit 3, which in ASCII would be 51. Hence, the string "\003" specifies groups of 3 digits each, and
"3" probably indicates groups of 51 digits each.
288 Standard Facets
short, int, or float values are missing. The intent was to keep the interface of the stan-
dard library concise, and a value of type short or int can be handled by the version for
long. Similarly, a value of type float can be handled by the put () version for double.
Besides the value that should be formatted, put() takes the following additional
parameters:
¢ an output iterator that specifies the location to which the formatted string should
be written
For example, put () fora long value has the following form:
For formatting of the numeric value, the put () function takes into account the
information contained in numpunct, which is obtained from the locale returned by
fg.getloc(), the format flags contained in the ios_base object fg, the fill character
£1, and the character classification information taken from the ctype facet container in the
locale returned by fg.getloc().
The semantics of the format flags are the same as for the standard IOStreams’ output
streams, e.g., ios_base: :dec, ios_base: :hex, ios_base: : oct specify if an integer
value should be represented to the base 10, 16, or 8 respectively. The fill character is used
for padding according to the ios_base: :adjust field specification. The exact details
of the formatting are described in appendix B, Formatting Numerical and bool Values.
An iterator pointing one beyond the last character written is returned.
The output iterator specifies the location to which the formatted string should be
written. The type of output iterator OutputIterator is a template parameter of the
num_put class template. As already mentioned above, its default is
ostreambuf_iterator<charT>, but any other output iterator type can be used as
well. An ostreambuf_iterator writes output to a stream buffer, and such iterators are
used by the stream classes in IOStreams when they attempt to output the result of format-
ting to a file or string stream. In a way, stream buffer iterators are an implementation
detail of the IOStreams. More about stream buffer iterators can be found in section 2.4.3,
Stream Buffer Iterators.
The put () function makes no provision for error reporting. Any failure during out-
put must be extracted from the returned iterator. Such an error can occur when not
enough character positions are available in the output sequence to hold the resulting char-
acters. Output stream buffer iterators have a public member function, failed(), that
can be used to check for any previously occurred errors. Other output iterators do not
have such a feature, and in that case output errors cannot be detected.
6.2 Formatting and Parsing Facets 289
Besides a reference to the value to be extracted, the get () function needs the fol-
lowing arguments:
¢ two input iterators specifying the character sequence that should be parsed
ea reference to an ios_base object
ea reference to an ios_base: : iostate object
The get () function uses the format flags from the ios_base object to control the
way in which the character sequence is parsed. The semantics of the flags are the same as
for a standard IOStreams, e.g., ios_base::dec, ios_base: :hex, ios_base: :oct
specify if a character sequence should be interpreted as the representation of a numeric
value to the base 10, 16, or 8 respectively. The exact details of the parsing algorithm are
described in appendix A, Parsing and Extraction of Numerical and bool Values.
The get () function uses the numpunct and ctype facets from the locale attached to
the ios_base object. The numpunct facet provides information like the radix character
and the thousands separator. The ctype facet is used for character classification, e.g., for
distinguishing between digit and nondigit characters.
The reference to the ios_base::iostate object is used for error indication.
get () sets the appropriate bitmask elements according to success or failure of the opera-
tions. The bitmask elements and their semantics are the same as for the IOStreams (see
section 1.3, The Stream State, for details). An iterator pointing one beyond the character
read is returned.
Below is an example of how the num_get facet’s get () function would be used.
The example is that of a function int_get (), which implements parsing of an int value
by means of the get () function for long values. The function int_get () takes a string
that is parsed in order to extract an int value and indicates failure if the extracted
numeric value does not fit into an int value. The three arguments for the function
int_get() are the string to be parsed, an ios_base object that carries the formatting
290 Standard Facets
flags and a locale with all necessary facets, and an int variable in which the parsing
result can be stored.
template <class String>
ios_base::iostate int_get(const String& s, ios_base& ib, int& 1)
{
typedef String::value_type charT;
typedef String::const_iterator iterT;
long 1;
10s_base::iostate err;
iterT in = s.begin();
iterT end = s.end();
money_put defined
by template<class charT,
class OutputIterator=
ostreambuf_iterator<charT> >
class money_put
Even in one cultural context, two different currency symbols can be used: either the
international symbol, which is a three-letter code defined in ISO 4217, or the domestic
6.2 Formatting and Parsing Facets 291
symbol. For instance, the international currency symbol for U.S. dollars is USD, while the
domestic symbol is $. International currency symbols are useful in applications that han-
dle many different currencies at the same time. Consider the difficulty of distinguishing
between U.S. and Canadian dollars if only the domestic currency symbol was used,
which in both countries is the $ sign. An application dealing with both currencies will
probably use USD and CAD instead of §.
moneypunct’s nontype template parameter Inter of type bool determines
whether the international currency symbol (Inter=true) or the domestic one
(Inter=false) should be used. The moneypunct facet has a static const data
member int1 that is set at compile time to the value of the template argument. The usage
of this data member, which is publicly accessible, is very similar to the usage of a nested
typedef for a type template parameter; it allows deduction of the template argument for
an instantiated template class.
In contrast to moneypunct, the money_put and money_get facets are not templa-
tized with a specification of the currency symbol. As we will see in detail below, their
member functions get () and put () are parameterized with this specification for each
invocation. The consequence of this architecture is that a combination of four facet
classes is needed to provide the entire information and functionality for the handling of
monetary values: one num_put facet, one num_get facet, and two numpunct facets; one
numpunct facet for the international currency symbol; and one for the domestic currency
value.
6.2.2.1 The moneypunct Facet
Let’s see in detail which information the moneypunct facet provides:
*decimal_point(), thousands_sep(), and do_grouping() behave like the
member functions of the same name from the numpunct facet.
¢curr_symbol () returns the string that represents the currency symbol.°
*positive_sign() and negative_sign() return the strings that indicate a
positive or a negative monetary value.
¢frac_digits() returns the number of digits to be displayed after the radix
separator.
5. One might wonder why the entire numpunct class is parameterized with the nontype template argument
Inter. Wouldn't it suffice if the curr_symbol () came in two flavors, one for the domestic and one for the
international currency symbol? The reason for providing two different numpunct facets for each type of cur-
rency symbol is that the type of the currency symbol might have an impact on the rest of the format information.
For instance, the format for negative or positive amounts might be different when the international currency
symbol is used from the format that is used with the domestic symbol.
292 Standard Facets
element has a value of the enumerated type money_base: : part. The possible
values are none, space, symbol, sign, value. Each value—symbol, sign,
value, and either space or none—appears exactly once. The value none, if pre-
sent, is not first; the value space, if present, is neither first nor last.
The num_put facet uses both pos_format () and neg_format () for formatting.
The num_get facet uses only neg_format () for parsing. The following rules apply for
the interpretation of the format array:
Where none or space appears, whitespace is permitted in the input or output
sequence. For input, space indicates that at least one space is required at that position.
Where symbol appears, the sequence of characters returned by curr_symbol () is
permitted, and can be required. For details, see the subsequent sections on num_put and
num_get.
Where sign appears, the first (if any) of the sequence of characters returned by
positive_sign() or negative_sign() is required, depending on the monetary
value being non-negative or negative. Any remaining characters of the sign sequence are
required after all other format components.
Where value appears, the absolute numeric monetary value is required.
Let’s look at some examples. In Hong Kong, a positive monetary value is written as .
HK$0.95, while a negative value is written as (HK$0.95). In Germany, a positive value is
written as 0,95 DM and a negative one as -0,95 DM. Here are the typical values returned
from pos_format () andneg_format () for these examples:
Hong Kong:
pos_format() sign, symbol, value, none
neg_format() sign, symbol, value, none
Germany:
pos_format() _ sign, value, space, symbol
neg_format() sign, value, space, symbol
In this case, a positive sign is defined as empty. Note that there are also other valid
possibilities to define the format specifications. An alternative for the German
pos_format () could be value, space, symbol, none.
The first argument is an output iterator that allows output of the resulting character
sequence; the iterator’s type is a template argument of the money_put class template. The
default type is ostreambuf_iterator<charT>, but any other output iterator type can
be used as well.
The argument quant specifies the monetary amount to be formatted. It is inter-
preted as a value of the smallest currency unit. For instance, 5.00 $ would be expressed in
cents and specified as either the numeric value 500 or the string "500". Any fractional
parts are ignored.° A value of 5.00 or a string like "5.00" would yield a monetary
amount of 0.05 $. In the string, only the optional leading minus sign and the immediately
subsequent digit characters are used; any trailing characters, including digits appearing
after a nondigit character, are ignored. In particular, the string must not contain any thou-
sands separators. A string like "1,000,000" would yield an amount of 0.01 $. Digit
characters are distinguished from nondigit characters by use of the ctype facet that is con-
tained in the locale attached to the ios_base argument fg.
The bool argument int1 indicates whether the resulting character sequence
should contain the international monetary sign (intl == true) or the domestic one
(intl == false). The relevant currency symbol is taken from the moneypunct facets
moneypunct<charT, true> or moneypunct<charT, false> contained in the locale
that comes with the ios_base object fg. Additional formatting information like the
character for the radix separator point, number of digits after the radix separator, etc., is
also taken from this moneypunct facet.
The format flags are taken from the ios_base object fg and are interpreted in the
following way: Acurrency symbol is generated if ios_base: : showbase is set. Fill char-
acters are placed where none or space appears in the formatting pattern, if
ios_base::internal is set. If ios_base::left is set, they are placed after the other
characters; if ios_base: :right is set, before the other characters.’ The interpretation of
any other format information contained in the ios_base object is implementation-defined.
The iterator positioned one beyond the last character written is returned by put ().
The put () function does not report any errors. Failure of output can only be detected by
checking the returned iterator’s failed() function.
If we attempt to use the money_put facet we notice that the formatting of monetary
amounts is not supported by IOStreams. Section 6.4.1, Indirect Use of a Facet Through a
Stream, demonstrates how the num_put facet can be used with a string stream. Anything
comparable is not possible for monetary amounts, unless we define an inserter for mone-
tary amounts ourselves. Section 3.1.4, Refined Inserters and Extractors, in part I explains
how such user-defined inserters can be implemented; the example chosen there uses the
6. It might seem that the type long doubl1e for the quant argument is inappropriate, because only the integral
part is used, but the intent is to allow a maximum range of values, and on most systems this can be achieved by
using long doubl1e, not the integral type long.
7. Note that it is possible, with some combination of format patterns and flag values, to produce output that can-
not be parsed using num_get<>: :get.
294. Standard Facets
time_put facet, but the same technique can be applied for use with the monetary facets. If
we do not intend to implement an inserter for monetary values, we can resort to direct use
of the money_put facet, independent of any streams. Direct use of facets is discussed in
section 6.4.2, Use of a Facet Through a Locale, and in section 6.4.3, Direct Use of the Facet
Independently of a Locale.
6.2.2.3 The money_get Facet
The money_get facet provides the functionality to parse a character sequence that repre-
sents a monetary amount and to extract the numeric value. For this reason it contains
overloaded versions of its member function get () for the types long double and
basic_string<charT> to store the extracted numeric value:
The first two arguments specify the character sequence, given as input iterator range
[s, end), that is to be parsed.
The bool parameter int1 specifies which currency symbol should be expected in
the character sequence: the international (intl == true) or the domestic (intl ==
false) symbol. The respective currency symbol is taken from the moneypunct facets
moneypunct<charT, true> or moneypunct<charT, false> contained in the locale
that comes with the ios_base object fg.
The function get () always uses the neg_format () from the respective numpunct
facet to parse the character sequence. The result is returned as an integral value stored in
units or as a sequence of digits in the string digits, possibly preceded by a minus sign.
For example, the character sequence $1,234.56 parsed with a U.S. locale would yield
the value 123456.0 for units, or the string "123456" for digits.
If numpunct’s member function grouping () indicates that no thousands separa-
tors are permitted, any such characters are not read, and parsing is terminated at the point
where the first thousands separator appears. Otherwise, thousands separators are
optional; if present, they are checked for correct placement.
Where space or none appears in the format pattern returned by numpunct’s
neg_format (), except at the end, optional whitespace is consumed after any required
space. Whether a character is whitespace or not is determined by using the ctype facet
contained in the locale fg.getloc().
The format flags from fg are interpreted in the following way: If ios_base: :
showbase is not set, the currency symbol is optional and is consumed only if other char-
acters are needed to complete the format; otherwise, the currency symbol is required. For
example, if showbase is off, fora neg_format () of "() " andacurrency symbol of "L",
the L is consumed when the character sequence (100 L) is parsed. If the
neg_format() is "-",thenthe Lin-100 Lis not consumed.
6.2 Formatting and Parsing Facets 295
Compared with the organization of localization information for numbers and mon-
etary values, the canonical equivalent of numpunct and moneypunct is missing for the
localization information for date and time values. The reason for this gap is that the facets
time_put and time_get are not based on any common formatting elements, but rather on
the C library date- and time-formatting function strftime (). Both facets, time_put and
time_get, use the structure tm that is defined in the C library header file <ctime>. The
detailed relationship of strftime() and the two time facets are explained in the follow-
ing two chapters; a description of the tm structure can be found in the reference part of this
book.
The first argument is an output iterator that allows output of the resulting character
sequence to a character container, and the iterator’s type is a template argument of the
time_put class template. The default type is ostreambuf_iterator<charT>, but any
other output iterator type can be used as well.
In the first version of put (), the character sequence [pat_begin, pat_end) 1s
scanned for any contained format patterns. A format pattern starts with the character '%'
followed by an optional modifier character and a format specifier character. If no modifier
character is present, it is assumed to be zero. The narrow() member function from
str.getloc()’s ctype facet is applied to each character before it is interpreted.
The date or time data contained in the struct tm t are formatted according to the
found format patterns as if t and the respective format pattern were arguments for the C
library’s function strftime() using fill for padding. Details about the format pat-
terns of strftime() and how they are interpreted can be found in appendix C,
strftime() Conversion Specifiers used by time_put facet.
Any character from [pat_begin, pat_end) that is not part of a format pattern is
written to the output iterator s without any interpretation. Thus characters resulting from
the formatting of the date/time value and nonformat pattern characters forming the
sequence [pat_begin, pat_end) are interleaved in the output in the order in which the
format pattern and the nonformat pattern characters appeared in sequence [pat_begin,
pat_end).
For example, a call to the first version of put () with a struct tm containing the
25th of December 1993 and the character sequence "This is 3A, day $d of month %B \n
the year %Y. \n",ie. the following code snippet:
ostringstream oss;
locale loc("American_USA.1252");
oss.imbue(loc);
const time_put<char>& tfac = use_facet<time_put<char> >(loc);
struct tm xmas = { 0, 0, 12, 25, 11, 93 };
String fmt("This is %A, day %d of month %B in the year %Y.\n");
time_put<char>::iter_type ret
= tfac.put(oss,oss,' ',&xmas,fmt.begin(),fmt.end());
cout << oss.str() << endl;
The second version of put () interprets the characters format and modi fier asa for-
mat pattern. The date or time value contained in t is formatted as if passed to the C library's
function strftime () (see appendix C, strf£t ime () Conversion Specifiers, for details).
6.2 Formatting and Parsing Facets 297
Note that neither the first nor the second version of put () interprets the format
parameters or the fill character. For the standard facets time _put<char> and
time_put<wchar_t>, formatting is controlled by the strftime() format patterns
only. The format parameters and the fill character are provided to the put () functions, so
that nonstandard facet types derived from time_put may use them in overriding versions
of the format functions.
An iterator positioned one beyond the last character written is returned by both ver-
sions of put (). The put () function does not report any errors. Failure of output can be
detected only by checking the returned iterator’s failed() function.
If we attempt to use the time_put facet we notice that the formatting of date and
time values is not supported by IOStreams. In section 6.4.1, Indirect Use of a Facet
Through a Stream, we demonstrate how the num_put facet can be used with a string
stream. Anything comparable is not possible for date and time values, unless we define
an inserter for date and time values ourselves. Section 3.1.4, Refined Inserters and Extrac-
tors, in part I explains how such user-defined inserters can be implemented; the example
chosen there uses the time_put facet. If we do not intend to implement an inserter for date
and time values, we can resort to direct use of the time facet, independent of any streams.
Direct use of a facet is discussed in section 6.4.2, Use of a Facet Through a Locale, and sec-
tion 6.4.3, Direct Use of the Facet Independent of a Locale.
All five member functions behave similarly. They parse the character sequence spec-
ified by [begin, end) to determine a time or date value or an element of a date value. If
the sequence being parsed matches the correct format, the corresponding members of
the struct tmargument t are set to the values used to produce the respective part of the
sequence. Otherwise an error is reported via the ios_base: : iostate object err® or
8. The bitmask elements and their semantics are the same as for the IOStreams (see section 1.3, The Stream State,
for details).
298 Standard Facets
unspecified values are assigned. As the err object does not necessarily reflect any failure,
user confirmation is required for reliable parsing of user-entered dates and times. Only
machine-generated formats, produced by time_put, can be parsed reliably. The
ios_base object fg is passed to each function to give access to the ctype facet of
fg.getloc(), which is needed during parsing. An iterator positioned one beyond the
last character read is returned by each of the functions.
The functions differ in the struct tm parts that are extracted from the character
sequence.
®get_time() searches for a character sequence that matches a time_put output
produced with the format character X.
e get_day() searches for a character sequence that matches a time_put output pro-
duced with the format character x.
Besides the functions that help to extract date and time values from a character
sequence, time_get provides a function:
which indicates the preferred order of components for those date formats that are com-
posed of day, month, and year. date_order () returns values of an enumeration type
dateorder nested in time_get. The possible values are no_order, dmy, mdy, ymd, ydm,
describing either a certain order or no order. date_order ( ) is intended as a convenience
only, for common formats, and may return no_order in valid locales.
If we attempt to use the time_get facet, we notice that the parsing of date and time
values is not supported by IOStreams. In section 6.4.1, Indirect Use of a Facet Through a
Stream, we demonstrate how the num_put facet can be used with a string stream. Any-
thing comparable is not possible for date and time values, unless we define an extractor
for date and time values ourselves. Section 3.1.4, Refined Inserters and Extractors, in part
I explains how such user-defined extractors can be implemented; the example chosen
there uses the time_put facet. If we do not intend to implement an extractor for date and
time values, we can resort to direct use of the time facet, independent of any streams.
Direct use of a facet is discussed in section 6.4.2, Use of a Facet Through a Locale, and sec-
tion 6.4.3, Direct Use of the Facet Independently of a Locale.
6.3 Grouping of Standard Facets in a Locale 299
9. The only exception to this rule of public nonvirtual functions calling protected virtual functions is the special-
ization ctype<char>. Its is () member function does not call any virtual member function, for reasons of effi-
ciency, but takes a table-driven approach; this table can be replaced for the purpose of customizing the facet.
300 Standard Facets
The byname facets have a constructor that takes the name of a cultural area as a
const char* argument. A byname facet redefines the protected virtual member func-
tions of its base class with behavior that is appropriate for the specified cultural area.
Byname facets are automatically created when a named locale is constructed. For
example:
locale loc("US");
creates a locale that represents the U.S. localization environment. This locale contains the
respective byname versions of those facets that have a byname version, such as the
numpunct facet. In the code below, the numpunct facet retrieved from the U.S. locale
object is a numpunct_byname facet:
ited operations that are not redefined by the derived class. Let us see what the respective
default behavior is for each of the standard facets.
NUMPUNCT 3
The base class for numpunct provides classic “C” behavior. Classic “C” behavior is the
way C functions used to behave before internationalization was added to the C standard.
Obviously, classic “C” describes the behavior only for the character type char. The
behavior for the character type wchar_t is defined in analogy to the classic “C” behavior
for char. For instance, when numpunct<char>: :decimal_point() returns '.', then
numpunct<wchar_t>::decimal_point () returns the wide character equivalent L' . '.
COLLATE
In the case of string collation, two collate base class facets must be provided by a
standard-compliant library. (See section 6.3.1.4, Mandatory Facet Types, for a list of all the
mandatory standard facet types.) Both required base class facets provide classic “C”
behavior, which in the case of collation is lexicographic ordering, that is, character and
character sequences are ordered according to the numeric values of the character codes.
CODECVT
In the case of code conversion, two codecvt base class facets must be provided by a stan-
dard compliant library. (See section 6.3.1.4, Mandatory Facet Types, for a list of all the
mandatory standard facet types.) The facet codecvt<char, char,mbstate_t> is a
degenerated one; it implements “no conversion,” so that in() and out () behave simi-
larly to a memcpy(). The behavior of codecvt<wchar_t,char,mbstate_t> is
implementation-defined.
You may have noticed that some of the standard facet base classes have
implementation-dependent functionality. Usually, interfaces with implementation-defined
behavior must be avoided by users who strive for portability of their programs. Hence, one
might wonder whether it is a problem that the base class behavior of some facets ‘is
implementation-defined. The answer is no, not really.
302 Standard Facets
The behavior of a base class facet is of interest only when a specialized version of
one of the standard facet interfaces is to be implemented and the existing base class
behavior is to be reused. The byname facets are powerful and already provide support for
all common localization environments. Only in really special, “exotic” cases is the deriva-
tion of a new standard facet necessary at all. In such a case it is likely that the new func-
tionality must be implemented from scratch and cannot be built reusing the base class
behavior. Hence, the base class behavior is almost irrelevant, because it will most likely be
overridden anyway.
6.3.1.4 Mandatory Facet Types
As mentioned before, the standard C++ library defines a number of facet base class tem-
plates for the standard facet families. A certain set of instantiations or specializations of
these templates must be contained in every locale. The mandatory facet types are listed in
table 6-3, together with their categories, which are explained in section 6.3.2, Locale Cate-
gories, later in this chapter.
money_put<char> money_put<wchar_t>
numeric numpunct<char> numpunct<wchar_t>
num_get<char> num_get<wchar_t>
num_put<char> num_put<wchar_t>
time time_get<char> time_get<wchar_t>
time_put<char> time_put<wchar_t>
messages messages<char> messages<wchar_t>
10. The codecvt_byname facets are not required by the final C++ standard. At the time of this writing, the
question of whether this situation is a defect in the standard or whether the facets were omitted intentionally is
under discussion.
6.3 Grouping of Standard Facets in a Locale 303
Types that are used as arguments for the formal parameters Input Iterator or
OutputIterator must satisfy the requirements of an input iterator type or an output
iterator type respectively. They must be iterators to a character container; that is, the itera-
tor’s value type must be compatible with a character type.
Note that instantiations of these facet templates for character types other than char
and wchar_t and iterator types other than istreambuf_iterator and
ostreambuf_iterator are not automatically contained in every locale object and must
be explicitly installed.
11. See section G.1, Bitmask Types, in appendix G for further details on bitmask types.
6.3 Grouping of Standard Facets in a Locale
locale
locale categories
locale: :ctype
facet base class templates
template <class charT>
class ctype
facet base classes = facet families
ctype<char>
| derived facet classes = family members
ctype_byname<char>
ctype<w_chart>
|
derived facet classes = family members
ctype_byname<w_chart>
ctype<...>
| derived facet classes = family members
—— locale::collate
facet base class templates
| template <class charT>
class collate
facet base classes = facet families
ostringstream ost;
ost.imbue(locale("German")
);
ost << "Hello World - " << 12345678L;
string s = ost.str();
Afterwards, the string s contains the initial string "Hello World - " plus the result
of formatting the number 12345678L, that is "Hello World - 12.345.678". The
numpunct facet, the format flags, and the fill character used for formatting are those of the
German locale, which we attached to the stream via a call to the stream’s member func-
tion imbue ().
Use of formatting and parsing facets through a stream is the most convenient way of
using these facets. Predefined in the standard library are inserters and extractors for
numeric values, which use the numeric facets, as shown above. Internationalized parsing
and formatting of date and time values are not available through the stream classes. This
is because there are no standard types for representing date and time values. However,
such inserters and extractors can be added. Section 3.1, Input and Output of User-Defined
Types, in part I gives an example; it explains how the functionality of the date formatting
facet can be made available through a stream.
Other values whose parsing or formatting depends on cultural conventions can be
handled in the exact same way. Section 8.2, Defining a New Facet Family, explores the
6.4 Advanced Usage of the Standard Facets 307
example of address formatting. A facet type for address-formatting rules is defined, corre-
sponding facets are installed in a locale, that locale is attached to a stream, an inserter for
address values is defined that uses the address-formatting facet, and eventually addresses
can be formatted as easily as the numeric values in the example above.
If for any reason we do not want to use streams for implicit use of a facet, we can call
the facets directly. For the sake of comparability, we stick to the example used above: for-
matting of numeric values, that is, use of the num_put facet. Direct use of the num_put
facet might not look terribly useful, but the issues discussed below apply equally well to all
other facets. Take, for instance, monetary, date, and time values. There are no inserters and
extractors for these entities defined in the standard library, and the monetary and time
facets are not used by IOStreams. These facets must be used directly. A facet that we want to
use directly can be contained in a locale or it can be used by itself, independent of a locale. In
the following section, we discuss both cases, using the num_put facet as an example.
12. You might wonder why the entire num_put class template is parameterized by the iterator type if only the
member function put () needs the iterator type. An alternative idea could have been to make the member func-
tion put () a template function and eliminate the iterator type argument from the num_put class template.
Unfortunately, this is not feasible, because the put () function calls a protected virtual do_put () function that
has exactly the same signature. This protected virtual member function needs to turn into a function template,
too, and virtual member functions are not allowed to be templates in C++.
308 Standard Facets
locale, we must explicitly install it in the locale object that we want to use. Below is an
example that shows how the necessary facet is installed and subsequently used in a Ger-
man locale:
locale loc(locale("German")
,new num_put<char, back_insert_iterator<string> >);
basic_ios<char> str(0);
str.imbue(loc);
string s("Hello World - ");
use_facet<num_put<char, back_insert_iterator<string> > >(loc)
.put (back_inserter(s),str,' ', (unsigned long)12345678L);
First, we install the num_put facet in a German locale by creating a new locale 1oc that is
a copy of the German locale plus the num_put facet for strings.
For invocation of the facet’s put () function, we need an ios_base object.!5 We use
a basic_ios object here, because the class ios_base has a protected constructor, and
ios_base objects can be created only by friends or derived objects. We could be creating
a complete stream instead, but we do not need stream-specific properties such as the
stream buffer or the formatting functions of a stream. As we don’t need a stream buffer in
this context, we create the basic_ios objects with a constructor argument of 0, which
means that no stream buffer is provided. Then we attach our extended German locale to
the basic_ios object.
For invocation of the put () function, we retrieve the num_put facet from the locale
via use_facet. We create an insert iterator for the string s, which is appended by the for-
matted character representation of 12345678L. The insert iterator for the string is created
by using the creator function back_inserter () from the standard library.
Then we pass all required arguments to the put () function: the iterator (which
allows appending characters to the string), the basic_ios object (which carries the for-
mat flags and the locale that has the German numpunct and ctype facets), a fill character,
and the numeric value that is to be formatted. Resulting from the call to put () the string
s contains the same value as in the previous example: "Hello World - 12.345.678".
The return value of the put () function is not checked, because the returned iterator
is of type back_insert_iterator<string>, which is an iterator type that does not
signal failure of output by any means. Failure of output to a dynamically growing string is
extremely unlikely anyway. The only failure that can happen would occur when the
string cannot be expanded anymore because the entire memory is exhausted. In that case,
the allocation inside the string’s insert () function would probably raise a bad_alloc
exception anyway.
If another type of output iterator was used that refers to a fixed-sized character con-
tainer, such as a normal string iterator or a pointer to a character array, a container over-
13. The ios_base object is used by the put () function to retrieve the format flags that control the formatting of
numeric values. Also, the put () function uses the numpunct and ctype facet from the locale attached to the
ios_base object for retrieving information about the radix character, thousands separator, etc., and for recogni-
tion of whitespace characters, digits, etc.
6.4 Advanced Usage of the Standard Facets 309
flow would escape unnoticed. In such cases, it is advisable to wrap the iterator into a spe-
cial purpose output iterator that has a failed() function like the
ostreambuf_iterator, so that exhaustion of the character container can be detected.
As you can see from the example above, direct use of a facet is substantially more
complicated than use of a facet through a stream operation, as we had shown it before. In
the example above, it looks rather stupid to stuff the facet into a locale first and then
retrieve it again so that it can be used. It makes sense because the num_put facet needs
other facets anyway. Stuffing all of the facets involved in the processing into one locale
object makes it easy to pass around all the necessary information in the form of the
locale object. Still, we can do it differently. A facet need not necessarily be contained in a
locale. Let us see how facets can be used independent of locales.
~StandAloneFacet() {}
};
the base class constructor is called with the value 1 as an argument. This is to indicate that
the facet is used “stand alone,” i.e., the memory is correctly managed by the base class.
For details, see section 7.3, Memory Management of Facets in a Locale. Note also that this
wrapper template requires the encapsulated facet type not to take any further constructor
arguments. In particular, this wrapper cannot be used for byname facets.
For the sake of comparability with the previous uses of the num_put facet, let us see
how the facet wrapper would be used for writing a formatted numeric value to a string:
The effect is exactly the same as in the example above, where we installed the
num_put facet in a locale first and retrieved it for use later.
CHAPTER 7
Locales and facets are designed to form an extensible framework. User-defined interna-
tionalization services can be added to the locale framework by providing user-defined
facets. Not only can a locale object contain the predefined standard facets that we dis-
cussed in the preceding chapter; it can also maintain any kind of user-defined facet. Natu-
rally, such user-defined facets must meet a couple of requirements and constraints in
order to fit into the locale framework.
In this chapter we explain these requirements and the overall architecture of the
locale framework. The classes representing locales and facets are tightly coupled. We will
see what a locale requires of a facet, how a facet is identified, how facet lookup in a locale
works, and how a locale manages its facets’ lifetime and memory.
311
312 The Architecture of the Locale Framework
; charT:class, ! charT:class, I
; Outputiterator:class i ——————— Outputiterator:class ij
,; charT:class ! 1 charT:class, |
----- -! —$<$<<—————I_Inti:boo! '
numpunct --—— Mot. 4d
moneypunct ]
rv—roeonrr—rrcsreere t
1 internT:class, |
' externT:class, !
parece . I
; StateT:class Message facets ._.----
codecvt -— > charT:class_ |
messages |
time_put |
, charT:class, 1
———— Outputiterator:class '
example. Facet families are class hierarchies with a facet base class that defines the inter-
face for all its derived facet types. In section 7.2.1, Facet Identification, below, we explain
in detail which requirements a type must meet in order to be a facet type, and we also look
at the properties that facet base classes must have.
A locale bundles the internationalization services and information for a particular
cultural environment. A locale object acts as a container of facet objects and is organized in
such a way that it contains at most one facet object out of a given facet family. In section
7.2.2, Facet Lookup, below we explain how this organization is achieved. We saw in previ-
ous sections that when use_facet () is invoked, we must specify a facet type in order to
receive the facet object of the type that is contained in the locale. Section 7.2.2, Facet Lookup,
explains how facets contained in a locale are identified and how the facet lookup works.
class locale::facet
{
protected:
explicit facet(size_t refs = 0);
virtual ~facet();
private:
facet (const facet&); // not defined
void operator=(const facet&); // not defined
};
class locale::id
{
public:
id();
private:
void operator=(const id&); // not defined
id(const id&); // not defined
};
1. As before, we omit the scope : : std in this book because all names in the standard C++ library are defined
within the : : std namespace. Hence we assume that one would normally add a using namespace :: std;
statement to each translation unit and avoid full qualification of standard library names.
314 The Architecture of the Locale Framework
facet type must define a static id member: The locale needs the id member of its facets to
determine into which slot a facet object belongs.
We can see that instantiations of the numpunct template are facet base classes,
because they define the id member, whereas instantiations of the numpunct_byname
template are facets that belong to the same facet family, because they inherit the id mem-
ber and share the facet identification with their base class.
Note that every instantiation of numpunct for a different character type introduces
another facet family with its own facet identification. numpunct<char> has the identifica-
tion numpunct<char>: : id, and numpunct<wchar_t> has the identification numpunct
<wchar_t>::1id, which is a different static variable with a different value. The derived
class numpunct : byname<char> shares the identification numpunct<char>: :id with
its base class numpunct<char>, and numpunct_byname<wchar_t> has the identifica-
tion numpunct<wchar_t>::id of its base class.
The locale has one slot per facet identification, which means that a locale can contain
one representative of the numpunct<char> family and one representative of the
numpunct<wchar_t> family, either a base class object or a derived class object respec-
tively. However, a locale can never contain a numpunct<char> and a
numpunct_byname<char> facet, because they have the same facet identification and
would compete for the same slot in the locale.
Let us see what use_facet must do so that the contained facet’s member function
decimal_point () is invoked in the end.
IMPLEMENTING use_facet ()
To understand how use_facet() uses the facet identification for retrieval of a facet
from a locale, we will study a tentative implementation of the use_facet () function.
Keep in mind that the C++ standard does not define any implementation issues and that
our implementation only demonstrates the principle and is not meant to be a realistic
implementation.
For exposition, we assume that the locale contains a facet repository as a private
data member. For each facet identification there must be a slot in the facet repository for a
316 The Architecture of the Locale Framework
representative of the respective facet family. You can think of the facet repository as a map
with locale: :idas the key and const locale: : facet* as the value. An implemen-
tation could use the map class template from the standard, that is, an instantiation
map<size_t, const locale: :facet*>, assuming that locale: :idis convertible to
size_t. However, keep in mind that this is only an example; a real implementation prob-
ably uses a faster data structure for the facet repository.
Let us further assume that class locale has a private member function that imple-
ments retrieval of a facet with a given facet identification from the facet repository con-
tained in the locale. This helper function might have the following signature: const
locale::facet* get_facet (const locale::id&). Under this assumption, the
function use_facet () must be a friend of class locale, so that it has access to the pri-
vate member function get_facet ().
Here is a tentative implementation of use_facet ():
return (*pd);
The sample code shows that use_facet () first tries to retrieve the facet from the
locale’s facet repository via the interface identification Facet : : id. The way the locale
repository is organized (it does not allow multiple entries for the same key), it can contain
no more than one facet with the requested facet identification Facet : : id. If such a facet
can be found, a dynamic cast is performed to check if the found facet can be cast down to
the requested facet type const Facet *.* More about the dynamic cast can be found in
section G.9, Dynamic Cast, in appendix G.
2. An implementation may also perform a downcast to const Facet &, which is semantically the same.
7.2 Identification and Lookup of Facets in a Locale 317
What will happen if we invoke use_facet () on different classes from such a class hier-
archy? Let’s assume we have the following situation:
public:
virtual string bar() { return "this is the base class"; }
public:
virtual string bar() { return "this is the derived class"; }
virtual string bar_2() { return "hello world"; }
To keep the example simple and concise, neither of the two user-defined facets,
base_facet and derived_facet, is meant to contain any realistic localization services
or information. For the same reason, we ignore the rule that all standard facets follow,
namely, that public member functions of a facet are never virtual, but call protected vir-
tual member functions to do the actual work.
Now let’s examine the different possible cases and discuss them in terms of our sam-
ple implementation of use_facet () above.
EXACT TYPE MATCH. Consider a situation in which the locale object contains a facet of
exactly the same type that is requested in the use_facet () template specification. Say a
locale object 1oc contains a facet object of type base_facet and we call
BASE REQUESTED, DERIVED AVAILABLE. Let us see what happens when the locale object
contains a facet instance of the derived class, and the type requested in the use_facet ()
318 The Architecture of the Locale Framework
template specification is the base class. In terms of our example classes: loc contains a
facet of type derived_facet and we call
This effect is really interesting. What we have here is a kind of two-phase polymorphic
dispatch.
First, whatever type of derived class object from the facet family is contained in
the locale is extracted by specifying the base class type in the use_facet template
specification.
Second, the invocation of a virtual function of the determined object is dispatched
by C++ means, that is, by calling a virtual function through a pointer or reference to the
facet.
DERIVED REQUESTED, BASE AVAILABLE. Let’s examine the situation in which the locale
object contains a facet instance of the base class and the type requested in the
use_facet () template specification is the derived class, i.e., loc contains a facet of type
base_ facet and we call
With the same argumentation as in the previous situation, use_facet () retrieves the
instance of base_facet from the locale object loc when it uses derived_facet::id
as a search key. The dynamic cast to const derived_facet* will fail, because the
retrieved object is of type const base_facet*. This failure is appropriate, since the base
class pointer is not compatible with the derived class pointer, because the base class
base_facet need not support the full interface of the derived class derived_facet. In
our example, the base class facet does not have a bar_2() member function, and if
use_facet returned successfully, the subsequent call to bar_2() might cause a pro-
gram crash.
wrone tp. A call to use_facet () also fails with a bad_cast exception when the
facet repository in the locale object contains no facet with the requested locale: :id.
7.2 Identification and Lookup of Facets in a Locale 319
if (has_facet<derived_facet>(loc))
cout << use_facet<derived_facet>(loc).bar();
What happens? numpunct<char> is the base class of a facet family and the repre-
sentative of that facet family is requested. Depending on the locale, a facet of exactly that
type might be contained (exact type match) or, if it is a named locale, a facet of the derived
type numpunct_byname<char> would be contained (base requested, derived available). In
either case, use_facet< numpunct<char> >() will return a reference to the numpunct
facet object that is actually contained in the locale (step 1 of two-phase dispatch). The
member function decimal_point() invokes the virtual member function
do_decimal_point (). Depending on the dynamic type of the contained facet object,
either the base class version of do_decimal_point () or the more specialized derived
class version of it will be called (step 2 of two-phase dispatch). The net effect is that due to the
two-phase dispatch, the right facet is found and the right member function is invoked,
without use_facet() having any knowledge of the actual locale and its contained
facets. |
7.2.2.2 Storing Facets in a Locale
We’ve seen above that the functions use_facet() andhas_ facet () provide the func-
tionality to retrieve facets from a locale object. How are the facets stored in a locale in the
first place? It all happens when a locale object is created. A locale fills its facet repository
depending on the arguments passed to its constructor, and again the facet identification
plays a role. Here is an example of a locale constructor:3
It creates a locale that is a copy of an existing locale other . If the locale other con-
tains a facet with the identification Facet: : id, this facet is replaced in the copy. The
replacing facet is the one that the pointer f points to. If the locale other does not contain
3. The different locale constructors are discussed in detail in section 5.1, Creating Locale Objects.
320 The Architecture of the Locale Framework
a facet with the identification Facet : : id, the facet that the pointer f points to is added
and extends the copy.
One interesting aspect of this behavior is that it allows the addition of instances of
new, user-defined facets in a simple way.
7.2.2.3 The Rationale Behind the Use of the Two-Phase Polymorphism
Until now we have described how the two-phase polymorphism is used by a locale to
maintain its facets. Why is this mechanism needed at all? Why is the polymorphism
gained by virtual functions insufficient? The answer lies in the extensibility of the locale
framework. The goal was to allow user-defined facets, which implement new internation-
alization services, to be treated by the locale in the same way as predefined standard
facets.
Imagine a locale that bases the polymorphic behavior of its facets on the facets’ vir-
tual functions alone. Since virtual functions allow polymorphic behavior only for the
classes of a single class hierarchy, the standard would have had to define the base classes
(i.e., the roots of the hierarchies) for all possible internationalization services. No matter
how many conceivable interfaces the standard would have added, the number of inter-
faces and services would still have been limited. Instead of trying to anticipate all conceiv-
able facet interfaces, the design decision was to use the two-phase polymorphism,
because that is an entirely open concept.
The first phase of the dispatch, based on the facet identification, is used to select a
certain set of internationalization services. Each set is represented by the interface that the
corresponding facet base class defines for its facet family. Further sets of internationaliza-
tion services can be added by a user in the form of new facet types derived from
locale::facet defining a static locale: : id data member. Since the locale frame-
work can generate a (theoretically) unlimited number of unique facet identifications, it is
possible to add an unlimited number of new internationalization services to the locale
framework.
The second phase of the dispatch, based on virtual functions, is used to select the spe-
cific behavior of a service. Each of the services can be implemented differently for each
derived facet class in the previously selected facet family. The virtual function dispatch
ensures that the service of the facet object actually contained in a given locale is chosen. This
way a (theoretically) infinite number of different versions of the same service can be added
to the locale framework by means of derivation and redefinition of virtual functions.
An example that shows how a new internationalization service can be implemented
is given in section 8.2, Defining a New Facet Family; an address-formatting service is
defined as a new facet base class together with two specific implementations of that inter-
face in the form of two derived address facet classes.
responsibility for the lifetime and memory of all facets that it contains. It will delete its
facets once the locale object itself goes out of scope.
In fact, all predefined standard facets are designed for this kind of maintenance by a
locale: they have a protected virtual destructor. Objects of classes without a public destruc-
tor cannot be created on the stack or as global or static variables; they can be created only
on the heap, in the hope that someone who has access to the destructor, such as a friend,
will later delete the heap object. Class locale isa friend of class locale: : facet and has
access to the protected destructor of the standard facets. As a consequence, standard facets
are typically created on the heap and handed over to a locale that later deletes them.
For the values 0 and 1 the refs argument has the following effect:
elf refs == 0, the locale takes care of deleting the facet. To be more specific: The
locale performs delete static_cast<locale: :facet*>(f), where f isa
pointer to the facet, when the last locale object containing the facet is destroyed. In
this case the facet should be used only in conjunction with a locale, because its life-
time is tied to the lifetime of the locales it belongs to.
elf refs == 1, the locale does not destroy the facet and the creator of the facet is
fully responsible for the facet’s lifetime and deletion. In this case, the facet can be
used independent of any locale.
The effect of providing values other than 0 or 1 is undefined.
The locale will probably count the references to facets that were constructed with a
refs argument of value 0, in order to determine how many locale objects refer to the
facet, because it must destroy the facet when the last locale referring to the facet is
destroyed.
The American ctype facet, together with the rest of a German locale, forms the locale
loc. When loc goes out of scope, it will delete the American ctype facet on the heap,
322 The Architecture of the Locale Framework
because refs was initialized to 0 when the facet was created. As mentioned above, the
locale can delete the facet because it is a friend of the facet base class locale: : facet so
that it can access its virtual destructor.
Note that the base class constructor is called with the value 1 as the initial value for
the facet’s refs argument. Such a locale need not be stuffed into a locale object but can be
used independently. Here is the sample usage from section 6.4.3:
We decided that the base class should follow the pattern demonstrated by the stan-
dard facets in the library, which is to provide the latitude to control deletion of the facet by
setting the constructor argument refs to 1 if necessary. Such a base class facet can be used
either inside a locale or stand-alone, and it allows its derived classes to use both possibili-
ties. It is generally a good idea to provide facet base classes with a refs constructor argu-
ment in order to keep all options open for the derived classes.
For the derived class we are more restrictive. We decide that derived class facets
must not be used independent of any locales, and to ensure this the refs argument is 0.
Whether this is a wise or not-so-wise decision depends on the circumstances and context.
All we want to point out here is that the decision about the refs argument is part of the
design of a new facet type and has an impact on the way the resulting facets can be used.
Here is the completed example:
and decrements whenever it deletes a pointer to the facet. When a locale deletes the last
pointer, it also deletes the facet itself. Again, details of an implementation of the reference
counter are not specified by the standard. No matter how the reference-counting scheme
is implemented, the entire mechanism should be invisible to a user that uses the locale
framework, even when he derives new user-defined facets. Only the refs argument
must be provided, and the locale uses its value to decide whether or not a facet must be
deleted in the end; the rest of the reference-counting scheme is transparent.
For illustration, let us examine an example for the memory management of facets
that are shared between locales. Say we have a function that creates a new locale object by
combining a given locale with a certain facet.
Figure 7-2 shows an arbitrary locale 1oc provided as an argument to the function.
type_A::1id
reference counter: 1
reference counter: 1
reference counter: 1
After creation of the second locale object temp_locale, both locale objects share
almost all of their facets except the one of type type_A that is replaced in the newly con-
structed locale object (see Figure 7-3).
When the locale object temp_locale goes out of scope, its destructor decrements
the reference counters of the locale’s facets. The reference counter of the new type_A
object by then will be 0, and consequently the facet will be deleted. After destruction of
the locale object temp_locale, the situation will be as before.
Locale object:
Locale object:loc temp locale
Identification | Pointer Old type_A New type_A Identification | Pointer
facet object facet object
type_A::id : ; type_A::id @
reference counter: 1| | reference counter: 1
type_B::id @
reference counter: 2
i Z
type_C::id ?
reference counter: 2 SS
references or pointers to locale objects are required for the polymorphic behavior of the
contained facets’ services.
Basically, a locale exhibits value semantics to its users, while internally it acts as a
reference to a container of facets. It can be passed around as a value at almost no cost. At
the same time, a locale object has referencelike properties in that it allows access to poly-
morphic services of its contained facets through locale objects.
Also, keep in mind that the lifetime of a locale object must exceed the lifetime of its
contained facets. With each call to use_ facet (),a reference to one of its contained facets
is provided. References to objects in a container always raise the question of how long the
reference to the contained element will stay valid—which leads us to another interesting
property of locales.
User-Defined Facets
Locales and facets form an extensible framework that allows the addition of user-defined
internationalization services. Such services can be encapsulated into user-defined facet
types, which can have arbitrary interfaces and functionality. In chapter 4, “The Architec-
ture of the Locale Framework,” we explained that facet types must have the following
two properties: Facets have to be subclasses of class locale: : facet. Additionally, they
must contain a facet identification in the form of a static data member that is declared as
static locale::id id;. This identification is used for maintenance and retrieval of
facets from a locale and identifies an entire family of facets: All facets with the same iden-
tification belong to the same facet family. A locale cannot contain two facets with identical
identification. Hence, facets from the same family can only be replacements of each other.
New types of facets can be added by either deriving from existing facet types, in
which case the facet identification is inherited and the new facet belongs to an already
existing facet family, or by defining a new facet class that has a facet identification of its
own, in which case a new facet family is introduced.
In the following sections we study both cases in terms of examples.
327
328 User-Defined Facets
family have the same interface and the same facet identification. If we want to add a user-
defined facet type to an existing facet family, we must derive the new facet type from one
of the family members so that the new facet type inherits the interface and the facet iden-
tification of its base class. The facet families that we use for demonstration are the prede-
fined standard facets, but the technique is the same for other user-defined facet families.
We study two examples: (1) a facet that provides user-defined names for the Boolean
values true and false (which will belong to the numpunct facet family), and (2) a facet that
recognizes any umlaut in the German alphabet (which will be added to the ctype facet
family).
These two examples demonstrate two different techniques. For the implementation
of the numpunct facet for user-defined Boolean names, we will override a virtual member
function that is inherited from the standard numpunct facet; that is, we redefine existing
facet functionality. For the implementation of the umlaut facet, we will add a member
function and in this way extend the interface inherited from the standard ctype facet; that
is, we add functionality to a facet.
protected:
CharT const * const true_;
CharT const * const false_;
basic_string<CharT> do_truename() const {return true_;}
basic_string<CharT> do_falsename() const {return false_;}
~ boolnames() {}
public:
explicit boolnames(const char* locnam, const CharT* t,
const CharT* f, size_t refs = 0)
: Numpunct_byname<CharT>(locnam,refs), true_(t), false_(f) {}
};
The new facet type is a byname facet that takes a locale name as a constructor argu-
ment. It also takes the string representations of the Boolean values true and false as
constructor arguments. The remaining constructor argument is a reference counter that is
passed to the base class and has a default of 0. The reference counter determines whether
a containing locale will later delete the facet or not. The new facet type boolnames has
the same interface and the same facet identification as its base class. When installed in a
locale, it will replace the respective numpunct facet that is present in the locale. Note that
the new facet must be explicitly installed in a locale. This would be done as follows:
A facet of the new facet type boolnames<char> is created on the heap. A locale
object is created that is a copy of a U.S. locale with the numpunct<char> facet replaced
by the new boolnames<char> facet. This new locale is attached to cout, and the subse-
quent output to cout prints:
or
Any arguments? no
Note that the new facet object is created on the heap and that we used the default
value 0 for the refs constructor argument. This is typical for facet objects that are placed
330 User-Defined Facets
into a locale, because the locale takes over ownership of its facets and deletes them when
it goes out of scope. As already explained in section 7.3, Memory Management of Facets in
a Locale, user-defined facets can also be created on the stack or data segment and be used
independent of a locale. We demonstrate this in the next example.
}
public:
explicit umlaut(size_t refs = 0) : ctype_byname<CharT>("German",refs) {}
The new umlaut facet can be installed in a locale object, where it would replace the
ctype facet. The umlaut facet can be retrieved from the locale and used like any other
facet. It has all the functionality of a ctype facet plus the additional is_umlaut () func-
tion. The facet can be retrieved as a derived class reference or as a base class reference. In
the code snippet below we demonstrate both alternatives: |
if (has_facet<umlaut<char> >(loc))
{ const umlaut<char>& ufac = use_facet<umlaut<char> >(loc);
cout << ufac.is(ctype_base::alpha,'A') << endl;
cout << ufac.is_umlaut('A') << endl;
8.2 Defining a New Facet Family 331
When the umlaut facet is retrieved via its actual derived class type, the
is_umlaut() function is accessible. If we use the umlaut facet as an ordinary ctype
facet and retrieve it by its base class type, only the ctype facet interface is accessible and
is_umlaut () cannot be invoked. Naturally, we can cast the base class reference down to
the derived class type in our example above and then invoke the is_umlaut () function.
Yet we recommend avoiding downcasts by using a base class reference when only the
base class functionality is needed and a derived class reference when the new functional-
ity is accessed.
The umlaut facet need not be contained in a locale but can also be used standing
alone, that is, independent of any locale. Here is an example of this kind of usage:
umlaut<char> fac(1);
cout << fac.is(ctype_base::alpha,'A') << endl;
cout << fac.is_umlaut('A') << endl;
facets and demonstrate how they can be used in conjunction with IOStreams for imple-
mentation of an address inserter. Eventually we explore how the installation of an address-
formatting facet in a locale object could be automated and suggest a locale factory for that
purpose.
[<Country>] U.S.A.
In Germany addresses have a slightly different format: It is, for instance, not cus-
tomary to print a person’s second name. An optional country code is placed in front of the
Zip code, separated by a hyphen. States are irrelevant. And so on and so forth. Here is the
general pattern and an example of an address in Germany:
<blank line>
template<class charT>
class address
{
friend basic_ostream<charT>&
operator<< (basic_ostream<charT>& os, const address<charT>& ad);
public:
address(const String& firstname, const String& secname,
const String& lastname,
const String& addressl, const String& address2,
const String& town, const String& zipcode,
const String& state, const String& country,
const String& cntrycode)
firstname_(firstname), secname_(secname), lastname_(lastname),
addressl_(address1), address2_(address2),
town_(town), zipcode_(zipcode), state_(state),
country_(country), cntrycode_(cntrycode) {}
private:
String firstname_;
String secname_;
String lastname_;
String address1_;
String address2_;
String town_;
String zipcode_;
String state_;
String country_;
String cntrycode_;
};
The address class contains private data members that hold the various elements of
an address. The constructor initializes these elements. An operator<<() prints
addresses according to a stream’s current locale object. It is a friend function of the
address class.! We will see its implementation later.
1. It would be nicer if the inserter were a template on the same character type, so that output could be written to
narrow- and wide-character streams. We restricted the example to narrow-character streams, because our com-
piler was not capable of coping with friend function templates.
334 User-Defined Facets
two properties: They have to be subclasses of class locale: : facet, and they must con-
tain a facet identification in the form of a static data member that is declared as static
locale: :idid;
Facet types with the same facet identification belong to the same facet family and
replace each other in a locale. A facet class that has a facet identification of its own intro-
duces a new facet family. In our example, address formatting is present in a locale in addi-
tion to other internationalization facilities and is not meant to replace any existing
information. Hence, we define a new facet family for address formatting by building a
new facet type with an identification of its own.
Following the naming conventions of the standard, we call our address facet
address_put because it handles the formatting of addresses. This is in line with the
names of the standard facets num_put (formatting of numeric values), money_put (for-
matting of monetary values), and time_put (formatting of time and date).* The format-
ting operation is a member function called put ().
For the implementation of address_put we follow the design and implementation
idioms for formatting facets, which are established in the standard library:
output iterators. Formatting operations in the standard library, like num_put
<charT>: :put (), take an iterator to the beginning of the output sequence as an argu-
ment. This approach allows a flexible solution and fits smoothly into the overall concept
of the entire standard library, where iterators are used as generic connectors between
independent abstractions. In line with this policy, we too use an output iterator to desig-
nate the target location of the formatted address string.
In the standard library, the type of this output iterator is a template argument of the
respective facet class template. By default, the output iterator type is an output stream
buffer iterator. It allows direct access to a stream buffer and is a sensible default for the use
of facets in IOStreams. We adopt this policy for the address_put facet and make it a
class template taking the output iterator type as a template argument.
PUBLIC AND VIRTUAL PROTECTED INTERFACE. In section 6.3.1.1, The Standard Facet Base
Class Templates, we explained that the standard facet types follow a certain idiom: the
public interface consists of nonvirtual member functions that delegate all tasks to pro-
tected virtual member functions. In other words, a public member function foo () calls a
protected virtual function do_foo (), which does the real work.
In our example, the public interface of the address_put class template contains a
member function put (), which calls a protected virtual function do_put (), which does
the real work. These functions take the output iterator that specifies the target location,
and all elements that form the address (name, city, etc.) as parameters.
¢ XXX_get for parsing facets; they typically have a get () member function
¢ XXX_byname for derived facet types that can be constructed from a locale name
¢ XXXpunct for facets that represent information, rather than functionality
8.2 Defining a New Facet Family 835
public:
typedef OutIter iter_type;
protected:
virtual void do_put (OutIter oi,
const String& firstname, const String& secname, const String& lastname,
const String& addressl, const String& address2,
const String& town, const String& zipcode, const String& state,
const String& country, const String& cntrycode) const;
In the code above, you can see the design decisions made so far:
e The new facet type is a class derived from locale: : facet with an identification
of its own.
e It’s a class template taking the character type and the output iterator type as
parameters.
336 User-Defined Facets
e It has a public put () and a protected do_put () function. (The member function
put_string() isa helper function that writes strings to an output iterator.)
For the sake of simplicity, we decided against the following design option:
¢ The patterns for international address formats could have been encapsulated into
an addresspunct facet, similar to a numpunct or moneypunct facet. The
“punct” facets in the standard library are used by related formatting and parsing
facets for finding rules, pattern, and other information. We decided in favor of an
alternative technique and put the knowledge about specific address patterns
directly into the respective formatting operations rather than factoring it out into a
separate facet. This technique can be found in the standard library, too. It is
demonstrated by the standard time and date facets time_put and time_get,
which, unlike num_put /num_get and money_put/money_get, donot rely ona
timepunct facet.
protected:
virtual void do_put (OutIter oi,
| const String& firstname, const String& secname,
const String& lastname,
const String& addressl, const String& address2,
const String& town, const String& zipcode,
const String& state,
const String& country, const String& cntrycode) const = 0;
};
protected:
void do_put (OutIter oi,
const String& firstname, const String& secname,
const String& lastname,
const String& addressl, const String& address2,
const String& town, const String& zipcode,
const String& state,
const String& country, const String& cntrycode) const
{
String s(firstname);
s.append(" ").append(secname) .append(" ") .append (lastname)
-append("\n");
s.append(address1) .append("\n");
if(!address2.empty()) s.append(address2) .append("\n");
S.append(town) .append(", ").append(state) .append(" ") .append(zipcode)
-append("\n");
if(!country.empty()) s.append(country) .append("\n");
put_string(oi,s);
}
};
protected:
void do_put(OutIter oi,
const String& firstname, const String& secname,
const String& lastname,
const String& addressl, const String& address2,
const String& town, const String& zipcode,
const String& state,
const String& country, const String& cntrycode) const
String s(firstname);
S.append(" ") .append(lastname) .append("\n");
s.append(address1) .append("\n");
if (!address2.empty()) s.append(address2) .append("\n");
S.append("\n");
if (!cntrycode.empty()) s.append(cntrycode) .append("-");
S.append (zipcode) .append(" ").append(town) .append("\n");
put_string(oi,s);
};
The core of these address facets is the implementation of the respective do_put ()
function. do_put() concatenates the address elements into one large address string,
according to U.S. and German address-formatting rules respectively.> The helper function
address_put<>::put_string() then writes the formatted string to the output iterator.
try
{
const address_put<charT>& apFacet = use_facet<address_put<charT> > (loc);
apFacet.put(os, ad.firstname_, ad.secname_, ad.lastname,
3. Of course, the repeated invocation of append () is potentially inefficient, because it might lead to reallocation
of the string’s internal character array with each call to append (). We chose this solution here in order to keep
the example concise and simple. In a more realistic implementation, one could create the string with a suffi-
ciently large initial capacity to avoid any reallocation.
8.2 Defining a New Facet Family 339
For culture-sensitive address formatting, the inserter must retrieve the address-
formatting facet from the stream’s current locale. Streams have a member function
getloc() that returns the stream’s locale object. From that locale the address facet can be
retrieved via the template function use_facet<Facet>(). Note that the user-defined
address formatting facet address_put is retrieved in the exactly the same way as it
would be for any standard facet.
The inserter then calls the facet’s put () function and delegates the actual format-
ting to it. All the elements of an address are passed as arguments to the put () function.
The first argument to put () is expected to be the iterator designating the beginning of the
output sequence. A stream buffer iterator pointing to the current position of the output
stream can be created from a reference to an output stream. (See section 2.4, Stream Itera=
tors and Stream Buffer Iterators, in part I for more information.) Hence we pass in the
stream itself. The implicit conversion mechanism for function arguments in C++ cares for
construction of an output stream buffer iterator.
EQUIPPING LOCALES WITH ADDRESS FACETS
We have seen above how an address_put facet is retrieved from a locale object. In addi-
tion to retrieval, we need to consider ways and means of storing address facets in locale
objects in the first place. In section 7.3.2, Immutability of Facets in a Locale, we explained
that locales are immutable objects. Facets are stuffed into a locale when the locale object is
created and cannot be replaced or added later on. Locale objects are built by composition:
You start off with the copy of an existing locale and replace and add facets to create a new
locale object. ,
In our example, we want to equip a locale that contains all standard facets for a cul-
tural environment (in the following example called a standard locale), with an additional
address-formatting facet.
A standard locale can be created by means of the following constructor:
explicit locale(const char* name);
It constructs a locale object that contains all the standard facets for a cultural environment
specified by the locale name.
A new locale object containing all the facets from an existing locale object, plus an
additional new facet, can be composed via the following locale member template
constructor:
340 User-Defined Facets
To add a US_address_put facet object to the locale that contains all standard U.S.
facets, we would have to write
A LOCALE FACTORY
For decoupling the potentially troublesome process of locale construction from compara-
bly straightforward locale use, we build a factory* that handles the construction of locale
objects. The idea is to create locale objects “bynames”: they shall all have standard facets
for the cultural area specified by the name, plus a number of desired, additional nonstan-
dard facets, such as an address-formatting facet. The extra effort of building such a locale
factory might look like overkill in our simple example, but it will definitely pay off ina sit-
uation where numerous user-defined facets must be available in every locale object.
We suggest building a hierarchy of locale factories: a base locale factory creating
standard locale objects and derived factories for nonstandard locales. The code below
shows an implementation of a base locale factory that has a make_locale() function
that returns a standard locale associated with the cultural environment specified by a
locale name:
class locale_factory
{
public:
virtual locale make_locale (const char* name) const
{ return locale(name); }
4. An in-depth discussion of the factory pattern can be found in Design Patterns: Elements of Reusable Object-
Oriented Software by Erich Gamma et al. (for details see the bibliography section of this book).
8.2 Defining a New Facet Family | 341
Note that the factory method make_locale() usually, according to the general
factory pattern, returns a pointer or reference to the created object. This is because derived
factories will be allowed to create objects of derived classes, which can have additional
members or vary in the behavior of existing member functions. Our factory method
make_locale() deviates from the general pattern in that it returns a locale object rather
than a pointer or a reference. This is a general idiom in using locales: They are passed
around as objects, but internally they are only a handle to an arbitrary number of facets
from arbitrary facet families.” No inheritance is involved for creating locales that hold an
arbitrary combination of facets. Hence there is no need for returning references or point-
ers to locales.
The code below shows the implementation of a locale factory that returns a locale
containing all standard facets and, if a U.S. or a German locale is requested, additionally
an address_put facet.
public:
address_locale_factory()
{
facets["En_US"] = new US_address_put<char, osIter>(1);
facets["De_DE"] new German_address_put<char, osIter>(1);
}
~address_locale_factory()
{
delete facets["En_US"];
delete facets["De_DE"];
}
private:
Map<string, address_put<char, osIter>* > facets;
};
5. The details of the locale architecture were explained in chapter 7, “The Architecture of the Locale Framework.”
342 User-Defined Facets
A locale that has an address facet installed must be passed to the printAddress ()
function when it is invoked. Here is an example:
CONCLUSION
In this chapter we demonstrated a technique for adding arbitrary, user-defined facets to
the locale framework in the standard library and their usage in conjunction with
iostreams. The example of choice was an address-formatting facet. The technique itself,
however, is more general and can be applied to arbitrary facet types. Here is a wrapup of
the essentials:
manparony. A user-defined facet type must be derived from class locale: : facet
and have a facet identification in the form of a static data member named id of type
locale: :id.
RECOMMENDED. We recommend the following: For the sake of consistency, a facet
name should follow the naming conventions of the standard facets.
e Formatting and parsing operations should access the source or destination via iter-
ators. Formatting and parsing facets should be templatized on the iterator type
and should use stream buffer iterators as a default.
¢For the sake of consistency with the standard facets, public member functions
should delegate to protected member functions.
REFERENCE GUIDE
INTRODUCTION
The following introduction describes how entries in the reference guide are organized
and described.
Entries are grouped into the following sections: locale, character traits, IOStreams,
stream iterators, and cther I/O operations.
Each section starts with its header files, global type definitions, global objects, and
global functions (if they exist), followed by their classes in alphabetical order.
Each class entry describes the header file that contains the class, its base class(es), a
general picture of the class, a synopsis, its nested classes and types, its (constant) data
members, and its member functions grouped according to access specification
(public/protected) and their functionality.
The synopsis and the subsequent descriptions are ordered in the same way: Bold-
face comments in the synopsis reappear as entries on the lefthand margin of the descrip-
tions. Within each group, the entries are ordered by meaning in the synopsis and
alphabetically in the subsequent descriptions. Below is an example:
SYNOPSIS
namespace std {
template <class charT>
class numpunct : public locale::facet {
public:
// type definitions:
typedef charT char_type;
typedef basic_string<charT> string_type;
343
344 Reference Guide
// data members:
static locale::id id;
// constructors:
explicit numpunct(size_t refs = 0);
// mnumpunct operations:
char_type decimal_point () const;
char_type thousands_sep () const;
string grouping () const;
string_type truename() const;
string_type falsename() const;
protected:
virtual ~numpunct();
// mnumpunct operations:
virtual char_type do_decimal_point() const;
virtual char_type do_thousands_sep() const;
virtual string do_ grouping () const;
virtual string_type do_truename() const;
virtual string_type do_falsename() const;
};
}
virtual ~numpunct
() ;
For virtual functions the base class gives the description of the general semantics
and the base class behavior. Derived classes repeat these virtual functions only if they pro-
vide new implementations.
Calls to a member function of the same object are always written as this->foo(),
not simply as foo (), to make clear beyond any doubt that foo is a member function and
not a global function.
In a similar way, scope operators (even if syntactically not necessary) are used to
make clear where an item comes from. For instance, constants that are defined as pro-
tected or public in a base class and used in a derived class are named base_class: :
constant in the description of the derived class.
To enhance readability, this is always used in the text even if * this would be cor-
rect C++ syntax. Here is an example: Failures are indicated by the state of this.
The last remark is IOStreams-specific. Member function descriptions of stream
classes, e.g., basic_istream<charT>, mention only the error bits that are set in case of
failure. It is a general principle for the IOStreams that optionally exceptions related to the
error bits could be thrown. Hence it is omitted in the reference manual for brevity. For
details about the error indication model of IOStreams, see section 1.3, The Stream State.
LOCALE
<locale>
DESCRIPTION
The header file contains all declarations for class locale and all standard facets.
SYNOPSIS
namespace std {
// Class locale:
class locale ;
// global functions:
// facet access:
template <class Facet> const Facet& use_facet(const locale&);
template <class Facet> bool has_facet(const locale&) throw/();
// character classification:
template <class charT> bool isspace (charT c, const locale& loc);
template <class charT> bool isprint (charT c, const locale& loc);
template <class charT> bool iscntrl (charT c, const locale& loc);
template <class charT> bool isupper (charT c, const locale& loc);
template <class charT> bool islower (charT c, const locale& loc);
template <class charT> bool isalpha (charT c, const locale& loc);
346
header file <locale> 347
// numeric facets:
template <class charT, class InputIterator> class num_get;
template <class charT, class OutputIterator> class num_put;
template <class charT> class numpunct;
template <class charT> class numpunct_byname;
// collation facets:
template <class charT> class collate;
template <class charT> class collate_byname;
// money facets:
class money_base;
template <class charT, class InputIterator> class money_get;
template <class charT, class OutputIterator> class money_put;
locale
348 header file <locale>
locale
global functions 349
global functions
HEADER
header: <locale>
FACET ACCESS
template <class Facet> bool has_facet (const locale& loc) throw();
Returns true if a facet that can be identified by Facet is available in loc; otherwise
returns false.
Makes the facet, which is identified by Facet, available from loc, by returning a refer-
ence to this facet. If no such facet can be found there, a bad_cast exception is thrown.
CHARACTER CLASSIFICATION
template <class charT> bool isalnum(charT c, const locale& loc);
Checks whether c is an alphabetical character, using the ctype facet contained in loc,
ie., calls use_facet<ctype<charT> >(loc).is (ctype_base: :alpha,c).
Checks whether c is a control character, using the ct ype facet contained in loc, i.e., calls
use_facet<ctype<charT> >(loc) .is(ctype_base::cntrl,c).
Checks whether c is a digit, using the ctype facet contained in loc, ie., calls
use_facet<ctype<charT> >(loc) .is(ctype_base: :digit,c).
locale
350 global functions
Checks whether c is a lowercase character, using the ct ype facet contained in loc, ie.,
calls use_facet<ctype<charT> >(loc).is(ctype_base::lower,c).
Checks whether c is a lower printable character, using the ctype facet contained in loc,
i.e., calls use_facet<ctype<charT> >(loc).is(ctype_base::print,c).
Checks whether c is a lower punctuation character, using the ctype facet contained in
loc, iie., calls use_facet<ctype<charT> >(loc).is(ctype_base::punct,c).
Checks whether c is a lower whitespace character, using the ctype facet contained in
loc, i.e., calls use_facet<ctype<charT> >(loc).is(ctype_base::space,c).
locale
global functions | 351
Checks whether c is an uppercase character, using the ct ype facet contained in loc, i.e.,
calls use_facet<ctype<charT> >(loc) .is(ctype_base::upper,c).
Checks whether c is a character that represents a hexadecimal digit (i.e., 0-9, a-f, or A-F)
using the ctype facet contained in loc, ie., calls use_facet<ctype<charT> >
(loc) .is(ctype_base::xdigit,c).
CHARACTER CONVERSION
Returns the character that represents the lowercase conversion of c, if it exists; otherwise
c. Uses the ctype facet contained in loc, ie., calls use_facet<ctype<charT> >
(loc) .tolower(c).
Returns the character that represents the uppercase conversion of c, if it exists; otherwise
c. Uses the ctype facet contained in loc, i.e, calls use_facet<ctype<charT> >
(loc) .toupper(c).
locale
352 | codecvt<internT,externT,stateT>
codecvi<internT,externT,stateT>
CLASS TEMPLATE
class template<class internT, class externT, class stateT> codecvt
header: <locale>
base class(es): codecvt_base, locale: : facet
DESCRIPTION
codecvt is the facet that contains the information about character code conversion and
the functionality to perform the conversions. The following instantiations are required:
codecvt<wchar_t,char,mbstate_t> converts between the native character
sets for narrow and wide characters.
codecvt<char, char, mbstate_t> implements a degenerate conversion; it does
not convert at all.
SYNOPSIS
namespace std {
template <class internT, class externT, class stateT>
class codecvt : public locale::facet, public codecvt_base {
public:
// type definitions:
typedef internT intern_type;
typedef externT extern_type;
typedef stateT state_type;
// data members:
Static locale::id id;
// constructors:
explicit codecvt (size_t refs = 0)
// code conversion:
result in(stateT& state,
const externT* from, const externT* from_end,
const externT*& from_next,
internT* to, internT* to_limit, internT*& to_next) const;
result out(stateT& state, .
const internT* from, const internT* from_end,
const internT*& from_next,
externT* to, externT* to_limit, externT*& to_next) const;
// miscellaneous:
result unshift(stateT& state,
externT* to, externT* to_limit, externT*& to_next) const;
int encoding() const throw();
bool always_noconv() const throw();
locale
codecvt<internT,externT,stateT> 353
TYPE DEFINITIONS
The type intern_type is the character type that is associated with the internal code set.
The type extern_type is the character type that is associated with the external code set.
locale
354 codecvt<internT,externT,stateT>
The type state_type is a type that is capable of holding the conversion state. It must be
maintained during a conversion from the external to the internal character set and vice
versa and can contain any information that is useful to communicate to or from the
do_in() and do_out () member functions.
CODE CONVERSION
locale
codecvt<internT,externT,stateT> 355
MISCELLANEOUS
Calls do_length(state, from, end, max) and returns the result of this call.
Calls do_unshift (state, to, to_limit, to next) and returns the result of this call.
virtual ~codecvt();
CODE CONVERSION
Converts characters from the input represented by the range [from, from_end)
and places the result into the output designated by to. Converts no more than
(from_end - from) elements from the input and places no more than (to_limit -
to) elements into the output. Conversion also stops if it encounters a character that can-
not be converted. When the function returns, from_next and to_next are pointing
locale
356 codecvt<internT,externT,stateT>
one beyond the last element successfully handled. If no translation is needed (return
value codecvt_base::noconv), to_next is set to to, and from_next to from.
Returns one of the values from codecvt_base: : result.
Converts characters from the input represented by the range [from, from_end)
and places the result into the output designated by to. Converts no more than
(from_end - from) elements from the input and places no more than (to_limit - to)
elements into the output. Conversion also stops if it encounters a character that cannot be
converted. When the function returns, from_next and to_next are pointing one
beyond the last element successfully handled. If no translation is needed (return value
codecvt_base: :noconv), to_next is set to to, and from_next to from. Returns one
of the values from codecvt_base::result.
MISCELLANEOUS
virtual bool do_always_noconv() const throw();
locale
codecvt<internT,externT,stateT> 357
Returns the maximum number of externT characters that can be consumed to produce
one internT character, ie., the maximum value that do_length(st, from,
from_end,1) can return for any valid range [from, from_end) and stateT value st.
Places the characters needed to unshift the conversion state represented by state into
the output designated by to. Typically, these characters will be characters to return the
state to the initial state stateT().
Places no more than (to_limit - to) elements into the output. When the function
returns, to_next is pointing one beyond the last element placed in the output. If no
unshift sequence is needed, to_next is set to to. Returns one of the values from
codecvt_base::result.
| The required instantiations codecvt<wchar_t, char, mbstate_t> and
codecvt<char, char, mbstate_t> store no characters.
(The term “unshift” stems from the “shift sequences” that are used in state-dependent
multibyte encodings. In this type of encoding scheme, shift (or escape) sequences are used
to switch between one- and two-byte modes, as well as between different character sets;
“unshift” therefore means “returning to the initial default mode.” For further explanation
see section 4.2.7.3, Character Encoding Schemes.)
locale
358 codecvt_base
codecvt_base
CLASS
class codecvt_base
header: <locale>
base class(es): [none]
DESCRIPTION
codecvt_base is the base class of all codecvt facets. It provides an enumerated type that
represents the result of code conversion operations.
SYNOPSIS
namespace std {
class codecvt_base {
public:
enum result { ok, partial, error, noconv };
};
}
TYPE DEFINITIONS
Enumerated type that represents the result of code conversion operations, such as in(),
out(),andunshift().
CONSTANT DEFINITIONS
The following list shows the predefined values and their semantics for the nested enu-
merated type result. The numeric values are implementation-dependent:
ok
partial
Last characters from the input sequence not converted or additional input characters
needed before another output character can be produced (in case of in() and out ()) or
more characters need to be supplied to complete termination (in the case of unshift ()).
locale
codecvt_base 359
error
Encountered one or more character(s) in the input sequence that could not be converted
(in the case of in() and out ()) or the state is invalid (in the case of unshift ()).
noconv
No conversion is needed (in the case of in() and out ()); ie., this is a nonconverting
code conversion facet, or no termination is needed for the state type (in case of
unshift ());ie.,a stateless encoding scheme is used, which has no shift states, and there
is no need to unshift anything.
locale
360 codecvt_byname<internT,externT,stateT>
codecvt_byname<internT,externT,stateT>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class internT, class externT, class stateT>
class codecvt_byname : public codecvt<internT, externT, stateT> {
public:
// constructors:
explicit codecvt_byname(const char*, size_t refs = 0);
protected:
// destructor:
virtual ~codecvt_byname();
// code conversion:
virtual result do_out(stateT& state,
const internT* from, const internT* from_end,
const internT*& from_next,
externT* to, externT* to_limit, externT*& to_next) const;
virtual result do_in(stateT& state,
const externT* from, const externT* from_end,
const externT*& from_next,
internT* to, internT* to_limit, internT*& to_next) const;
1. At the time of this writing, the question of whether the instantiations of the codecvt_byname facets are
required or not is under discussion by the standards committee. The final C++ standard currently does not
require them, but there is a corresponding change request pending.
locale
codecvt_byname<internT,externT,stateT> 361
// miscellaneous:
virtual result do_unshift(stateT& state,
externT* to, externT* to_limit, externT*& to_next) const;
virtual int do_encoding() const throw();
virtual bool do_always_noconv() const throw();
virtual int do_length(const stateT&,
const externT* from, const externT* end, size_t max) const;
virtual result do_unshift(stateT& state,
externT* to, externT* to_limit, externT*& to_next) const;
virtual int do_max_length() const throw();
};
}
virtual ~codecvt_byname();
locale
362 collate<charT>
collate<charT>
CLASS TEMPLATE
DESCRIPTION
collate is the facet that contains the functionality for string collation (comparison) and
hashing.
The following instantiations are required: collate<char> and collate
<wchar_t>. They provide classic “C” behavior. Details are given with the description of
the virtual protected member functions.
SYNOPSIS
namespace std {
template <class charT>
class collate : public locale::facet {
public:
// type definitions:
typedef charT char_type;
typedef basic_string<charT> string_type;
// data member:
static locale::id id;
// constructors:
explicit collate(size_t refs = 0);
// collate operations:
int compare(const charT* lowl, const charT* highl,
const charT* low2, const charT* high2) const;
string_type transform(const charT* low, const charT* high) const;
long hash(const charT* low, const charT* high) const;
protected:
// destructor:
virtual ~collate();
// collate operations:
virtual int do_compare(const charT* lowl, const charT* highl,
const charT* low2, const charT* high2) const;
virtual string_type do_transform
(const charT* low, const charT* high) const;
virtual long do_hash (const charT* low, const charT* high) const;
};
}
locale
collate<charT> . 363
collate<charT>: :id defines the unique identifications of the collate facet interfaces.
Each template instantiation with a different character type defines a different facet inter-
face with an associated unique id.
Calls do_compare (low1,high1, low2,high2) and returns the result of this call.
Calls do_hash (low, high) and returns the result of this call.
virtual ~collate();
virtual int
do_compare(const charT* lowl, const charT* highi,
const charT* low2, const charT* high2) const;
locale
364 collate<charT>
Returns a hash value derived from the character sequence represented by the range
[low, high). Character sequences for which this->compare() yields 0 will produce
the same hash value.
Returns a string that is the transformation of the character sequence represented by the
range [low,high) to an internal representation. The lexicographic comparison of two
strings resulting from do_transfer() yields the same result as the comparison of the
original character ranges with do_compare(). This is helpful when a single character
sequence is compared to many other character sequences, because it avoids the transfor-
mation of the single character sequence for each comparison.
locale
collate_byname<charT> 365
collate_byname<charT>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT>
class collate_byname : public collate<charT> {
public:
// type definitions:
typedef charT char_type;
typedef basic_string<charT> string_type;
// constructors:
explicit collate_byname(const char*, size_t refs = 0);
protected:
// destructor:
virtual ~collate_byname();
// collate operations:
virtual int do_compare(const charT* lowl, const charT* highl,
const charT* low2, const charT* high2) const;
virtual string_type do_transform
(const charT* low, const charT* high) const;
virtual long do_hash (const charT* low, const charT* high) const;
};
}
locale
366 collate_byname<charT>
virtual ~collate_byname () ;
locale
ctype<charT> 367
ctype<charT>
CLASS TEMPLATE
DESCRIPTION
ctype is the facet that contains the functionality for character classification and
conversion.
The following instantiations are required: ctype<char> and ctype<wchar_t>.
They implement character classification appropriate to the implementation’s native char-
acter set. The type ctype<char> is provided as a specialization of the ctype<charT>
template.
SYNOPSIS
template <class charT>
class ctype : public locale::facet, public ctype_base {
public:
// type definitions:
typedef charT char_type;
// data members:
static locale::id id;
// constructors:
explicit ctype(size_t refs = 0);
// character classification:
bool is(mask m, charT c) const;
const charT* is(const charT* low, const charT* high, mask* vec) const;
const charT* scan_is (mask m,
const charT* low, const charT* high) const;
const charT* scan_not (mask m,
const charT* low, const charT* high) const;
// character conversion:
charT toupper(charT c) const;
const charT* toupper(charT* low, const charT* high) const;
charT tolower(charT c) const;
const charT* tolower(charT* low, const charT* high) const;
charT widen(char c) const;
const char* widen(const char* low, const char* high, charT* to) const;
char narrow(charT c, char dfault) const;
const charT* narrow(const charT* low, const charT*, char dfault,
char* to) const;
locale
368 ctype<charT>
protected:
// destructor:
virtual ~ctype();
// character classification:
virtual bool do_is(mask m, charT c) const;
virtual const charT* do_is(const charT* low, const charT* high,
| mask* vec) const;
virtual const charT* do_scan_is(mask m,
const charT* low, const charT* high) const;
virtual const charT* do_scan_not (mask m,
const charT* low, const charT* high) const;
// character conversion:
virtual charT do_toupper(charT) const;
virtual const charT* do_toupper(charT* low, const charT* high) const;
virtual charT do_tolower(charT) const;
virtual const charT* do_tolower(charT* low, const charT* high) const;
virtual charT do_widen(char) const;
virtual const char* do_widen(const char* low, const char* high,
charT* dest) const;
virtual char do_narrow(charT, char dfault) const;
virtual const charT* do_narrow(const charT* low, const charT* high,
char dfault, char* dest) const;
};
}
ctype<charT>: : id defines the unique identifications of the ctype facet interfaces. Each
template instantiation with a different character type defines a different facet interface
with an associated unique id.
locale
ctype<charT> 7 369
CHARACTER CLASSIFICATION
bool is (mask m, charT c) const;
Calls do_is (low, high, vec) and returns the result of this call.
Calls do_scan_is(m, low, high) and returns the result of this call.
Calls do_scan_not (m, low, high) and returns the result of this call.
CHARACTER CONVERSION
Calls do_narrow(low, high, dfault,to) and returns the result of this call.
Calls do_tolower (low, high) and returns the result of this call.
locale
370 ctype<charT>
Calls do_toupper (low, high) and returns the result of this call.
Calls do_widen (low, high) and returns the result of this call.
virtual ~ctype();
CHARACTER CLASSIFICATION
Determines the classification m of type ctype_base: :mask for each element in the
range [low, high) and places m into vec. After the call, the array vec contains the bit
masks that characterize each character from the range [low, high). Returns high.
Locates the first element from the range [low, high) that conforms to the classification
defined by m. Returns the address of the found element or high, if none was found.
locale
ctype<charT> 371
Locates the first element from the range [low, high) that does not conform to the classi-
fication defined by m. Returns the address of the found element or high, if none was
found.
CHARACTER CONVERSION
Places the char representation that corresponds to the elements from the range
[low, high) into the array designated by to. If an element has no corresponding char
representation, dfault is placed into the array alternatively. Returns high.
Returns the character that represents the lowercase conversion of c, if it exists; otherwise c.
Replaces each element from the range [low, high) with the character that represents the
lowercase conversion, if it exists. Returns high.
Returns the character that represents the uppercase conversion of c, if it exists; otherwise c.
locale
372 ctype<charT>
Replaces each element from the range [low, high) with the character that represents the
uppercase conversion, if it exists. Returns high.
Places the charT representation that corresponds to the elements from the range
[low, high) into the array designated by to. Returns high.
locale
ctype<char> 373
ctype<char>
TEMPLATE SPECIALIZATION
DESCRIPTION
For performance reasons, the ctype facet for the character type char is a template spe-
cialization. Its implementation is based on a table that uses the character as key and the
classification mask that corresponds to the character as value.
SYNOPSIS
namespace std {
template <> class ctype<char> : public locale::facet, public ctype_base{
public:
// type definitions:
typedef char char_type;
// data members and constant definitions:
Static locale::id id;
static const size_t table_size = IMPLEMENTATION DEFINED;
// constructors:
explicit ctype(const mask* tab = 0, bool del = false, size_t refs = 0);
// character classification:
bool is(mask m, char c) const;
const char* is(const char* low, const char* high, mask* vec) const;
const char* scan_is (mask m, const char* low, const char* high) const;
const char* scan_not(mask m, const char* low, const char* high) const;
// character conversion:
char toupper(char c) const;
const char* toupper(char* low, const char* high) const;
char tolower(char c) const;
const char* tolower(char* low, const char* high) const;
char widen(char c) const;
const char* widen(const char* low, const char* high, char* to) const;
char narrow(char c, char dfault) const;
const char* narrow(const char* low, const char* high, char dfault,
char* to) const;
protected:
// destructor:
virtual ~ctype();
// table access:
locale
374 ctype<char>
CONSTANT DEFINITIONS
Size of the table used for character classification. The value is implementation-defined,
but at least 256.
Implements all public member functions as described for ctype<charT>, except the fol-
lowing listed:
explicit ctype(const mask* tab = 0, bool del = false, size_t refs = 0);
Constructs an object of type ctype<char>. If tab==0, the new object uses ctype
<char>::classic_table() for character classification. If tab!=0, the new object
uses tab for character classification. If tab!=0 and del==true, the table is destroyed
via delete[] table() during destruction.
If refs==0, the lifetime of this is managed by the locale(s) that contain this. If
refs==1, the memory of this must be explicitly managed. The behavior for refs>1 is
not defined.
locale
ctype<char> 375
CHARACTER CLASSIFICATION
Returns true if c conforms to the classification defined by m; otherwise false. Uses the
table for classification; that is, returns table()[(unsigned char)c] & m. (See
ctype_base: :mask for details.)
const char* is
(const char* low, const char* high, mask* vec) const;
Determines the classification for each element in the range [low, high) and places it into
vec. After the call, the array vec contains the bitmask that characterizes each character
from the range [low, high). The classification for an element c is determined as
table() [ (unsigned char) c]. Returns high.
Locates the first element from the range [low, high) that conforms to the classification
defined by m. The classification for an element c is determined as table()
[ (unsigned char) c]. Returns the address of the found element or high, if none was
found.
Locates the first element from the range [low, high) that does not conform to the classi-
fication defined by m. The classification for an element c is determined as
table() [ (unsigned char) c]. Returns the address of the found element or high, if
none was found.
Implements all protected member functions as described for ctype<charT>, except the
following listed:
virtual ~ctype();
locale
376 ctype<char>
TABLE ACCESS
Returns a pointer to the table that is currently used by this for character classification,
that is, the first constructor argument, if it was nonzero, and _ otherwise
classic_table().
locale
ctype_base 377
ctype_base
CLASS
class ctype_base
header: <locale>
base class(es): [none]
DESCRIPTION
ctype_base is the base class of all ctype facets. It provides a bitmask that represents dif-
ferent character categories.
SYNOPSIS
namespace std {
class ctype_base {
public:
enum mask { // numeric values are for exposition only.
space=1<<0, print=1l<<1, cntrl=1<<2, upper=1<<3, lower=1<<4,
alpha=1<<5, digit=1<<6, punct=1<<7, xdigit=1<<8,
alnum=alpha|digit, graph=alnum|punct
};
TYPE DEFINITIONS
Enumerated bitmask type that represents the different types of character categories.
CONSTANT DEFINITIONS
The following list shows the predefined values and their semantics for the nested enu-
merated type mask. The numeric values are implementation-dependent:
alpha
alphabetical characters
digit
locale
378 ctype_base
entrl
control characters
lower
lowercase characters
printable characters
punct
punctuation characters
space
whitespace characters
upper
uppercase characters
xdigit
characters that represents the hexadecimal digits, i.e., 0-9, a-f, or A-F
alnum
alphanumeric characters, that is, the union of alphabetical characters and digits; the value
equals alpha| digit
graph
printing characters, that is, the union of alphanumeric and punctuation characters; the
value equals alnum|punct
locale
ctype_byname<charT> | 379
ctype_byname<charT>
CLASS TEMPLATE
DESCRIPTION
ctype_byname is the byname ctype facet. It allows the behavior of an object to be speci-
fied to conform to a certain localization environment. (See the description of the construc-
tor for more information.)
The following instantiations are required: ctype_byname<char> and
ctype_byname<wchar_t>. The type ctype_byname<char> is provided as a special-
ization of the ctype_byname<charT> template.
SYNOPSIS
namespace std {
template <class charT>
class ctype_byname : public ctype<charT> {
public:
// type definitions:
typedef ctype<charT>::mask mask;
// constructors:
explicit ctype_byname(const char*, size_t refs = 0);
protected:
// Gestructor:
virtual ~ctype_byname();
// character classification:
virtual bool do_is(mask m, charT c) const;
virtual const charT* do_is(const charT* low, const charT* high,
mask* vec) const;
virtual const char* do_scan_is(mask m,
const charT* low, const charT* high) const;
virtual const char* do_scan_not (mask m,
const charT* low, const charT* high) const;
// character conversion:
virtual charT do_toupper(charT) const;
virtual const charT* do_toupper(charT* low, const charT* high) const;
virtual charT do_tolower(charT) const;
virtual const charT* do_tolower(charT* low, const charT* high) const;
virtual charT do_widen(char) const;
locale
380 ctype_byname<charT>
virtual ~ctype_byname();
locale
locale 381
locale
CLASS
class locale
header: <locale>
base class(es): [none]
DESCRIPTION
locale encapsulates an abstraction that maintains different facet objects, which together
form a certain localization environment. A facet object can be maintained by a locale
object only if it is an instance of a class that is either derived from locale: : facet and
declares a static public member data id of type locale: : id or is an instance of a class
that is derived from such a class.
Locales can have names. Valid locale names are "C"; ""; and any implementation-
defined locale name. "C" stands for the classic U.S. English ASCII locale. "" stands for
the native locale on your system. The syntax and semantics of other locale names are not
defined by the standard but are entirely implementation-specific. For example, the name
"De_DE" on an X/Open system denotes the same localization environment as
"German_Germany.1252" ona Microsoft platform.
SYNOPSIS
namespace std {
class locale {
public:
// Class definitions:
class facet;
class id;
// type definitions:
typedef int category;
// constant definitions:
static const category // values assigned are for exposition only
none = 0,
collate = 0x010, ctype = 0x020,
monetary = 0x040, numeric = 0x080,
time = 0x100, messages = 0x200,
all = collate | ctype | monetary | numeric | time | messages;
// construct/copy/destroy:
locale() throw()
locale(const locale& other) throw()
explicit locale(const char* std_name);
locale(const locale& other, const char* std_name, category);
locale
382 locale
CLASS DEFINITIONS
class id {
public:
1d();
private:
void operator= (const id&); // not defined
id(const id&); // not defined
};
class facet {
protected:
explicit facet(size_t refs = 0);
virtual ~facet();
private:
facet (const facet&); // not defined
void operator=(const facet&); // not defined
};
facet is the base class for all facet instances that can be contained in a locale object. If
refs==0, the lifetime of the facet object is managed by the locale(s) that it contain(s). If
locale
locale | 383
refs==1, the facet object must be explicitly deleted. The behavior for refs>1 is not
defined.
TYPE DEFINITIONS
CONSTANT DEFINITIONS
The following list shows the predefined values and their semantics for the nested bitmask
type category. The numeric values are implementation-dependent:
none
collate
Corresponding to the C locale category LC_COLLATE. This category contains at least the
facet interfaces associated with collate<char> and collate<wchar_t>.
ctype
Corresponding to the C locale category LC_CTYPE. This category contains at least the facet
interfaces associated with ctype<char>, ctype<wchar_t>, codecvt<char, char,
mbstate_t>,and codecvt<wchar_t,char,mbstate_t>.
messages
Corresponding to the Posix locale category LC_MESSAGE. This category contains at least
the facet interfaces associated with messages<char> and messages<wchar_t>.
monetary
Corresponding to the C locale category LC_MONETARY. This category contains at least the
facet interfaces associated with moneypunct<char>, moneypunct<wchar_t>,
moneypunct<char,true>, moneypunct<wchar_t,true>, money_get<char>,
money_get<wchar_t>,money_put<char>, and money_put<wchar_t>.
locale
384 locale
numeric
Corresponding to the C locale category LC_NUMERIC. This category contains at least the
facet interfaces associated with numpunct<char>, numpunct<wchar_t>, num_get
<char>, num_get<wchar_t>, num_put<char>, and num_put<wchar_t>.
time
Corresponding to the C locale category LC_TIME. This category contains at least the facet
interfaces associated with time_get<char>, time_get<wchar_t>, time_put
<char>, and time_put<wchar_t>.
locale() throw();
Constructs a snapshot of the current global locale, that is, an object of type locale, which
is either a copy of the argument passed into the last call to locale::global
(locale&), if this function has been called before, or a copy of the locale returned by a
call to locale: :classic().
Constructs an object of type locale. The name specifies the localization environment
to which the constructed object conforms. Valid locale names are "C"; ""; and any
implementation-defined locale name. Throws the exception runt ime_error if the argu-
ment name is not valid, or 0.
Constructs an object of type locale, which is a copy of other, except for the facets iden-
tiffted by the category c. These facets are the same as those in a locale object constructed by
locale
locale 385
locale (name). Throws the exception runtime_error if the argument name is not
valid, or 0. The resulting locale has a name if, and only if, other has a name.
template<class Facet>
locale(const locale& other, Facet* f);
Constructs an object of type locale, which is a copy of other, except for the facet identi-
fied by Facet. This facet is the same as * f. If £==0 the newly constructed object is a copy
of other. The resulting locale has no name.
Constructs an object of type locale, which is a copy of other, except for the facets iden-
tified by the category c. These facets are taken from one. The resulting locale has a name
if, and only if, the two source locales have names.
Creates a copy of rhs that replaces the current value of this. Returns *this.
template<class Facet>
locale combine(const locale& other);
Returns a newly constructed object of type locale, which is a copy of *this, except for
the facet identified by Facet. This facet is taken from other. Throws the exception
runtime_error if has_facet<Facet> (other) returns false. The resulting locale
has no name.
~locale() throw();
OPERATORS
Returns true if both arguments are the same locale, or one is a copy of the other, or each
has a name and the names are identical. Returns false otherwise.
locale
386 locale
Compares two strings according to the collate facet of this, i.e., returns
This operator, and therefore locale itself, satisfies the requirements for a comparator
predicate for strings. For example, it can be used for instantiation of containers such as
map<string,T, locale> or for invocation of algorithms such as sort (begin, end,
locale("US"));
LOCALE OPERATIONS
The name of this, if this has one. Otherwise, the string "*". Valid locale names are
"Cc"; ""; and any implementation-defined locale name.
Sets the global locale to loc, which causes future calls to locale () to construct a copy of
loc. Returns the previous value set, if this function has been called before, or a copy of the
locale returned by a call to locale: :classic().
Effects on the C locale: If loc has a name, sets the “C” locale accordingly, i.e., calls
setlocale(LC_ALL, loc.name().c_str()). Otherwise the effect on the C locale is
implementation-specific.
Returns a locale object that behaves according to the “C” locale semantics.
locale
messages<charT> 387
messages<charT>
CLASS TEMPLATE
template<class charT> class messages
header: <locale>
base class(es): locale::facet, messages_base
DESCRIPTION
messages is the facet that contains the functionality for the retrieval of localized message
strings from message catalogs.
The following instantiations are required: messages<char> and messages
<wchar_t>. Their behavior is implementation-specific. The standard specifies neither
how a message catalog is represented and organized nor what the syntax of the contained
messages is.
SYNOPSIS
namespace std {
template <class charT>
class messages : public locale::facet, public messages_base {
public:
// type definitions:
typedef charT char_type;
typedef basic_string<charT> string_type;
// data member:
Static locale::id id;
// constructors:
explicit messages(size_t refs = 0);
// messages operations:
catalog open(const basic_string<char>& fn, const locale&) const;
string_type get(catalog c, int set, int msgid,
const string_type& dfault) const;
void close(catalog c) const;
protected:
// destructor:
virtual messages ();
// messages operations:
virtual catalog do_open(const basic_string<char>&, const locale&) const;
virtual string_type do_get(catalog, int set, int msgid,
const string_type& dfault) const;
virtual void do_close(catalog) const;
};
}
locale
388 messages<charT>
virtual ~messages();
Closes the message catalog identified by c. c must be obtained from a previous opening
of a catalog that is not yet closed.
locale
messages<charT> 389
locale
390 messages_base
CLASS
class messages_base
header: <locale>
base class(es): [none]
DESCRIPTION
messages_base is the base class of the messages facet. It provides a type definition for
catalog.
SYNOPSIS
namespace std {
class messages_base {
public:
typedef int catalog;
};
}
TYPE DEFINITIONS
Values of type catalog are usable as arguments to the messages facet’s member func-
tions get () and close() and can be obtained only by calling open ().
locale
messages_byname<charT> | 391
messages_byname<charT>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT>
class messages_byname : public messages<charT>
{
public:
// type definitions:
typedef messages_base::catalog catalog;
typedef basic_string<charT> string_type;
// constructors:
explicit messages_byname(const char*, size_t refs = 0);
protected:
// destructor:
virtual ~messages_byname();
// messages operations:
virtual catalog do_open(const basic_string<char>&,
const locale&) const;
virtual string_type do_get(catalog, int set, int msgid,
const string_type& dfault) const;
virtual void do_close(catalog) const;
};
}
locale
392 messages_byname<charT>
virtual ~messages_byname () ;
locale
money_base : , 393
money_base
CLASS
class money_base
header: <locale>
base class(es): [none]
DESCRIPTION
money_base is the base class of all moneypunct facets. It provides the means to store and
exchange monetary format patterns. °
SYNOPSIS
namespace std {
class money_base {
public:
enum part { none, space, symbol, sign, value };
struct pattern { char field[4]; };
};
TYPE DEFINITIONS
Enumerated type that represents the elements and their usage of a monetary format.
CONSTANT DEFINITIONS
The following list shows the predefined values and their semantics for the nested enu-
merated type part. The numeric values are implementation-dependent:
none
sign
locale
394 | money_base
space
optional whitespace character for parsing, required whitespace character for formatting
symbol
the currency symbol; usage of the currency symbol also depends on ios_base::
showbase (see money_put: :do_put () andmoney_get: :do_get())
value
locale
money_get<charT,inputiterator> 395
money_get<charT,inputiterator>
CLASS TEMPLATE
DESCRIPTION
money_get is the facet that contains the functionality to parse a character sequence that
represents a monetary value. The bool value template parameter Inter specifies
whether the currency symbol used should be the international currency symbol
(Inter= ornot=true)
(Inter==false).
The following instantiations are required: money_get<char> and money_get
<wchar_t>. The money_get template must be instantiable for all character types and
input iterator types. The behavior of the instantiations is implementation-specific for
character types other than char and wchar_t.
SYNOPSIS
namespace std {
template <class charT,
class InputIterator = istreambuf_iterator<charT> >
class money_get : public locale::facet {
public:
// type definitions:
typedef charT char_type;
typedef InputIterator iter_type;
typedef basic_string<charT> string_type;
// data members:
static locale::id id;
// constructors:
explicit money_get(size_t refs = 0);
// parsing operations:
iter_type get(iter_type s, iter_type end, bool intl, ios_base& f,
ios_base::iostate& err, long double& units) const;
iter_type get(iter_type s, iter_type end, bool intl, ios base& f,
ios_base::iostate& err, string_type& digits) const;
protected:
// destructor:
virtual ~money_get();
// parsing operations:
virtual iter_type do_get(iter_type, iter_type, bool, ios _base&,
ios_base::iostate& err, long double& units) const;
locale
396 money_get<charT,inputiterator>
virtual ~money_get();
locale
money_get<charT,inputiterator> 397
Parses characters in the interval [s, end) to construct a monetary value. The monetary
value is constructed according to formatting flags in f. flags() and the moneypunct
<charT, int1> facet from f.getloc(). The parsing is done according to use_facet
<moneypunct<charT, intl>>(f£.getloc()).neg_format().
Digit group separators are optional. If present, digit grouping is checked after all
syntactic elements have been read. If no grouping is specified, any thousands separator
characters encountered in the input sequence are not considered part of the numeric
format.
Where money_base: : space or money_base: :none appear in the format pat-
tern, except at the end, optional whitespace is consumed.
The interpretation of all elements from str.flags() other than ios_base::
showbase_ is implementation-defined. If (str.flags() & ios_base: : showbase)
== false, the currency symbol is optional. If it appears after all other required syntactic
elements, it is not consumed. If (str. flags() & ios_base: :showbase) == true, the
currency symbol is required and always consumed. The expected currency symbol is the
international one if int1 == true; otherwise the domestic one.
If the first character of the sign appears in its correct position, any remaining sign
characters are required and consumed.
The result is a pure sequence of digits, representing a count of the smallest unit of
currency, which is then stored in units or digits respectively. If the parsed value is
negative, units is negated and digits is preceded by '-'.
The operation stops when it encounters an error, runs out of character, or has con-
structed a monetary value. The result of the operation is indicated by err.
Returns an iterator pointing immediately beyond the last character recognized as a
part of a valid monetary quantity.
locale
398 moneypunct<charT,inter>
moneypunct<charT,inter>
CLASS TEMPLATE
DESCRIPTION
moneypunct is the facet that contains the information about the format and punctuation
of monetary expressions. The bool value template parameter Inter specifies whether
the currency symbol should be the international currency symbol (Inter==true) or
not (Inter==false).
The following instantiations are required: moneypunct<char>, moneypunct
<char, true>, moneypunt<wchar_t>, and moneypunt<wchar_t,true>. Their
behavior is implementation-specific.
_ SYNOPSIS
namespace std {
template <class charT, bool International = false>
class moneypunct : public locale::facet, public money_base {
public:
// type definitions:
typedef charT char_type;
typedef basic_string<charT> string_type;
// data members and constant definitions:
static locale::id id;
static const bool intl = International;
// constructors:
explicit moneypunct (size_t refs = 0);
// moneypunct operations:
charT decimal_point() const;
charT thousands_sep() const;
string grouping () const;
string_type curr_symbol() const;
string_type positive_sign() const;
string_type negative_sign() const;
int frac_digits() const;
pattern pos_format () const;
pattern neg_format () const;
protected:
// destructor:
virtual ~moneypunct();
locale
moneypunct<charT,inter> 399
// moneypunct operations:
virtual charT do_decimal_point() const;
virtual charT do_thousands_sep() const;
virtual string do_grouping() const;
virtual string_type do_curr_symbol
() const;
virtual string_type do_positive_sign() const;
virtual string_type do_negative_sign() const;
virtual int do_frac_digits() const;
virtual pattern do_pos_format () const;
virtual pattern do_neg_format () const;
};
}
locale
400 | moneypunct<charT,Iinter>
virtual ~moneypunct
() ;
Returns a string that represents the currency symbol. The boo1 value template parameter
Inter specifies whether the currency symbol should be the international currency sym-
bol (Inter==true) or not (Inter==false). The international currency symbol for
this is always four characters long, usually three letters and a space.
locale
moneypunct<charT,Iinter> 401
The returned value of type string is interpreted as an array of integral values of size
sizeof (char). It describes the way in which digits of the integral part of a numeric
value are grouped. Each character in the string is interpreted as an integer and specifies
the number of digits in a group, starting with the rightmost group. The last integer in the
string determines the size of all remaining groups. If the last integer is <= 0 or CHAR_MAX,
the described group is unlimited. If the string is empty, there is no grouping.
moneypunct<charT, Inter> returns the empty string, indicating no grouping.
Returns a value of type money_base: :pattern that is the format pattern for positive
monetary values. The pattern specifies the order in which syntactic elements appear in
the monetary format. In this four-element array, each value symbol, sign, value, and
either space or none (defined in class money_base) appears exactly once. none, if pre-
sent, is not first; space, if present, is neither first nor last. Otherwise, the elements may
appear in any order.
For the required instantiations, namely moneypunct<char>, moneypunct
<wchar_t>, moneypunct<char,true>, and moneypunct<wchar_t, true>,
do_pos_format () returns the format pattern { symbol, sign, none, value }.
Returns a string that represents the indication of a positive monetary value. The first char-
acter of the string, if any, is placed at the position where the sign has to appear according
to the format pattern (see also the description of do_pos_format ()). Any remaining
characters are placed after all other format elements.
Returns a string that represents the indication of a negative monetary value. The first
character of the string, if any, is placed at the position where the sign has to appear accord-
locale
402 moneypunct<charT,inter>
ing to the format pattern (see also the description of do_neg_format () ). Any remaining
characters are placed after all other format elements.
Returns a value of type money_base: :pattern that is the format pattern for negative
monetary values. The pattern specifies the order in which syntactic elements appear in
the monetary format. In this four-element array, each value symbol, sign, value, and
either space or none appears exactly once. none, if present, is not first; space, if present,
is neither first nor last. Otherwise, the elements may appear in any order.
For the required instantiations, namely moneypunct<char>, moneypunct
<wchar_t>, moneypunct<char,true>, and moneypunct<wchar_t, true>,
do_neg_format () returns the format pattern { symbol, sign, none, value }.
locale
moneypunct_byname<charT,inter> 403
moneypunct_byname<charT,inter>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, bool Intl = false>
class moneypunct_byname : public moneypunct<charT, Intl> {
public:
// type definitions:
typedef money_base::pattern pattern;
typedef basic_string<charT> string_type;
// constructors:
explicit moneypunct_byname(const char*, size_t refs = 0);
protected:
// destructor:
virtual ~moneypunct_byname();
// moneypunct operations:
virtual charT do_decimal_point
() const;
virtual charT do_thousands_sep() const;
virtual string do_grouping() const;
virtual string_type do_curr_symbol () const;
virtual string_type do_positive_sign() const;
virtual string_type do_negative_sign() const;
virtual int do_frac_digits() const;
virtual pattern do_pos_format
() const;
virtual pattern do_neg_format
() const;
};
locale
404 moneypunct_byname<charT,inter>
virtual ~moneypunct_byname
() ;
locale
money_put<charT,Outputiterator> 405
money_put<charT, Outputiterator>
CLASS TEMPLATE
DESCRIPTION
money_put is the facet that contains the functionality to generate a formatted character
sequence that represents a monetary value. The bool value template parameter Inter
specifies whether the currency symbol used should be the international currency symbol
(Inter==true)
ornot (Inter==false).
The following instantiations are required: money_put<char> and money_put
<wchar_t>. The money_put template must be instantiable for all character types and
output iterator types. The behavior of the instantiations is implementation-specific for
character types other than char and wchar_t.
SYNOPSIS
namespace std {
template <class charT,
class OutputIterator = ostreambuf_iterator<charT> >
class money_put : public locale::facet {
public:
// type definitions:
typedef charT char_type;
typedef OutputIterator iter_type;
typedef basic_string<charT> string_type;
// data members:
Static locale::id id;
// constructors:
explicit money_put(size_t refs = 0);
// formatting operations:
iter_type put(iter_type s, bool intl, ios_base& f,
char_type fill, long double units) const;
iter_type put(iter_type s, bool intl, ios_base& f,
char_type fill, const string_type& digits) const;
protected:
// destructor:
virtual ~money_put();
locale
406 money_put<charT,Outputliterator>
// formatting operations:
virtual iter_type do_put(iter_type, bool, ios_base&, char_type fill,
long double units) const;
virtual iter_type do_put(iter_type, bool, ios_base&, char_type fill,
const string_type& digits) const;
};
}
virtual ~money_put();
locale
money_put<charT,Outputliterator> 407
locale
408 num_get<charT,Inputiterator>
num_get<charT, iInputiterator>
CLASS TEMPLATE
class num_get
header: <locale>
base class(es): locale: : facet
DESCRIPTION
num_get is the facet that contains the functionality to parse a character sequence that
represents a numeric or Boolean value.
The following instantiations are required: num_get<char> and num_get
<wchar_t>. The num_get template must be instantiable for all character types and
input iterator types. The behavior of the instantiations is implementation-specific for
character types other than char and wchar_t.
SYNOPSIS
namespace std {
template <class charT, class InputIterator=istreambuf_iterator<charT> >
class num_get : public locale::facet {
public:
// type definitions:
typedef charT char_type;
typedef InputIterator iter_type;
// data members:
Static locale::id id;
// constructors:
explicit num_get(size_t refs = 0);
// parsing operations:
iter_type get(iter_type in, iter_type end, ios_base&,
1os_base::iostate& err, bool& v) const;
iter_type get(iter_type in, iter_type end, ios_baseé& ,
1os_base::iostate& err, long& v) const;
iter_type get(iter_type in, iter_type end, ios_base&,
1os_base::iostate& err, unsigned short& v) const;
iter_type get(iter_type in, iter_type end, ios_base&,
1os_base::iostate& err, unsigned int& v) const;
iter_type get(iter_type in, iter_type end, ios_baseé&,
los_base::iostate& err, unsigned long& v) const;
iter_type get(iter_type in, iter_type end, ios_base&,
los_base::iostate& err, float& v) const;
locale
num_get<charT,inputiterator> 409
locale
410 num_get<charT,Iinputiterator>
Calls do_get (in, end,ib,err,b) and returns the result of this call.
Calls do_get (in, end, ib, err,1) and returns the result of this call.
Calls do_get (in, end, ib, err,s) and returns the result of this call.
Calls do_get (in,end, ib, err,ui) and returns the result of this call.
Calls do_get (in, end, ib,err,d) and returns the result of this call.
locale
num_get<charT,Inputiterator> 411
Calls do_ get (in, end, ib, err,1d) and returns the result of this call.
Calls do_get (in, end, ib,err,p) and returns the result of this call.
virtual ~num_get();
Parses characters in the interval [s , end) to construct a bool value which is then stored in
b. The value is constructed according to ib. flags() and the numpunct<charT> facet
from ib.getloc(). For details, see appendix A. The result of the operation is indicated
by err. Returns an iterator pointing immediately beyond the last character recognized as
part of a valid boolean quantity.
Parses characters in the interval [s, end) to construct a long value, which is then stored
in 1. The value is constructed according to ib.flags() and the numpunct
<charT> facet from ib.getloc(). For details, see appendix A. The result of the opera-
tion is indicated by err. Returns an iterator pointing immediately beyond the last charac-
ter recognized as a part of a valid numeric quantity.
Parses characters in the interval [s, end) to construct an unsigned short value, which
is then stored in s. The value is constructed according to ib.flags() and the
numpunct<charT> facet from ib.getloc(). For details, see appendix A. The result of
locale
412 num_get<charT,inputiterator>
the operation is indicated by err. Returns an iterator pointing immediately beyond the
last character recognized as part of a valid numeric quantity.
Parses characters in the interval [s, end) to construct an unsigned int value, which is
then stored in ui. The value is constructed according to ib.flags() and the
numpunct<charT> facet from ib.getloc(). For details, see appendix A. The result of
the operation is indicated by err. Returns an iterator pointing immediately beyond the
last character recognized as part of a valid numeric quantity.
Parses characters in the interval [s , end) to construct an unsigned long value, which is
then stored in ul. The value is constructed according to ib.flags() and the
numpunct<charT> facet from ib.getloc(). For details, see appendix A. The result of
the operation is indicated by err. Returns an iterator pointing immediately beyond the
last character recognized as part of a valid numeric quantity.
Parses characters in the interval [s , end) to construct a float value, which is then stored
in £. The value is constructed according to ib.flags() and the numpunct<charT>
facet from ib.getloc(). For details, see appendix A. The result of the operation is indi-
cated by err. Returns an iterator pointing immediately beyond the last character recog-
nized as part of a valid numeric quantity.
Parses characters in the interval [s, end) to construct a double value, which is then
stored in d. The value is constructed according to ib.flags() and the
numpunct<charT> facet from ib.getloc(). For details, see appendix A. The result of
the operation is indicated by err. Returns an iterator pointing immediately beyond the
last character recognized as part of a valid numeric quantity.
locale
num_get<charT,Inputiterator> 413
Parses characters in the interval [s , end) to construct a long doub1e value, which is then
stored in 1d. The value is constructed according to ib.flags() and the
numpunct<charT> facet from ib.getloc(). For details, see appendix A. The result of
the operation is indicated by err. Returns an iterator pointing immediately beyond the
last character recognized as part of a valid numeric quantity.
Parses characters in the interval [s , end) to construct a void* value, which is stored in p.
The value is constructed according to ib.flags() and the numpunct<charT> facet
from ib.getloc(). For details, see appendix A. The result of the operation is indicated
by err. Returns an iterator pointing immediately beyond the last character recognized as
part of a valid pointer.
locale
414 numpunct<charT>
numpunct<charT>
CLASS TEMPLATE
DESCRIPTION
numpunct is the facet that contains the information about the format and punctuation of
numeric and boolean expressions.
The following instantiations are required: numpunct<char> and numpunct
<wchar_t>. They provide classic “C” behavior. Details are given with the description of
the virtual protected member functions.
SYNOPSIS
namespace std {
template <class charT>
class numpunct : public locale::facet {
public:
// type definitions:
typedef charT char_type;
typedef basic_string<charT> string_type;
// data members:
static locale::id id;
// constructors:
explicit numpunct (size_t refs = 0);
// mumpunct operations:
char_type decimal_point() const;
char_type thousands_sep() const;
string grouping ({) const;
string_type truename() const;
string_type falsename() const;
protected:
virtual ~numpunct();
// numpunct operations:
virtual char_type do_decimal_point() const;
virtual char_type do_thousands_sep() const;
virtual string do_ grouping () const;
virtual string_type do_truename() const;
virtual string_type do_falsename() const;
};
}
locale
numpunct<charT> 415
virtual ~numpunct
() ;
locale
416 numpunct<charT>
The returned value of type string is interpreted as an array of integral values of size
sizeof (char). It describes the way in which digits of the integral part of a numeric
value are grouped. Each character in the string is interpreted as an integer and specifies
the number of digits in a group, starting with the rightmost group. The last integer in the
string determines the size of all remaining groups. If the last integer is <= 0 or CHAR_MAX,
the described group is unlimited. If the string is empty, there is no grouping.
The base class implementation returns the empty string, indicating no grouping.
locale
numpunct_byname<charT> 417
numpunct_byname<charT>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT>
class numpunct_byname : public numpunct<charT> {
public:
// type definitions:
typedef charT char_type;
typedef basic_string<charT> string_type;
// constructors:
explicit numpunct_byname(const char*, size_t refs = 0);
protected:
// destructor:
virtual ~numpunct_byname();
// mumpunct operations:
virtual char_type do_decimal_point() const;
virtual char_type do_thousands_sep() const;
virtual string do_ grouping () const;
virtual string_type do_truename() const;
virtual string_type do_falsename() const;
};
locale
418 numpunct_byname<charT>
virtual ~numpunct_byname () ;
locale
num_put<charT,Outputliterator> 419
num_put<charT,Outputiterator>
CLASS TEMPLATE
DESCRIPTION
num_put is the facet that contains the functionality to generate a formatted charac-
ter sequence that represents a numeric or Boolean value.
The following instantiations are required: mnum_put<char> and
num_put<wchar_t>. The num_put template must be instantiable for all character types
and output iterator types. The behavior of the instantiations is implementation-specific
for character types other than char and wchar_t.
SYNOPSIS
namespace std {
template <class charT, class OutputIterator=ostreambuf_iterator<charT> >
class num_put : public locale::facet {
public:
// type definitions:
typedef charT char_type;
typedef OutputIterator iter_type;
// data members:
static locale::id id;
// constructors:
explicit num_put(size_t refs = 0);
// formatting operations:
iter_type put(iter_type s, ios_base& f, char_type fill, bool v) const:
iter_type put(iter_type s, ios_base& f, char_type fill, long v) const;
iter_type put(iter_type s, ios_base& f, char_type fill,
unsigned long v) const;
iter_type put(iter_type s, ios_base& f, char_type fill,
double v) const;
iter_type put(iter_type s, ios_base& f, char_type fill,
long double v) const;
iter_type put(iter_type s, ios_base& f, char_type fill,
const void* v) const;
locale
420 num_put<charT,Outputiterator>
protected:
// destructor:
virtual ~num_put();
// formatting operations:
virtual iter_type do_put(iter_type, ios_base&, char_type fill,
bool v) const;
virtual iter_type do_put(iter_type, ios_base&, char_type fill,
long v) const;
virtual iter_type do_put(iter_type, ios_base&, char_type fill,
| unsigned long) const;
virtual iter_type do_put(iter_type, ios_base&, char_type fill,
double v) const;
virtual iter_type do_put(iter_type, ios_base&, char_type fill,
long double v) const;
virtual iter_type do_put(iter_type, ios_base&, char_type fill,
const void* v) const;
};
}
Calls do_ put (s,ib, fill,b) and returns the result of this call.
Calls do_put(s, ib, £il1,1) and returns the result of this call.
locale
num_put<charT,Outputiterator> 421
Calls do_put(s, ib, £ill,d) and returns the result of this call.
Calls do_put (s, ib, £il1,1d) and returns the result of this call.
Calls do_put(s, ib, £i11,p) and returns the result of this call.
virtual ~num_put();
locale
422 num_put<charT,Outputiterator>
beyond the last element produced. It makes no provisions for error reporting. Any fail-
ures must be extracted from the returned iterator.
Produces a formatted character sequence that is a representation of the pointer value con-
tained in p. The formatting is done according to numpunct<charT> and ib. flags ().
For details, see appendix B. The operation returns an iterator pointing one beyond the last
element produced. It makes no provisions for error reporting. Any failures must be
extracted from the returned iterator.
locale
time_base 423
time_base
CLASS
class time_base
header: <locale>
base class(es): [none]
DESCRIPTION
time_base is the base class of all time_get facets. It provides the means to describe the
order of elements that form a date.
SYNOPSIS
namespace std {
class time_base {
public:
enum dateorder { no_order, dmy, mdy, ymd, ydm };
};
}
TYPE DEFINITIONS
CONSTANT DEFINITIONS
The following list shows the predefined values and their semantics for the nested enu-
merated type result:
dmy
mdy
ydm
locale
424 time_base
ymd
no order
locale
time_get<charT,Inputiterator> 425
time_get<charT,Inputiterator>
CLASS TEMPLATE
class time_get
header: <locale>
base class(es): locale::facet, time base
DESCRIPTION
time_get is the facet that contains the functionality to parse a character sequence that
represents date and/or time.
The following instantiations are required: time_get<char> and
time_get<wchar_t>. The time_get template must be instantiable for all character
types and input iterator types. The behavior of the instantiations is implementation-
specific for character types other than char and wchar_t.
SYNOPSIS
namespace std {
template <class charT, class InputIterator=istreambuf_iterator<charT> >
class time_get : public locale::facet, public time_base {
public:
// type definitions:
typedef charT char_type;
typedef InputIterator iter_type;
// data members:
Static locale::id id;
// constructors:
explicit time_get(size_t refs = 0);
// parsing operations:
iter_type get_time(iter_type s, iter_type end, ios base& f,
los_base::iostate& err, tm* t) const;
iter_type get_date(iter_type s, iter_type end, ios _base& f,
1os_base::iostate& err, tm* t) const;
iter_type get_weekday(iter_type s, iter_type end, ios _base& f,
1os_base::iostate& err, tm* t) const;
iter_type get_monthname(iter_type s, iter_type end, ios _base& f,
ios_base::iostate& err, tm* t) const;
iter_type get_year(iter_type s, iter_type end, ios_base& f,
10s_base::iostate& err, tm* t) const;
// miscellaneous:
dateorder date_order() const { return do_date_order(); }
locale
426 time_get<charT,Inputiterator>
protected:
virtual ~time_get();
// parsing operations:
virtual iter_type do_get_time(iter_type s, iter_type end, ios_base&,
ios_base::iostate& err, tm* t) const;
virtual iter_type do_get_date(iter_type s, iter_type end, ios_base&,
ios_base::iostate& err, tm* t) const;
virtual iter_type do_get_weekday(iter_type s, iter_type end, ios_base&,
ios_base::iostate& err, tm* t) const;
virtual iter_type do_get_monthname(iter_type s, ios_base&,
ios _base::iostate& err, tm* t) const;
virtual iter_type do_get_year(iter_type s, iter_type end, ios_baseé,
ios_base::iostate& err, tm* t) const;
// miscellaneous:
virtual dateorder do_date_order() const;
};
}
PARSING OPERATIONS
locale
time_get<charT,Inputiterator> 427
MISCELLANEOUS
virtual ~time_get();
PARSING OPERATIONS
Parses characters in the interval [s , end) to construct the date-related values of struct
tm, and store them in t. The interpretation of £ . flags () is implementation-defined. The
operation stops when it encounters an error, runs out of character, or has read all charac-
ters that can be consumed to construct the required date values. The result of the opera-
tion is indicated by err. Returns an iterator pointing immediately beyond the last
character consumed as part of a valid date value.
locale
428 time_get<charT,Inputiterator>
virtual iter_type
do_get_monthname(iter_type s, iter_type end,
ios_base& f, ios_base::iostate& err, tm* t) const;
Parses characters in the interval [s, end) to construct the month-related value of struct
tm, and store it in t. The interpretation of f.flags() is implementation-defined. The
operation stops when it encounters an error, runs out of character, or has read all charac-
ters that can be consumed to construct the required month value. The input can be the
abbreviation of a month name. If it finds an abbreviation that is followed by characters
that could match a full name, it continues reading until it matches the full name or fails.
The result of the operation is indicated by err. Returns an iterator pointing immediately
beyond the last character consumed.
Parses characters in the interval [s, end) to construct the time-related values of struct
tm and store them in t. Parsing is done against the time representation known to this.
The interpretation of f . flags () is implementation-defined. The operation stops when it
encounters an error, runs out of character, or has read all characters that can be consumed
to construct the required time values. The result of the operation is indicated by err.
Returns an iterator pointing immediately beyond the last character consumed as part of a
valid time value.
virtual iter_type
do_get_weekday(iter_type s, iter_type end, ios_base& f,
ios_base::iostate& err, tm* t) const;
locale
time_get<charT,Inputiterator> 429
Parses characters in the interval [s , end) to construct the year-related value of struct tm
and store it in t. The interpretation of f . flags () is implementation-defined. The opera-
tion stops when it encounters an error, runs out of character, or has read all characters that
can be consumed to construct the required month value. It is implementation-defined
whether or not two-digit year numbers are accepted and, if so, what century they are
assumed to lie in. The result of the operation is indicated by err. Returns an iterator
pointing immediately beyond the last character consumed.
MISCELLANEOUS
Returns one of the values from time_base: :dateorder, describing the order of day,
month, and year.
locale
430 time_get_byname<charT,inputliterator>
time_get_byname<charT, Inputiterator>
CLASS TEMPLATE
class time_get_byname
header: <locale>
base class(es): time_get<charT, InputIterator>
DESCRIPTION
virtual ~time_get_byname() ;
locale
time_put<charT,Outputliterator> 431
time_put<charT,Outputiterator>
CLASS TEMPLATE
DESCRIPTION
time_put is the facet that contains the functionality to generate a formatted character
sequence that represents date and/or time.
The following instantiations are required: time_put<char> and
time_put<wchar_t>. The time_put template must be instantiable for all character
types and output iterator types. The behavior of the instantiations is implementation-
specific for character types other than char and wchar_t.
SYNOPSIS
namespace std {
template <class charT, class OutputIterator=ostreambuf_iterator<charT> >
class time_put : public locale::facet {
public:
// type definitions:
typedef charT char_type;
typedef OutputIterator iter_type;
// data members:
Static locale::id id;
// constructors:
explicit time_put(size_t refs = 0);
// formatting operations:
iter_type put(iter_type s, ios_base& f, char_type fill, const tm* tmb,
const charT* pattern, const charT* pat_end) const;
iter_type put(iter_type s, ios_base& f, char_type fill,
const tm* tmb, char format, char modifier = 0) const;
protected:
// destructor:
virtual ~time_put();
// formatting operations:
virtual iter_type do_put(iter_type s, ios_base&,char_type, const tm* t,
char format, char modifier) const;
};
locale
432 time_put<charT,Outputliterator>
Calls do_put(s,f£,f£1i11,tmb, form, mod) and returns the result of this call.
locale
time_put<charT,Outputiterator> 433
virtual ~time_put();
locale
434 time_put_byname<charT,Outputliterator>
time_put_byname<charT, Outputiterator>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class OutputIterator=ostreambuf_iterator<charT> >
class time_put_byname : public time_put<charT, OutputIterator> {
public:
// type definitions:
typedef charT char_type;
typedef OutputIterator iter_type;
// constructors:
explicit time_put_byname(const char*, size_t refs = 0);
protected:
// destructor:
virtual ~time_put_byname();
// formatting operations:
virtual iter_type do_put(iter_type s, ios_base&,char_type, const tm* t,
char format, char modifier) const;
};
locale
time_put_byname<charT,Outputliterator> 435
virtual ~time_put_byname() ;
locale
436 time_base
time_base
CLASS
class time_base
header: <locale>
base class(es): [none]
DESCRIPTION
time_base is the base class of all time_get facets. It provides the means to describe the
order of elements that form a date.
SYNOPSIS
namespace std {
class time_base {
public:
enum dateorder { no_order, dmy, mdy, ymd, ydm };
};
}
TYPE DEFINITIONS
CONSTANT DEFINITIONS
The following list shows the predefined values and their semantics for the nested enu-
merated type result:
dmy
mdy
ydm
locale
time_base 437
ymd
no_ order
locale
438 tm
STRUCTURE
struct tm
header: <ctime>
DESCRIPTION
tm is a structure from the C Library that is used by the time facets and represents a
time/date value. The tm structure contains at least the following members, in any order.
SYNOPSIS
struct tm {
int tm_sec;
int tm_min;
int tm_hour;
int tm_mday;
int tm_mon;
int tm_year;
int tm_wday;
int tm_yday;
int tm_isdst;
};
int tm_sec;
second after the minute, that is, [0-60]. The range [0, 60] for tm_sec allows for a positive
leap second.
int tm_min;
int tm_hour;
int tm_mday;
locale
tm 439
int tm_mon;
int tm_year;
int tm_wday;
int tm_yday;
int tm_isdst;
daylight saving time flag. The value of tm_isdst is positive if daylight saving time is in
effect, zero if daylight saving time is not in effect, and negative if the information is not
available.
locale
2
CHARACTER TRAITS
<string>
DESCRIPTION
The header file contains all declarations for the template class char_traits and its spe-
cialization. It also contains all declarations for the C++ string classes.
SYNOPSIS
namespace std {
// general template:
template<class charT> struct char_traits;
440
header file <string> 44]
character traits
442 char_traits<charT>
char _traits<charT>
CLASS TEMPLATE
template<class charT>
struct char_traits { }
header: <string>
DESCRIPTION
char_traits is a (possibly empty) struct template and serves as a basis for explicit
specializations.
character traits
char_traits<char> 443
char_traits<char>
TEMPLATE SPECIALIZATION
DESCRIPTION
SYNOPSIS
namespace std {
template<>
struct char_traits<char> {
// type definitions:
typedef char char_type;
typedef int int_type;
typedef streamoff off _type;
typedef streampos pos_type;
typedef mbstate_t state_type;
character traits
444 char_traits<char>
TYPE DEFINITIONS
Performs ci = c2.
Returns 0, if s1 [i] == s2[i] foreach i in the interval [0 , n); else a value < 0, if for some
j in [0,n), s1[j] < s2[3] and for each i from [0,j), sl[i] == s2[i]. Otherwise
returns a value > 0. :
If s2 is not in the interval [s1,s1+n), the operation performs s1 [i] = s2[i] foreach i
in the interval [0 , n) and returns s1; otherwise the behavior is undefined.
character traits
char_traits<char> 445
Returns the smallest pointer p in the interval [s,s+n), such that *p == a. If no such
pointer exists, the return value is 0.
Returns the smallest value n of type size_t, such that s[n] == char (0).
Returns (c1<c2).
character traits
446 char_traits<char>
character traits
char_traits<wchar_t> 447
char_traits<wchar_t>
TEMPLATE SPECIALIZATION
DESCRIPTION
SYNOPSIS
namespace std {
template<>
struct char_traits<wchar_t> {
// type definitions:
typedef wchar_t char_type;
typedef wint_t int_type;
typedef streamoff off_type;
typedef wstreampos pos_type;
typedef mbstate_t state_type;
character traits
448 char_traits<wchar_t>
TYPE DEFINITIONS
Performs cl = c2.
static int compare(const char_type* sl, const char_type* s2, size_t n);
Returns 0, if s1[i] == s2[i] for each i in the interval [0,n); or else a value < 0, if for
somej in[0,n),s1[j] <s2[j] and foreach i from [0,j),s1[1i] ==s2 [i]. Otherwise
returns a value > 0.
If s2 is not in the interval [s1, s1+n), the operation performs s1[i] =s2[i] for each i
in the interval [0 , n) and returns s1; otherwise the behavior is undefined.
character traits
char_traits<wchar_t> 449
Returns the smallest pointer p in the interval [s,s+n), such that *D == a. If no such
pointer exists, the return value is 0.
Returns the smallest value n of type size_t, such that s[n] == char (0).
Performs s1 [i] =s2[i] for each i in the interval [0 , n) and returns s1+n.
character traits
450 char_traits<wchar_t>
character traits
3
IOSTREAMS
<iosfwd>
DESCRIPTION
The header file contains forward declarations for the following:
¢ the specializations of the template class char_traits
¢ all IOStreams class templates
¢ the type definitions for the narrow- and wide-character stream types
e the template class pos
¢ the type definitions for specializations of fpos
SYNOPSIS
namespace std {
template<class charT> class char_traits;
template<> class char_traits<char>;
template<> class char_traits<wchar_t>;
// Class definitions:
451
452 header file <iosfwd>
class basic_ostream;
class basic_iostream;
class basic_filebuf;
class basic_ifstream;
class basic_fstream;
iostreams
header file <iosfwd> 453
type definitions:
lostreams
454 header file <iostream>
<iostream>
DESCRIPTION
The header file contains declarations of the predefined global stream objects, namely cin,
cout, cerr, clog, wcin, wcout, wcerr, and wclog.
SYNOPSIS
namespace std {
extern istream cin;
extern ostream cout;
extern ostream cerr;
extern ostream clog;
iostreams
header file <ios> 455
<ios>
DESCRIPTION
The header file contains declarations for the stream base classes ios base and
basic_ios<class
charT, class traits>.
SYNOPSIS
#include <iosfwd>
namespace std {
typedef OFF_T streamoff;
typedef SZT streamsize;
template <class stateT> class fpos;
class ios_base;
template <class charT, class traits = char_traits<charT> >
class basic_ios;
// manipulators:
1os_base& boolalpha (ios_base& str);
1os_base& noboolalpha(ios_base& str);
// adjustfield:
los_base& internal (los_base& str);
iostreams
456 header file <ios>
// basefield:
10s_base& dec (10Ss_base& str);
1os_base& hex (10s_base& str);
10s_base& oct (10Ss_base& str);
// floatfield:
ios_base& fixed (10s_base& str);
iostreams
header file <streambuf> 457
<streambuf>
DESCRIPTION
The header file contains declarations for the stream buffer class template
basic_streambuf<class charT, class traits> and type definitions for its
specializations. |
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_streambuf;
typedef basic_streambuf<char> streambuf;
typedef basic_streambuf<wchar_t> wstreambuf;
iostreams
458 header file <istream>
<istream>
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_istream;
typedef basic_istream<char> istream;
typedef basic_istream<wchar_t> wistream;
iostreams
header file <ostream> 459
<ostream>
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_ostream;
typedef basic_ostream<char> ostream;
typedef basic_ostream<wchar_t> wostream;
iostreams
460 header file <iomanip>
<iomanip>
DESCRIPTION
The header file contains declarations for the standard manipulators, ie., for the function
templates resetiosflags, setiosflags, setbase, set fill, setprecision, and
setw.
SYNOPSIS
namespace std {
// Types T1, T2, ... are unspecified implementation types
Tl resetiosflags(ios_base::fmtflags mask);
T2 setiosflags (ios_base::fmtflags mask);
T3 setbase(int base);
template<charT> T4 setfill(charT c);
T5 setprecision(int n);
T6 setw(int n);
tostreams
header file <sstream> 461
<sstream>
DESCRIPTION
The header file contains declarations for the string stream classes:
ethe string stream buffer class template basic_stringbuf<class charT,
classtraits,
class Allocator>
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT>,
class Allocator = allocator<charT> >
class basic_stringbuf;
typedef basic_stringbuf<char> stringbuf;
typedef basic_stringbuf<wchar_t> wstringbuf;
iostreams
462 header file <sstream>
iostreams
header file <fstream> 463
Estream
DESCRIPTION
The header file contains declarations for the file stream classes:
ethe file stream buffer class template basic_filebuf<class charT, class
traits>
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_filebuf;
typedef basic_filebuf<char> filebuf;
typedef basic_filebuf<wchar_t> wfilebuf;
1ostreams
464 global type definitions
header: <iosfwd>
TYPE DEFINITIONS
iostreams
global objects | 465
global objects
HEADER
header: <iostream>
OBJECTS
DESCRIPTION
cin is an input stream object that handles characters of type char. Its stream buffer is ini-
tially associated with stdin from the C standard I/O. Also, it is initially tied to cout, i-e.,
Cin. tie() returns &cout.
cout is an output stream object that handles characters of type char. Its stream buffer is
initially associated with stdout from the C standard I/O.
cerr is an output stream object that handles characters of type char. Its stream buffer is
initially associated with stderr from the C standard I/O. Also, it is initially configured
to pass the data received by one output operation on the stream level directly to the exter-
nal device;ie., cerr.flags() | unitbuf is nonzero.
clog is an output stream object that handles characters of type char. Its stream buffer is
initially associated with stderr from the C standard I/O.
wein is an input stream object that handles characters of type wchar_t. Its stream buffer
is initially associated with stdin from the C standard I/O. Also, it is initially tied to
wecout; ie.,wcin.tie() returns &wcout.
wcout is an output stream object that handles characters of type wchar_t. Its stream
buffer is initially associated with stdout from the C standard I/O.
weerr is an output stream object that handles characters of type wchar_t. Its stream
buffer is initially associated with stderr from the C standard I/O. Also, it is initially con-
figured to pass the data received by one output operation on the stream level directly to
the external device; i.e., wcerr.flags() | unitbuf is nonzero.
wclog is an output stream object that handles characters of type wchar_t. Its stream
buffer is initially associated with stderr from the C standard I/O.
iostreams
466 global objects
The predefined global streams listed above are initialized in such a way that they can be
used in constructors and destructors of static and global objects.
The predefined global streams are by default synchronized with their associated C
standard files. You can switch off the synchronization by calling ios_base::
sync_with_stdio(false).
The relationship between the global narrow-character streams and their wide-
character counterparts is undefined, which means that you do not know what is going to
happen if you use both in the same program.
The difference between clog and cerr is that clog is fully buffered, whereas out-
put to cerr is written to the external device immediately after formatting. The same
holds for wclog and wcerr.
lostreams
basic_filebuf<charT,traits> 467
basic_filebuf<charT, traits>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_filebuf : public basic_streambuf<charT,traits> {
public: |
// type definitions:
typedef charT char_type;
typedef typename traits::int_type int_type;
typedef typename traits::pos_type pos_type;
typedef typename traits::off_type off_type;
typedef traits traits_type;
// constructors/destructor:
basic_filebuf();
virtual ~basic_filebuf();
// open/close:
bool is_open() const;
basic_filebuf<charT,traits>* open(const char* s,
1os_base: :openmode mode);
basic_filebuf<charT,traits>* close();
protected:
virtual streamsize showmanyc();
virtual int_type underflow();
virtual int_type uflow();
virtual int_type pbackfail(int_type c = traits::eof());
virtual int_type overflow (int_type c = traits::eof());
virtual basic_streambuf<charT,traits>*
setbuf (char_type* s, streamsize n);
iostreams
468 basic_filebuf<charT,traits>
basic filebuf();
If is_open() == true, the associated file is closed and this is returned. Otherwise the
‘return value is 0.
Returns true if a previous call to open () succeeded and thereafter no successful call to
close() has been processed. Otherwise the return value is false.
basic_filebuf<char_type,traits_type >*
open(const char *filename, ios_base::openmode mode);
Opens the file specified by filename according to the mode. Returns this if successful,
otherwise 0.
Same functional description as for the base class. Redefined here to deal with the special
constraints of a file buffer.
1ostreams
basic_filebuf<charT,traits> 469
Makes space available in the put area by transferring characters from the put area to the
external file. Depending on the contained locale, the transferred characters are converted
to an external character representation. It is implementation-dependent how many char-
acters from the put area are transferred to the external file. It is also implementation-
dependent if c is transferred to the external file, for traits::eq_int_type
(c, traits _type::eof()) == false. The return value is traits_type::
not_eof (c) if the operation succeeded. Otherwise traits_type::eof().
Same functional description as for the base class. Redefined here to deal with the special
constraints of a file buffer.
virtual pos_type
seekoff(off_type off, 1os_base::seekdir dir,
ios_base::openmode mode = ios_base::in | ios_base::out);
Alters the file position. The requested new file position is described by of f and dir in the
following way: dir describes the base position and of f is used to determine the offset
relative to the base position. If the code conversion facet of the locale contained in this
indicates that each internal character is converted to a constant number of external char-
acters (1.e., use_facet<codecvt<charT, char, typename traits::state_type>
>(this->getloc()).encoding() > 0), the offset relative to the base position will be
off multiplied by this constant number. Otherwise the repositioning will fail, if of £ != 0.
The return value is pos_type (off_type(-1) ) incase of failure.
virtual pos_type
seekpos (pos_type pos,
ios_base::openmode mode = ios_base::in | 1os_base::out);
Alters the file position. The following table shows how mode affects the operation:
CONDITION EFFECT
(mode
& basic_ios::in) !=0 sets the file position to pos, then
updates the input sequence
(mode & basic_ios::out) !=0 sets the file position to pos, then
updates the output sequence
otherwise operation fails
iostreams
470 basic_filebuf<charT,traits>
If pos is an invalid stream position, the operations fails. If pos has not been obtained by a
previous successful call to a positioning function, the effect is undefined. If successful the
return value is pos, otherwise an invalid stream position.
virtual basic_streambuf<char_type,traits_type>*
setbuf (char_type* s, streamsize n);
If setbuf (0, 0) is called before any I/O has occurred, the stream that holds this file
buffer becomes unbuffered; i.e., that pbase () and pptr () always return 0 and output to
the file appears as soon as possible. The return value is this. For other parameters the
operation behaves in an implementation-dependent manner.
Overrides the base class functionality if it is able to determine more available characters.
Same functional description as for the base class. Redefined here to deal with the special
constraints of a file buffer.
If a put area exists, that is, if there are characters that have not yet been written, calls
filebuf: :overflow() to write the characters to the file. If a get area exists, the effect is
implementation-defined.
Same functional description as for the base class, except that it uses the mechanisms as
described in basic_filebuf: :underflow() to make characters available.
Reads new characters from the external file and converts them, if necessary, to the internal
character representation. The resulting characters, either from the read or the conversion,
are put into the get area, and the pointers of the get and put areas are updated. The return
value is the first newly read character. If no new characters could be made available, the
return value is traits_type::eof().
iostreams
basic_fstream<charT,traits> 471
basic_fstream<charT,traits>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits=char_traits<charT> >
class basic_fstream : public basic_iostream<charT,traits> {
public:
// type definitions:
typedef charT char_type;
typedef typename traits::int_type int_type;
typedef typename traits::pos_type pos_type;
typedef typename traits::off_type off_type;
typedef traits traits_type;
// constructors:
basic_fstream();
explicit basic_fstream(const char’ s,
ios_base::openmode mode = ios_base::in|ios_base::out);
// file stream operations:
basic_filebuf<charT,traits>* rdbuf() const;
bool is_open();
void open(const char* s,
ios_base::openmode mode = ios_base::in|ios_base::out);
void close();
private:
// basic_filebuf<charT,traits> sb; exposition only
};
}
basic fstream();
iostreams
472 basic_fstream<charT,traits>
explicit
basic_fstream(const char* filename,
ios_base::openmode mode = ios_base::in | 10s_base: :out);
void close();
bool is_open();
lostreams
basic_ifstream<charT,traits> 473
basic_ifstream<charT,traits>
CLASS TEMPLATE
DESCRIPTION
basic_ifstream is an input stream that can be associated with a file.
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_ifstream : public basic_istream<charT,traits> {
public:
// type definitions: _
typedef charT char_type;
typedef typename traits::int_type int_type;
typedef typename traits::pos_type pos_type;
typedef typename traits::off_type off_type;
typedef traits traits_type;
// constructors:
basic_ifstream();
explicit basic_ifstream(const char* s,
10s_base::openmode mode = ios_base::in);
// dinput file stream operations:
basic_filebuf<charT,traits>* rdbuf() const;
bool is_open();
void open(const char* s, ios_base::openmode mode = ios_base::in);
void close();
private:
// basic_filebuf<charT,traits> sb; exposition only
};
basic ifstream();
iostreams
474. basic_ifstream<charT,traits>
explicit
basic_ifstream(const char* filename,
ios_base::openmode mode = ios_base::in);
void close();
bool is_open();
1ostreams
basic_ios<charT,traits> 475
basic_ios<charT,traits>
CLASS TEMPLATE
DESCRIPTION
basic_ios is the character-type and traits-type-dependent base class for all streams. It
encapsulates the common character-type-dependent functionality of a stream, e.g., set-
ting and getting the stream buffer.
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_ios : public ios_base {
public:
// type definitions:
typedef charT char_type;
typedef typename traits::int_type int_type;
typedef typename traits::pos_type pos_type;
typedef typename traits::off_type off_type;
typedef traits traits_type;
// constructor/destructor: .
explicit basic_ios(basic_streambuf<charT,traits>* sb);
virtual ~basic_ios();
// error state:
operator void*() const
bool operator!() const
lostate rdstate() const;
void clear(iostate state = goodbit);
void setstate(iostate state);
bool good() const;
bool eof() const;
bool fail() const;
bool bad() const;
// exceptions handling:
lostate exceptions() const;
void exceptions(iostate except);
// locales:
locale imbue(const locale& loc);
char narrow(char_type c, char dfault) const;
1ostreams
476 basic_ios<charT,traits>
explicit
basic_ios(basic_streambuf<char_type,traits_type>* sb);
ERROR STATE
Returns this->fail().
iostreams
basic_ios<charT,traits> 477
Returns (this->rdstate()
==0).
EXCEPTION HANDLING
Sets this to cause exceptions for all elements specified in except and calls this->
clear (rdstate() ).
LOCALES
Calls ios_base: : imbue (loc) and if rdbuf () !=0, rdbuf () ->imbue (loc). Returns
the previous return value of ios_base: : imbue ().
1ostreams
478 basic_ios<charT,traits>
Returns the char representation that corresponds to c, if it exists; otherwise df1t, using
the ctype facet of the stream’s locale, i.e., calls use_facet<ctype<char_type> >
(getloc()) .narrow(c,def1t) and returns the resulting character.
Returns the char_type representation that corresponds to c, using the ctype facet of
the stream’s locale, ie., calls use_facet<ctype<char_type> >(getloc()).
widen (c) and returns the resulting character.
MISCELLANEOUS
Assigns to the data members of this the corresponding data members of rhs. this->
rdstate() and this->rdbuf () are left unchanged. Before copying any part of rhs,
all registered callbacks cb are called together with their index ix as
(*cb) (erase_event, *this, ix). After all parts except this->exceptions () have
been replaced, all callbacks cb that were copied from rhs are called together with their
index ix as (*cb) (copyfmt_event, *this,ix). Finally this->exceptions() is
altered by calling this->exceptions(rhs.exceptions()). The return value is
*this..
Returns the character used to pad (fill) an output field to the specified width.
Causes this to use the character c for padding (filling). Returns the previous value of
this->fill().
1ostreams
basic_ios<charT,traits> 479
basic_streambuf<charT, traits>*
rdbuf (basic_streambuf<char_type,traits_type>* sb);
sb replaces the current stream buffer pointer contained in this. Additionally, this->
clear () is called. Returns the previous value of this->rdbuf ().
Returns the pointer to an output stream that is tied to, i.e., synchronized with, this.
basic_ostream<char_type, traits_type>*
tie (basic_ostream<char_type, traits_type>* str);
Ties (synchronizes) an output stream to (with) this. Returns the previous value of this->
tie().
basic_ios();
lostreams
480 basic_iostream<charT,traits>
basic iostream<charT,traits>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_iostream : public basic_istream<charT,traits>,
public basic_ostream<charT,traits> {
public:
// constructor/destructor
explicit basic_iostream(basic_streambuf<charT,traits>* sb);
virtual ~basic_iostream();
};
}
lostreams
basic_istream<charT,traits> 481
basic_istream<charT,traits>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_istream : virtual public basic_ios<charT,traits> {
public:
// type definitions:
typedef charT char_type;
typedef typename traits:: int _type int_type;
typedef typename traits::pos _type pos_type;
typedef typename traits::off _type off_type;
typedef traits traits_type;
// prefix/suffix:
class sentry;
// constructor/destructor:
explicit basic_istream(basic _Streambuf<charT,traits>* sb);
virtual ~basic_istream();
// formatted input:
basic_istream<charT, traits>& operator>>(bool& n);
basic_istream<charT, traits>& operator>>(short& n);
basic_istream<charT, traits>& operator>>(unsigned short& n);
basic_istream<charT, traits>& operator>>(int& n);
basic_istream<charT, traits>& operator>>(unsigned int& n);
basic_istream<charT, traits>& operator>>(long& n);
basic_istream<charT, traits>& operator>>(unsigned long& n);
basic_istream<charT, traits>& operator>>(float& f);
basic_istream<charT, traits>& operator>>(double& f);
basic_istream<charT, traits>& operator>>(long double& f);
basic_istream<charT, traits>& operator>>(void*& p);
basic_istream<charT, traits>& operator>>
(basic_streambuf<char_type, traits>* sb);
iostreams
482 basic_istream<charT,traits>
// unformatted input:
streamsize gcount() const;
int_type get();
basic_istream<charT, traits>& get (char_type& ¢c);
basic_istream<charT, traits>& get (char_type* s, streamsize n)j;
basic_istream<charT, traits>& get(char_type* s, streamsize n,
char_type delim);
basic_istream<charT, traits>& get (basic_streambuf<char_type,
traits>& sb);
basic_istream<charT, traits>& get (basic_streambuf<char_type,
traits>& sb, char_type delim);
basic_istream<charT, traits>& getline(char_type* s, streamsize n);
basic_istream<charT, traits>& getline(char_type* s, streamsize n,
char_type delim);
basic_istream<charT, traits>& ignore(streamsize n = 1,
int_type delim =traits::eof());
int_type peek();
basic_istream<charT, traits>& read (char_type* s, streamsize n);
streamsize readsome(char_type* s, streamsize n)j;
// manipulator extractors:
basic_istream<charT, traits>& operator>>
(basic_istream<charT, traits>& (*pf) (basic_istream<charT, traits>&) );
basic_istream<charT, traits>& operator>>
};
// character extraction:
template<class charT, class traits> basic_istream<charT, traits>&
operator>>(basic_istream<charT,traits>&, charT&);
template<class traits> basic_istream<char, traits>&
operator>>(basic_istream<char,traits>&, unsigned charé&);
template<class traits> basic_istream<char,traits>&
operator>>(basic_istream<char,traits>&, signed char&);
1ostreams
basic_istream<charT,traits> 483
CLASS DEFINITIONS
PREFIX/SUFFIX
class sentry
{
public:
explicit sentry (basic_istream<char_type,traits_type>& is,
bool noskipws = false);
~sentry();
operator bool();
};
sentry defines a helper class that handles exception-safe preparations for formatted and
unformatted input.
The constructor calls is.tie()->flush(), if is.tie()!=0. If noskipws==
false and is.flags() & ios_base::skipws != 0, it extracts and discards each
available whitespace character from the input. The locale that is currently imbued in is is
used to determine if a character from the input is a whitespace character.
If, after these operations, is.good()==true, this->operator bool () returns
true, otherwise false. The constructor may also call is.setstate(ios_base::
failbit) to indicate an error.
lostreams
484 basic_istream<charT,traits>
FORMATTED INPUT
basic_istream<char_type, traits_type>&
operator>>(bool& b);
basic_istream<char_type,traits_type>&
operator>>(shorté& s);
basic_istream<char_type, traits_type>&
operator>>(unsigned short& us);
basic_istream<char_type, traits_type>&
operator>>(int& i);
basic_istream<char_type, traits_type>&
operator>>(unsigned int& ul);
iostreams
basic_istream<charT,traits> 485
basic_istream<char_type, traits_type>&
operator>>(long& 1);
basic_istream<char_type, traits_type>&
operator>>(unsigned long& ul);
basic_istream<char_type, traits_type>&
operator>> (float& f);
basic_istream<char_type, traits_type>&
operator>>(double& d);
basic_istream<char_type, traits_type>&
operator>>(long double& ld);
Extracts a value of type long double and stores it in 1d by using the num_get
<char_type, istreambuf_iterator<char_type> > facet from the locale contained
in this. The return value is *this. Failures are indicated by the state of *this.
basic_istream<char_type, traits_type>&
operator>>(void*& p);
lostreams
486 basic_istream<charT,traits>
basic_istream<char_type, traits_type>&
operator>> (basic_streambuf<char_type,traits_type>* sb);
Extracts characters of type char_type from the input stream *this and inserts them
into the stream buffer * sb. Characters are extracted and inserted until extraction from the
input stream *this or insertion into the stream buffer *sb fails, or some other error
occurs. Failures are indicated by the state of *this. Return value is * this.
UNFORMATTED INPUT
Returns the number of characters of type char_type extracted by the last call to an input
member function of this.
int_type get();
Extracts a character of type char_type if one is available and returns it. Otherwise
ios base: : failure is set in the state of *this and traits_type: :eof() returned.
basic_istream<char_type,traits_type>&
get (char_type& c);
basic_istream<char_type, traits_type>&
get (char_type* s, streamsize n, char_type delim);
Extracts characters of type char_type and puts them into the array of type
char_type[] whose first element is designated by s. Characters are transferred until
extraction fails, n-1 characters are transferred, or the next input character equals delim
(in which case the delimiter is not extracted). The end-of-string character is written
behind the last character stored in the array. When no characters are transferred, failure
will be indicated by the state of *this, and *s will contain only the end-of-string charac-
ter. Return is *this.
basic_istream<char_type, traits_type>&
get (char_type* s, streamsize n);
Calls this->get (s,n,widen('\n')) and returns the result from this call.
1ostreams
basic_istream<charT,traits> 487
basic_istream<char_type, traits_type>&
get (basic_streambuf<char_type,traits_type>* sb, char_type delim):
Extracts characters of type char_type and inserts them into the output stream associ-
ated to *sb. Characters are extracted and inserted until extraction or insertion fails, some
other error occurs, or the next character to be extracted equals delim (in which case the
delimiter is not extracted). Failures are indicated by the state of *this. Return value is
*this.
basic_istream<char_type, traits_type>&
get (basic_streambuf<char_type,traits_type>* sb);
Calls this->get (sb, widen('\n')) and returns the result from this call.
basic_istream<char_type, traits_type>&
getline(char_type* s, streamsize n, char_type delim);
Extracts characters of type char_type and puts them into the array of type
char_type[] whose first element is designated by s. Characters are transferred until
extraction fails, n-1 characters are transferred, or the next input character equals delim.
If the transfer is stopped because delim equals the next input character, this character is
extracted but not stored in the array. The end-of-string character is written behind the last
character stored in the array. When no characters are transferred, failure will be indicated
by the state of *this, and *s will contain the end-of-string character. Return value is
*this.
basic_istream<char_type, traits_type>&
getline (char_type* s, streamsize n);
basic_istream<char_type, traits_type>&
ignore (streamsize n = 1, int_type delim = traits _type::eof());
Extracts characters of type char_type and discards them. Characters are extracted until
extraction fails, n characters are extracted, or the next input character equals delim (in
which case the delimiter is extracted). Failures are indicated by the state of *this. Return
value is *this.
lostreams
488 basic_istream<charT,traits>
int_type peek();
Returns the next available character without consuming it, that is, returns
this->rdbuf()->sgetc(), if the stream state is good, and_ returns
traits type: :eof() otherwise.
basic_istream<char_type, traits_type>&
read(char_type* s, streamsize n);
Extracts n characters from the external input device, that is, extracts characters of type
char_type and puts them into the array of type char_type[] whose first element is
designated by s. Characters are transferred until extraction fails or n characters are
transferred.
Failures are indicated by the state of *this. Return value is *this.
Extracts up to n characters that are immediately available from the external input device,
that is, extracts characters of type char_type and puts them into the array of type
char_type[] whose first element is designated by s. Characters are transferred until an
error occurs during extraction, n characters are transferred, or this->rdbuf ()->
in _avail == 0. In other words, the number of characters transferred is
min(n, rdbuf ()->in_avail()).
readsome () differs from read () in that it does not wait for input (from a terminal
for instance), but returns immediately.
Failures are indicated by the state of *this. If this->rdbuf () -> in_avail == 0
stops the transfer of characters, ios_base: :eofbit is set. Returns the number of char-
acters transferred.
MANIPULATOR EXTRACTORS
basic_istream<char_type, traits_type>&
operator>> (basic_istream<char_type, traits_type>&
(*pf) (basic_istream<char_type, traits_type>&) );
basic_istream<char_type, traits_type>&
operator>> (basic_ios<char_type, traits_type>&
(*pf) (basic_ios<char_type, traits_type>&) );
iostreams
basic_istream<charT,traits> 489
basic_istream<char_type, traits_type>&
operator>>(ios_base& (*pf) (ios _base&));
BUFFER MANAGEMENT
basic_istream<char_type,traits_type>& putback(char_type c)
Inserts c into the putback sequence by calling this->rdbuf () ->putback (c). Failures
are indicated by the state of *this. Returns *this.
int synce()
basic_istream<char_type,traits_type>& unget ()
Makes the last character extracted from this available again by calling
this->rdbuf()->unget(). Failures are indicated by the state of *this. Returns
*this.
POSITIONING
basic_istream<char_type,traits_type>&
seekg(ios_base::pos_type pos);
basic_istream<char_type,traits_type>&
seekg(ios_base::off_type off, ios_base::seekdir dir);
pos_type tellg();
lostreams
490 basic_istream<charT,traits>
GLOBAL FUNCTIONS
CHARACTER EXTRACTION
Extracts a character of type charT and stores it in c. The return value is * is. Failures are
indicated by the state of is.
template<class traits>
basic_istream<char,traits >& operator>>
(basic_istream<char,traits >& is, unsigned char& c);
Extracts a character of type unsigned char and stores it in c. The return value is *is.
Failures are indicated by the state of is.
template<class traits>
basic_istream<char,traits>& operator>>
(basic_istream<char,traits>& is, signed char& c);
Extracts a character of type signed char and stores it in c. The return value is *is. Fail-
ures are indicated by the state of is.
Extracts characters of type charT and puts them into the array of type charT[] whose
first element is designated by s. Characters are transferred until extraction fails,
is.width()-1 characters are transferred (if is .width() >0), or the next input charac-
ter is a whitespace character. The end-of-string character is written behind the last charac-
ter stored in the array. When no characters are transferred, failure will be indicated by the
state of *this, and *s will contain only the end-of-string character. The operation calls
is.width(0) inany case. Returns *this.
template<class traits>
basic_istream<char,traits>& operator>>
(basic_istream<char,traits>& is, unsigned char* s)j;
Extracts characters of type unsigned char and puts them into the array of type
unsigned char[] whose first element is designated by s. Characters are transferred
1ostreams
basic_istream<charT,traits> 49]
until extraction fails, is .width()-1 characters are transferred (if is.width() >0), or
the next input character is a whitespace character. The end-of-string character is written
behind the last character stored in the array. When no characters are transferred, failure
will be indicated by the state of *this, and *s will contain only the end-of-string charac-
ter. The operation calls is .width(0) in any case. Returns *this.
template<class traits>
basic_istream<char,traits_type>& operator>>
(basic_istream<char,traits_type>& is, signed char* s);
Extracts characters of type signed char and puts them into the array of type signed
char[] whose first element is designated by s. Characters are transferred until extrac-
tion fails, is.width()-1 characters are transferred (if is.width() >0), or the next
input character is a whitespace character. The end-of-string character is written behind
the last character stored in the array. When no characters are transferred, failure will be
indicated by the state of *this, and *s will contain only the end-of-string character. The
operation calls is .width(0) in any case. Returns *this.
template<class traits>
basic_istream<char,traits_type>& ws
(basic_istream<char,traits_type>& is);
Manipulator that extracts whitespace characters from the input stream is. Extraction
stops when the next character in the input stream is is not a whitespace character or no
more characters are available. If the extraction stops because no more characters are avail-
able, ios_base::eofbit is set foris.
iostreams
492 basic_istringstream<charT,traits,Allocator>
basic_istringstream<charT,traits,Allocator>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT>,
class Allocator = allocator<charT> >
class basic_istringstream : public basic_istream<charT,traits> {
public:
// type definitions:
typedef charT char_type;
typedef typename traits::int_type int_type;
typedef typename traits::pos_type pos_type;
typedef typename traits::off_type off_type;
typedef traits traits_type;
// constructors:
explicit basic_istringstream(ios_base::openmode which = ios_base::in);
explicit basic_istringstream(
const basic_string<charT,traits,Allocator>& str,
1os_base::openmode which = ios_base::in);
// string stream operations:
basic_stringbuf<charT,traits,Allocator>* rdbuf() const;
basic_string<charT,traits,Allocator> str() const;
void str(const basic_string<charT,traits,Allocator>& s);
private:
// basic_stringbuf<charT,traits,Allocator> sb; exposition only
};
}
iostreams
basic_istringstream<charT,traits,Allocator> 493
explicit basic_istringstream
(const basic_string<char_type,traits_type,Allocator>& s,
1os_base::openmode mode = ios_base::in);
basic_string<char_type,traits_type,Allocator> str();
Returns the string contained in the string stream buffer by calling this->rdbuf->
str().
Sets the string contained in the string stream buffer to s by calling this->rdbuf->
str(s).
1ostreams
494 basic_ofstream<charT,traits>
basic_ofstream<charT,traits>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_ofstream : public basic_ostream<charT,traits> {
public:
// type definitions:
typedef charT char_type;
typedef typename traits::int_type int_type;
typedef typename traits::pos_type pos_type;
typedef typename traits::off_type off_type;
typedef traits traits_type;
// constructors:
basic_ofstream();
explicit basic_ofstream(const char* s,
1os_base::openmode mode = ios_base::out);
// file stream operations:
basic _filebuf<charT,traits>* rdbuf() const;
bool is_open();
void open(const char* s, ios_base::openmode mode = ios_base::out);
void close();
private:
// basic_filebuf<charT,traits> sb; exposition only
};
}
1ostreams
basic_ofstream<charT,traits> 495
explicit
basic_ofstream(const char* filename, ios_base::openmode mode = ios_base::out);
void close();
bool is_open();
lostreams
496 basic_ostream<charT,traits>
basic_ostream<charT,traits>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_ostream : virtual public basic_ios<charT,traits> {
public:
// type definitions:
typedef charT char_type;
typedef typename traits::int_type int_type;
typedef typename traits::pos_type pos_type;
typedef typename traits::off_type off_type;
typedef traits traits_type;
// prefix/suffix:
class sentry;
// constructor/destructor:
explicit basic_ostream(basic_streambuf<char_type,traits>* sb);
virtual ~basic_ostream();
// formatted output:
basic_ostream<charT,traits>& operator<<(bool n);
basic_ostream<charT,traits>& operator<<(short n);
basic_ostream<charT,traits>& operator<<(unsigned short n);
basic_ostream<charT,traits>& operator<<(int n);
basic_ostream<charT,traits>& operator<<(unsigned int n);
basic_ostream<charT,traits>& operator<<(long n);
basic_ostream<charT,traits>& operator<<(unsigned long n);
basic_ostream<charT, traits>& operator<<(float f);
basic_ostream<charT,traits>& operator<<(double f);
basic_ostream<charT,traits>& operator<<(long double f);
basic_ostream<charT,traits>& operator<<(const void* p);
basic_ostream<charT,traits>& operator<<
(basic_streambuf<char_type,traits>* sb);
iostreams
basic_ostream<charT,traits> 497
// unformatted output:
basic_ostream<charT,traits>& put(char_type c);
basic_ostream<charT,traits>& write(const char_type* s, streamsize n);
// manipulator inserters:
basic_ostream<charT,traits>& operator<<
(basic_ostream<charT,traits>& (*pf) (basic_ostream<charT, traits>&));
basic_ostream<charT,traits>& operator<<
(basic_ios<charT,traits>& (*pf) (basic_ios<charT,traits>&));
basic_ostream<charT,traits>& operator<<
(ios_base& (*pf) (ios_base&));
// buffer management:
basic_ostream<charT,traits>& flush();
// positioning:
pos_type tellp();
basic_ostream<charT,traits>& seekp(pos_type);
basic_ostream<charT,traits>& seekp(off_type, ios_base::seekdir);
};
// character inserters:
template<class charT, class traits>
basic_ostream<charT,traits>& operator<<(basic_ostream<charT,
traits>&,
charT);
template<class charT, class traits>
basic_ostream<charT,traits>& operator<<(basic_ostream<charT,
traits>&,
char);
template<class traits>
basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&,
char);
template<class traits>
basic_ostream<char,traits>& operator<<(basic_ostream<char,
traits>&,
Signed char);
template<class traits>
basic_ostream<char,traits>& operator<<(basic_ostream<char,
traits>&,
unsigned char)
template<class charT, class traits>
basic_ostream<charT, traits>& operator<<(basic_ostream<charT,
traits>&,
const charT*); |
template<class charT, class traits>
basic_ostream<charT,traits>& operator<<(basic_ostream<charT,
traits>&,
const char*);
template<class traits>
basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&,
const char*);
iostreams
498 basic _ostream<charT,traits>
template<class traits>
basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&€,
const signed char*);
template<class traits>
basic_ostream<char,traits>& traits>é&,
operator<<(basic_ostream<char,
const unsigned char*);
// output stream manipulators:
template <class charT, class traits>
basic_ostream<charT, traits>& traits>&
endl (basic_ostream<charT, os);
template <class charT, class traits>
basic_ostream<charT, traits>& traits>&
ends (basic_ostream<charT, os);
template <class charT, class traits>
basic_ostream<charT,traits>& traits>&
flush(basic_ostream<charT, os);
CLASS DEFINITIONS
PREFIX/SUFFIX
class sentry
{
public:
explicit sentry (basic_ostream<char_type,traits_type>& os);
~sentry();
operator bool();
di .
sentry defines a helper class that handles exception-safe preparations and follow-up
treatment for formatted and unformatted output. The constructor calls os.tie()->
flush(), if os.tie()!=0. If, after these operations, is.good() == true, this->
operator bool() returns true, otherwise false. The constructor may also call
is.setstate(ios base::failbit) to indicate an error. If ((os.flags() &
ios_base::unitbuf), and there are no uncaught exceptions, the destructor calls
os.flush().
1ostreams
basic_ostream<charT,traits> 499
virtual ~basic_ostream
();
FORMATTED OUTPUT
basic_ostream<char_type, traits_type>&
operator<<(bool b);
basic_ostream<char_type,traits_type>&
operator<<(short s);
basic_ostream<char_type, traits_type>&
operator<<(unsigned short us);
basic_ostream<char_type, traits_type>&
operator<<(int 1);
basic_ostream<char_type, traits_type>&
operator<< (unsigned int ui);
iostreams
500 | basic_ostream<charT,traits>
contained by this. The return value is *this. Failures are indicated by the state of
*this.
basic_ostream<char_type, traits_type>&
operator<<(long 1);
basic_ostream<char_type,traits_type>&
operator<<(unsigned long ul);
basic_ostream<char_type, traits_type>&
operator<<(float f);
basic_ostream<char_type, traits_type>&
operator<< (double d);
basic_ostream<char_type, traits_type>&
operator<<(long double 1d);
iostreams
basic_ostream<charT,traits> 501
basic_ostream<char_type, traits_type>&
operator<<(void* p);
basic_ostream<char_type, traits_type>&
operator<< (basic_streambuf<char_type,traits_type>* sb);
Gets characters of type char_type from the stream buffer *sb and inserts them in the
output stream *this. Characters are read from the stream buffer *sb until its end is
reached, insertion to the output stream *this fails, or some other error occurs. Failures
are indicated by the state of *this. Return value is *this.
UNFORMATTED OUTPUT
basic_ostream<char_type,traits_type>&
put (char_type c);
Inserts a character c of type char_type. Failures are indicated by the state of *this.
Return value is *this.
basic_ostream<char_type, traits_type>&
write(const char_type* s, streamsize n);
Inserts characters of type char_type, which are obtained from successive locations of an
array whose first element is designated by s. Characters are inserted until n characters are
inserted or insertion fails. Failures are indicated by the state of *this. Return value is
*this.
MANIPULATOR INSERTERS
basic_ostream<char_type,traits_type>&
operator<< (basic_ostream<char_type, traits_type>&
(*pf) (basic_ostream<char_type, traits_type>&));
iostreams
502 basic_ostream<charT,traits>
basic_ostream<char_type,traits_type>&
basic_ostream<char_type, traits_type>&
operator<<(ios_base& (*pf) (ios_base&));
BUFFER MANAGEMENT
basic_ostream<char_type,traits_type>& flush ()
POSITIONING
basic_ostream<char_type,traits_type>&
seekg(pos_type pos);
basic_ostream<char_type, traits_type>&
seekg(off_type off, i1os_base::seekdir dir);
Repositions this to the location designated by off and dir, by calling this->
rdbuf ()->pubseekpos(off,dir). Failures are indicated by the state of *this.
Returns * this.
pos_type tellg();
If this->fail () ==true, returns pos_type (-1). Otherwise gets the stream positions
by calling this->rdbuf ()->pubseekoff(0,ios_base::cur,ios_base:
:out)
and returns the result from this call.
1ostreams
basic_ostream<charT,traits> 503
GLOBAL FUNCTIONS
Inserts a character c of type charT the output stream os. Padding is done according to
the adjustfield setting in os. After the insertion, width (0) is called. The return value
is *os. Failures are indicated by the stream state of os.
Inserts a character c of type char into the output stream os. In case c has type char and
the character type of the stream is not char, then the character to be inserted is
os.widen(c); otherwise the character is c. Padding is done according to the
adjust field setting in os. After the insertion, width (0) is called. The return value is
*os. Failures are indicated by the stream state of os.
template<class traits>
basic_ostream<char,traits>& operator<<
(basic_ostream<char,traits>& os, char c);
Inserts a character c of type char into the output stream os. Padding is done according to
the adjust field setting in os. After the insertion, width (0) is called. The return value
is *os. Failures are indicated by the stream state of os.
template<class traits>
basic_ostream<char,traits>& operator<<
(basic_ostream<char,traits>& os, signed char c);
Inserts a character c of type signed char into the output stream os. Padding is done
according to the adjust field setting in os. After the insertion, width (0) is called. The
return value is *os. Failures are indicated by the stream state of os.
template<class traits>
basic_ostream<char,traits>& operator<<
(basic_ostream<char,traits>& os, unsigned char c);
Inserts a character c of type char into the output stream os. Padding is done according to
the adjust field setting in os. After the insertion, width (0) is called. The return value
is *os. Failures are indicated by the stream state of os.
iostreams
504 basic_ostream<charT,traits>
Inserts a character array of type const charT[] into the output stream os. The number
of successive characters taken from the array is determined by traits: :length(s).
(Note: The result of char_traits<char>::length(s) is the same as strlen(s).)
Padding is done according to the adjust field setting in os. After the insertion,
width(0) is called. The return value is *os. Failures are indicated by the stream state
of os.
Inserts a character array of type const char [] into the output stream os. The number of
successive characters taken from the array is determined by traits::length(s).
(Note: The result of char_traits<char>::length(s) isthe sameas strlen(s).)In
case s is a sequence of characters of type char and the character type of the stream is not
char, then the characters to be inserted are widened using as .widen().
Padding is done according to the adjust field setting in os. After the insertion,
width (0) is called. The return value is *os. Failures are indicated by the stream state
of os.
template<class.traits>
basic_ostream<char,traits>& operator<<
(basic_ostream<char,traits>& os, const char* s);
Inserts a character array of type const char [] into the output stream os. The number of
successive characters taken from the array is determined by traits::length(s).
(Note: The result of char_traits<char>::length(s) is the same as strlen(s).)In
case s is a sequence of characters of type char and the character type of the stream is not
char, then the characters to be inserted are widened using as .widen ().
Padding is done according to the adjust field setting in os. After the insertion,
width(0) is called. The return value is *os. Failures are indicated by the stream state
of os.
lostreams
basic_ostream<charT,traits> 505
template<class traits>
basic_ostream<char,traits>& operator<<
(basic_ostream<char,traits>& os, const signed char* s);
Inserts a character array of type const signed char [] into the output stream os. The
number of successive characters taken from the array is determined by traits: :
length(s). (Note: The result of char_traits<char>::length(s) is the same as
strlen(s).)
Padding is done according to the adjust field setting in os. After the insertion,
width(0) is called. The return value is *os. Failures are indicated by the stream state
of os.
template<class traits>
basic_ostream<char,traits>& operator<<
(basic_ostream<char,traits>& os, const unsigned char* c);
Inserts a character array of type const unsigned char [] into the output stream os.
The number of successive characters taken from the array is determined by traits: :
length(s). (Note: The result of char_traits<char>::length(s) is the same as
strlen(s).)
Padding is done according to the adjust field setting in os. After the insertion,
width(0) is called. The return value is *os. Failures are indicated by the stream state
of os.
template<class traits>
basic_ostream<char,traits_type>& endl
(basic_ostream<char,traits_type>& os);
template<class traits>
basic_ostream<char,traits_type>& ends
(basic_ostream<char,traits_type>& os);
template<class traits>
basic_ostream<char,traits_type>& flush
(basic_ostream<char,traits_type>& os);
lostreams
506 basic_ostringstream<charT,traits,Allocator>
basic_ostringstream<charT,
traits, Allocator>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT>,
class Allocator = allocator<charT> >
class basic_ostringstream : public basic_ostream<charT,traits> {
public:
// type definitions:
typedef charT char_type;
typedef typename traits::int_type int_type;
typedef typename traits::pos_type pos_type;
typedef typename traits::off_type off_type;
// constructors/destructor:
explicit basic_ostringstream
(1os_base::openmode which = ios_base::out);
explicit basic_ostringstream
(const basic_string<charT,traits,Allocator>& str,
i1os_base::openmode which = ios_base::out);
// string stream operations:
basic_stringbuf<charT,traits,Allocator>* rdbuf() const;
basic_string<charT,traits,Allocator> str() const;
void str(const basic_string<charT,traits,Allocator>& s);
private:
// basic_stringbuf<charT,traits,Allocator> sb; exposition only
};
1ostreams
basic_ostringstream<charT,traits,Allocator> 507
explicit
basic_ostringstream (ios_base::openmode mode = ios _base::out);
explicit
basic_ostringstream
(const basic_string<char_type,traits_type,Allocator>«& s,
ios_base::openmode mode = ios_base::out);
basic_string<char_type,traits_type,Allocator> str();
Returns the string contained in the string stream buffer by calling this->rdbuf->
str().
Sets the string contained in the string stream buffer to s by calling this->rdbuf->
str(s).
lostreams
508 basic_streambuf<charT,traits
basic _streambuf<charT,traits”
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_streambuf {
public:
// type definitions:
typedef charT char_type;
typedef typename traits::int_type int_type;
typedef typename traits::pos_type pos_type;
typedef typename traits::off_type off_type;
typedef traitstraits_type;
// destructor:
virtual ~basic_streambuf();
// get area:
streamsize in_avail();
int_type snextc();
int_type sbumpc();
int_type sgetc();
streamsize sgetn(char_type* s, streamsize n)j;
2. In the description of basic_streambuf member functions below, we deviate from the notational conven-
tions that we use throughout the rest of the reference section. For the member functions that give access to the
pointers to the get and put area we omit the this pointer; that is, we simply say (gptr()!= 0 &&
eback()<gptr()) instead of (this->gptr() !=0&& this->eback() < this->gptr() ). Thisis done to
facilitate readability of the descriptions.
iostreams
basic_streambuf<charT,traits 509
// putback:
int_type sputbackc(char_type c);
int_type sungetc();
// put area:
int_type sputc(char_type c);
streamsize sputn(const char_type* s, streamsize n);
// buffer management:
basic_streambuf<char_type,traits>* pubsetbuf (char_type* s,streamsize n);
int pubsync();
// positioning:
pos_type pubseekoff(off_type off, ios_base::seekdir way,
i10s_base: :openmode which=ios_base::in|ios_base: :out) ;
pos_type pubseekpos(pos_type sp,
1os_base: :openmode which=ios_base::in|ios_base::out) ;
// dlocales:
locale pubimbue(const locale &loc);
locale getloc() const;
protected:
// constructor:
basic_streambuf ();
// put area:
char_type* pbase() const;
char_type* pptr() const;
char_type* epptr() const;
void pbump(int n);
void setp(char_type* pbeg, char_type* pend);
virtual streamsize xsputn(const char_type* s, streamsize n);
virtual int_type overflow (int_type c = traits::eof());
// putback:
virtual int_type pbackfail(int_type c = traits::eof());
// get area:
char_type* eback() const;
char_type* gptr() const;
char_type* egptr() const;
void gbump(int n);
void setg(char_type* gbeg, char_type* gnext, char_type* gend);
virtual int showmanyc
() ;
virtual streamsize xsgetn(char_type* s, streamsize n);
virtual int_type underflow();
virtual int_type uflow();
// buffer management:
virtual basic_streambuf<char_type,traits>* setbuf(char_type* s,
streamsize n);
iostreams
510 basic_streambuf<charT,traits
// positioning:
virtual pos_type seekoff(off_type off, ios_base::seekdir way,
ios_base::openmode which = ios_base::in | 10S_base::out);
virtual pos_type seekpos(pos_type sp,
ios_base::openmode which = ios_base::in | ios_base::out);
virtual int sync();
// locales:
virtual void imbue(const locale &loc);
};
}
GET AREA
streamsize inavail();
Returns the number of bytes available, ie., egotr() -gptr(). If the get area does not
exist or is empty (i.e., gptr()== 0 || eback()>= gptr()), this->showmanyc() Is
called and returns the result of this call.
int_type sbumpc();
Gets one character from the get area and returns this character converted to int_type,
ie, traits _type::to_int_type(*gptr ()); additionally, the next pointer of the get
area is incremented. If the get area does not exist or is empty (ie., gptr()== 0 | |
eback()>= gptr()), this->uflow() is called and returns the result of this call. In
contrast to sgetc(), sbumpc() additionally increments the get position in the input
sequence, which means that it consumes the character.
int_type sgetc();
Gets one character from the get area and returns this character converted to int_type,
ie., traits _type::to_int_type(*gptr() ).Ifthe get area does not exist or is empty
(ie, gotr()==0 | | eback()>=gptr()), this->underflow() is called and returns
the result of this call.
iostreams
basic_streambuf<charT,traits 511
int_type snextc();
Calls the function sbumpc (). If it returns traits _type::eof () , the return value is
traits_type::eof(). Otherwise this->sgetc() is called and the result of this call
is returned, i.e., it gets the next character from the input sequence, in contrast to sgetc (),
which gets the current character.
PUTBACK
int_type sputbacke(char_type c);
If a putback position is available (i.e., gptr () != 0 && eback () <gptr ()) and the charac-
ter in the putback position (ie., *gptr() [-1]) is equal to c, the next pointer of the
get area is decremented and the element the next pointer is now pointing to is returned.
The element is converted to int_type before the return. Otherwise calls this-—>
backfail (traits: :to_int_type(c) ) and returns the result of this call.
int_type sunget();
PUT AREA
int_type sputc();
Stores the character c in the put area, e.g., (*pptr()) =c. If the put area does not exist or
is full (i.e., pptr()==0 | | pptr() ==epptr () ), this->overflow/() is called and the
result of this call returned.
BUFFER MANAGEMENT
basic_streambuf<char_type, traits_type>*
pubsetbuf (char_type* s, streamsize n);
lostreams
512 basic_streambuf<charT,traits
int pubsync();
POSITIONING
pos_type
pubseekoff (off_type off, ios_base::seekdir dir,
ios_base::openmode mode = ios_base::in | ios_base::out);
Calls this->seekoff (off, dir,mode) and returns the result of this call.
pos_type
pubseekpos (pos_type pos,
ios _base::openmode mode = ios_base::in | ios_base::out);
LOCALES
Sets the locale of this to loc and calls this->imbue (loc) . The return value is the pre-
viously used locale.
basic_streambuf
();
PUT AREA
char_type epptr();
iostreams
basic_streambuf<charT,traits 513
Handles the situation where c cannot be inserted into the put area, because the put area
does not exist or is full (i.e., pptr()==0 || pptr()>=epptr () ). The concrete handling
depends on the specific derived class. basic_streambuf<charT ,traits>::
overflow() always returns the failure indication traits_type: :eof ().
char_type pbase();
Advances the next pointer of the put array by n. n can also be negative. If pptr () +n >=
epptr (), the behavior is undefined. The purpose is that derived classes, which manipu-
late the put area themselves, adjust the next.
char_type pptr();
Sets the begin pointer and the next pointer to beg and the end pointer to end. This func-
tion is typically used by derived classes when they supply a buffer area.
Writes n successive characters beginning with (*s) to the put area. It behaves like
repeated calls to this->sputc(). Writing stops when either n characters have been
written or a call to this->sputpc() would return traits_type: :eof(). The func-
tion returns the number of written characters.
PUTBACK |
virtual int_type pbackfail(int_type c = traits_type::eof());
Makes a character available that will be returned with the next call to this- >sgetc().If
c is not equal to traits_type::eof(),c is the character to be made available. If c is
equal to traits_type: : eof (), the character that is in the input sequence before the
characters that are already in the input area is the character to be made available. The
details of making the character available are implementation-specific. Returns
traits_type: :eof () if the character cannot be made available. The concrete handling
lostreams
514 basic_streambuf<charT,traits
GET AREA
char_type eback();
char_type egptr();
Advances the next pointer of the get array by n. n can also be negative. If eback() >=
gptr () -n, the behavior is undefined. The purpose is that derived classes, which manip-
ulate the get area themselves, adjust the next pointer accordingly.
char_type gptr();
Sets the begin pointer to beg, the next pointer to next, and the end pointer to end. This
function is typically used by derived classes when they supply a buffer area.
Returns an estimation of the number of characters that are at least available from the
input sequence or -1. If the operation returns -1, calls to this->underflow() and
this->uflow() will fail. Otherwise the returned number of characters can be made
available by one or more calls to this->underflow() or this->uflow(). The con-
crete handling depends on the specific derived class.
basic _streambuf<charT, traits>: :showmanyc () always returns 0.
Handles the situation where the get area does not exist or is empty (i.e., gptr() == 0 | |
gptr() >= egptr()) and behaves like underflow(), but additionally increments the
next pointer of the get area. The concrete handling depends on the specific derived class.
iostreams
basic_streambuf<charT,traits 515
Handles the situation where the get area does not exist or is empty (i.e., gotr()== 0 | |
gptr() >= egptr ()). The concrete handling depends on the specific derived class.
basic_streambuf<charT, traits>: :underflow() always returns the failure
indication traits_type::eof().
Assigns n successive characters from the get area to s. It behaves like repeated calls to
this—>bumpc (). Assigning stops when either n characters have been assigned or a call
to this->bumpc() would return traits _type::eof(). The function returns the
number of assigned characters.
BUFFER MANAGEMENT
synchronizes the put area with the input and output sequence, e.g., takes the characters
from the put area and makes them available in the output and input sequence. Pointers to
the put area are adjusted. In case of failure, the return value is —1. What constitutes failure
is determined by each derived class. basic_streambuf<charT, traits>::sync()
does nothing but return 0. The effect on the get area is implementation-specific.
POSITIONING
virtual pos_type
seekoff(off_type off, ios_base::seekdir dir,
ios_base::openmode mode = ios _base::in | ios_base::out);
Alters the position in one (i.e., input or output) or both (i.e., input and output) sequences.
The concrete handling depends on the specific derived class. basic_streambuf<charT,
iostreams
516 basic_streambuf<charT,traits
virtual pos_type
seekpos (pos_type pos,
ios_base::openmode mode = ios_base::in | ios_base::out);
Alters the position in one (i.e., input or output) or both (i.e., input and output) sequences.
The concrete handling depends on the specific derived class. basic_streambut
<charT, traits>: :seekpos () always returns the failure indication
pos_type(off_type(-1)).
LOCALES
Allows an object of a derived class to be informed that the locale of this has changed,
by overwriting this function in the respective derived class. basic_streambuf
<charT, traits>::imb doesue()
nothing.
iostreams
basic_stringbuf<charT,traits,Allocator> 517
basic_stringbuf<charT,traits,Allocator>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT>,
Class Allocator = allocator<charT> >
class basic_stringbuf : public basic_streambuf<charT,
traits> {
public:
// type definitions:
typedef charT char_type;
typedef typename traits::int_type int_type;
typedef typename traits::pos_type pos_type;
typedef typename traits::off_type off_type;
typedef traits traits_type;
// constructors:
explicit basic_stringbuf (ios_base::openmode which
= ios_base::in
| ios_base::out);
explicit basic_stringbuf
(const basic_string<charT, traits,Allocator>& str,
los_base::openmode which = ios_base::in | ios_base::out);
// get and set:
basic_string<charT,traits,Allocator> str() const;
void str(const basic_string<charT,
traits, Allocator>& s);
protected:
// overridden virtual functions:
virtual int_type underflow();
virtual int_type pbackfail(int_type c = traits::eof());
virtual int_type overflow (int_type c = traits::eof());
virtual basic_streambuf<charT,traits>* setbuf(charT*, streamsize) ;
lostreams
518 basic_stringbuf<charT,traits,Allocator>
explicit basic_stringbuf
(ios_base::openmode mode = ios_base::in | ios_base::out);
explicit basic_stringbuf
(const basic_string<char_type,traits_type,Allocator>& s,
Discards the previous content of the internal buffer and then copies s into the buffer. Ini-
tializes the get and put area according to the ios_base: : openmode parameter that was
passed to the constructor of this.
Allocates a new internal buffer that can hold the characters from the previous internal
lostreams
basic_stringbuf<charT,traits,Allocator> 519
buffer plus one or more additional write positions. Copies the characters from the previ-
ous buffer to the newly allocated one and adjusts the pointers. Then calls
this->sputc(c) and returns the result of this call.
Same functional description as for the base class. Redefined here to deal with the special
constraints of a string buffer.
virtual pos_type
seekoff(off_type off, i1os_base::seekdir dir,
ios_base::openmode mode = ios_base::in | ios_base::out);
Alters the stream position in the following way. mode determines in which controlled
sequence the position is altered:
if (mode & basic_ios::in) != 0, positions the input sequence,
if (mode & basic_ios::out) !=0, positions the output sequence,
1ostreams
520 basic_stringbuf<charT,traits,Allocator>
virtual pos_type
seekpos (pos_type pos,
ios_base::openmode mode = ios_base::in | ios_base::out);
iostreams
basic_stringstream<charT,traits,Allocator> 521
basic_stringstream<charT,traits,Allocator>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT>,
class Allocator = allocator<charT> >
class basic_stringstream : public basic_iostream<charT,traits> {
public:
// type definitions:
typedef charT char_type;
typedef typename traits::int_type int_type;
typedef typename traits::pos_type pos_type;
typedef typename traits::off_type off_type;
// econstructors/destructors
explicit basic_stringstream(
ios_base::openmode which = ios_base::out|ios_base::in);
explicit basic_stringstream(
const basic_string<charT,traits,Allocator>& str,
ios_base::openmode which = ios_base: :out|ios_base::in) ;
// string stream operations:
basic_stringbuf<charT,traits,Allocator>* rdbuf() const;
basic_string<charT,traits,Allocator> str() const;
void str(const basic_string<charT,traits,Allocator>& str);
private:
// basic_stringbuf<charT, traits> sb ; exposition only
};
}
iostreams
522 basic_stringstream<charT,traits,Allocator>
explicit
basic_stringstream (ios_base::openmode mode =
ios_base::in | ios_base::out);
Constructs an object of type basic_stringstream<charT, traits,Allocator>.
explicit
basic_stringstream
(const basic_string<char_type,traits_type,Allocator>& s,
ios_base::openmode mode = ios_base::in | ios_base::out);
basic_string<char_type,traits_type,Allocator> str();
Returns the string contained in the string stream buffer by calling this->rdbuf->
str().
void
str (basic_string<char_type,traits_type,Allocator>& s);
Sets the string contained in the string stream buffer to s by calling this->rdbuf->
str(s).
iostreams
fpos<stateT> 523
fpos<stateT>
CLASS TEMPLATE
DESCRIPTION
fpos represents an abstraction that maintains the file position information and its associ-
ated conversion state.
SYNOPSIS
namespace std {
template <class stateT> class fpos {
public:
stateT state() const;
void state(stateT);
private;
// stateT st; exposition only
};
}
fpos(stateT s);
StateT state()
Returns the conversion state that is related to the file position described by this.
void state(stateT s)
Sets the conversion state that is related to the file position described by this tos.
iostreams
524 ios_base
ios base
CLASS
class ios_base
header: <ios>
base class(es): [none]
DESCRIPTION
-ilos_base is the character-type and traits-type-independent base class for all streams. It
encapsulates the common character-type-independent functionality of a stream, e.g., set-
ting and getting formatting / parsing flags.
SYNOPSIS
namespace std {
class ios_base {
public:
// class definitions:
class Init;
class failure;
// type definitions and constants:
// format flags:
typedef Tl fmtflags;
static const fmtflags boolalpha;
static const fmtflags dec;
static const fmtflags fixed;
Static const fmtflags hex;
static const fmtflags internal;
static const fmtflags left;
static const fmtflags oct;
static const fmtflags right;
static const fmtflags scientific;
static const fmtflags showbase;
static const fmtflags showpoint;
static const fmtflags showpos;
static const fmtflags skipws;
static const fmtflags unitbuf;
static const fmtflags uppercase;
static const fmtflags adjustfield;
Static const fmtflags basefield;
static const fmtflags floatfield;
iostreams
ios_base 525
// stream state:
typedef T2 iostate;
static const iostate badbit;
static const iostate eofbit;
Static const iostate failbit;
static const iostate goodbit;
// open modes:
typedef T3 openmode;
static const openmode app;
static const openmode ate;
static const openmode binary;
static const openmode in;
static const openmode out;
static const openmode trunc;
// positioning:
typedef T4 seekdir;
Static const seekdir beg;
static const seekdir cur;
static const seekdir end;
// callback events;
enum event { erase_event, imbue_event, copyfmt_event };
// destructor
virtual ~ios_base();
// format control operations:
fmtflags flags() const;
fmtflags flags(fmtflags fmtfl);
fmtflags setf(fmtflags fmtfl);
fmtflags setf(fmtflags fmtfll, fmtflags mask);
void unsetf(fmtfllags mask);
streamsize precision() const;
streamsize precision(streamsize prec);
streamsize width() const;
streamsize width(streamsize wide);
// locales:
locale imbue(const locale& loc);
locale getloc() const;
// user storage:
static int xalloc();
long& iword(int index);
void*& pword(int index);
// callbacks;
typedef void (*event_callback) (event, ios_base&, int index);
void register_callback(event_call_back fn, .int index);
1ostreams
526 ios_base
// miscellaneous:
static bool sync_with_stdio(bool sync = true);
protected:
// constructor:
10s_base();
private:
// static int index; exposition only
// long* iarray; exposition only
// woid** parray; exposition only
};
// ios_base manipulators:
// alpha representation of bool:
1los_base& boolalpha (ios_base& str);
10s_base& noboolalpha(ios_base& str);
// integer base:
1os_base& showbase (los_base& str);
1os_base& noshowbase (ios_base& str);
// decimal point:
1os_base& showpoint (ios_base& str);
10s_base& noshowpoint(ios_base& str);
// sign:
1os_base& showpos (ios_base& str);
los_base& noshowpos (ios_base& str);
// skip whitespace:
10s_base& skipws ~ (los_base& str);
1os_base& noskipws (los_base& str);
// uppercase:
1os_base& uppercase (ios_base& str);
1o0s_base& nouppercase(ios_base& str);
// adjustfield:
1os_base& internal (los_base& str);
1os_base& left (los_base& str);
i1os_base& right (los_base& str);
// basefield:
ios_base& dec (los_base& str);
ios_base& hex (ios_base& str);
1os_base& oct (los_base& str);
// floatfield:
1os_base& fixed (los_base& str);
1os_base& scientific (ios_base& str);
1ostreams
ios_base 527
CLASS DEFINITIONS
class Init
{
public:
Init();
~Init();
};
The class Init describes an object that controls the construction of the global objects cin,
cout, cerr, clog, wcin, wcout, wcerr to take place before they are used by any func-
tion and the destruction to take place after they are used by any function. That means that
the global stream objects can be used in constructors and destructors of static and global
objects.
class failure
public exception
public:
explicit failure (const string& msg);
virtual ~failure();
virtual const char* what() const;
};
All objects that are thrown as exceptions in the IOStreams library are instances of class
failure. The member function what () returns a message that describes the exception.
TYPE DEFINITIONS
Enumerated types:
enum seekdir
{ beg, cur, end };
iostreams
528 ios
_ base
enum event
{ erase_event, imbue_event, copyfmt_event };
A type definition:
CONSTANT DEFINITIONS
The following list shows the predefined values and the effect gained by setting a certain
value for the nested bitmask type fmt flags. The numeric values are implementation-
dependent.
boolalpha
Extracts and inserts values of bool type in alphabetical format. Relevant for parsing and
formatting Boolean values.
showbase
Output contains a prefix that indicates the numeric base of an integral value. Relevant for
formatting integral values and parsing and formatting monetary values.
showpoint
Output always contains a point character (' . ') as radix separator for floating-point num-
bers independent of the actual locale. Relevant for formatting floating-point values.
showpos
Output contains a plus character (' + ') in front of non-negative numeric output. Relevant
for formatting numerical values.
skipws
Input operations skip leading whitespace characters. Relevant for all input operations.
iostreams
ios_base 529
unitbuf
Output is flushed after each output operation. Relevant for all output operations.
uppercase
Output operations replace certain lowercase letters with their respective uppercase equiv-
alents. Relevant for formatting numerical values.
internal
Output contains fill characters at a designated point of certain output operation. Relevant
for formatting numerical and monetary values.
left
Output contains fill characters at the left of certain output operations. Relevant for for-
matting of numerical and monetary values.
right
Output contains fill characters at the right of certain output operations. Relevant for for-
matting numerical and monetary values.
dec
Extracts and inserts values of any integer type in decimal base. Relevant for parsing and
formatting integral values.
hex
Extracts and inserts values of any integer type in hexadecimal base. Relevant for parsing
and formatting integral values.
oct
Extracts and inserts values of any integer type in octal base. Relevant for parsing and for-
matting integral values.
iostreams
530 ios_base
fixed
scientific
adjustfield
Defined as left | right | internal. Can be used as a mask to clear the output adjust-
ment specification.
basefield
Defined as dec | oct | hex. Can be used as a mask to clear the numeric base specification.
floatfield
Defined as scientific| fixed. Can be used as a mask to clear the floating-point num-
ber output specification.
The following list shows the predefined values and their semantics for the nested bitmask
type iostate. The numeric values are implementation-dependent, except that goodbit
must be 0.
badbit
eofbit
Indicates that an input operation reached the end of the input sequence.
failbit
Indicates that either an input operation failed to read the expected input or an output
operation was unable to generate the desired output.
lostreams
ios _base 531
goodbit
The following list shows the predefined values and their semantics for the nested bitmask
type openmode. The numeric values are implementation-dependent:
app
ate
binary
in
Open for input.
out
trunc
POSITIONING CONSTANTS
seekdir is an enumerated type that represents a specified position used for seeking in a
file stream. The following list shows its values and their semantics:
beg
iostreams
532 ios_base
cur
end
CALLBACK EVENT
event is an enumerated type that represents an event that causes a callback. The follow-
ing list shows its values and their semantics:
copyfmt_event
Indicates that new member data has been set by a call to basic_ios<charT,
traits>::copyfmt (basic_ios<charT, traits>&).
erase_event
Indicates that the stream is going to be destroyed or new member data are going to be set
by acall to basic_ios<charT,traits>:: copyfmt (const basic_ios&).
imbue event
Destroys an object of class ios_base. Each registered callback cb and its index ix are
called as (*cb) (erase_event, *this, ix) at sucha point during destruction that any
ios_base member function called by cb has defined results.
1ostreams
ios_base 533
Sets the control information for formatting and parsing to fmt fg and returns the previ-
ously used control information.
Returns the precision, e.g., the number of digits behind the radix separator, that should be
generated for certain output operations.
Sets the precision, e.g., the number of digits behind the radix separator, that should be
generated for certain output operations. Returns the previously used precision.
Sets the control information for formatting and parsing to fmt fg and returns the previ-
ously used control information, i.e., has the same effect as calling this->flags (fmtfg)
and returning the result from the call.
Clears the mask specified by mask in the control information for formatting and parsing
and then sets the control information to fmt fg & mask, ie., has the same effect as first
calling this->unsetf (mask) and then this->setf(fmtfg & mask). Returns the
previously used control information.
Clears the mask specified by fmt fg in the control information for formatting and parsing,
ie., has the same effect as calling this->flags ((this->flags()) & (~fmtfg) ).
Returns the minimum field width (counted as the number of characters) that should be
generated for certain output operations.
iostreams
534 ios_base
Sets the minimum field width (counted as the number of characters) that should be gener-
ated for certain output operations. Returns the previously used minimum field width.
LOCALES
If no locale was previously imbued, a copy of the global C++ locale is returned. Otherwise
the previously imbued locale is returned.
Sets a locale used in the stream to loc. After that, all callbacks cb are called together with
their index ix as (*cb) (imbue_event,*this,ix). If no locale was previously
imbued, a copy of the global C++ locale is returned. Otherwise the previously imbued
locale is returned.
USER STORAGE
Used to get and set the user storage of type long at index ix.
Used to get and set the user storage of type void* at index ix.
CALLBACKS
Registers a callback function cb together with its index ix. The callback is invoked when
either ios_base::~ios_base(), or 1os_base::imbue(locale&), or basic_ios
<charT, traits>::copyfmt (basic_ios<charT, traits>&) is called.
1ostreams
ios_base 535
MISCELLANEOUS
If any input or output operation has occurred prior to this call, the effect is implementation-
defined. Otherwise, called with a false argument, allows the standard C++ stream
objects to operate independently of their standard C counterparts. This could improve
efficiency. Called with a true argument, the standard C++ streams object are synchronized
with their respective C counterparts, which is the default behavior. The return value indi-
cates the previous setting.
10s_base();
Constructs an object of type ios_base. Its members have indeterminate values after the
construction.
GLOBAL FUNCTIONS
FMTFLAGS MANIPULATORS
iostreams
536 ios_base
ADJUSTFIELD MANIPULATORS
1ostreams
ios _base 537
BASEFIELD MANIPULATORS
FLOATFIELD MANIPULATORS
iostreams
538 manipulators
manipulators
HEADER
header: <iomanip>
FUNCTIONS
DESCRIPTION
The type smanip is implementation-specific, and the standard allows that it might be a
different type for each manipulator.
Additional requirements are that, for a manipulator m that returns smanip and an
object os of type basic_ostream<charT, traits>, the operationos << m(p); trig-
gers the invocation of the functionality described for this specific manipulator; and for an
object is of type basic_istream<charT,traits>, the operation is >> m(p); trig-
gers the invocation of the functionality described for this specific manipulator.
Manipulator that calls s. setf (ios_base: : fmtflags (0), mask) when applied to an
object s of type ios_base.
Manipulator that calls s .set £ (mask) when applied to an object s of type ios_base.
iostreams
manipulators 539
s.setf(base == 8 ? ios_base::oct :
base == 10 ? ios_base::dec :
base == 16 ? ios_base::hex :
ios_base::fmtflags
(0)
,10S_base: :basefield)
1ostreams
4
STREAM ITERATORS
<iterator>
DESCRIPTION
This header file contains all declarations for stream iterators and stream buffer iterators. It
also contains all declarations for the container iterators.
SYNOPSIS
The synopsis contains only the part of the header file <iterator> that is relevant for
stream and stream buffer iterators.
namespace std {
// primitives:
template<class Iterator> struct iterator_traits;
template<class T> struct iterator_traits<T*>;
540
header file <iterator> 541
// stream iterators:
template <class T, class charT = char,class traits = char_traits<charT>,
class Distance = ptrdiff_t>
class istream_iterator;
template <class T, class charT, class traits, class Distance>
bool operator==(const istream_iterator<T,charT,traits,Distance>& x,
const istream_iterator<T,charT,traits,Distance>& y);
template <class T, class charT, class traits, class Distance>
bool operator!=(const istream_iterator<T, charT,traits,Distance>& x,
const istream_iterator<T,charT,traits,Distance>& y);
stream iterators
542 istreambuf_iterator<charT,traits>
istreambuf_iterator<charT,traits>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template<class charT, class traits = char_traits<charT> >
class istreambuf_iterator
: public iterator<input_iterator_tag, charT,
typename traits::off_type, charT*, charT&> {
public:
// type definitions:
typedef charT char_type;
typedef traits traits_type;
typedef typename traits::int_type int_type;
typedef basic_streambuf<charT,traits> streambuf_type;
typedef basic_istream<charT, traits> istream_type;
// Class definition:
// class proxy; exposition only
// constructors:
istreambuf_iterator() throw();
istreambuf_iterator(istream_type& s) throw();
istreambuf_iterator(streambuf_type* s) throw();
istreambuf_iterator(const proxy& p) throw();
// <aiterator operations:
charT operator*() const;
stream iterators
istreambuf_iterator<charT,traits> 543
istreambuf_iterator<charT,
traits>& operator++();
proxy operator++(int);
// miscellaneous:
bool equal (istreambuf_iterator& b);
private:
// streambuf_type* sbuf_; exposition only
};
// global operators:
template <class charT, class traits>
bool operator==(const istreambuf_iterator<charT,
traits>& a,
const istreambuf_iterator<charT,traits>& b);
template <class charT, class traits>
bool operator!=(const istreambuf_iterator<charT,
traits>& a,
const istreambuf_iterator<charT,traits>& b);
stream iterators
544 istreambuf_iterator<charT,traits>
Class proxy provides a temporary placeholder as the return value of the postincre-
ment operator operator++(int). It keeps the character pointed to by the previous
value of the iterator for some possible future access to get the character and allows the cre-
ation of a streambuf_iterator from the proxy object that uses the proxy object’s
stream buffer.
istreambuf_iterator(istream_type& s) throw();
istreambuf_iterator(streambuf_type* s) throw();
ITERATOR OPERATIONS
Provides the current input character by returning the character obtained via the stream
buffer member sgetc ().
The result of operator* () on anend of stream is undefined.
stream iterators
istreambuf_iterator<charT,traits> 545
istreambuf_iterator<charT,traits>& operator++();
Advances the iterator advances to the next input character by calling the stream buffer
member sbumpc (). If the end of stream is reached, the iterator becomes equal to the end-
of-stream iterator value. Returns *this.
Advances the iterator advances to the next input character by calling the stream buffer
member sbumpc (). If the end of stream is reached, the iterator becomes equal to the end-
of-stream iterator value. Returns a proxy object constructed as proxy (sbuf_->
sbumpc(), sbuf_), where sbuf_ is a pointer to the stream buffer from which the itera-
tor reads.
' Animplementation of the istreambuf_iterator is permitted to provide equiva-
lent functionality without providing a proxy class. In that case, an iterator (or iterator-like
object) must be returned that provides the character read via sbumpc () when it is deref-
erenced and points to the previous stream buffer, that is, to the stream buffer before the
read operation.
MISCELLANEOUS
Returns true if, and only if, both iterators are at end of stream, or if neither is at end of
stream, regardless of what stream buffer object they use.
GLOBAL OPERATORS
stream iterators
546 istreambuf_iterator<T,charT,traits,Distance>
istream_iterator<T,charT,traits,Distance >
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class T, class charT = char,
class traits = char_traits<charT>,
class Distance = ptrdiff_t>
class istream_iterator
: public iterator<input_iterator_tag,T,Distance,const T*,const T&> {
public:
// type definitions:
typedef charT char_type
typedef traits traits_type; |
typedef basic_istream<charT,traits> istream_type;
// constructors/destructor:
istream_iterator();
istream_iterator(istream_type& s);
istream_iterator(const istream_iterator<T,charT,
traits, Distance>& x);
~istream_iterator();
stream iterators
istreambuf_iterator<T,charT,traits,Distance> 547
// iterator operations:
const T& operator*() const;
const T* operator->() const;
istream_iterator<T,charT,traits,Distance>& operator++();
istream_iterator<T,charT,traits,Distance> operator++(int);
private:
// basic_istream<charT,traits>* in stream; exposition only
// T value; exposition only
};
// global operators:
template <class T, class charT, class traits, class Distance>
bool operator==(const istream_iterator<T,charT,traits,Distance>& x,
const istream_iterator<T,charT,traits,Distance>& y);
template <class T, class charT, class traits, class Distance>
bool operator!=(const istream_iterator<T,charT,traits,Distance>& x,
const istream_iterator<T,charT,traits,Distance>& y);
istream_iterator(istream_type& s);
Constructs an istream_iterator that reads from the stream s. The first value may be
read during construction or the first time it is referenced.
~istream iterator () ;
ITERATOR OPERATIONS
stream iterators
548 istreambuf_iterator<T,charT,traits,Distance>
istream_literator<T,charT,traits,Distance>& operator++();
Reads the next value from the stream using operator>>() and stores the value inter-
nally. Returns *this.
Reads the next value from the stream using operator>>() and stores the value inter-
nally. Returns the previous *this, that is, a pointer to itself before the read operation.
GLOBAL OPERATORS
Returns true if both iterators are end-of-stream iterators or both iterators were con-
structed from the same stream.
Returns ! (x ==y).
stream iterators
iterator<Category,T,Distance,Pointer,Reference> 549
iterator<Category,T,Distance,Pointer,Reference>
CLASS TEMPLATE
DESCRIPTION
iterator is the base class of all iterator types and is used to ease the definition of
required types for new iterators. It provides a number of type definitions that are used by
derived classes.
SYNOPSIS
namespace std {
template<class Iterator> struct iterator_traits {
typedef typename Iterator: :difference_type difference_type;
typedef typename Iterator::value_type value_type;
typedef typename Iterator::pointer pointer;
typedef typename Iterator::reference reference;
typedef typename Iterator::iterator_category iterator_category;
};
stream iterators
550 iterator category tags
header: <iterator>
CLASS TEMPLATES
struct forward_iterator_tag
: public input_iterator_tag {};
struct random_access_iterator_tag
: public bidirectional_iterator_tag {};
DESCRIPTION
An iterator category describes an iterator’s capabilities. The iterator category tag classes
are used for algorithm selection, so that a template function can find out what is the most
specific category of its iterator argument and can select the most efficient algorithm at
compile time. }
stream iterators
ostreambuf_iterator<charT,traits> 551
ostreambuf_iterator<charT,traits>
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class charT, class traits = char_traits<charT> >
class ostreambuf_iterator:
public iterator<output_iterator_tag, void, void, void, void> {
public:
// type definitions:
typedef charT char_type;
typedef traits traits _type;
typedef basic_streambuf<charT,traits> streambuf_type;
typedef basic_ostream<charT, traits> ostream_type;
// constructors:
ostreambuf_iterator(ostream_type& s) throw();
ostreambuf_iterator(streambuf_type* s) throw();
ostreambuf_iterator& operator=(charT c);
// diterator operations:
ostreambuf_iterator& operator*();
ostreambuf_iterator& operator++();
ostreambuf_iterator& operator++(int);
// miscellaneous:
bool failed() const throw();
private:
//. streambuf_type* sbuf_; exposition only
};
stream iterators
552 ostreambuf_iterator<charT,traits>
ostreambuf_iterator(ostream_type& s) throw();
ostreambuf_iterator(streambuf_type* s) throw();
ITERATOR OPERATIONS
If failed() yields false, writes the character c to the stream buffer by calling the
stream buffer member sputc (c) ; otherwise has no effect. Returns * this.
ostreambuf_iterator& operator*
() ;
Returns *this.
ostreambuf_iterator& operator++();
Returns *this.
Returns *this.
MISCELLANEOUS
Returns true if in any previous assignment to the iterator at the end of the stream was
reached.
stream iterators
ostream_iterator<T,charT,traits,Distance> 553
ostream_iterator<T,charT,traits,Distance >
CLASS TEMPLATE
DESCRIPTION
SYNOPSIS
namespace std {
template <class T, class charT = char,
class traits = char_traits<charT> >
class ostream_iterator
: public iterator<output_iterator_tag, void, void, void, void> {
public:
// type definitions:
typedef charT char_type;
typedef traits traits_type;
typedef basic_ostream<charT,traits> ostream_type;
// construction/destruction/assignment
ostream_iterator(ostream_type& s);
ostream_iterator(ostream_type& s, const charT* delimiter);
ostream_iterator(const ostream_iterator<T,charT,traits>& x);
~ostream_iterator();
ostream_iterator<T,charT,traits>& operator=(const T& value);
// <diterator operations:
ostream_iterator<T,charT,traits>& operator* ();
ostream_iterator<T,charT,traits>& operator++();
ostream_iterator<T,charT,traits>& operator++(int);
private:
// basic_ostream<charT,traits>* out_stream; exposition only
// const char* delim; exposition only
};
stream iterators
554 ostream_iterator<T,charT,traits,Distance>
ostream_iterator(ostream_type& s);
~ostream iterator () ;
ostream_iterator<T,
charT, traits>&
operator=(const T& value);
Writes the element t to the stream using operator<<() followed by the delimiter string.
Returns *this.
ITERATOR OPERATIONS
ostream_iterator<T,charT,traits>& operator*
() ;
Returns *this.
ostream_iterator<T,charT,traits>& operator++();
Returns * this.
ostream_iterator<T,charT,traits>& operator++(int);
Returns *this.
stream iterators
5
OTHER 1/0 OPERATIONS
bitset<N>
FILE NAME
<bitset>
DESCRIPTION
The header file contains a template class and several related functions for representing
and manipulating fixed-size sequences of bits, including the following I/O operations:
GLOBAL OPERATORS
Extracts up to N (single-byte) characters from the input stream is. Stores these characters
in a temporary object str of type string, then evaluates the expression x =
bitset<N>(str).Characters are extracted and stored until any of the following occurs:
e N characters have been extracted and stored;
555
556 bitset<N>
If no characters are stored in str, the failbit is set. Returns a reference to the input
stream is.
Returns the string representation of the bitset object as if obtained by its to_string()
member function; that is, the operation returns os <<
x.template to_string<charT,traits,allocator<charT>>().
complex<T>
FILE NAME
<complex>
DESCRIPTION
The header file contains a template class and numerous functions for representing and
manipulating complex numbers including the following I/O operations:
GLOBAL OPERATORS
Extracts a complex number x of the form u, (u), or (u,v), where u is the real part and v
is the imaginary part. The input values must be convertible to type T. If bad input is
encountered, the failbit is set. Returns a reference to the input stream is.
Inserts the complex number x onto the stream os as if by os << '(' << x.real() <<
","<<x.imag() << ') '; using the stream’s format flags, precision, and locale. Returns
a reference to the output stream os.
basic_string<charT,traits,Allocator>
FILE NAME
<string>
DESCRIPTION
The header file contains all declarations for strings, including the following I/O
operations:
GLOBAL OPERATORS
fill characters are added according to the adjustfield setting. Then the field width is reset
to zero and the sentry object is destroyed. Returns a reference to the output stream os.
The conditions are tested in the order shown. In any case, after the last character is
extracted, the sentry object is destroyed. If the function extracts no characters, the failbit is
set. Returns a reference to the input stream is.
This section describes comprehensively and in detail how the num_get facet’s member
functions parse character sequences and what the resulting numerical or bool values are.
The num_get facet is represented by the class template:
563
564 Appendix A: Parsing and Extraction of Numerical and bool Values
Input streams use these member functions for the implementation of the respective
stream extractors. Hence the following text describes not only the num_get facet but also
how the stream extractors parse the input and extract a value.
STAGE 1
A conversion specifier and an optional length modifier are determined according to
str.flags() and the type that will hold the extracted value. The conversion specifier
and length modifier used are the same as for the standard C function scanf (). They are
described in detail in section A.3, Conversion Specifiers and Length Modifiers below.
The conversion specifier is
%g, if the extracted value is stored asa double or long double,
The conversion specifiers for an integral value are listed below. The first specifier,
whose condition is true, applies.
o,if (str.flags() & ios_base::basefield) ==ios base::oct
A length modifier is added to the conversion specifier according to the specified type:
h, for short and unsigned short
STAGE 2
As long as in! =end, a character ct of type charT is extracted from in. A related charac-
ter c of type char is created according to the rules below; the variable 1oc, which is men-
tioned in the descriptions below, is the locale contained in str, ie, loc = str.
getloc().
If stage 2 is not terminated, the input iterator is advanced by ++in, and pro-
cessing continues at the beginning of stage 2.
STAGE 3
If the sequence of chars accumulated in stage 2 caused the standard C function
scanf() parameterized with the conversion specifier from stage 1 to report an input
failure, ios_base::failbit is assigned to err. This is also done if a position of the
discarded thousands separators does not conform to the specification of
566 Appendix A: Parsing and Extraction of Numerical and bool Values
CONVERSION SPECIFIER
LENGTH MODIFIER
h—Together with the conversion specifiers $d, i, the length modifier h indi-
cates that the extracted value is a short rather than an int. Together with the
conversion specifiers 0, u, $x, %X, the length modifier h indicates that the
extracted value isan unsigned short rather than an unsigned int.
1— Together with the conversion specifiers $d, i, the length modifier 1 indi-
cates that the extracted value is a long rather than an int. Together with the
conversion specifiers 30, 3u, $x, %X, the length modifier 1 indicates that the
extracted value is an unsigned long rather than an unsigned int.
Formatting Numerical
and bool Values
This section describes comprehensively and in detail how the num_put facet’s member
functions format a numerical or bool value to a character sequence. The num_put facet is
represented by the class template:
569
570 Appendix B: Formatting Numerical and bool Values
These member functions are used for implementing the inserters of output streams for
numerical and bool values. Hence the following text describes not only the num_put
facet but also how the stream inserters format a numeric or bool value.
STAGE 1
A conversion specification formed by a conversion specifier and an optional qualifier,
length modifier, and precision specifier is determined according to str. flags () and the
type of the value val that is formatted. The conversion specifier, the qualifier, and the
length modifier used are the same as for the standard C function printf (). They are
described in detail in section B.3, Conversion Specifiers, Qualifiers, and Length Modifiers,
below.
The conversion specifier is 3p if a value of type void* is formatted.
For the two floating-point types double and long doubl1le, the conversion speci-
fiers are
%G, otherwise
Both lists above are ordered; i.e., the first specifier whose condition is true applies.
An optional qualifier and a length modifier are added to the conversion specifier
according to the rules below. If the type that is formatted is an integral type; that is, either
long, unsigned long, double, or long doubl1le, the qualifier that is added is
If the type that is formatted is a floating-point type that is either double or long
double, the qualifier that is added is
STAGE 2
All characters from the sequence resulting from stage 1 are converted to charT according
to the following rules:
If the character is equal to the decimal point ' . ', the result of the conversion is
use_facet< numpunct<charT> >(str.getloc()).decimal_point();
otherwise the character is converted to a character of type charT by the ctype’s
member function widen (); ie., if c is a character from this sequence unequal
to '.', it is converted via use_facet< ctype<charT> >(loc).
widen (c).
If the value that is formatted is of type long or unsigned long, i.e., of
an integral type, thousands separators are inserted. They are placed accord-
ing to the specification returned by use_facet< numpunct<charT> >
572 Appendix B: Formatting Numerical and bool Values
STAGE 3
The location of padding is determined according to the rules below. The rules
are ordered; i.e., the first condition that becomes true determines the rule that
applies.
If (str.flags() & ios_base::adjustfield) == ios_base::left,
padding is done after the character sequence created in stage 2.
If (str.flags() & ios_base::adjustfield) == ios_base::right,
padding is done before the character sequence created in stage 2.
If (str.flags() & ios_base::internal) == ios _base::left, anda
sign occurs in the character sequence created in stage 2, padding is done after
the sign.
If (str.flags() & ios_base::internal) == ios_base::left, anda
sign occurs in the character sequence created in stage 1 began with 0x or 0X,
padding is done after these two characters.
If none of the above conditions applies, padding is done before the character
sequence created in stage 2. _
The character used for padding is the character passed to the put () member func-
tion as parameter fi11.
STAGE 4
The sequence of characters resulting from stage 3 is output to out; i.e., if c is a character
from the sequence, *out++ = c is performed. If at any point during this operation
out .failed() becomes true, the operation is terminated.
CONVERSION SPECIFIER
portion of the result, and the decimal-point character appears only if it is fol-
lowed by a digit.
%o, %u, %x, %*X—The unsigned int value is converted to a sequence of char-
acters that represent an unsigned octal (%o), unsigned decimal (%u), or
unsigned hexadecimal (%x, %X) in the style dddd, where d is a digit character. To
represent hexadecimal digits larger than 9, the characters abc de f are used for
the conversion specifier 3x and AB C DEF for the conversion specifier 3X.
%da—The int value is converted to a sequence of characters that represent a
signed decimal in the style [-]dddd, where d is a digit character.
QUALIFIER
+—The resulting character sequence will always begin with a plus- or minus-
sign character.
LENGTH MODIFIER
L specifies that the following %e, %E, 3£, 3g, $G conversion specifier applies to
along double value.
APPENDIX C
which both use the same conversion specifiers as the standard C function strftime(). The
first function parses the interval [pattern, pat_end) and interprets the characters
immediately following a '%' character as conversion specifiers. The second function
interprets the parameter form as a conversion specifier. The list below shows all valid
conversion specifiers and their semantics:
%a—is replaced by the abbreviated weekday name as known to the time_put
facet.
575
576 Appendix C: strftime() Conversion Specifiers
%A—is replaced by the full weekday name as known to the time_put facet.
%b—is replaced by the abbreviated month name as known to the time_put
facet.
%B—is replaced by the full month name as known to the time_put facet.
%c—is replaced by the time_put facet’s appropriate date and time
representation.
%d—is replaced by the day of the month as a decimal number, i.e., a number
between 01 and 31.
%H—is replaced by the hour (24-hour clock) as a decimal number, i.e., a num-
ber between 00 and 23.
%I—is replaced by the hour (12-hour clock) as a decimal number, i.e., a num-
ber between 01 and 12.
%j—is replaced by the day of the year as a decimal number, i.e., a number
between 001 and 366.
%m—is replaced by the month as a decimal number, i.e., a number between 01
and 12.
%M—is replaced by the minute as a decimal number, i.e., a number between 00
and 59.
%p—is replaced by the time_put facet’s equivalent to the AM/PM designations
associated with a 12-hour clock.
%S—is replaced by the second as a decimal number, i.e., a number between 00
and 61.
%U—is replaced by the week number of the year (the first Sunday as the first
day of week one) as a decimal number, i.e., a number between 00 and 53.
579
580 Appendix D: Correspondences Between C Stdio and C++ lOStreams
Table D-1: File Stream Open Modes in [OStreams and Their Equivalents in C stdio
C STDIO
+ "a"
+ "wi!
wy
+ "r+"
+ + Ww
+ + "wb"
+ + + "ab"
+ + + "wh"
+ "rb"
+ + "r+b"
+ + + “wtb”
Table D-2: Symbolic Stream Positions in |OStreams and Their Equivalents in C stdio
beg SEEK_SET
cur SEEK_CUR
end SEEK_END
APPENDIX E
Standard IOStreams stands in the tradition of the classic IOStreams library that has been
around since the first days of C++. Before the advent of the standard IOStreams, several
implementations of the classic IOStreams library were available to the C++ community,
all of which were similar, yet slightly different. One goal of the standardization was for-
mally to specify the IOStreams, as well as to improve and enhance it. Potentially danger-
ous features, like assignment of streams, were removed, and new capabilities, such as
internationalization support, were added. This appendix provides an overview of the dif-
ferences between the classic and the standard IOStreams, which are particularly interest-
ing to those developers who have existing IOStreams applications and want to migrate to
the standard [OStreams.
To make this chapter understandable even if you have not yet read the entire book,
we include short reviews of topics that are covered in detail elsewhere in the book. Take a
deeper look at the sections given as references if you would like further information.
Here is an overview of the major differences:
The standard IOStreams is a template taking the character type as a parameter.
The base class ios is split into character-type-dependent and character-type-
independent portions.
Standard IOStreams optionally throws exceptions.
Standard IOStreams is internationalized.
581
582 Appendix E: Differences Between Classic and Standard |OStreams
Additional virtual functions have been added to the stream buffer interface.
The classic IOStreams classes allowed input and output of text that could be repre-
sented as a sequence of narrow characters of type char. This was seen as a restriction,
because not all alphabets and their corresponding character encodings can be conve-
niently expressed in terms of narrow characters. Sequences of wide characters of type
wchar_t are needed to represent larger alphabets, like the Japanese one for instance. The
standards committee decided to turn the traditional IOStreams classes into class tem-
plates in order to eliminate the restriction of narrow-character I/O. The stream class tem-
plates take two template arguments: the character type and an associated character traits
type. Character types and trait types are described in greater detail in section 2.3, Charac-
ter Types and Character Traits. Here is a brief summary:
The character type is usually one of the built-in character types char or wchar_t.
The instantiations for the narrow-character type char are designed to cover the tradi-
tional functionality of classic IOStreams. The instantiations for the wide-character type
wchar_t operate on wide-character sequences and can convert them to external multi-
byte character encodings. The character type can also be of any other conceivable user-
defined type.!
The character traits type describes the properties of the character type. These include
information such as the end-of-file value, which is an integral constant called EOF for type
char, and a constant called WEOF for type wchar_t, the meaning of equality, or compari-
son of two characters.
For ease of use, and for backward compatibility, the standard defines type defini-
tions for the stream class templates instantiated with the character types char and
wchar_t. For type char these are
1. “User-defined” here stands for any character type that is not built into the language. A user-defined character
can be added by a library vendor as well as by a user.
Appendix E: Differences Between Classic and Standard !OStreams 583
Note that these typedefs define names identical to the class names in the traditional
IOStreams. In other words, there is still an ostream; the only difference is that it now
stands fora basic_ostream<char, char _traits<char> >.
While these typedefs help to migrate an implementation that uses classic IOStreams
to the use of standard IOStreams, some points still need attention:
As already mentioned, all definitions of the standard IOStreams reside in the name-
space : : std. This is also true for the typedefs listed above; as a result, either a using dec-
laration or using directive must be used to refer to the typedefs, or the typedefs must be
qualified with their namespace, e.g., ::std::fstream.
Since the typedefs are not classes anymore, they cannot be used in forward declara-
tions, as they could with the classic IOStreams where these names depicted classes. We
recommend using the include file <ios fwd> instead.
2. One might expect that the functionality error handling, not only the flag definitions, would be contained in
ios_base because error handling is character-independent. However, error indication is done in basic_
ios<class charT, class Traits>, because ios_base is also used in the locale section of the standard
library, where it serves as an abstraction for passing formatting information to the locale. If ios_base contained
the error handling, which in the standard IOStreams includes the indication of errors by throwing exceptions
(see subsequent sections for details), these exceptions could also be raised by the standard locale. This effect was
neither intended nor acceptable. Hence, ios_base contains only the definition of all flags for error indication;
the raising of exceptions and the indication of error states are located in basic_ios<class charT, class
Traits>.
584 Appendix E: Differences Between Classic and Standard !lOStreams
ing. It also manages the user-allocable storage (iword/pword), handles registration and
invocation of callbacks, and allows imbuing of locales.
The advantage of splitting class ios into class ios_base and class template
basic_ios<class charT, class Traits> is that all behavior independent of the
template parameters is factored out into a nontemplate. This minimizes the binary code
size of the library as well as user programs.
Besides the split, the behavior of the new stream base classes differs from the behav-
ior of the classic IOStreams base class ios. Stream callbacks are a completely new feature
of the standard IOStreams provided by ios_base. They help implement proper resource
management when streams get copied via copyfmt () or destroyed by their destructor.
Another point is the open modes. While classic IOStreams’ implementations typically
offered a nocreate open mode in their ios base class, this mode no longer exists in the
standard IOStreams.
int value;
// some calculation
cout << "The calculated value is: " << value << '\n';
if (!cout)
handle_error();
As convenient as it may be, it has one drawback: In the example above it is not pos-
sible to check the stream state, which accumulates the stream errors after each output
operation. C++ exceptions can help in this situation, because they allow a more active
error indication. For this reason the standard IOStreams optionally allows error indication
via exceptions. The mechanisms for error indication are described in greater detail in sec-
tion 1.3, The Stream State. In particular, IOStreams exceptions are explained in section
1.3.3, Catching Stream Exceptions. Here is a brief review of both mechanisms:
In IOStreams each stream maintains a stream state that indicates the success or fail-
ure of a operation. The stream state can either be good, or one of the following:
Errors are accumulated in the stream state and must be actively checked by calling
certain member functions such as good (), fail (),bad(), etc.
The standard [OStreams provides means for enabling or disabling exceptions. An
exception mask specifies which of the stream state flags should trigger an exception. If, for
instance, the fail bit is set in the exception mask, an operation that sets the fail bit in the
stream state will also raise an exception of type ios_base:: failure. By default, all
exceptions are disabled. The user of IOStreams can actively enable exceptions by modify-
ing the exception mask. The stream classes offer the exceptions () member function for
retrieval and modification of the exception mask.
Note that there is no guarantee that all exceptions will be suppressed, even if all bits
in the exception mask are turned off. Errors detected by the stream and the stream buffer
themselves are not indicated via exceptions if the exception mask does not allow it, but
exceptions raised by user-provided operations will be propagated. Examples of user-pro-
vided operations are overridden virtual functions of derived stream buffer classes, regis-
tered callback functions, and operations of user-defined locales and facets.
copy and assignment for objects of these classes, because there are no “right” semantics
for copying or assigning a stream with respect to its stream buffer. There are different pos-
sibilities, e.g., sharing the stream buffer after the assignment, flushing the stream buffer
during the assignment and then providing both streams with entirely independent
buffers, and so on. None of these possibilities is intuitively right, though. Consequently,
copying and assigning are prohibited.
On the other hand, streams need to be assigned. The most convincing example is
the wish to redirect standard output (or any of the other standard I/O objects) by assign-
ing a valid stream object to cout. In order to satisfy this requirement, the classic
IOStreams had the classes istream_withassign, ostream_withassign, and
iostream_withassign. It implemented a public copy constructor and assignment
operator, which let both streams share the stream buffer after the copying or assignment.
This imposed dependencies between the lifetime of the two stream objects used in the
copy constructor or assignment operator, and the correct use of the __withassign classes
was rather complicated.
For this reason the classes istream_withassign, ostream_withassign, and
iostream_withassign no longer exist in the standard IOStreams. To perform
operations equivalent to the copy constructor and the assignment operator of the old
_withassign classes, the user of the standard streams has to explicitly implement this
functionality. Standard streams have the following member functions defined in
basic_ios<class charT, class Traits>, that can be used for this purpose:
iostate rdstate(), which allows retrieval of the stream state
The correct use of these functions is discussed in detail in section 2.1.3, Copying and
Assignment of Streams.
length) is available on some UNIX platforms and allows a file to be set to a defined
length; however, this feature was not directly supported by the traditional IOStreams, but
accessible only indirectly through the file descriptor.
The fd() function is omitted from the C++ standard. The simple reason is that the
C++ standard does not want to exclude operating systems without file descriptors from
providing a standard-conforming IOStreams library.
On the other hand, vendors of the standard C++ library are free to extend the library,
as long as these extensions do not conflict with the standard. Hence it is quite possible that
a functionality like fd() will be included as a nonstandard extension in some library
implementations.
and the string stream constructed this way was frozen. In the standard IOStreams the
string is not used as an internal buffer; only its content is copied into an independent
internal buffer area. Again, the internal buffer is not accessible from outside the string
stream, and freezing is not necessary.
nal device, but in contrast to underflow(), it consumes the character. The nonvirtual
stream buffer functions sgetc () and sbumpc () in the standard IOStreams call different
functions: sgetc() calls underflow(), as it used to do in the classic IOStreams;
sbumpc () was changed and now calls uf 1low(). With these changes, unbuffered stream
buffers too can be derived, because sbumpc () does not require that an internal character
buffer exists. Instead, the functionality of peeking at the next and consuming it is virtual
and can be overridden for a derived stream buffer class. The default implementation of
uflow() provided in the base class basic_streambuf is built on top of underflow():
uflow() calls underflow() and increments the next pointer, which is a reasonable
default behavior for buffered stream buffers.
The net effect of this change to the stream buffer base class is that for derived stream
buffer classes with an internal character buffer, only underflow() must be overridden,
pretty much as it was in the classic IOStreams. See section 3.4.1.1.2, A Stream Buffer for
Buffered Character Transport, for an example. For a stream buffer class that does not
buffer the characters internally, underflow () and uflow() must be overridden. See sec-
tion 3.4.1.1.1, A Stream Buffer For Unbuffered Character Transport, for an example.
pbhackfail()
In the classic IOStreams, the stream buffer function sputbackc (char) directly accessed
the internal character buffer. In the standard IOStreams, sputbackc (char) invokes a
virtual member function spbackfail () instead.
In the classic IOStreams, the nonvirtual member function sputbackc (char)
allowed a character to be put back into the input sequence. That character could either be
the previously extracted one or a different character that had not been obtained from the
external device. As sputbackc (char) was a nonvirtual function, it was impossible to
override its behavior in any derived stream buffer class. Moreover, sputbackc (char)
was implemented so that it directly accessed the internal character buffer in order to store
the putback character in the previous position. This implementation naturally does not
work for stream buffers that do not maintain an internal character array. It also does not
work if the next pointer points to the beginning of the internal character buffer, although
for certain stream buffers it might be possible to make available additional putback posi-
tions even in such a situation.
To overcome these limitations, the standards committee introduced a virtual mem-
ber function called pbackfail(), and sputbackc() calls this function if a character
different from the previously extracted one is put back into the input sequence. In such a
case, write access to the internal character buffer, or any equivalent functionality, is pro-
vided by pbackfail(). sputbackc() also calls pbackfail() if the next pointer
points to the beginning of the internal character buffer. In that case, pbackfail () makes
available additional putback positions.
Additionally, a second nonvirtual member function sungetc() was introduced
into the stream buffer classes. Its functionality is a subset of sputbackc ()’s functional-
ity, namely, putting back the previously extracted character into the input sequence.
590 Appendix E: Differences Between Classic and Standard |OStreams
sungetc() also calls the virtual pbackfail() function if no putback positions are
available.
A side effect of these additions is that under certain circumstances uflow(),
underflow(), and pbackfail() must be overridden in a derived stream buffer class,
because these three functions are semantically interdependent. Specifically, under-
flow(), which is a peek without consumption of the character, must have the same
semantics as uf low(), which is a peek with consumption, followed by pbackfail(),
which represents ungetting the consumed character. The implementation of uflow() in
the stream base class basic_streambuf provides a reasonable default behavior that
works for buffered stream buffer classes: uflow(), the peek with consumption, is imple-
mented as underflow(), which is‘a peek without consumption, plus increment of the
get area’s next pointer, which means consumption of the character. For this reason, it is
enough to override underflow() for a buffered stream buffer class. (See section
3.4.1.1.2, A Stream Buffer for Buffered Character Transport, for an example.) For
unbuffered stream buffers, in contrast, all three functions must be redefined. (See section
3.4.1.1.1, AStream Buffer for Unbuffered Character Transport, for an example.)
Relationship Between C
and C++ Locales
The internationalization supports defined in standard C and standard C++ have a lot in
common. In particular, they both provide services and information for the same range of
cultural differences. There is also a relationship between the global C++ locale and the
global C locale.
591
592 Appendix F: Relationship Between C and C++ Locales
There is a global locale in C++ too. It is used as a default locale for operations that do
not explicitly choose a locale object. Streams, for instance, are created with a snapshot of
the global C++ locale and use this snapshot unless another locale is explicitly attached.
The difference between the global locale in C and the global locale in C++ becomes
visible when we consider that the global locale can be changed. In C, when the current
global C locale is replaced by another locale, all internationalized operations from then on
use the new global locale and silently change their behavior accordingly. In C++, when
the current global C++ locale is replaced by another locale, all internationalized opera-
tions work as before. The change affects only new snapshots that are taken of the global
locale. Existing snapshots taken of the previous global locale are not affected. In other
words, snapshots of the global C++ locale are not transparent and do not reflect any
replacement of the global locale.
Under certain circumstances, setting the global C++ locale has an effect on the global
C locale. If the C++ locale provided has a name, the set locale () function from the stan-
dard C library is called. As a result, C functions, which are called from a C++ program, will
be using the same global locale as the calling C++ functions. Note that the reverse is not
true. Setting the global C locale has no effect on the global C++ locale. If the global C++
locale does not have a name, the effect on the C locale is implementation-defined.
The C++ locale model has the advantage that working with several cultural envi-
ronments in one program is much easier than in C. In C, the global locale must be
switched back and forth each time another cultural area is relevant. In C++, several locale
objects can be used in parallel, and each internationalized operation can use its own
locale. Here is an example:
cin.imbue(locale("")); // the native locale
cout.imbue(locale::classic());
double f;
while (cin >> £) cout << f << endl;
Traditional C-style localization using the global locale is still easy. Simply use snap-
shots of the global C++ locale in all places:
locale: :global(locale("")); // set the global locale
// imbue it on all the std streams
Cin.imbue(locale() )
cout.imbue (locale ( ) ) °
‘
double f;
while (cin >> f£) cout << f << endl;
594 Appendix F: Relationship Between C and C++ Locales
On a “German” computer, all input and output will be parsed and formatted
according to German conventions.
The most important difference between the C and C++ locales is that the C++ locale
also provides an extensible framework into which user-defined internationalization ser-
vices can be integrated. The C locale does not allow any extensions.
APPENDIX G
Bitmask types can be implemented in several ways. A bitmask type can be an inte-
ger type, a bitset,! or an enumerated type that overloads certain operators. As a user,
1. bitset is a class template defined in the standard C++ library in the header file <bitset>. The template
class bitset<N> represents a fixed-sized sequence of N bits.
595
596 Appendix G: New C++ Features and Idioms
you need not know what it really is. All you need to care about is the name of the bitmask
type and the names of the associated predefined bit values. It is guaranteed that you can
set, clear, and test the flags as outlined above, and that you can assign combinations of bits
to an object of the bitmask type.
Bitmask types are used in various places throughout the standard C++ library.
Examples in IOStreams are the format flags, the stream state, and the open modes. All of
these bitmask types are implementation-defined.
class MyString
{
public:
MyString(const char* cString);
MyString(unsigned int capacity);
The first one-argument constructor of class MyString takes a const char* argu-
ment representing a C-style string, which is a pointer to an array of characters terminated
with a '\0'. An automatic conversion by means of this constructor is very convenient,
because it allows use of a C-style string wherever a MySt ring object is required. This is a
desired effect. For example, when we call the add () member function, we do not have to
explicitly construct a MySt ring object from the C-style string that we intend to pass as an
argument, but we can directly use the C-style string itself, as shown below:
X s("Hello ");
s.add("world !");
MyString s("test");
s.add(5); // oops, supposedly the C-string "5"
Let us suppose we forgot to put the integral value 5 into quotes so that it would be a
C-string literal. Unfortunately, the compiler will not catch this mistake, because it can
implicitly convert the (unsigned) int literal 5 to a MyString object using the second
constructor. In fact, a MySt ring object with the capacity to hold five characters contain-
ing no text is constructed, and this temporary MyString object is passed to the add ()
function. This effect is certainly not desired and illustrates the typical pitfall that stems
from implicit conversions based on one-argument constructors. Often, only some con-
structors are meant as conversions, while others have entirely different semantics; and
use of the nonconverting constructors for implicit conversions is usually not desired, yet
it cannot be suppressed.
The function-specifier explicit was invented to remedy this shortcoming of the
language. A one-argument constructor that is specified explicit is not included in any
implicit type conversion that is automatically generated by the compiler. By means of the
explicit specification, a programmer can distinguish between one-argument construc-
tors that have conversion semantics and one-argument constructors that are not meant as
conversions. We can benefit from this language improvement by changing our example to
class MyString
{
public:
MyString(const char* cString);
explicit MyString(unsigned int capacity);
With this modification the previous example will not compile any longer and the
supposed error (passing an integer literal in lieu of a C-style string) will now be detected
by the compiler:
MyString s("test");
s.add(5); // error: cannot convert integer to MyString
Note that the use of explicit prevents use of the one-argument constructor in
implicit conversions performed by the compiler, but the programmer can still use them
for explicit conversions if needed. For illustration, let’s assume we have a class X that can
be constructed from a MySt ring object:
class X {
public:
X(const MyString& s);
X x(MyString(5)); // okay
X x(static_cast<MyString>(5)); // okay
Here the programmer uses a static_cast to tell the compiler that it should do an
explicit conversion from integer to MyString based on the MyString’s explicit
constructor.
In sum, nonconverting constructors must be specified explicit; all one-argument
constructors without an explicit specification are treated as converting constructors,
and the compiler may use them in implicit conversion sequences.
Appendix G: New C++ Features and Idioms 599
Such a class template can be instantiated for all types T that can be ordered by means
of operator < and compared for equality by means of operator ==. We can, for instance,
use it for sorting or finding objects of type T as in the find() function below:
When we invoke the find () function, we can pass an object of type Order<T> as
an argument, shown below for type int:
int buf[1024];
// £111 the integer buffer
if (find(buf, 1024, 0, Order<int>()))
// integer 0 found
string buf[1024];
// £111 the string buffer
if (find(buf, 1024, string("xyz"), Order<string>()))
// string "xyz" found
However, it does not work as expected for C-style strings of type const char*,
because the operations provided by class Order<const char*> would compare the
pointers to the C-style strings for equality. Two C-style strings with the same content, but
stored at different memory locations, would not compare as equal. Hence a call to
600 Appendix G: New C++ Features and idioms
find() with an Order<const char*> object will find only identical C-style strings, that
is, C-style strings with the same address, but will not be capable of identifying equal C-
style strings, that is, C-style strings with the same content.
What we would need here to make our find() function work is a special version of
the Order class template for C-style strings. This is what template specialization is for. We
can define a version of the class template Order for type const char~, see below:
template <>
class Order<const char*> {
public:
bool less(const char* lhs, const char* rhs) const
{ return strcemp(lhs,rhs)<0; }
bool equals(const char* lhs, const char* rhs) const
{ return stremp(lhs,rhs) == 0; }
};
PARTIAL SPECIALIZATION
C++ also allows partial specialization of templates. A partial specialization differs from a
full specialization in that it is still a template. If, for instance, we have a class template with
two template parameters and we provide a specialization that binds only one of the two
template arguments, we have partial specialization. If we bind both template parameters,
we have full specialization.
We can use partial specialization in our example above and provide a specialization
of the Order class for pointer types, which compares the objects being pointed to rather
Appendix G: New C++ Features and Idioms 601
This template would be instantiated whenever a version of the Order class for any
pointer type is needed, except for pointers of type const char*, because the full special-
ization Order<const char*> is even more specialized than the partial specialization
Order<T*>.
template <class T> // the actual class template
class Order { ... }; |
template <class T> // a partial specialization
class Order<T*> { ... };
template <> // a full specialization
class Order<const char*> { ... };
template <>
class Order<string> {
public:
Order(locale l=locale()) : _1(1) {}
bool operator<(const string& lhs, const string& rhs) const
{ return less(rhs,lhs); }
bool less(const string& lhs, const string& rhs) const
{ return _l(rhs,lhs); }
bool equals(const string& lhs, const string& rhs) const
{ return !_l(rhs,lhs)&&!_l(lhs,rhs); }
private:
locale _1;
};
602 Appendix G: New C++ Features and Idioms
template <>
class Order<string> {
public:
Order(locale 1);
bool operator<(const string& lhs, const string& rhs) const;
bool less(const string& lhs, const string& rhs) const;
bool equals(const string& lhs, const string& rhs) const;
private:
locale _1;
};
template <>
class Order<const char*> {
public:
bool less(const char* lhs, const char* rhs) const;
bool equals(const char* lhs, const char* rhs) const;
};
Appendix G: New C++ Features and Idioms 603
With this implementation of Order and its specializations, the find() function can
_ be invoked only on types for which a specialization of Order is defined, because the
instantiation of Order for an arbitrary type would not yield a meaningful class.
Functions generated from any of the equals () function templates as well as nor-
mal functions with the name equals () can be invoked. The compiler chooses the “best
match.” Given the choice between a function generated from a template and a normal
function, if these are otherwise equally good matches, the compiler prefers the normal
function.
private:
T _buf[s];
};
The class template Stack below has a template template parameter Container,
which is the container template on top of which the stack is implemented.
template <class T, template <class T> class Container>
class Stack {
public:
private:
Container<T> c;
};
Appendix G: New C++ Features and idioms 605
All three different kinds of template parameters can have default values. The default
value for a nontype template argument is a constant value of the respective type; the
default for a type template argument must be suitable for instantiation of the class tem-
plate, and the default for a template template argument is a class template. Only trailing
parameters can be omitted. Here are examples.
Let us first examine defaults for nontype template parameters:
private:
T _buf[s];
};
With the default value of 256 specified for the buffer size, the size argument can be
omitted when the Buf fer template is instantiated. A Buf fer can be specified as
Buffer<string, 100>
Buf fer<string>
in which case the buffer would have the default size of 256 entries.
Here is an example of a default value for a type template argument:
private:
CharType* _str;
};
With the default value of type char specified for the character type of this String
class, the type argument can be omitted when the String template is instantiated. A
Buf fer can be specified as
Buffer<wchar_t>
606 Appendix G: New C++ Features and Idioms
String<>
in which case the string would handle tiny characters of type char. Note that the empty
brackets <> are needed in order to indicate that we refer to the St ring class template, not
just to a class named String.
Even template template parameters can have a default. Here is the previously men-
tioned Stack example:
private:
Container<ElemT> c;
With the container template deque specified as a default value for the container
template, the template template argument can be omitted when the Stack template is
instantiated. A Stack can be specified as
Stack<string, vector>
Stack<string>
The default for the type template argument Traits is an instantiation of the stan-
dard character traits template for type charT, where charT is the first type argument of
the string template. Hence, if charT is wchar_t, the Traits parameter would have the
default char_traits<wchar_t>, and if charT is MyCharType, the default for
the Traits type would be char_traits<MyCharType>. If the default traits type fits
the need, the string template can be referred to as
basic_string<char>
or
Appendix G: New C++ Features and Idioms 607
basic_string<MyCharType>.
If nonstandard traits are needed, the defaulted template parameter must be explic-
itly specified, as shown in the example below:
Note that only class templates can have default template arguments. Defaults can-
not be provided for any argument of a function template. The following would be illegal:
you usually do not care about instantiation of the function template. You simply use this
function template as in the following example:
int i = 5;
foo(i);
float x = 1.5;
foo(x);
The compiler does the work for you; it examines the arguments to these function
calls, determines the argument types, and deduces that in the above cases the function
templates need to be instantiated for type int and for type float.
608 Appendix G: New C++ Features and Idioms
Different from the example above, the template parameter Facet does not appear
as a type of function parameter. The only function parameter to use_facet is the locale.
Now consider a call to this function template:
locale loc;
const numpunct<char>& fac = use_facet(loc); // will not compile !!!
The function argument loc does not allow the template argument to be deduced,
because its type has nothing to do with the template argument Facet. The return type of
the function template is not considered for template argument deduction. Hence in the
call to use_facet above, the template argument Facet cannot be deduced. It has to be
explicitly specified.
Explicit template argument specification is done like this:
locale loc;
const numpunct<char>& fac = use_facet<numpunct<char> >(loc);
Note that the syntax for explicit template argument specification of a function tem-
plate is similar to template argument specification of class templates. If you have a class
template
you naturally specify the template arguments whenever you need an instantiation of the
class template:
list<int> counters;
list<float> sizes;
foo<int>();
foo<float>();
if you have to. If the template argument appears in the function argument list, it is more
convenient to let the compiler deduce the template argument for you.
Appendix G: New C++ Features and Idioms 609
};
This does not compile, because a name following a typename keyword must be a
qualified name that depends on the template parameter. In general, a qualified name is
a name that is preceded by a scope operator; in this case it must be a qualifier containing
a template parameter or a template class name. Therefore, we have to correct our
example to
template <class T>
class B {
public:
typedef int someType_t;
};
One might have expected that the derived class D inherits the type definition of
someType_t. However, when a base class of a class template depends on a template
parameter, as is the case in our example, the compiler cannot inspect the base class while
parsing the template definition of the derived class to see if a name like someType_t is
defined there. Hence we have to reference the type someType_t using a qualified name
such as B<T>: : someType_t. The reason is that although the compiler might have seen
the definition of the base class template B, it does not know whether the actual B that will
be used will be an instance from the template or a specialization that the compiler has not
yet seen.
All this makes for funny effects that are somewhat surprising. Here’s an example
from the standard library, the base class for binary function objects:
template <class Arg, class Result>
struct unary_function
{
typedef Arg argument_type;
typedef Result result_type;
};
Its purpose is to inherit the two types argument_type and result_type to its
derived classes, so that these types are available in every function object. However,
although you derive from the unary_function base class you cannot use any of these
Appendix G: New C++ Features and Idioms 611
types unless you fully specify them. Here’s an example of a typical function object. What
you might expect to find is
However, here is what you probably will find in your header file:
In any case, the point is that although you inherit the type definitions from the
unary_function base class, you have to fully qualify them.
The parameter class in our article is a similar case. Fortunately, it is pretty simple; it
does not use the types it defines and inherits. Still, if you wanted to use any of the types
defined in the base class somewhere in the definition of the derived class, you would have
to fully specify the type names or repeat the base class’s type definitions. We defined the
base class as
612 Appendix G: New C++ Features and Idioms
UPCASTS
An upcast is a cast up the inheritance tree, i.e., from a derived class to a base class. Here is
a simple example:
A pointer to a derived class object is passed to a function that expects a base class
pointer. For argument passing, the derived class pointer is implicitly cast to a base class
pointer. Implicit casts are considered harmless, and indeed an upcast is always well
Appendix G: New C++ Features and Idioms 613
defined, because an object of a derived class type contains a subobject of the base class
type. For that reason, it cannot happen that the function foo() accesses any members
that do not exist.
DOWNCAST
A downcast is a cast down the inheritance tree, i.e., from a base class to a derived class.
Here is an example:
A function taking a base class pointer accesses a member that is available solely in
the derived class. An explicit cast is needed to convert the base class pointer to the desired
derived class pointer. Different from an upcast, a downcast is never performed automati-
cally by the compiler. Instead, the programmer has to cast explicitly, which in our exam-
ple is done using the old-style cast notation.
In our example, the base class pointer points to an object of the derived class type, SO
nothing harmful can happen. All access to members of the object pointed to via the
derived class pointer is well defined in this case. However, a downcast is potentially dan-
gerous. Consider the following example:
In this case, the base class pointer does not point to a derived class object, only to a
base class object. Accessing the data member cnt in function foo () is likely to lead toa
program crash, because the object pointed to does not have the required data member.
614 Appendix G: New C++ Features and idioms
SAFE DOWNCAST
As an alternative to the potentially hazardous language construct of an old-style down-
cast, C++ has a new language feature, the dynamic_cast, which is a safe downcast. It
does not simply perform the required cast, regardless of the actual type of object pointed
to, but instead allows checking of whether the object pointed to really is of the expected
derived class type. In the previous example, one would rewrite the function foo() so
that it checks the pointer’s runtime type information before it attempts access to any
members of the object pointed to. The rewritten function would look like this:
AVOIDING DOWNCASTS
Usually, the use of downcasts is considered poor programming style, because it can
almost always be avoided and replaced by proper use of virtual functions and polymor-
phism. Instead of using dynamic_cast, we could fix the base class and introduce virtual
functions that give access to the data member in question. The base class versions of the
access functions would not do anything but return a default value; the derived class ver-
sions of the access functions would really access the data member.
class Base {
public:
virtual int cnt() { return 0; }
virtual int cnt(int i) { return 0; }
};
class Derived : public Base {
public:
virtual int cnt() { return _cnt; }
virtual int cnt(int i) { int tmp = _cnt; _cnt = i; return tmp; }
private:
int _cnt;
};
Base* db = new Base;
void foo(Base* p)
{ p->cnt(0); } // polymorphic call of-access function
Appendix G: New C++ Features and Idioms 615
and
Only pointers and references to polymorphic types can be cast via the
dynamic_cast operator. A polymorphic type is a class type that has at least one virtual
member function, either directly defined or inherited from a base class. This is because only
2. Include directives and using statements are omitted, as usual. The standard exception bad_cast is defined in
the header file <exception>.
616 Appendix G: New C++ Features and Idioms
PEER CAST
The dynamic_cast operator can also be used for upcasts, which is not terribly useful in
the first place. It is, however, interesting in the case of multiple inheritance. In that context
it allows safe peer class casts such as the following:
In this case, a pointer to a derived object with multiple base classes is cast from a
base class pointer type to another base class pointer type. The cast fails if the object
pointed to is not of a type that is derived from both base classes. This kind of cast is nei-
ther an upcast nor a downcast, but a cast from one branch of the inheritance tree to
another. It is sometimes referred to as a peer class cast.
class X : public Y
{
public:
X(int i, int j) : Y(i), z(j) { ... some other code here ... };
private:
ZZ}
In prestandard C++ we could only wrap the whole function body of the constructor
into a try block:
Appendix G: New C++ Features and Idioms 617
It was not possible to include the initialization list in the try block. As a result, it was
impossible to add any functionality that would react to an exception thrown from the ini-
tialization list, i.e., from the base class constructor of Y (int) or the member constructor
of Z (int). For this reason the standards committee added the function try block to the
C++ programming language.
X’s constructor can be improved in the following way, using a function try block:
}
catch(...) // this ellipsis is correct C++ syntax
{
. do something about the error ...
As usual in a failed constructor, the fully constructed base classes and members are
destroyed. This happens before the handler is entered, meaning that base classes and non-
static data members cannot be accessed in the handler. Only the arguments to the con-
structor are still accessible.
It is not possible to “handle” the exception and finish the creation of the object,
because it is not possible to “return” from the handler. We must exit from the handler
either via an explicit throw statement or a call to exit (), abort (), or the like. If we try
to leave the handler via a return statement or by flowing off the end of the handler, pre-
tending we had handled the exception, the caught exception is automatically rethrown.
Function try blocks can be used not only with constructors but also with destructors
and normal functions. Here is an example in which the function try block is used for a
destructor:
618 Appendix G: New C++ Features and Idioms
~X() try
{
. some code here ...
}
catch(...) // this ellipsis is correct C++ syntax
{
. Go something about the error ...
The function try block for a destructor behaves similarly to that of a constructor: It is
not possible to prevent the throwing of an exception once the control flow has entered the
exception handler, because the handler can be left only by either an explicit throw or an
automatic rethrow.
Finally, let us explore how a function try block behaves together with a normal func-
tion. Here is an example of this kind of use:
void foo() try
{
. some code here ...
}
catch(...) // this ellipsis is correct C++ syntax
{
. do something about the error ...
Unlike constructors and destructors, the handler of a function try block for a normal
function can be left with a return statement; it does not trigger an automatic rethrow. Not
explicitly returning from the handler, but flowing off the end of the catch block, is equiva-
lent to a return with no value, which results in undefined behavior in the case of a value-
returning function. Like the function try block for a constructor, the scope and lifetime of
the parameters extend into the handler of the function try block.
When applied to main() or main(int argc, char* argv[]), the function try
block catches all exceptions raised during execution of the main() function, but it does
not catch exceptions thrown by constructors or destructors of global objects.
Appendix G: New C++ Features and Idioms 619
class logic_error
class domain_error
class invalid_argument
class length_error
class out_of_range
class runtime_error
class range_error
class overflow_error
class underflow_error
The IOStreams exception ios_base: : failure is the only standard library excep-
tion that is not embedded into this hierarchy of exception types.
All C++ standard exception types are derived from the class exception:
class exception
{
public:
exception() throw;
exception(const exception&) throw;
exception& operator= (const exception&) throw;
620 Appendix G: New C++ Features and Idioms
Hence all exception objects contain a message, which is retrievable via the what ()
function. The content of this message is not standardized.
MEMORY MANAGEMENT. C++ strings automatically allocate and deallocate their mem-
ory. The memory management is encapsulated into the string class template. The memory
of a C string has to be allocated and deallocated explicitly.
pynamic size. A C++ string internally maintains a character buffer, which is dynam-
ically resized as needed, whereas C strings are character arrays of fixed size.
RANGE CHECK. C++ strings have access functions that check for range violations and
throw exceptions to indicate such violations. C strings are accessed directly via pointers;
no range check is possible.
VALUE SEMANTICS. C++ strings behave like values, which means that copies of a C++
string object can be treated as independent of each other. Copies of C strings have to be
explicitly created via C library functions like strcpy ().
COPY ON WRITE. YOu can pass a C++ string around without worrying about the over-
head of avoidable copying. Duplication of a C++ string object’s internal data is performed
only if needed; i.e., it is automatically delayed to the actual write access. (Copy on write is
not a feature required by the standard. However, it is permitted optimization that is likely
to be present in a reasonable implementation.)
ALLocarors. C++ strings support different memory allocation models by means of
so-called allocators. See box on “Error! Reference source not found.”
FUNCTIONALITY. C++ strings offer numerous operations for accessing and manipulat-
ing strings. Here is an overview.
element access: operator[ ](),at()
INTERNATIONALIZATION
Lippman, Stanley B., and Josee Lajoie, C++ Primer, Third Edition (Reading, Mass.: Addi-
son-Wesley Publishing Company, 1998).
Stroustrup, Bjarne, The C++ Programming Language, Third Edition (Reading, Mass.: Addison-
Wesley Publishing Company, 1997).
623
624 Bibliography
PATTERNS
Gamma, Erich, Richard Helm, Ralph Johnson, and John Vlissides, Design Patterns: Ele-
ments of Reusable Object-Oriented Software (Reading, Mass.: Addison-Wesley Publish-
ing Company, 1994).
INDEX
625
626 Index
basic_streambuf stream buffer base class, Categories, of locales. See locale category
85, 88-90, 508-516 cerr standard stream 13, 56, 57, 465
deriving from, 225-226, 228-229, 236-237, char character type, 11, 110, 279
244-245 char_traits
changes classic vs. standard iostreams, <char> specialization, 443-446
588-590 <charT> template, 442
basic_string class, 558-559, 620-621 <wchar_t> specialization, 447-450
basic_stringbuf stream buffer class, 70, char_type typedef
85, 244-245, 517-520 in character traits, 115, 440, 441, 444, 448
basic_stringstream stream class, 11, 47, in facets, 362, 365, 367, 373, 387, 395, 398,
68, 70, 74-75, 521-522, 405, 408, 414, 417, 419, 425, 431, 434
replacing strstream class, 587 in stream buffers, 467, 508, 517, 521
beg stream position, 52, 117, 531 in streams, 471, 473, 475, 481, 492, 494, 496,
begin_pointer, 86. See also Stream buffer, 506
abstraction Character(s)
Bidirectional iterator. See also Iterators, buffered input of. See Buffered character
categories of category, 126 transport
bidirectional_iterator_tag, iterator buffered output of. See Buffered character
category tag, 550 transport
binary open mode flag, 42-45, 531 case conversion of, 278-280, 351
Binary open mode. See binary open mode classification of, 276-278, 349-351
flag code conversion of, 116, 263-264, 279,
boolalpha 282-284, 352-356, 358-359
flag, 19, 528 copying, finding, and comparing,
manipulator, 24, 535 116
Buffered character transport encoding of, 257-263
input, 239-244 external representation of, 112, 114
output, 235-239 input of, 227-228, 231-235
internal representation of, 111-112,
C strings, 620 113
C++ strings, 620-621 native representation of, 111, 113
See also basic_string class output of, 227, 231
Callback functions, 142 putback of, 231-235
failure of, 143-144, 201-202, 204-207 single-byte and multibyte, 112-113
implementation of, 202 traits. See Character traits
invocation of, 143-144, 478, 532, 534 transportation of, 226-244
memory management using, 200-204 type vs. encoding, 110-111
registration of, 142-143, 202-203, 534 upper- and lowercase, 278-280
type of, 528 unbuffered output of. See Unbuffered
Callback events, 528, 532 character transport
Carriage return, 43 unbuffered, input of. See Unbuffered
Catalog. See Message catalogs character transport
catalog unbuffered, putback of. See Unbuffered
typedef in facets, 390, 391 character transport
Index 627
failure stream exception, 36-38, 153, 167, Format control, 150-151, 159-160
527 Format flags, 17
falsename().Seealsodo_falsename () listed, 18-19
member of numpunct facet, 415 usage of, 20
fda() stream member, removal of, 586-587 Format parameters, 16
Field adjustment, 18-20 with arbitrary value, 16-17
in user-defined inserters, 150, 160 with predefined set of values, 17-18
Field width, peculiarities of, 21-22 Formatted I/O, 12-31
filebuf. Seealsobasic_filebuf control of, 15-16
typedef, 464 Formatting, 5, 28
File buffer. See File stream buffer of boolean values, 572-573
File, creating, 43 control of, 82-83
File descriptor, removal of accessor, 586-587 errors in, 83-84
File length, 42 of facets, 286-298
File name, 39 of numbers, 8, 79-84, 569-572
File open modes, in C vs. in C++, 579-580 Formatting layer, 7-8
File position, 42 Forward iterator. See also Iterators, categories
File stream buffer, 97~99 of category, 126
effect of input on, 101-102 forward_iterator_tag, iterator category
effect of output on, 99-101 tag, 550
effect of putback on, 106-109 fpos class, 523
effect of switching between input and frac_digits().Seealso
output on, 102-106 do_frac_digits()
class. See basic_filebuf stream buffer member of money_punct facet, 291, 400
class fstream typedef, 453
File stream classes, 10, 68 Function call, vs. constructor call, 185
declarations for, 463 Function templates, specialization of, 603-604
File stream objects, 39-41 Function try block, 617-618
opening, 41-45 Functions, vs. operators, 14
File streams, 9, 43-45
bidirectional, 45-47 gbump () stream buffer member, 514
fi11() stream member, 478 gcount () stream buffer member, 486
find() member of character traits, 445, 449 General input and output stream classes, 9
find() algorithm, 124, 133, 134 General stream classes, 66-68
fixed stream buffer and, 71
format flag, 19, 530 get ()
manipulator, 25, 537 member of messages facet, 388. See also
flags () stream member, 17, 532-533 do_get()
floatfield member of money_get facet, 396. See also
format flags, 19, 530 do_get ()
bit group, 19, 530 member of num_get facet, 410-411. See also
Floating-point numbers, 8 do_get()
flush manipulator, 24, 54, 186, 505 member of input streams, 486-487
fmt flags type, 527 get_date().Seealsodo_get_date()
Fonts, 257 member of time_get facet, 158, 426
Index 631
num_put facet, 266, 287-288, 301, 305, for input stream buffer iterators, 545
419-422 for input stream iterators, 548
Index 635
pos_type typedef Radix separator, 27, 253, 286, 401, 416, 565
in character traits, 117-118, 440, 441, 444, Random access iterator. See Iterators,
448 categories of category, 126
in stream buffers, 467, 508, 517, 521 random_access_iterator_tag, iterator
in streams, 471, 473, 475, 481, 492, 494, 496, category tag, 550
506 rdbuf (),74
positive_sign() aspects of, 72
member of money_put facet, 292 file stream member, 472, 474, 479, 495
member of moneypunct facet, 291, 400 stream buffer member, 507, 522
pptr() stream buffer member, 513 rdstate() file stream member, 34, 74, 477
precision () stream member, 533 read () stream buffer member, 155, 229, 488
Prefix stream activities, 151-152, 158-159 readsome () stream buffer member, 488
Prohibited copy assignment Record I/O, 4
for locale: :id, 382 refs argument of facet constructors, 321-323
for locale: : facet, 382 register_callback() stream member,
for streams, 72—76, 476 534
Prohibited copy construction resetiosflags manipulator, 25, 538
for locale: :id, 382 right
for locale: : facet, 382 format flag, 19, 529
for streams, 72—76, 476 manipulator, 25, 537
proxy nested class in Roman characters, 260, 261
istreambuf_iterator, 543-544
pubimbue () stream buffer member, 78, 512 Safe downcast, 614. See also dynamic_cast
pubseekoff () stream buffer member, 512 operator
pubsetbuf () stream buffer member, 511 sbumpc () stream buffer member, 88, 96, 510
pubsync () stream buffer member, 54, 227, scan_is() member of ctype facet, 369, 375
512 scan_not() member of ctype facet, 369,
put () 375
member of money_put facet, 406. See also scientific
do_put() flag, 19, 530
member of num_put facet, 287-288, manipulator, 25, 537
420-421. See also do_put () seekdir type, 527
member of time_put facet, 432. See also seekg () stream buffer member, 52, 489, 502
do_put () seekof f () stream buffer member, 244, 469,
member of output streams, 501 515,519
using stream buffer iterators, 158 seekp () stream buffer member, 52
Put area, 86 seekpos () stream buffer member, 244, 469,
putback () input stream member, 489 516, 520
pword() stream member, 62, 140, 141-142, sentry nested class in streams, 151-152, 483,
191, 195, 196, 534 498
adding attributes and functionality use of, 158-159
through, 192-195, 207-209 setbase manipulator, 25, 539
distinguished from iword (),208 setbuf ()
in classic vs. standard IOStreams, 588
Qualifiers, 574 stream buffer member, 245, 470, 515
Index 637
How.to
Register this Book
_ Visit: http://www.aw.com/cseng/register
Your Book
Enter the ISBN*
Then
you will receive:
. . * Notices and reminders about upcoming author appearances, tradeshows, and online
chats with special guests
dvance | notice of forthcoming editions of your book
ok recommendations
i about special contests and promotions throughout the year
[email protected]
Request information about book registration.
Addison-Wesley Professional
One Jacob Way, Reading, Massachusetts 01867 USA
TEL 781-944-3700 ¢ FAX 781-942-3076