C
C
Information................................................................................................................. 7
A Brief Description...................................................................................................... 7
Preface.................................................................................................................... 7
An Overview of Programs and Programming Languages.........................................7
The Features of C++ as a Language.....................................................................10
History of C++......................................................................................................... 11
C++ Language FAQ.................................................................................................. 12
Tutorials.................................................................................................................... 14
Supplemental papers......................................................................................... 15
C++ Language......................................................................................................... 15
Compilers................................................................................................................. 16
What is a compiler?............................................................................................ 17
Console programs.............................................................................................. 18
Basisc of C++........................................................................................................... 19
Structure of a program............................................................................................. 19
Comments.......................................................................................................... 22
Using namespace std......................................................................................... 23
Variables and types.................................................................................................. 24
Identifiers........................................................................................................... 25
Fundamental data types.................................................................................... 26
Declaration of variables..................................................................................... 29
Initialization of variables.................................................................................... 30
Type deduction: auto and decltype....................................................................31
Introduction to strings........................................................................................ 32
Constants................................................................................................................. 33
Literals............................................................................................................... 33
Typed constant expressions...............................................................................39
Preprocessor definitions (#define).....................................................................40
Operators................................................................................................................. 41
Assignment operator (=).................................................................................... 41
1
Arithmetic operators ( +, -, *, /, % )...................................................................42
Compound assignment (+=, -=, *=, /=, %=, >>=, <<=, &=, ^=, |=).............43
Increment and decrement (++, --).....................................................................44
Relational and comparison operators ( ==, !=, >, <, >=, <= )........................44
Logical operators ( !, &&, || )..............................................................................46
Conditional ternary operator ( ? ).......................................................................47
Comma operator ( , ).......................................................................................... 48
Bitwise operators ( &, |, ^, ~, <<, >> ).............................................................48
Explicit type casting operator............................................................................49
sizeof.................................................................................................................. 49
Other operators.................................................................................................. 50
Precedence of operators.................................................................................... 50
Basic Input/Output.................................................................................................... 52
Standard output (cout)....................................................................................... 53
Standard input (cin)........................................................................................... 55
cin and strings.................................................................................................... 56
stringstream....................................................................................................... 57
Program structure..................................................................................................... 58
Statements and flow control..................................................................................... 58
Selection statements: if and else.......................................................................59
Iteration statements (loops)............................................................................... 60
Jump statements................................................................................................ 65
Another selection statement: switch..................................................................67
Functions.................................................................................................................. 68
Functions with no type. The use of void.............................................................72
The return value of main.................................................................................... 73
Arguments passed by value and by reference...................................................74
Efficiency considerations and const references..................................................76
Inline functions................................................................................................... 77
Default values in parameters.............................................................................78
Declaring functions............................................................................................ 79
Recursivity......................................................................................................... 81
Overloads and templates......................................................................................... 82
2
Overloaded functions......................................................................................... 82
Function templates............................................................................................. 83
Non-type template arguments...........................................................................86
Name visibility.......................................................................................................... 87
Scopes................................................................................................................ 87
Namespaces....................................................................................................... 89
using.................................................................................................................. 90
Namespace aliasing........................................................................................... 92
The std namespace............................................................................................ 92
Storage classes.................................................................................................. 93
Compound data types.............................................................................................. 94
Arrays....................................................................................................................... 94
Initializing arrays................................................................................................ 95
Accessing the values of an array........................................................................96
Multidimensional arrays..................................................................................... 98
Arrays as parameters....................................................................................... 100
Library arrays................................................................................................... 101
Character sequences.............................................................................................. 102
Initialization of null-terminated character sequences......................................103
Strings and null-terminated character sequences............................................105
Pointers.................................................................................................................. 106
Address-of operator (&)....................................................................................107
Dereference operator (*).................................................................................. 108
Declaring pointers............................................................................................ 109
Pointers and arrays.......................................................................................... 112
Pointer initialization.......................................................................................... 113
Pointer arithmetics........................................................................................... 114
Pointers and const............................................................................................ 116
Pointers and string literals................................................................................ 118
Pointers to pointers.......................................................................................... 119
void pointers.................................................................................................... 120
Invalid pointers and null pointers.....................................................................121
Pointers to functions........................................................................................ 122
3
Dynamic memory................................................................................................... 122
Operators new and new[].................................................................................123
Operators delete and delete[]..........................................................................125
Dynamic memory in C...................................................................................... 126
Data structures....................................................................................................... 126
Data structures................................................................................................ 126
Pointers to structures....................................................................................... 130
Nesting structures............................................................................................ 132
Other data types.................................................................................................... 132
Type aliases (typedef / using)...........................................................................132
Unions.............................................................................................................. 134
Anonymous unions........................................................................................... 135
Enumerated types (enum)...............................................................................136
Enumerated types with enum class.................................................................137
Classes................................................................................................................... 138
Classes (I)............................................................................................................... 138
Constructors..................................................................................................... 142
Overloading constructors.................................................................................143
Uniform initialization........................................................................................ 144
Member initialization in constructors...............................................................145
Pointers to classes............................................................................................ 147
Classes defined with struct and union..............................................................149
Classes (II).............................................................................................................. 149
Overloading operators..................................................................................... 149
The keyword this.............................................................................................. 152
Static members................................................................................................ 153
Const member functions.................................................................................. 155
Class templates................................................................................................ 157
Template specialization.................................................................................... 159
Special members.................................................................................................... 160
Default constructor.......................................................................................... 161
Destructor........................................................................................................ 163
Copy constructor.............................................................................................. 164
4
Copy assignment.............................................................................................. 165
Move constructor and assignment...................................................................166
Implicit members............................................................................................. 168
Friendship and inheritance..................................................................................... 170
Friend functions................................................................................................ 170
Friend classes................................................................................................... 171
Inheritance between classes............................................................................ 173
What is inherited from the base class?............................................................176
Multiple inheritance.......................................................................................... 177
Polymorphism......................................................................................................... 178
Pointers to base class....................................................................................... 179
Virtual members............................................................................................... 180
Abstract base classes....................................................................................... 181
Other language features......................................................................................... 185
Type conversions.................................................................................................... 185
Implicit conversion........................................................................................... 185
Implicit conversions with classes.....................................................................186
Keyword explicit............................................................................................... 187
Type casting..................................................................................................... 188
dynamic_cast................................................................................................... 190
static_cast........................................................................................................ 191
reinterpret_cast................................................................................................ 192
const_cast........................................................................................................ 193
typeid............................................................................................................... 193
Exceptions.............................................................................................................. 195
Exception specification.................................................................................... 197
Standard exceptions........................................................................................ 197
Preprocessor directives.......................................................................................... 199
macro definitions (#define, #undef)................................................................199
Conditional inclusions (#ifdef, #ifndef, #if, #endif, #else and #elif)..............201
Line control (#line)........................................................................................... 203
Error directive (#error)..................................................................................... 203
Source file inclusion (#include)........................................................................204
5
Pragma directive (#pragma)............................................................................204
Predefined macro names..................................................................................204
Standard library...................................................................................................... 206
Input/output with files............................................................................................ 206
Open a file........................................................................................................ 207
Closing a file..................................................................................................... 209
Text files........................................................................................................... 209
Checking state flags......................................................................................... 210
get and put stream positioning........................................................................211
Binary files....................................................................................................... 213
Buffers and Synchronization............................................................................. 215
6
Information
A brief description:
Some general aspects of this language.
History of C++:
Brief history of the development of this language.
Frequently Asked Questions:
A short list of common questions novice programmers ask.
A Brief Description
Preface
Computers are some of the most versatile tools that we have available. They are
capable of performing stunning feats of computation, they allow information to be
exchanged easily regardless of their physical location, they simplify many every-day
tasks, and they allow us to automate many processes that would be tedious or
boring to perform otherwise. However, computers are not "intelligent" as we are.
They have to be told in no uncertain terms exactly what they're supposed to do, and
their native languages are quite unlike anything we speak. Thus, there's a
formidable language barrier between a person who wishes a computer to do
something, and the computer that typically requires instructions in its native
language, machine code, to do anything. So far, computers cannot figure out what
they are supposed to do on their own, and thus they rely on programs which we
create, which are sets of instructions that the computer can understand and follow.
Depending on the type of project, there are many factors that have to be considered
when choosing a language. Here is a list of some of the more noteworthy ones:
7
port well across operating systems and the compilation process may take a
while.
Interpreted languages are read by a program called an interpreter and are
executed by that program. While they are as portable as their interpreter and
have no long compile times, interpreted languages are usually much slower
than an equivalent compiled program.
Finally, just-in-time compiled (or JIT-compiled) languages are languages
that are quickly compiled when programs written in them need to be run
(usually with very little optimization), offering a balance between
performance and portability.
High or Low Level Level, in this case, refers to how much the nature of the
language reflects the underlying system. In other words, a programming
language's level refers to how similar the language is to a computer's native
language. The higher the level, the less similar it is.
A low-level language is generally quite similar to machine code, and thus is
more suitable for programs like device drivers or very high performance
programs that really need access to the hardware. Generally, the term is
reserved for machine code itself and assembly languages, though many
languages offer low-level elements. Since a low-level language is subject to
all the nuances of the hardware it's accessing, however, a program written in
a low-level language is generally difficult to port to other platforms. Low level
languages are practically never interpreted, as this generally defeats the
purpose.
A high-level language focuses more on concepts that are easy to
understand by the human mind, such as objects or mathematical functions. A
high-level language usually is easier to understand than a low-level language,
and it usually takes less time to develop a program in a high-level language
than it does in a low-level language. As a trade-off one generally needs to
sacrifice some degree of control over what the resulting program actually
does. It is not, however, impossible to mix high-level and low-level
functionality in a language.
Type System
A type system refers to the rules that the different types of variables of a
language have to follow. Some languages (including most assembly
languages) do not have types and thus this section does not apply to them.
However, as most languages (including C++) have types, this information is
important.
8
types of variables. Many languages require variables' types to be
explicitly defined, and thus rely on manifest typing. Some however, will
infer the type of the variable based on the contexts in which it is used,
and thus use inferred typing.
These typing characteristics are not necessarily mutually exclusive, and some
languages mix them.
Supported paradigms
A programming paradigm is a methodology or way of programming that a
programming language supports. Here is a summary of a few common
paradigms:
o Declarative
A declarative language will focus more on specifying what a language
is supposed to accomplish rather than by what means it is supposed to
accomplish it. Such a paradigm might be used to avoid undesired side-
effects resulting from having to write one's own code.
o Functional
Functional programming is a subset of declarative programming that
tries to express problems in terms of mathematical equations and
functions. It goes out of its way to avoid the concepts of states and
mutable variables which are common in imperative languages.
o Generic
Generic programming focuses on writing skeleton algorithms in terms
of types that will be specified when the algorithm is actually used, thus
allowing some leniency to programmers who wish to avoid strict strong
typing rules. It can be a very powerful paradigm if well-implemented.
o Imperative
Imperative languages allow programmers to give the computer
ordered lists of instructions without necessarily having to explicitly
9
state the task. It can be thought of being the opposite of declarative
programming.
o Structured
Structured programming languages aim to provide some form of
noteworthy structure to a language, such as intuitive control over the
order in which statements are executed (if X then do Y otherwise do Z,
do X while Y is Z). Such languages generally deprecate "jumps", such
as those provided by the goto statement in C and C++.
o Procedural
Although it is sometimes used as a synonym for imperative
programming, a procedural programming language can also refer to an
imperative structured programming language which supports the
concept of procedures and subroutines (also known as functions in C or
C++).
o Object-Oriented
Object-Oriented programming (sometimes abbreviated to OOP) is a
subset of structured programming which expresses programs in the
terms of "objects", which are meant to model objects in the real world.
Such a paradigm allows code to be reused in remarkable ways and is
meant to be easy to understand.
Standardization
Does a language have a formal standard? This can be very important to
ensure that programs written to work with one compiler/interpreter will work
with another. Some languages are standardized by the American National
Standards Institute (ANSI), some are standardized by the International
Organization for Standardization (ISO), and some have an informal but de-
facto standard not maintained by any standards organization.
10
...is a strongly-typed unsafe language.
C++ is a language that expects the programmer to know what he or she is
doing, but allows for incredible amounts of control as a result.
...is portable.
As one of the most frequently used languages in the world and as an open
language, C++ has a wide range of compilers that run on many different
platforms that support it. Code that exclusively uses C++'s standard library
will run on many platforms with few to no changes.
History of C++
The C++ programming language has a history going back to 1979, when Bjarne
Stroustrup was doing work for his Ph.D. thesis. One of the languages Stroustrup had
the opportunity to work with was a language called Simula, which as the name
implies is a language primarily designed for simulations. The Simula 67 language -
which was the variant that Stroustrup worked with - is regarded as the first
language to support the object-oriented programming paradigm. Stroustrup found
that this paradigm was very useful for software development, however the Simula
language was far too slow for practical use.
Shortly thereafter, he began work on "C with Classes", which as the name implies
was meant to be a superset of the C language. His goal was to add object-oriented
programming into the C language, which was and still is a language well-respected
for its portability without sacrificing speed or low-level functionality. His language
11
included classes, basic inheritance, inlining, default function arguments, and strong
type checking in addition to all the features of the C language.
The first C with Classes compiler was called Cfront, which was derived from a C
compiler called CPre. It was a program designed to translate C with Classes code to
ordinary C. A rather interesting point worth noting is that Cfront was written mostly
in C with Classes, making it a self-hosting compiler (a compiler that can compile
itself). Cfront would later be abandoned in 1993 after it became difficult to integrate
new features into it, namely C++ exceptions. Nonetheless, Cfront made a huge
impact on the implementations of future compilers and on the Unix operating
system.
In 1983, the name of the language was changed from C with Classes to C++. The +
+ operator in the C language is an operator for incrementing a variable, which gives
some insight into how Stroustrup regarded the language. Many new features were
added around this time, the most notable of which are virtual functions, function
overloading, references with the & symbol, the const keyword, and single-line
comments using two forward slashes (which is a feature taken from the language
BCPL).
In 1990, The Annotated C++ Reference Manual was released. The same year,
Borland's Turbo C++ compiler would be released as a commercial product. Turbo C+
+ added a plethora of additional libraries which would have a considerable impact
on C++'s development. Although Turbo C++'s last stable release was in 2006, the
compiler is still widely used.
In 1998, the C++ standards committee published the first international standard for
C++ ISO/IEC 14882:1998, which would be informally known as C++98. The
Annotated C++ Reference Manual was said to be a large influence in the
development of the standard. The Standard Template Library, which began its
conceptual development in 1979, was also included. In 2003, the committee
responded to multiple problems that were reported with their 1998 standard, and
revised it accordingly. The changed language was dubbed C++03.
In 2005, the C++ standards committee released a technical report (dubbed TR1)
detailing various features they were planning to add to the latest C++ standard.
The new standard was informally dubbed C++0x as it was expected to be released
sometime before the end of the first decade. Ironically, however, the new standard
12
would not be released until mid-2011. Several technical reports were released up
until then, and some compilers began adding experimental support for the new
features.
In mid-2011, the new C++ standard (dubbed C++11) was finished. The Boost
library project made a considerable impact on the new standard, and some of the
new modules were derived directly from the corresponding Boost libraries. Some of
the new features included regular expression support (details on regular
expressions may be found here), a comprehensive randomization library, a new C+
+ time library, atomics support, a standard threading library (which up until 2011
both C and C++ were lacking), a new for loop syntax providing functionality similar
to foreach loops in certain other languages, the auto keyword, new container
classes, better support for unions and array-initialization lists, and variadic
templates.
Search:
Information
Not logged in
registerlog in
What is C++?
13
makes the communication and manipulation of data in a program written in
C++ as simple as in other languages, without losing the power it offers.
There are many ways. Depending on the time you have and your preferences.
The language is taught in many types of academic forms throughout the
world, and can also be learnt by oneself with the help of tutorials and books.
The documentation section of this Website contains an online tutorial to help
you achieve the objective of learning this language.
No. No one owns the C++ language. Anyone can use the language royalty-
free.
What is ANSI-C++?
ANSI-C++ is the name by which the international ANSI/ISO standard for the
C++ language is known. But before this standard was published, C++ was
already widely used and therefore there is a lot of code out there written in
pre-standard C++. Referring to ANSI-C++ explicitly differenciates it from pre-
standard C++ code, which is incompatible in some ways.
14
8 return 0;
9 }
If your compiler is able to compile this program, you will be able to compile
most of the existing ANSI-C++ code.
You need a C++ compiler and linker that can generate code for your
windowing environment (Windows, XWindow, MacOS, ...). Windowed
programs do not generally use the console to communicate with the user.
They use a set of functions or classes to manipulate windows instead, which
are specific to each environment. Anyway, the same principles apply both for
console and windowed programs, except for communicating with the user.
Tutorials
Supplemental papers
ASCII Codes
Numerical bases
Boolean operations
C++ Language
15
These tutorials explain the C++ language from its basics up to the newest features
introduced by C++11. Chapters have a practical orientation, with example
programs in all sections to start practicing what is being explained right away.
Introduction
Compilers
Basics of C++
Structure of a program
Constants
Operators
Basic Input/Output
Program structure
Control Structures
Functions
Name visibility
Arrays
Character sequences
16
Pointers
Dynamic Memory
Data structures
Classes
Classes (I)
Classes (II)
Special members
Polymorphism
Type conversions
Exceptions
Preprocessor directives
Compilers
The essential tools needed to follow these tutorials are a computer and a compiler
toolchain able to compile C++ code and build the programs to run on it.
C++ is a language that has evolved much over the years, and these tutorials
17
explain many features added recently to the language. Therefore, in order to
properly follow the tutorials, a recent compiler is needed. It shall support (even if
only partially) the features introduced by the 2011 standard.
Many compiler vendors support the new features at different degrees. See the
bottom of this page for some compilers that are known to support the features
needed. Some of them are free!
If for some reason, you need to use some older compiler, you can access an older
version of these tutorials here (no longer updated).
What is a compiler?
Computers understand only one language and that language consists of sets of
instructions made of ones and zeros. This computer language is appropriately called
machine language.
00000 10011110
A particular computer's machine language program that allows a user to input two
numbers, adds the two numbers together, and displays the total could include these
machine code instructions:
00000 10011110
00001 11110100
00010 10011110
00011 11010100
00100 10111111
00101 00000000
This is a portion of code written in C++ that accomplishes the exact same purpose:
18
1 int a, b, sum;
2
3 cin >> a;
4 cin >> b;
5
6 sum = a + b;
7 cout << sum << endl;
Even if you cannot really understand the code above, you should be able to
appreciate how much easier it will be to program in the C++ language as opposed
to machine language.
Because a computer can only understand machine language and humans wish to
write in high level languages high level languages have to be re-written (translated)
into machine language at some point. This is done by special programs called
compilers, interpreters, or assemblers that are built into the various programming
applications.
Console programs
Console programs are programs that use text to communicate with the user and the
environment, such as printing text to the screen or reading input from a keyboard.
Console programs are easy to interact with, and generally have a predictable
behavior that is identical across all platforms. They are also simple to implement
and thus are very useful to learn the basics of a programming language: The
examples in these tutorials are all console programs.
The way to compile console programs depends on the particular tool you are using.
The easiest way for beginners to compile C++ programs is by using an Integrated
Development Environment (IDE). An IDE generally integrates several development
tools, including a text editor and tools to compile programs directly from it.
Here you have instructions on how to compile and run console programs using
different free Integrated Development Interfaces (IDEs):
19
Windows/Linux/Mac Compile console programs using
Code::blocks
OS Code::blocks
If you happen to have a Linux or Mac environment with development features, you
should be able to compile any of the examples directly from a terminal just by
including C++11 flags in the command for the compiler:
Compil
Platform Command
er
Linux, among
GCC g++ -std=c++0x example.cpp -o example_program
others...
Basisc of C++
Structure of a program
The left panel above shows the C++ code for this program. The right panel shows
the result when the program is executed by a computer. The grey numbers to the
left of the panels are line numbers to make discussing programs and researching
errors easier. They are not part of the program.
20
Two slash signs indicate that the rest of the line is a comment inserted by the
programmer but which has no effect on the behavior of the program.
Programmers use them to include short explanations or observations
concerning the code or program. In this case, it is a brief introductory
description of the program.
Lines beginning with a hash sign (#) are directives read and interpreted by
what is known as the preprocessor. They are special lines interpreted before
the compilation of the program itself begins. In this case, the directive
#include <iostream>, instructs the preprocessor to include a section of
standard C++ code, known as header iostream, that allows to perform
standard input and output operations, such as writing the output of this
program (Hello World) to the screen.
The function named main is a special function in all C++ programs; it is the
function called when the program is run. The execution of all C++ programs
begins with the main function, regardless of where the function is actually
located within the code.
The open brace ({) at line 5 indicates the beginning of main's function
definition, and the closing brace (}) at line 7, indicates its end. Everything
21
between these braces is the function's body that defines what happens when
main is called. All functions use braces to indicate the beginning and end of
their definitions.
This statement has three parts: First, std::cout, which identifies the standard
character output device (usually, this is the computer screen). Second, the
insertion operator (<<), which indicates that what follows is inserted into
std::cout. Finally, a sentence within quotes ("Hello world!"), is the content
inserted into the standard output.
Notice that the statement ends with a semicolon ( ;). This character marks
the end of the statement, just as the period ends a sentence in English. All
C++ statements must end with a semicolon character. One of the most
common syntax errors in C++ is forgetting to end a statement with a
semicolon.
You may have noticed that not all the lines of this program perform actions when
the code is executed. There is a line containing a comment (beginning with //).
There is a line with a directive for the preprocessor (beginning with #). There is a
line that defines a function (in this case, the main function). And, finally, a line with a
statements ending with a semicolon (the insertion into cout), which was within the
block delimited by the braces ( { } ) of the main function.
The program has been structured in different lines and properly indented, in order
to make it easier to understand for the humans reading it. But C++ does not have
strict rules on indentation or on how to split instructions in different lines. For
example, instead of
1 int main ()
2 { Edit & Run
3 std::cout << " Hello World!";
4 }
22
all in a single line, and this would have had exactly the same meaning as the
preceding code.
In this case, the program performed two insertions into std::cout in two different
statements. Once again, the separation in different lines of code simply gives
greater readability to the program, since main could have been perfectly valid
defined in this way:
Edit &
int main () { std::cout << " Hello World! "; std::cout << " I'm a C+
+ program "; } Run
The source code could have also been divided into more code lines instead:
1 int main ()
2 {
3 std::cout <<
4 "Hello World!"; Edit & Run
5 std::cout
6 << "I'm a C++ program";
7 }
And the result would again have been exactly the same as in the previous
23
examples.
Preprocessor directives (those that begin by #) are out of this general rule since
they are not statements. They are lines read and processed by the preprocessor
before proper compilation begins. Preprocessor directives must be specified in their
own line and, because they are not statements, do not have to end with a semicolon
(;).
Comments
As noted above, comments do not affect the operation of the program; however,
they provide an important tool to document directly within the source code what the
program does and how it operates.
1 // line comment
2 /* block comment */
The first of them, known as line comment, discards everything from where the pair
of slash signs (//) are found up to the end of that same line. The second one, known
as block comment, discards everything between the /* characters and the first
appearance of the */ characters, with the possibility of including multiple lines.
If comments are included within the source code of a program without using the
comment characters combinations //, /* or */, the compiler takes them as if they
were C++ expressions, most likely causing the compilation to fail with one, or
several, error messages.
24
Using namespace std
If you have seen C++ code before, you may have seen cout being used instead of
std::cout. Both name the same object: the first one uses its unqualified name
(cout), while the second qualifies it directly within the namespace std (as
std::cout).
cout is part of the standard library, and all the elements in the standard C++ library
are declared within what is called a namespace: the namespace std.
In order to refer to the elements in the std namespace a program shall either qualify
each and every use of elements of the library (as we have done by prefixing cout
with std::), or introduce visibility of its components. The most typical way to
introduce visibility of these components is by means of using declarations:
The above declaration allows all elements in the std namespace to be accessed in
an unqualified manner (without the std:: prefix).
With this in mind, the last example can be rewritten to make unqualified uses of
cout as:
1 // my second program in C++
2 #include <iostream>
3 using namespace std;
4 Edit &
5 int main () Hello World! I'm a C++ program Run
6 {
7 cout << "Hello World! ";
8 cout << "I'm a C++ program";
9 }
Both ways of accessing the elements of the std namespace (explicit qualification
and using declarations) are valid in C++ and produce the exact same behavior. For
simplicity, and to improve readability, the examples in these tutorials will more
often use this latter approach with using declarations, although note that explicit
qualification is the only way to guarantee that name collisions never happen.
25
Variables and types
The usefulness of the "Hello World" programs shown in the previous chapter is
rather questionable. We had to write several lines of code, compile them, and then
execute the resulting program, just to obtain the result of a simple sentence written
on the screen. It certainly would have been much faster to type the output sentence
ourselves.
However, programming is not limited only to printing simple texts on the screen. In
order to go a little further on and to become able to write programs that perform
useful tasks that really save us work, we need to introduce the concept of variables.
Let's imagine that I ask you to remember the number 5, and then I ask you to also
memorize the number 2 at the same time. You have just stored two different values
in your memory (5 and 2). Now, if I ask you to add 1 to the first number I said, you
should be retaining the numbers 6 (that is 5+1) and 2 in your memory. Then we
could, for example, subtract these values and obtain 4 as result.
The whole process described above is a simile of what a computer can do with two
variables. The same process can be expressed in C++ with the following set of
statements:
1 a = 5;
2 b = 2;
3 a = a + 1;
4 result = a - b;
Obviously, this is a very simple example, since we have only used two small integer
values, but consider that your computer can store millions of numbers like these at
the same time and conduct sophisticated mathematical operations with them.
Each variable needs a name that identifies it and distinguishes it from the others.
For example, in the previous code the variable names were a, b, and result, but we
could have called the variables any names we could have come up with, as long as
they were valid C++ identifiers.
Identifiers
A valid identifier is a sequence of one or more letters, digits, or underscore
characters (_). Spaces, punctuation marks, and symbols cannot be part of an
identifier. In addition, identifiers shall always begin with a letter. They can also begin
26
with an underline character (_), but such identifiers are -on most cases- considered
reserved for compiler-specific keywords or external identifiers, as well as identifiers
containing two successive underscore characters anywhere. In no case can they
begin with a digit.
alignas, alignof, and, and_eq, asm, auto, bitand, bitor, bool, break, case,
catch, char, char16_t, char32_t, class, compl, const, constexpr, const_cast,
continue, decltype, default, delete, do, double, dynamic_cast, else, enum,
explicit, export, extern, false, float, for, friend, goto, if, inline, int,
long, mutable, namespace, new, noexcept, not, not_eq, nullptr, operator, or,
or_eq, private, protected, public, register, reinterpret_cast, return, short,
signed, sizeof, static, static_assert, static_cast, struct, switch, template,
this, thread_local, throw, true, try, typedef, typeid, typename, union,
unsigned, using, virtual, void, volatile, wchar_t, while, xor, xor_eq
Very important: The C++ language is a "case sensitive" language. That means
that an identifier written in capital letters is not equivalent to another one with the
same name but written in small letters. Thus, for example, the RESULT variable is not
the same as the result variable or the Result variable. These are three different
identifiers identifiying three different variables.
Fundamental data types are basic types implemented directly by the language that
represent the basic storage units supported natively by most systems. They can
mainly be classified into:
27
Character types: They can represent a single character, such as 'A' or '$'.
The most basic type is char, which is a one-byte character. Other types are
also provided for wider characters.
Numerical integer types: They can store a whole number value, such as 7
or 1024. They exist in a variety of sizes, and can either be signed or unsigned,
depending on whether they support negative values or not.
Floating-point types: They can represent real values, such as 3.14 or 0.01,
with different levels of precision, depending on which of the three floating-
point types is used.
Boolean type: The boolean type, known in C++ as bool, can only represent
one of two states, true or false.
signed long long int Not smaller than long. At least 64 bits.
unsigned char
28
float
Floating-point
double Precision not less than float
types
long double Precision not less than double
* The names of certain integer types can be abbreviated without their signed and
int components - only the part not in italics is required to identify the type, the part
in italics is optional. I.e., signed short int can be abbreviated as signed short,
short int, or simply short; they all identify the same fundamental type.
Within each of the groups above, the difference between types is only their size
(i.e., how much they occupy in memory): the first type in each group is the smallest,
and the last is the largest, with each type being at least as large as the one
preceding it in the same group. Other than that, the types in a group have the same
properties.
Note in the panel above that other than char (which has a size of exactly one byte),
none of the fundamental types has a standard size specified (but a minimum size,
at most). Therefore, the type is not required (and in many cases is not) exactly this
minimum size. This does not mean that these types are of an undetermined size,
but that there is no standard size across all compilers and machines; each compiler
implementation may specify the sizes for these types that fit the best the
architecture where the program is going to run. This rather generic size
specification for types gives the C++ language a lot of flexibility to be adapted to
work optimally in all kinds of platforms, both present and future.
Type sizes above are expressed in bits; the more bits a type has, the more distinct
values it can represent, but at the same time, also consumes more space in
memory:
8-bit 256 = 28
64-bit 18 446 744 073 709 551 616 = 264 (~18 billion billion)
29
For integer types, having more representable values means that the range of values
they can represent is greater; for example, a 16-bit unsigned integer would be able
to represent 65536 distinct values in the range 0 to 65535, while its signed
counterpart would be able to represent, on most cases, values between -32768 and
32767. Note that the range of positive values is approximately halved in signed
types compared to unsigned types, due to the fact that one of the 16 bits is used for
the sign; this is a relatively modest difference in range, and seldom justifies the use
of unsigned types based purely on the range of positive values they can represent.
For floating-point types, the size affects their precision, by having more or less bits
for their significant and exponent.
If the size or precision of the type is not a concern, then char, int, and double are
typically selected to represent characters, integers, and floating-point values,
respectively. The other types in their respective groups are only used in very
particular cases.
The types described above (characters, integers, floating-point, and boolean) are
collectively known as arithmetic types. But two additional fundamental types exist:
void, which identifies the lack of type; and the type nullptr, which is a special type
of pointer. Both types will be discussed further in a coming chapter about pointers.
C++ supports a wide variety of types based on the fundamental types discussed
above; these other types are known as compound data types, and are one of the
main strengths of the C++ language. We will also see them in more detail in future
chapters.
Declaration of variables
C++ is a strongly-typed language, and requires every variable to be declared with
its type before its first use. This informs the compiler the size to reserve in memory
for the variable and how to interpret its value. The syntax to declare a new variable
in C++ is straightforward: we simply write the type followed by the variable name
(i.e., its identifier). For example:
30
1 int a;
2 float mynumber;
These are two valid declarations of variables. The first one declares a variable of
type int with the identifier a. The second one declares a variable of type float with
the identifier mynumber. Once declared, the variables a and mynumber can be used
within the rest of their scope in the program.
If declaring more than one variable of the same type, they can all be declared in a
single statement by separating their identifiers with commas. For example:
int a, b, c;
This declares three variables (a, b and c), all of them of type int, and has exactly
the same meaning as:
1 int a;
2 int b;
3 int c;
To see what variable declarations look like in action within a program, let's have a
look at the entire C++ code of the example about your mental memory proposed at
the beginning of this chapter:
31
Don't be worried if something else than the variable declarations themselves look a
bit strange to you. Most of it will be explained in more detail in coming chapters.
Initialization of variables
When the variables in the example above are declared, they have an undetermined
value until they are assigned a value for the first time. But it is possible for a
variable to have a specific value from the moment it is declared. This is called the
initialization of the variable.
In C++, there are three ways to initialize variables. They are all equivalent and are
reminiscent of the evolution of the language over the years:
The first one, known as c-like initialization (because it is inherited from the C
language), consists of appending an equal sign followed by the value to which the
variable is initialized:
int x = 0;
int x (0);
Finally, a third method, known as uniform initialization, similar to the above, but
using curly braces ({}) instead of parentheses (this was introduced by the revision
of the C++ standard, in 2011):
32
int x {0};
All three ways of initializing variables are valid and equivalent in C++.
1 // initialization of variables
2
3 #include <iostream>
4 using namespace std;
5
6 int main ()
7 {
8 int a=5; // initial value: 5
Edit &
9 int b(3); // initial value: 3
6 Run
10 int c{2}; // initial value: 2
11 int result; // initial value undetermined
12
13 a = a + b;
14 result = a - c;
15 cout << result;
16
17 return 0;
18 }
1 int foo = 0;
2 auto bar = foo; // the same as: int bar = foo;
Here, bar is declared as having an auto type; therefore, the type of bar is the type of
the value used to initialize it: in this case it uses the type of foo, which is int.
Variables that are not initialized can also make use of type deduction with the
decltype specifier:
1 int foo = 0;
2 decltype(foo) bar; // the same as: int bar;
33
auto and decltype are powerful features recently added to the language. But the
type deduction features they introduce are meant to be used either when the type
cannot be obtained by other means or when using it improves code readability. The
two examples above were likely neither of these use cases. In fact they probably
decreased readability, since, when reading the code, one has to search for the type
of foo to actually know the type of bar.
Introduction to strings
Fundamental types represent the most basic types handled by the machines where
the code may run. But one of the major strengths of the C++ language is its rich set
of compound types, of which the fundamental types are mere building blocks.
An example of compound type is the string class. Variables of this type are able to
store sequences of characters, such as words or sentences. A very useful feature!
A first difference with fundamental data types is that in order to declare and use
objects (variables) of this type, the program needs to include the header where the
type is defined within the standard library (header <string>):
1 // my first string
2 #include <iostream>
3 #include <string>
4 using namespace std;
5
6 int main ()
This is a string Edit & Run
7 {
8 string mystring;
9 mystring = "This is a string";
10 cout << mystring;
11 return 0;
12 }
As you can see in the previous example, strings can be initialized with any valid
string literal, just like numerical type variables can be initialized to any valid
numerical literal. As with fundamental types, all initialization formats are valid with
strings:
Strings can also perform all the other basic operations that fundamental data types
34
can, like being declared without an initial value and change its value during
execution:
// my first string
1 #include <iostream>
2 #include <string>
3 using namespace std;
4
5 int main ()
6 { This is the initial string
Edit &
7 string mystring; content
8 mystring = "This is the initial This is a different string Run
9 string content"; content
10 cout << mystring << endl;
11 mystring = "This is a different
12 string content";
13 cout << mystring << endl;
14 return 0;
}
Note: inserting the endl manipulator ends the line (printing a newline character and
flushing the stream).
The string class is a compound type. As you can see in the example above,
compound types are used in the same way as fundamental types: the same syntax
is used to declare variables and to initialize them.
For more details on standard C++ strings, see the string class reference.
Constants
Literals
Literals are the most obvious kind of constants. They are used to express particular
values within the source code of a program. We have already used some in previous
chapters to give specific values to variables or to express messages we wanted our
programs to print out, for example, when we wrote:
a = 5;
35
The 5 in this piece of code was a literal constant.
Integer Numerals
1 1776
2 707
3 -273
These are numerical constants that identify integer values. Notice that they are not
enclosed in quotes or any other special character; they are a simple succession of
digits representing a whole number in decimal base; for example, 1776 always
represents the value one thousand seven hundred seventy-six.
In addition to decimal numbers (those that most of us use every day), C++ allows
the use of octal numbers (base 8) and hexadecimal numbers (base 16) as literal
constants. For octal literals, the digits are preceded with a 0 (zero) character. And
for hexadecimal, they are preceded by the characters 0x (zero, x). For example, the
following literal constants are all equivalent to each other:
1 75 // decimal
2 0113 // octal
3 0x4b // hexadecimal
These literal constants have a type, just like variables. By default, integer literals
are of type int. However, certain suffixes may be appended to an integer literal to
specify a different integer type:
u or U unsigned
l or L long
ll or LL long long
36
Unsigned may be combined with any of the other two in any order to form unsigned
long or unsigned long long.
For example:
1 75 // int
2 75u // unsigned int
3 75l // long
4 75ul // unsigned long
5 75lu // unsigned long
In all the cases above, the suffix can be specified using either upper or lowercase
letters.
1 3.14159 // 3.14159
2 6.02e23 // 6.02 x 10^23
3 1.6e-19 // 1.6 x 10^-19
4 3.0 // 3.0
These are four valid numbers with decimals expressed in C++. The first number is
PI, the second one is the number of Avogadro, the third is the electric charge of an
electron (an extremely small number) -all of them approximated-, and the last one
is the number three expressed as a floating-point numeric literal.
The default type for floating-point literals is double. Floating-point literals of type
float or long double can be specified by adding one of the following suffixes:
Suffix Type
f or F float
l or L long double
For example:
37
1 3.14159L // long double
2 6.02e23f // float
Any of the letters that can be part of a floating-point numerical constant ( e, f, l) can
be written using either lower or uppercase letters with no difference in meaning.
1 'z'
2 'p'
3 "Hello world"
4 "How do you do?"
The first two expressions represent single-character literals, and the following two
represent string literals composed of several characters. Notice that to represent a
single character, we enclose it between single quotes ( '), and to express a string
(which generally consists of more than one character), we enclose the characters
between double quotes (").
Both single-character and string literals require quotation marks surrounding them
to distinguish them from possible variable identifiers or reserved keywords. Notice
the difference between these two expressions:
x
'x'
Character and string literals can also represent special characters that are difficult
or impossible to express otherwise in the source code of a program, like newline ( \n)
or tab (\t). These special characters are all of them preceded by a backslash
character (\).
\n newline
38
\r carriage return
\t tab
\v vertical tab
\b backspace
\a alert (beep)
\\ backslash (\)
For example:
'\n'
'\t'
"Left \t Right"
"one\ntwo\nthree"
Several string literals can be concatenated to form a single string literal simply by
separating them by one or more blank spaces, including tabs, newlines, and other
valid blank characters. For example:
39
Note how spaces within the quotes are part of the literal, while those outside them
are not.
Some programmers also use a trick to include long string literals in multiple lines: In
C++, a backslash (\) at the end of line is considered a line-continuation character
that merges both that line and the next into a single line. Therefore the following
code:
1 x = "string expressed in \
2 two lines"
is equivalent to:
All the character literals and string literals described above are made of characters
of type char. A different character type can be specified by using one of the
following prefixes:
u char16_t
U char32_t
L wchar_t
Note that, unlike type suffixes for integer literals, these prefixes are case sensitive:
lowercase for char16_t and uppercase for char32_t and wchar_t.
For string literals, apart from the above u, U, and L, two additional prefixes exist:
Prefix Description
In raw strings, backslashes and single and double quotes are all valid characters;
the content of the literal is delimited by an initial R"sequence( and a final )sequence",
where sequence is any sequence of characters (including an empty sequence). The
40
content of the string is what lies inside the parenthesis, ignoring the delimiting
sequence itself. For example:
Both strings above are equivalent to "string with \\backslash". The R prefix can be
combined with any other prefixes, such as u, L or u8.
Other literals
Three keyword literals exist in C++: true, false and nullptr:
true and false are the two possible values for variables of type bool.
We can then use these names instead of the literals they were defined to:
41
14 cout << newline;
15 }
For example:
1 #include <iostream>
2 using namespace std;
3
4 #define PI 3.14159
5 #define NEWLINE '\n'
6
7 int main ()
8 {
31.4159 Edit & Run
9 double r=5.0; // radius
10 double circle;
11
12 circle = 2 * PI * r;
13 cout << circle;
14 cout << NEWLINE;
15
16 }
Note that the #define lines are preprocessor directives, and as such are single-line
instructions that -unlike C++ statements- do not require semicolons (;) at the end;
the directive extends automatically until the end of the line. If a semicolon is
included in the line, it is part of the replacement sequence and is also included in all
replaced occurrences.
Operators
42
Once introduced to variables and constants, we can begin to operate with them by
using operators. What follows is a complete list of operators. At this point, it is likely
not necessary to know all of them, but they are all listed here to also serve as
reference.
x = 5;
This statement assigns the integer value 5 to the variable x. The assignment
operation always takes place from right to left, and never the other way around:
x = y;
This statement assigns to variable x the value contained in variable y. The value of x
at the moment this statement is executed is lost and replaced by the value of y.
Consider also that we are only assigning the value of y to x at the moment of the
assignment operation. Therefore, if y changes at a later moment, it will not affect
the new value taken by x.
For example, let's have a look at the following code - I have included the evolution
of the content stored in the variables as comments:
1 // assignment operator
2 #include <iostream>
3 using namespace std;
4
5 int main ()
6 {
7 int a, b; // a:?, b:?
8 a = 10; // a:10, b:?
9 b = 4; // a:10, b:4 a:4 b:7 Edit & Run
10 a = b; // a:4, b:4
11 b = 7; // a:4, b:7
12
13 cout << "a:";
14 cout << a;
15 cout << " b:";
16 cout << b;
17 }
43
This program prints on screen the final values of a and b (4 and 7, respectively).
Notice how a was not affected by the final modification of b, even though we
declared a = b earlier.
Assignment operations are expressions that can be evaluated. That means that the
assignment itself has a value, and -for fundamental types- this value is the one
assigned in the operation. For example:
y = 2 + (x = 5);
In this expression, y is assigned the result of adding 2 and the value of another
assignment expression (which has itself a value of 5). It is roughly equivalent to:
1 x = 5;
2 y = 2 + x;
x = y = z = 5;
Arithmetic operators ( +, -, *, /, % )
The five arithmetical operations supported by C++ are:
operator description
+ addition
- subtraction
* multiplication
/ division
% modulo
44
Operations of addition, subtraction, multiplication and division correspond literally
to their respective mathematical operators. The last one, modulo operator,
represented by a percentage sign (%), gives the remainder of a division of two
values. For example:
x = 11 % 3;
Compound assignment (+=, -=, *=, /=, %=, >>=, <<=, &=, ^=, |=)
Compound assignment operators modify the current value of a variable by
performing an operation on it. They are equivalent to assigning the result of an
operation to the first operand:
y += x; y = y + x;
x -= 5; x = x - 5;
x /= y; x = x / y;
and the same for all other compound assignment operators. For example:
45
Increment and decrement (++, --)
Some expression can be shortened even more: the increase operator ( ++) and the
decrease operator (--) increase or reduce by one the value stored in a variable.
They are equivalent to +=1 and to -=1, respectively. Thus:
1 ++x;
2 x+=1;
3 x=x+1;
are all equivalent in its functionality; the three of them increase by one the value of
x.
In the early C compilers, the three previous expressions may have produced
different executable code depending on which one was used. Nowadays, this type of
code optimization is generally performed automatically by the compiler, thus the
three expressions should produce exactly the same executable code.
A peculiarity of this operator is that it can be used both as a prefix and as a suffix.
That means that it can be written either before the variable name ( ++x) or after it
(x++). Although in simple expressions like x++ or ++x, both have exactly the same
meaning; in other expressions in which the result of the increment or decrement
operation is evaluated, they may have an important difference in their meaning: In
the case that the increase operator is used as a prefix ( ++x) of the value, the
expression evaluates to the final value of x, once it is already increased. On the
other hand, in case that it is used as a suffix ( x++), the value is also increased, but
the expression evaluates to the value that x had before being increased. Notice the
difference:
Example 1 Example 2
x = 3; x = 3;
y = ++x; y = x++;
// x contains 4, y contains 4 // x contains 4, y contains 3
In Example 1, the value assigned to y is the value of x after being increased. While
in Example 2, it is the value x had before being increased.
Relational and comparison operators ( ==, !=, >, <, >=, <= )
Two expressions can be compared using relational and equality operators. For
example, to know if two values are equal or if one is greater than the other.
The result of such an operation is either true or false (i.e., a Boolean value).
46
The relational operators in C++ are:
operator description
== Equal to
!= Not equal to
1 (7 == 5) // evaluates to false
2 (5 > 4) // evaluates to true
3 (3 != 2) // evaluates to true
4 (6 >= 6) // evaluates to true
5 (5 < 5) // evaluates to false
Of course, it's not just numeric constants that can be compared, but just any value,
including, of course, variables. Suppose that a=2, b=3 and c=6, then:
Be careful! The assignment operator (operator =, with one equal sign) is not the
same as the equality comparison operator (operator ==, with two equal signs); the
first one (=) assigns the value on the right-hand to the variable on its left, while the
other (==) compares whether the values on both sides of the operator are equal.
Therefore, in the last expression ((b=2) == a), we first assigned the value 2 to b and
then we compared it to a (that also stores the value 2), yielding true.
47
its operand is false. Basically, it returns the opposite Boolean value of evaluating its
operand. For example:
The logical operators && and || are used when evaluating two expressions to obtain
a single relational result. The operator && corresponds to the Boolean logical
operation AND, which yields true if both its operands are true, and false otherwise.
The following panel shows the result of operator && evaluating the expression a&&b:
a b a && b
The operator || corresponds to the Boolean logical operation OR, which yields true
if either of its operands is true, thus being false only when both operands are false.
Here are the possible results of a||b:
|| OPERATOR (or)
a b a || b
For example:
1 ( (5 == 5) && (3 > 6) ) // evaluates to false ( true && false )
2 ( (5 == 5) || (3 > 6) ) // evaluates to true ( true || false )
48
When using the logical operators, C++ only evaluates what is necessary from left to
right to come up with the combined relational result, ignoring the rest. Therefore, in
the last example ((5==5)||(3>6)), C++ evaluates first whether 5==5 is true, and if
so, it never checks whether 3>6 is true or not. This is known as short-circuit
evaluation, and works like this for these operators:
operat
short-circuit
or
if the left-hand side expression is false, the combined result is false (the
&&
right-hand side expression is never evaluated).
if the left-hand side expression is true, the combined result is true (the
||
right-hand side expression is never evaluated).
This is mostly important when the right-hand expression has side effects, such as
altering values:
Here, the combined conditional expression would increase i by one, but only if the
condition on the left of && is true, because otherwise, the condition on the right-
hand side (++i<n) is never evaluated.
For example:
49
1 // conditional operator
2 #include <iostream>
3 using namespace std;
4
5 int main ()
6 {
7 int a,b,c;
7 Edit & Run
8
9 a=2;
10 b=7;
11 c = (a>b) ? a : b;
12
13 cout << c << '\n';
14 }
In this example, a was 2, and b was 7, so the expression being evaluated ( a>b) was
not true, thus the first value specified after the question mark was discarded in
favor of the second value (the one after the colon) which was b (with a value of 7).
Comma operator ( , )
The comma operator (,) is used to separate two or more expressions that are
included where only one expression is expected. When the set of expressions has to
be evaluated for a value, only the right-most expression is considered.
a = (b=3, b+2);
would first assign the value 3 to b, and then assign b+2 to variable a. So, at the end,
variable a would contain the value 5 while variable b would contain value 3.
| OR Bitwise inclusive OR
50
~ NOT Unary complement (bit inversion)
1 int i;
2 float f = 3.14;
3 i = (int) f;
The previous code converts the floating-point number 3.14 to an integer value (3);
the remainder is lost. Here, the typecasting operator was (int). Another way to do
the same thing in C++ is to use the functional notation preceding the expression to
be converted by the type and enclosing the expression between parentheses:
i = int (f);
sizeof
This operator accepts one parameter, which can be either a type or a variable, and
returns the size in bytes of that type or object:
x = sizeof (char);
Here, x is assigned the value 1, because char is a type with a size of one byte.
51
Other operators
Later in these tutorials, we will see a few more operators, like the ones referring to
pointers or the specifics for object-oriented programming.
Precedence of operators
A single expression may have multiple operators. For example:
x = 5 + 7 % 2;
From greatest to smallest priority, C++ operators are evaluated in the following
order:
Lev Groupin
Precedence group Operator Description
el g
Left-to-
1 Scope :: scope qualifier
right
postfix increment /
++ --
decrement
52
+ - unary prefix
Left-to-
4 Pointer-to-member .* ->* access pointer
right
Left-to-
5 Arithmetic: scaling * / % multiply, divide, modulo
right
Left-to-
6 Arithmetic: addition + - addition, subtraction
right
Left-to-
7 Bitwise shift << >> shift left, shift right
right
Left-to-
8 Relational < > <= >= comparison operators
right
Left-to-
9 Equality == != equality / inequality
right
Left-to-
10 And & bitwise AND
right
Left-to-
11 Exclusive or ^ bitwise XOR
right
Left-to-
12 Inclusive or | bitwise OR
right
Left-to-
13 Conjunction && logical AND
right
Left-to-
14 Disjunction || logical OR
right
= *= /= %= +=
-= assignment / compound
Assignment-level >>= <<= &= ^= | assignment Right-to-
15
expressions = left
?: conditional operator
53
Left-to-
16 Sequencing , comma separator
right
When an expression has two operators with the same precedence level, grouping
determines which one is evaluated first: either left-to-right or right-to-left.
Basic Input/Output
The example programs of the previous sections provided little interaction with the
user, if any at all. They simply printed simple values on screen, but the standard
library provides many additional ways to interact with the user via its input/output
features. This section will present a short introduction to some of the most useful.
C++ uses a convenient abstraction called streams to perform input and output
operations in sequential media such as the screen, the keyboard or a file. A stream
is an entity where a program can either insert or extract characters to/from. There is
no need to know details about the media associated to the stream or any of its
internal specifications. All we need to know is that streams are a source/destination
of characters, and that these characters are provided/accepted sequentially (i.e.,
one after another).
The standard library defines a handful of stream objects that can be used to access
what are considered the standard sources and destinations of characters by the
environment where the program runs:
stream description
We are going to see in more detail only cout and cin (the standard output and input
streams); cerr and clog are also output streams, so they essentially work like cout,
with the only difference being that they identify streams for specific purposes: error
messages and logging; which, in many cases, in most environment setups, they
actually do the exact same thing: they print on screen, although they can also be
individually redirected.
54
Standard output (cout)
On most program environments, the standard output by default is the screen, and
the C++ stream object defined to access it is cout.
For formatted output operations, cout is used together with the insertion operator,
which is written as << (i.e., two "less than" signs).
The << operator inserts the data that follows it into the stream that precedes it. In
the examples above, it inserted the literal string Output sentence, the number 120,
and the value of variable x into the standard output stream cout. Notice that the
sentence in the first statement is enclosed in double quotes ( ") because it is a string
literal, while in the last one, x is not. The double quoting is what makes the
difference; when the text is enclosed between them, the text is printed literally;
when they are not, the text is interpreted as the identifier of a variable, and its
value is printed instead. For example, these two sentences have very different
results:
cout << "This " << " is a " << "single C++ statement";
This last statement would print the text This is a single C++ statement. Chaining
insertions is especially useful to mix literals and variables in a single statement:
cout << "I am " << age << " years old and my zipcode is " << zipcode;
Assuming the age variable contains the value 24 and the zipcode variable contains
90064, the output of the previous statement would be:
55
What cout does not do automatically is add line breaks at the end, unless instructed
to do so. For example, take the following two statements inserting into cout:
cout << "This is a sentence.";
cout << "This is another sentence.";
The output would be in a single line, without any line breaks in between. Something
like:
First sentence.
Second sentence.
Third sentence.
Alternatively, the endl manipulator can also be used to break lines. For example:
First sentence.
Second sentence.
The endl manipulator produces a newline character, exactly as the insertion of '\n'
does; but it also has an additional behavior: the stream's buffer (if any) is flushed,
which means that the output is requested to be physically written to the device, if it
wasn't already. This affects mainly fully buffered streams, and cout is (generally) not
a fully buffered stream. Still, it is generally a good idea to use endl only when
flushing the stream would be a feature and '\n' when it would not. Bear in mind
that a flushing operation incurs a certain overhead, and on some devices it may
produce a delay.
56
Standard input (cin)
In most program environments, the standard input by default is the keyboard, and
the C++ stream object defined to access it is cin.
For formatted input operations, cin is used together with the extraction operator,
which is written as >> (i.e., two "greater than" signs). This operator is then followed
by the variable where the extracted data is stored. For example:
1 int age;
2 cin >> age;
The first statement declares a variable of type int called age, and the second
extracts from cin a value to be stored in it. This operation makes the program wait
for input from cin; generally, this means that the program will wait for the user to
enter some sequence with the keyboard. In this case, note that the characters
introduced using the keyboard are only transmitted to the program when the ENTER
(or RETURN) key is pressed. Once the statement with the extraction operation on cin
is reached, the program will wait for as long as needed until some input is
introduced.
The extraction operation on cin uses the type of the variable after the >> operator
to determine how it interprets the characters read from the input; if it is an integer,
the format expected is a series of digits, if a string a sequence of characters, etc.
// i/o example
1
#include <iostream>
2
using namespace std;
3
4
int main ()
5
{
6
int i; Please enter an integer value: 702 Edit &
7
cout << "Please enter an The value you entered is 702 and Run
8
integer value: "; its double is 1404.
9
cin >> i;
10
cout << "The value you entered
11
is " << i;
12
cout << " and its double is "
13
<< i*2 << ".\n";
14
return 0;
}
As you can see, extracting from cin seems to make the task of getting input from
the standard input pretty simple and straightforward. But this method also has a big
drawback. What happens in the example above if the user enters something else
that cannot be interpreted as an integer? Well, in this case, the extraction operation
57
fails. And this, by default, lets the program continue without setting a value for
variable i, producing undetermined results if the value of i is used later.
This is very poor program behavior. Most programs are expected to behave in an
expected manner no matter what the user types, handling invalid values
appropriately. Only very simple programs should rely on values extracted directly
from cin without further checking. A little later we will see how stringstreams can be
used to have better control over user input.
Extractions on cin can also be chained to request more than one datum in a single
statement:
1 cin >> a;
2 cin >> b;
In both cases, the user is expected to introduce two values, one for variable a, and
another for variable b. Any kind of space is used to separate two consecutive input
operations; this may either be a space, a tab, or a new-line character.
1 string mystring;
2 cin >> mystring;
To get an entire line from cin, there exists a function, called getline, that takes the
stream (cin) as first argument, and the string variable as second. For example:
1 // cin with strings What's your name? Homer Simpson Edit &
2 #include <iostream> Hello Homer Simpson.
Run
3 #include <string> What is your favorite team? The
4 using namespace std; Isotopes
58
int main ()
5
{
6
string mystr;
7
cout << "What's your name? ";
8
getline (cin, mystr);
9
cout << "Hello " << mystr <<
10
".\n"; I like The Isotopes too!
11
cout << "What is your favorite
12
team? ";
13
getline (cin, mystr);
14
cout << "I like " << mystr << "
15
too!\n";
16
return 0;
}
Notice how in both calls to getline, we used the same string identifier (mystr). What
the program does in the second call is simply replace the previous content with the
new one that is introduced.
The standard behavior that most users expect from a console program is that each
time the program queries the user for input, the user introduces the field, and then
presses ENTER (or RETURN). That is to say, input is generally expected to happen in
terms of lines on console programs, and this can be achieved by using getline to
obtain input from the user. Therefore, unless you have a strong reason not to, you
should always use getline to get input in your console programs instead of
extracting from cin.
stringstream
The standard header <sstream> defines a type called stringstream that allows a
string to be treated as a stream, and thus allowing extraction or insertion operations
from/to strings in the same way as they are performed on cin and cout. This feature
is most useful to convert strings to numerical values and vice versa. For example, in
order to extract an integer from a string we can write:
This declares a string with initialized to a value of "1204", and a variable of type
int. Then, the third line uses this variable to extract from a stringstream
constructed from the string. This piece of code stores the numerical value 1204 in
the variable called myint.
59
// stringstreams
1
#include <iostream>
2
#include <string>
3
#include <sstream>
4
using namespace std;
5
6
int main ()
7
{
8
string mystr;
9
float price=0; Enter price: 22.25
10 Edit &
int quantity=0; Enter quantity: 7
11 Run
Total price:
12
cout << "Enter price: "; 155.75
13
getline (cin,mystr);
14
stringstream(mystr) >> price;
15
cout << "Enter quantity: ";
16
getline (cin,mystr);
17
stringstream(mystr) >> quantity;
18
cout << "Total price: " << price*quantity <<
19
endl;
20
return 0;
21
}
In this example, we acquire numeric values from the standard input indirectly:
Instead of extracting numeric values directly from cin, we get lines from it into a
string object (mystr), and then we extract the values from this string into the
variables price and quantity. Once these are numerical values, arithmetic
operations can be performed on them, such as multiplying them to obtain a total
price.
With this approach of getting entire lines and extracting their contents, we separate
the process of getting user input from its interpretation as data, allowing the input
process to be what the user expects, and at the same time gaining more control
over the transformation of its content into useful data by the program.
Program structure
A simple C++ statement is each of the individual instructions of a program, like the
variable declarations and expressions seen in previous sections. They always end
with a semicolon (;), and are executed in the same order in which they appear in a
program.
But programs are not limited to a linear sequence of statements. During its process,
a program may repeat segments of code, or take decisions and bifurcate. For that
60
purpose, C++ provides flow control statements that serve to specify what has to be
done by our program, when, and under which circumstances.
Many of the flow control statements explained in this section require a generic
(sub)statement as part of its syntax. This statement may either be a simple C++
statement, -such as a single instruction, terminated with a semicolon ( ;) - or a
compound statement. A compound statement is a group of statements (each of
them terminated by its own semicolon), but all grouped together in a block,
enclosed in curly braces: {}:
if (condition) statement
Here, condition is the expression that is being evaluated. If this condition is true,
statement is executed. If it is false, statement is not executed (it is simply ignored),
and the program continues right after the entire selection statement.
For example, the following code fragment prints the message (x is 100), only if the
value stored in the x variable is indeed 100:
1 if (x == 100)
2 cout << "x is 100";
If you want to include more than a single statement to be executed when the
condition is fulfilled, these statements shall be enclosed in braces ( {}), forming a
block:
1 if (x == 100)
2 {
3 cout << "x is ";
4 cout << x;
5 }
61
As usual, indentation and line breaks in the code have no effect, so the above code
is equivalent to:
Selection statements with if can also specify what happens when the condition is
not fulfilled, by using the else keyword to introduce an alternative statement. Its
syntax is:
For example:
1 if (x == 100)
2 cout << "x is 100";
3 else
4 cout << "x is not 100";
This prints x is 100, if indeed x has a value of 100, but if it does not, and only if it
does not, it prints x is not 100 instead.
Several if + else structures can be concatenated with the intention of checking a
range of values. For example:
1 if (x > 0)
2 cout << "x is positive";
3 else if (x < 0)
4 cout << "x is negative";
5 else
6 cout << "x is 0";
62
Iteration statements (loops)
Loops repeat a statement a certain number of times, or while a condition is fulfilled.
They are introduced by the keywords while, do, and for.
The while-loop simply repeats statement while expression is true. If, after any
execution of statement, expression is no longer true, the loop ends, and the
program continues right after the loop. For example, let's have a look at a
countdown using a while-loop:
The first statement in main sets n to a value of 10. This is the first number in the
countdown. Then the while-loop begins: if this value fulfills the condition n>0 (that n
is greater than zero), then the block that follows the condition is executed, and
repeated for as long as the condition (n>0) remains being true.
The whole process of the previous program can be interpreted according to the
following script (beginning in main):
1. n is assigned a value
2. The while condition is checked (n>0). At this point there are two possibilities:
63
3. Execute statement:
cout << n << ", ";
--n;
(prints the value of n and decreases n by 1)
A thing to consider with while-loops is that the loop should end at some point, and
thus the statement shall alter values checked in the condition in some way, so as to
force it to become false at some point. Otherwise, the loop will continue looping
forever. In this case, the loop includes --n, that decreases the value of the variable
that is being evaluated in the condition ( n) by one - this will eventually make the
condition (n>0) false after a certain number of loop iterations. To be more specific,
after 10 iterations, n becomes 0, making the condition no longer true, and ending
the while-loop.
Note that the complexity of this loop is trivial for a computer, and so the whole
countdown is performed instantly, without any practical delay between elements of
the count (if interested, see sleep_for for a countdown example with delays).
It behaves like a while-loop, except that condition is evaluated after the execution
of statement instead of before, guaranteeing at least one execution of statement,
even if condition is never fulfilled. For example, the following example program
echoes any text the user introduces until the user enters goodbye:
64
} while (str != "goodbye");
14
}
The do-while loop is usually preferred over a while-loop when the statement needs
to be executed at least once, such as when the condition that is checked to end of
the loop is determined within the loop statement itself. In the previous example, the
user input within the block is what will determine if the loop ends. And thus, even if
the user wants to end the loop as soon as possible by entering goodbye, the block in
the loop needs to be executed at least once to prompt for input, and the condition
can, in fact, only be determined after it is executed.
Like the while-loop, this loop repeats statement while condition is true. But, in
addition, the for loop provides specific locations to contain an initialization and
an increase expression, executed before the loop begins the first time, and after
each iteration, respectively. Therefore, it is especially useful to use counter variables
as condition.
5. the loop ends: execution continues by the next statement after it.
65
using namespace std;
4
5
int main ()
6
{
7
for (int n=10; n>0; n--) {
8
cout << n << ", ";
9
}
10
cout << "liftoff!\n";
11
}
The three fields in a for-loop are optional. They can be left empty, but in all cases
the semicolon signs between them are required. For example, for (;n<10;) is a loop
without initialization or increase (equivalent to a while-loop); and for (;n<10;++n) is
a loop with increase, but no initialization (maybe because the variable was already
initialized before the loop). A loop with no condition is equivalent to a loop with true
as condition (i.e., an infinite loop).
Because each of the fields is executed in a particular time in the life cycle of a loop,
it may be useful to execute more than a single expression as any of initialization,
condition, or statement. Unfortunately, these are not statements, but rather, simple
expressions, and thus cannot be replaced by a block. As expressions, they can,
however, make use of the comma operator (,): This operator is an expression
separator, and can separate multiple expressions where only one is generally
expected. For example, using it, it would be possible for a for loop to handle two
counter variables, initializing and increasing both:
This loop will execute 50 times if neither n or i are modified within the loop:
n starts with a value of 0, and i with 100, the condition is n!=i (i.e., that n is not
equal to i). Because n is increased by one, and i decreased by one on each
iteration, the loop's condition will become false after the 50th iteration, when both n
and i are equal to 50.
66
Range-based for loop
The for-loop has another syntax, which is used exclusively with ranges:
This kind of for loop iterates over all the elements in range, where declaration
declares some variable able to take the value of an element in this range. Ranges
are sequences of elements, including arrays, containers, and any other type
supporting the functions begin and end; Most of these types have not yet been
introduced in this tutorial, but we are already acquainted with at least one kind of
range: strings, which are sequences of characters.
Note how what precedes the colon (:) in the for loop is the declaration of a char
variable (the elements in a string are of type char). We then use this variable, c, in
the statement block to represent the value of each of the elements in the range.
This loop is automatic and does not require the explicit declaration of any counter
variable.
Range based loops usually also make use of type deduction for the type of the
elements with auto. Typically, the range-based loop above can also be written as:
Here, the type of c is automatically deduced as the type of the elements in str.
67
Jump statements
Jump statements allow altering the flow of a program by performing jumps to
specific locations.
68
The goto statement
goto allows to make an absolute jump to another point in the program. This
unconditional jump ignores nesting levels, and does not cause any automatic stack
unwinding. Therefore, it is a feature to use with care, and preferably within the
same block of statements, especially in the presence of local variables.
The destination point is identified by a label, which is then used as an argument for
the goto statement. A label is made of a valid identifier followed by a colon (:).
goto is generally deemed a low-level feature, with no particular use cases in modern
higher-level programming paradigms generally used with C++. But, just as an
example, here is a version of our countdown loop using goto:
switch (expression)
{
case constant1:
group-of-statements-1;
break;
case constant2:
group-of-statements-2;
break;
.
.
.
default:
default-group-of-statements
}
69
It works in the following way: switch evaluates expression and checks if it is
equivalent to constant1; if it is, it executes group-of-statements-1 until it finds the
break statement. When it finds this break statement, the program jumps to the end
of the entire switch statement (the closing brace).
Finally, if the value of expression did not match any of the previously specified
constants (there may be any number of these), the program executes the
statements included after the default: label, if it exists (since it is optional).
Both of the following code fragments have the same behavior, demonstrating the if-
else equivalent of a switch statement:
switch (x) {
if (x == 1) {
case 1:
cout << "x is 1";
cout << "x is 1";
}
break;
else if (x == 2) {
case 2:
cout << "x is 2";
cout << "x is 2";
}
break;
else {
default:
cout << "value of x unknown";
cout << "value of x unknown";
}
}
The switch statement has a somewhat peculiar syntax inherited from the early
times of the first C compilers, because it uses labels instead of blocks. In the most
typical use (shown above), this means that break statements are needed after each
group of statements for a particular label. If break is not included, all statements
following the case (including those under any other labels) are also executed, until
the end of the switch block or a jump statement (such as break) is reached.
If the example above lacked the break statement after the first group for case one,
the program would not jump automatically to the end of the switch block after
printing x is 1, and would instead continue executing the statements in case two
(thus printing also x is 2). It would then continue doing so until a break statement
is encountered, or the end of the switch block. This makes unnecessary to enclose
the statements for each case in braces {}, and can also be useful to execute the
same group of statements for different possible values. For example:
70
1 switch (x) {
2 case 1:
3 case 2:
4 case 3:
5 cout << "x is 1, 2 or 3";
6 break;
7 default:
8 cout << "x is not 1, 2 nor 3";
9 }
Notice that switch is limited to compare its evaluated expression against labels that
are constant expressions. It is not possible to use variables as labels or ranges,
because they are not valid C++ constant expressions.
To check for ranges or values that are not constant, it is better to use
concatenations of if and else if statements.
Functions
In C++, a function is a group of statements that is given a name, and which can be
called from some point of the program. The most common syntax to define a
function is:
Where:
- type is the type of the value returned by the function.
- name is the identifier by which the function can be called.
- parameters (as many as needed): Each parameter consists of a type followed by an
identifier, with each parameter being separated from the next by a comma. Each
parameter looks very much like a regular variable declaration (for example: int x),
and in fact acts within the function as a regular variable which is local to the
function. The purpose of parameters is to allow passing arguments to the function
from the location where it is called from.
- statements is the function's body. It is a block of statements surrounded by braces
{ } that specify what the function actually does.
71
5 int addition (int a, int b)
6 {
7 int r;
8 r=a+b;
9 return r;
10 }
11
12 int main ()
13 {
14 int z;
15 z = addition (5,3);
16 cout << "The result is " << z;
17 }
This program is divided in two functions: addition and main. Remember that no
matter the order in which they are defined, a C++ program always starts by calling
main. In fact, main is the only function called automatically, and the code in any
other function is only executed if its function is called from main (directly or
indirectly).
In the example above, main begins by declaring the variable z of type int, and right
after that, it performs the first function call: it calls addition. The call to a function
follows a structure very similar to its declaration. In the example above, the call to
addition can be compared to its definition just a few lines earlier:
At the point at which the function is called from within main, the control is passed to
function addition: here, execution of main is stopped, and will only resume once the
addition function ends. At the moment of the function call, the value of both
arguments (5 and 3) are copied to the local variables int a and int b within the
function.
Then, inside addition, another local variable is declared (int r), and by means of
the expression r=a+b, the result of a plus b is assigned to r; which, for this case,
where a is 5 and b is 3, means that 8 is assigned to r.
72
return r;
Ends function addition, and returns the control back to the point where the function
was called; in this case: to function main. At this precise moment, the program
resumes its course on main returning exactly at the same point at which it was
interrupted by the call to addition. But additionally, because addition has a return
type, the call is evaluated as having a value, and this value is the value specified in
the return statement that ended addition: in this particular case, the value of the
local variable r, which at the moment of the return statement had a value of 8.
Therefore, the call to addition is an expression with the value returned by the
function, and in this case, that value, 8, is assigned to z. It is as if the entire function
call (addition(5,3)) was replaced by the value it returns (i.e., 8).
A function can actually be called multiple times within a program, and its argument
is naturally not limited just to literals:
73
z= 4 + subtraction (x,y);
21 cout << "The fourth result is " << z << '\n';
}
Similar to the addition function in the previous example, this example defines a
subtract function, that simply returns the difference between its two parameters.
This time, main calls this function several times, demonstrating more possible ways
in which a function can be called.
Let's examine each of these calls, bearing in mind that each function call is itself an
expression that is evaluated as the value it returns. Again, you can think of it as if
the function call was itself replaced by the returned value:
1 z = subtraction (7,2);
2 cout << "The first result is " << z;
If we replace the function call by the value it returns (i.e., 5), we would have:
1 z = 5;
2 cout << "The first result is " << z;
as:
The arguments passed to subtraction are variables instead of literals. That is also
valid, and works fine. The function is called with the values x and y have at the
moment of the call: 5 and 3 respectively, returning 2 as result.
74
The fourth call is again similar:
z = 4 + subtraction (x,y);
The only addition being that now the function call is also an operand of an addition
operation. Again, the result is the same as if the function call was replaced by its
result: 6. Note, that thanks to the commutative property of additions, the above can
also be written as:
z = subtraction (x,y) + 4;
With exactly the same result. Note also that the semicolon does not necessarily go
after the function call, but, as always, at the end of the whole statement. Again, the
logic behind may be easily seen again by replacing the function calls by their
returned value:
Requires the declaration to begin with a type. This is the type of the value returned
by the function. But what if the function does not need to return a value? In this
case, the type to be used is void, which is a special type to represent the absence of
value. For example, a function that simply prints a message may not need to return
any value:
75
11 {
12 printmessage ();
13 }
void can also be used in the function's parameter list to explicitly specify that the
function takes no actual parameters when called. For example, printmessage could
have been declared as:
In C++, an empty parameter list can be used instead of void with same meaning,
but the use of void in the argument list was popularized by the C language, where
this is a requirement.
Something that in no case is optional are the parentheses that follow the function
name, neither in its declaration nor when calling it. And even when the function
takes no parameters, at least an empty pair of parentheses shall always be
appended to the function name. See how printmessage was called in an earlier
example:
printmessage ();
The parentheses are what differentiate functions from other kinds of declarations or
statements. The following would not call the function:
printmessage;
Well, there is a catch: If the execution of main ends normally without encountering a
return statement the compiler assumes the function ends with an implicit return
statement:
76
return 0;
Note that this only applies to function main for historical reasons. All other functions
with a return type shall end with a proper return statement that includes a return
value, even if this is never used.
value description
Because the implicit return 0; statement for main is a tricky exception, some
authors consider it good practice to explicitly write the statement.
In this case, function addition is passed 5 and 3, which are copies of the values of x
and y, respectively. These values (5 and 3) are used to initialize the variables set as
parameters in the function's definition, but any modification of these variables
within the function has no effect on the values of the variables x and y outside it,
because x and y were themselves not passed to the function on the call, but only
77
copies of their values at that moment.
In certain cases, though, it may be useful to access an external variable from within
a function. To do that, arguments can be passed by reference, instead of by value.
For example, the function duplicate in this code duplicates the value of its three
arguments, causing the variables used as arguments to actually be modified by the
call:
To gain access to its arguments, the function declares its parameters as references.
In C++, references are indicated with an ampersand ( &) following the parameter
type, as in the parameters taken by duplicate in the example above.
When a variable is passed by reference, what is passed is no longer a copy, but the
variable itself, the variable identified by the function parameter, becomes somehow
associated with the argument passed to the function, and any modification on their
corresponding local variables within the function are reflected in the variables
passed as arguments in the call.
In fact, a, b, and c become aliases of the arguments passed on the function call ( x, y,
78
and z) and any change on a within the function is actually modifying variable x
outside the function. Any change on b modifies y, and any change on c modifies z.
That is why when, in the example, function duplicate modifies the values of
variables a, b, and c, the values of x, y, and z are affected.
The variables would not be passed by reference, but by value, creating instead
copies of their values. In this case, the output of the program would have been the
values of x, y, and z without being modified (i.e., 1, 3, and 7).
This function takes two strings as parameters (by value), and returns the result of
concatenating them. By passing the arguments by value, the function forces a and b
to be copies of the arguments passed to the function when it is called. And if these
are long strings, it may mean copying large quantities of data just for the function
call.
But this copy can be avoided altogether if both parameters are made references:
79
Arguments by reference do not require a copy. The function operates directly on
(aliases of) the strings passed as arguments, and, at most, it might mean the
transfer of certain pointers to the function. In this regard, the version of concatenate
taking references is more efficient than the version taking values, since it does not
need to copy expensive-to-copy strings.
On the flip side, functions with reference parameters are generally perceived as
functions that modify the arguments passed, because that is why reference
parameters are actually for.
The solution is for the function to guarantee that its reference parameters are not
going to be modified by this function. This can be done by qualifying the parameters
as constant:
Inline functions
Calling a function generally causes a certain overhead (stacking arguments, jumps,
etc...), and thus for very short functions, it may be more efficient to simply insert
the code of the function where it is called, instead of performing the process of
formally calling a function.
Preceding a function declaration with the inline specifier informs the compiler that
inline expansion is preferred over the usual function call mechanism for a specific
function. This does not change at all the behavior of a function, but is merely used
to suggest the compiler that the code generated by the function body shall be
inserted at each point the function is called, instead of being invoked with a regular
80
function call.
For example, the concatenate function above may be declared inline as:
This informs the compiler that when concatenate is called, the program prefers the
function to be expanded inline, instead of performing a regular call. inline is only
specified in the function declaration, not when it is called.
Note that most compilers already optimize code to generate inline functions when
they see an opportunity to improve efficiency, even if not explicitly marked with the
inline specifier. Therefore, this specifier merely indicates the compiler that inline is
preferred for this function, although the compiler is free to not inline it, and optimize
otherwise. In C++, optimization is a task delegated to the compiler, which is free to
generate any code for as long as the resulting behavior is the one specified by the
code.
81
In this example, there are two calls to function divide. In the first one:
divide (12)
The call only passes one argument to the function, even though the function has
two parameters. In this case, the function assumes the second parameter to be 2
(notice the function definition, which declares its second parameter as int b=2).
Therefore, the result is 6.
divide (20,4)
The call passes two arguments to the function. Therefore, the default value for b
(int b=2) is ignored, and b takes the value passed as argument, that is 4, yielding a
result of 5.
Declaring functions
In C++, identifiers can only be used in expressions once they have been declared.
For example, some variable x cannot be used before being declared with a
statement, such as:
int x;
The same applies to functions. Functions cannot be called before they are declared.
That is why, in all the previous examples of functions, the functions were always
defined before the main function, which is the function from where the other
functions were called. If main were defined before the other functions, this would
break the rule that functions shall be declared before being used, and thus would
not compile.
The prototype of a function can be declared without actually defining the function
completely, giving just enough details to allow the types involved in a function call
to be known. Naturally, the function shall be defined somewhere else, like later in
the code. But at least, once declared like this, it can already be called.
82
The declaration shall include all types involved (the return type and the type of its
arguments), using the same syntax as used in the definition of the function, but
replacing the body of the function (the block of statements) with an ending
semicolon.
The parameter list does not need to include the parameter names, but only their
types. Parameter names can nevertheless be specified, but they are optional, and
do not need to necessarily match those in the function definition. For example, a
function called protofunction with two int parameters can be declared with either of
these statements:
Anyway, including a name for each parameter always improves legibility of the
declaration.
83
This example is indeed not an example of efficiency. You can probably write yourself
a version of this program with half the lines of code. Anyway, this example
illustrates how functions can be declared before its definition:
Declare the prototype of the functions. They already contain all what is necessary to
call them, their name, the types of their argument, and their return type ( void in
this case). With these prototype declarations in place, they can be called before
they are entirely defined, allowing for example, to place the function from where
they are called (main) before the actual definition of these functions.
But declaring functions before being defined is not only useful to reorganize the
order of functions within the code. In some cases, such as in this particular case, at
least one of the declarations is required, because odd and even are mutually called;
there is a call to even in odd and a call to odd in even. And, therefore, there is no way
to structure the code so that odd is defined before even, and even before odd.
Recursivity
Recursivity is the property that functions have to be called by themselves. It is
useful for some tasks, such as sorting elements, or calculating the factorial of
numbers. For example, in order to obtain the factorial of a number ( n!) the
mathematical formula would be:
5! = 5 * 4 * 3 * 2 * 1 = 120
And a recursive function to calculate this in C++ could be:
84
9 else
10 return 1;
11 }
12
13 int main ()
14 {
15 long number = 9;
16 cout << number << "! = " << factorial (number);
17 return 0;
18 }
Notice how in function factorial we included a call to itself, but only if the argument
passed was greater than 1, since, otherwise, the function would perform an infinite
recursive loop, in which once it arrived to 0, it would continue multiplying by all the
negative numbers (probably provoking a stack overflow at some point during
runtime).
Overloaded functions
In C++, two different functions can have the same name if their parameters are
different; either because they have a different number of parameters, or because
any of their parameters are of a different type. For example:
1 // overloading functions
2 #include <iostream>
3 using namespace std;
4
5 int operate (int a, int b)
6 {
7 return (a*b);
8 }
9
10 double operate (double a, double b)
11 { 10 Edit & Run
12 return (a/b); 2.5
13 }
14
15 int main ()
16 {
17 int x=5,y=2;
18 double n=5.0,m=2.0;
19 cout << operate (x,y) << '\n';
20 cout << operate (n,m) << '\n';
21 return 0;
22 }
85
In this example, there are two functions called operate, but one of them has two
parameters of type int, while the other has them of type double. The compiler
knows which one to call in each case by examining the types passed as arguments
when the function is called. If it is called with two int arguments, it calls to the
function that has two int parameters, and if it is called with two doubles, it calls the
one with two doubles.
In this example, both functions have quite different behaviors, the int version
multiplies its arguments, while the double version divides them. This is generally not
a good idea. Two functions with the same name are generally expected to have -at
least- a similar behavior, but this example demonstrates that is entirely possible for
them not to. Two overloaded functions (i.e., two functions with the same name)
have entirely different definitions; they are, for all purposes, different functions, that
only happen to have the same name.
Note that a function cannot be overloaded only by its return type. At least one of its
parameters must have a different type.
Function templates
Overloaded functions may have the same definition. For example:
1 // overloaded functions
2 #include <iostream>
3 using namespace std;
4
5 int sum (int a, int b)
6 {
7 return a+b;
8 }
9
10 double sum (double a, double b) 30 Edit & Run
11 { 2.5
12 return a+b;
13 }
14
15 int main ()
16 {
17 cout << sum (10,20) << '\n';
18 cout << sum (1.0,1.5) << '\n';
19 return 0;
20 }
Here, sum is overloaded with different parameter types, but with the exact same
body.
86
The function sum could be overloaded for a lot of types, and it could make sense for
all of them to have the same body. For cases such as this, C++ has the ability to
define functions with generic types, known as function templates. Defining a
function template follows the same syntax as a regular function, except that it is
preceded by the template keyword and a series of template parameters enclosed in
angle-brackets <>:
It makes no difference whether the generic type is specified with keyword class or
keyword typename in the template argument list (they are 100% synonyms in
template declarations).
In the code above, declaring SomeType (a generic type within the template
parameters enclosed in angle-brackets) allows SomeType to be used anywhere in the
function definition, just as any other type; it can be used as the type for
parameters, as return type, or to declare new variables of this type. In all cases, it
represents a generic type that will be determined on the moment the template is
instantiated.
x = sum<int>(10,20);
The function sum<int> is just one of the possible instantiations of function template
87
sum. In this case, by using int as template argument in the call, the compiler
automatically instantiates a version of sum where each occurrence of SomeType is
replaced by int, as if it was defined as:
1 // function template
2 #include <iostream>
3 using namespace std;
4
5 template <class T>
6 T sum (T a, T b)
7 {
8 T result;
9 result = a + b;
10 return result;
11 }
11 Edit & Run
2.5
12
13 int main () {
14 int i=5, j=6, k;
15 double f=2.0, g=0.5, h;
16 k=sum<int>(i,j);
17 h=sum<double>(f,g);
18 cout << k << '\n';
19 cout << h << '\n';
20 return 0;
21 }
In this case, we have used T as the template parameter name, instead of SomeType.
It makes no difference, and T is actually a quite common template parameter name
for generic types.
In the example above, we used the function template sum twice. The first time with
arguments of type int, and the second one with arguments of type double. The
compiler has instantiated and then called each time the appropriate version of the
function.
Note also how T is also used to declare a local variable of that (generic) type within
sum:
T result;
88
Therefore, result will be a variable of the same type as the parameters a and b, and
as the type returned by the function.
In this specific case where the generic type T is used as a parameter for sum, the
compiler is even able to deduce the data type automatically without having to
explicitly specify it within angle brackets. Therefore, instead of explicitly specifying
the template arguments with:
1 k = sum<int> (i,j);
2 h = sum<double> (f,g);
without the type enclosed in angle brackets. Naturally, for that, the type shall be
unambiguous. If sum is called with arguments of different types, the compiler may
not be able to deduce the type of T automatically.
Templates are a powerful and versatile feature. They can have multiple template
parameters, and the function can still use regular non-templated types. For
example:
1 // function templates
2 #include <iostream>
3 using namespace std;
4
5 template <class T, class U>
6 bool are_equal (T a, U b)
7 {
8 return (a==b);
9 }
x and y are equal Edit & Run
10
11 int main ()
12 {
13 if (are_equal(10,10.0))
14 cout << "x and y are equal\n";
15 else
16 cout << "x and y are not equal\n";
17 return 0;
18 }
Note that this example uses automatic template parameter deduction in the call to
are_equal:
89
are_equal(10,10.0)
Is equivalent to:
are_equal<int,double>(10,10.0)
1 // template arguments
2 #include <iostream>
3 using namespace std;
4
5 template <class T, int N>
6 T fixed_multiply (T val)
7 { 20 Edit & Run
8 return val * N; 30
9 }
10
11 int main() {
12 std::cout << fixed_multiply<int,2>(10) << '\n';
13 std::cout << fixed_multiply<int,3>(10) << '\n';
14 }
The second argument of the fixed_multiply function template is of type int. It just
looks like a regular function parameter, and can actually be used just like one.
But there exists a major difference: the value of template parameters is determined
on compile-time to generate a different instantiation of the function fixed_multiply,
and thus the value of that argument is never passed during runtime: The two calls
to fixed_multiply in main essentially call two versions of the function: one that
always multiplies by two, and one that always multiplies by three. For that same
reason, the second template argument needs to be a constant expression (it cannot
be passed a variable).
90
Name visibility
Scopes
Named entities, such as variables, functions, and compound types need to be
declared before being used in C++. The point in the program where this declaration
happens influences its visibility:
An entity declared outside any block has global scope, meaning that its name is
valid anywhere in the code. While an entity declared within a block, such as a
function or a selective statement, has block scope, and is only visible within the
specific block in which it is declared, but not outside it.
For example, a variable declared in the body of a function is a local variable that
extends until the end of the the function (i.e., until the brace } that closes the
function definition), but not outside it:
In each scope, a name can only represent one entity. For example, there cannot be
two variables with the same name in the same scope:
1 int some_function ()
2 {
3 int x;
4 x = 0;
5 double x; // wrong: name already used in this scope
6 x = 0.0;
7 }
The visibility of an entity with block scope extends until the end of the block,
including inner blocks. Nevertheless, an inner block, because it is a different block,
can re-utilize a name existing in an outer scope to refer to a different entity; in this
case, the name will refer to a different entity only within the inner block, hiding the
91
entity it names outside. While outside it, it will still refer to the original entity. For
example:
Note that y is not hidden in the inner block, and thus accessing y still accesses the
outer variable.
Namespaces
Only one entity can exist with a particular name in a particular scope. This is seldom
a problem for local names, since blocks tend to be relatively short, and names have
particular purposes within them, such as naming a counter variable, an argument,
etc...
But non-local names bring more possibilities for name collision, especially
considering that libraries may declare many functions, types, and variables, neither
of them local in nature, and some of them very generic.
Namespaces allow us to group named entities that otherwise would have global
scope into narrower scopes, giving them namespace scope. This allows organizing
the elements of programs into different logical scopes referred to by names.
92
namespace identifier
{
named_entities
}
Where identifier is any valid identifier and named_entities is the set of variables,
types and functions that are included within the namespace. For example:
1 namespace myNamespace
2 {
3 int a, b;
4 }
In this case, the variables a and b are normal variables declared within a namespace
called myNamespace.
These variables can be accessed from within their namespace normally, with their
identifier (either a or b), but if accessed from outside the myNamespace namespace
they have to be properly qualified with the scope operator ::. For example, to
access the previous variables from outside myNamespace they should be qualified like:
1 myNamespace::a
2 myNamespace::b
1 // namespaces
2 #include <iostream>
3 using namespace std;
4
5 namespace foo
6 {
7 int value() { return 5; }
8 }
9
10 namespace bar 5
11 { 6.2832 Edit & Run
12 const double pi = 3.1416; 3.1416
13 double value() { return 2*pi; }
14 }
15
16 int main () {
17 cout << foo::value() << '\n';
18 cout << bar::value() << '\n';
19 cout << bar::pi << '\n';
20 return 0;
21 }
93
In this case, there are two functions with the same name: value. One is defined
within the namespace foo, and the other one in bar. No redefinition errors happen
thanks to namespaces. Notice also how pi is accessed in an unqualified manner
from within namespace bar (just as pi), while it is again accessed in main, but here it
needs to be qualified as bar::pi.
Namespaces can be split: Two segments of a code can be declared in the same
namespace:
This declares three variables: a and c are in namespace foo, while b is in namespace
bar. Namespaces can even extend across different translation units (i.e., across
different files of source code).
using
The keyword using introduces a name into the current declarative region (such as a
block), thus avoiding the need to qualify the name. For example:
94
25 }
Notice how in main, the variable x (without any name qualifier) refers to first::x,
whereas y refers to second::y, just as specified by the using declarations. The
variables first::y and second::x can still be accessed, but require fully qualified
names.
In this case, by declaring that we were using namespace first, all direct uses of x
and y without name qualifiers were also looked up in namespace first.
using and using namespace have validity only in the same block in which they are
stated or in the entire source code file if they are used directly in the global scope.
For example, it would be possible to first use the objects of one namespace and
then those of another one by splitting the code in different blocks:
95
8 }
9
10 namespace second
11 {
12 double x = 3.1416;
13 }
14
15 int main () {
16 {
17 using namespace first;
18 cout << x << '\n';
19 }
20 {
21 using namespace second;
22 cout << x << '\n';
23 }
24 return 0;
25 }
Namespace aliasing
Existing namespaces can be aliased with new names, with the following syntax:
This introduces direct visibility of all the names of the std namespace into the code.
This is done in these tutorials to facilitate comprehension and shorten the length of
the examples, but many programmers prefer to qualify each of the elements of the
standard library used in their programs. For example, instead of:
96
Whether the elements in the std namespace are introduced with using declarations
or are fully qualified on every use does not change the behavior or efficiency of the
resulting program in any way. It is mostly a matter of style preference, although for
projects mixing libraries, explicit qualification tends to be preferred.
Storage classes
The storage for variables with global or namespace scope is allocated for the entire
duration of the program. This is known as static storage, and it contrasts with the
storage for local variables (those declared within a block). These use what is known
as automatic storage. The storage for local variables is only available during the
block in which they are declared; after that, that same storage may be used for a
local variable of some other function, or used otherwise.
But there is another substantial difference between variables with static storage and
variables with automatic storage:
- Variables with static storage (such as global variables) that are not explicitly
initialized are automatically initialized to zeroes.
- Variables with automatic storage (such as local variables) that are not explicitly
initialized are left uninitialized, and thus have an undetermined value.
For example:
The actual output may vary, but only the value of x is guaranteed to be zero. y can
actually contain just about any value (including zero).
97
Arrays
That means that, for example, five values of type int can be declared as an array
without having to declare 5 different variables (each with its own identifier). Instead,
using an array, the five int values are stored in contiguous memory locations, and
all five can be accessed using the same identifier, with the proper index.
For example, an array containing 5 integer values of type int called foo could be
represented as:
where each blank panel represents an element of the array. In this case, these are
values of type int. These elements are numbered from 0 to 4, being 0 the first and
4 the last; In C++, the first element in an array is always numbered with a zero (not
a one), no matter its length.
where type is a valid type (such as int, float...), name is a valid identifier and the
elements field (which is always enclosed in square brackets []), specifies the length
of the array in terms of the number of elements.
Therefore, the foo array, with five elements of type int, can be declared as:
NOTE: The elements field within square brackets [], representing the number of
elements in the array, must be a constant expression, since arrays are blocks of
static memory whose size must be determined at compile time, before the program
runs.
98
Initializing arrays
By default, regular arrays of local scope (for example, those declared within a
function) are left uninitialized. This means that none of its elements are set to any
particular value; their contents are undetermined at the point the array is declared.
But the elements in an array can be explicitly initialized to specific values when it is
declared, by enclosing those initial values in braces {}. For example:
The number of values between braces {} shall not be greater than the number of
elements in the array. For example, in the example above, foo was declared having
5 elements (as specified by the number enclosed in square brackets, []), and the
braces {} contained exactly 5 values, one for each element. If declared with less,
the remaining elements are set to their default values (which for fundamental types,
means they are filled with zeroes). For example:
This creates an array of five int values, each initialized with a value of zero:
99
When an initialization of values is provided for an array, C++ allows the possibility
of leaving the square brackets empty []. In this case, the compiler will assume
automatically a size for the array that matches the number of values included
between the braces {}:
After this declaration, array foo would be 5 int long, since we have provided 5
initialization values.
Finally, the evolution of C++ has led to the adoption of universal initialization also
for arrays. Therefore, there is no longer need for the equal sign between the
declaration and the initializer. Both these statements are equivalent:
Static arrays, and those declared directly in a namespace (outside any function), are
always initialized. If no explicit initializer is specified, all the elements are default-
initialized (with zeroes, for fundamental types).
name[index]
Following the previous examples in which foo had 5 elements and each of those
elements was of type int, the name which can be used to refer to each element is
the following:
For example, the following statement stores the value 75 in the third element of
foo:
100
and, for example, the following copies the value of the third element of foo to a
variable called x:
x = foo[2];
Notice that the third element of foo is specified foo[2], since the first one is foo[0],
the second one is foo[1], and therefore, the third one is foo[2]. By this same
reason, its last element is foo[4]. Therefore, if we write foo[5], we would be
accessing the sixth element of foo, and therefore actually exceeding the size of the
array.
In C++, it is syntactically correct to exceed the valid range of indices for an array.
This can create problems, since accessing out-of-range elements do not cause
errors on compilation, but can cause errors on runtime. The reason for this being
allowed will be seen in a later chapter when pointers are introduced.
At this point, it is important to be able to clearly distinguish between the two uses
that brackets [] have related to arrays. They perform two different tasks: one is to
specify the size of arrays when they are declared; and the second one is to specify
indices for concrete array elements when they are accessed. Do not confuse these
two possible uses of brackets [] with arrays.
The main difference is that the declaration is preceded by the type of the elements,
while the access is not.
1 foo[0] = a;
2 foo[a] = 75;
3 b = foo [a+2];
4 foo[foo[a]] = foo[2] + 5;
For example:
101
4
5 int foo [] = {16, 2, 77, 40, 12071};
6 int n, result=0;
7
8 int main ()
9 {
10 for ( n=0 ; n<5 ; ++n )
11 {
12 result += foo[n];
13 }
14 cout << result;
15 return 0;
16 }
Multidimensional arrays
Multidimensional arrays can be described as "arrays of arrays". For example, a
bidimensional array can be imagined as a two-dimensional table made of elements,
all of them of a same uniform data type.
jimmy represents a bidimensional array of 3 per 5 elements of type int. The C++
syntax for this is:
and, for example, the way to reference the second element vertically and fourth
horizontally in an expression would be:
jimmy[1][3]
102
(remember that array indices always begin with zero).
Multidimensional arrays are not limited to two indices (i.e., two dimensions). They
can contain as many indices as needed. Although be careful: the amount of memory
needed for an array increases exponentially with each dimension. For example:
declares an array with an element of type char for each second in a century. This
amounts to more than 3 billion char! So this declaration would consume more than
3 gigabytes of memory!
At the end, multidimensional arrays are just an abstraction for programmers, since
the same results can be achieved with a simple array, by multiplying its indices:
With the only difference that with multidimensional arrays, the compiler
automatically remembers the depth of each imaginary dimension. The following two
pieces of code produce the exact same result, but one uses a bidimensional array
while the other uses a simple array:
None of the two code snippets above produce any output on the screen, but both
assign values to the memory block called jimmy in the following way:
103
Note that the code uses defined constants for the width and height, instead of using
directly their numerical values. This gives the code a better readability, and allows
changes in the code to be made easily in one place.
Arrays as parameters
At some point, we may need to pass an array to a function as a parameter. In C++,
it is not possible to pass the entire block of memory represented by an array to a
function directly as an argument. But what can be passed instead is its address. In
practice, this has almost the same effect, and it is a much faster and more efficient
operation.
This function accepts a parameter of type "array of int" called arg. In order to pass
to this function an array declared as:
procedure (myarray);
104
7 cout << arg[n] << ' ';
8 cout << '\n';
9 }
10
11 int main ()
12 {
13 int firstarray[] = {5, 10, 15};
14 int secondarray[] = {2, 4, 6, 8, 10};
15 printarray (firstarray,3);
16 printarray (secondarray,5);
17 }
In the code above, the first parameter (int arg[]) accepts any array whose
elements are of type int, whatever its length. For that reason, we have included a
second parameter that tells the function the length of each array that we pass to it
as its first parameter. This allows the for loop that prints out the array to know the
range to iterate in the array passed, without going out of range.
base_type[][depth][depth]
Notice that the first brackets [] are left empty, while the following ones specify sizes
for their respective dimensions. This is necessary in order for the compiler to be
able to determine the depth of each additional dimension.
Library arrays
The arrays explained above are directly implemented as a language feature,
inherited from the C language. They are a great feature, but by restricting its copy
105
and easily decay into pointers, they probably suffer from an excess of optimization.
To overcome some of these issues with language built-in arrays, C++ provides an
alternative array type as a standard container. It is a type template (a class
template, in fact) defined in header <array>.
Containers are a library feature that falls out of the scope of this tutorial, and thus
the class will not be explained in detail here. Suffice it to say that they operate in a
similar way to built-in arrays, except that they allow being copied (an actually
expensive operation that copies the entire block of memory, and thus to use with
care) and decay into pointers only when explicitly told to do so (by means of its
member data).
Just as an example, these are two versions of the same example using the language
built-in array described in this chapter, and the container in the library:
for (int i=0; i<3; ++i) for (int i=0; i<myarray.size(); ++i)
++myarray[i]; ++myarray[i];
As you can see, both kinds of arrays use the same syntax to access its elements:
myarray[i]. Other than that, the main differences lay on the declaration of the
array, and the inclusion of an additional header for the library array. Notice also how
it is easy to access the size of the library array.
Character sequences
The string class has been briefly introduced in an earlier chapter. It is a very
powerful class to handle and manipulate strings of characters. However, because
strings are, in fact, sequences of characters, we can represent them also as plain
arrays of elements of a character type.
106
char foo [20];
is an array that can store up to 20 elements of type char. It can be represented as:
In this case, the array of 20 elements of type char called foo can be represented
storing the character sequences "Hello" and "Merry Christmas" as:
Notice how after the content of the string itself, a null character ( '\0') has been
added in order to indicate the end of the sequence. The panels in gray color
represent char elements with undetermined values.
The above declares an array of 6 elements of type char initialized with the
characters that form the word "Hello" plus a null character '\0' at the end.
107
But arrays of character elements have another way to be initialized: using string
literals directly.
In the expressions used in some examples in previous chapters, string literals have
already shown up several times. These are specified by enclosing the text between
double quotes ("). For example:
Therefore, the array of char elements called myword can be initialized with a null-
terminated sequence of characters by either one of these two statements:
In both cases, the array of characters myword is declared with a size of 6 elements of
type char: the 5 characters that compose the word "Hello", plus a final null
character ('\0'), which specifies the end of the sequence and that, in the second
case, when using double quotes (") it is appended automatically.
Please notice that here we are talking about initializing an array of characters at the
moment it is being declared, and not about assigning values to them later (once
they have already been declared). In fact, because string literals are regular arrays,
they have the same restrictions as these, and cannot be assigned values.
Expressions (once myword has already been declared as above), such as:
1 myword = "Bye";
2 myword[] = "Bye";
108
This is because arrays cannot be assigned values. Note, though, that each of its
elements can be assigned a value individually. For example, this would be correct:
1 myword[0] = 'B';
2 myword[1] = 'y';
3 myword[2] = 'e';
4 myword[3] = '\0';
In the standard library, both representations for strings (C-strings and library
strings) coexist, and most functions requiring strings are overloaded to support
both.
For example, cin and cout support null-terminated sequences directly, allowing
them to be directly extracted from cin or inserted into cout, just like strings. For
example:
109
return 0;
}
In any case, null-terminated character sequences and strings are easily transformed
from one another:
Pointers
For a C++ program, the memory of a computer is like a succession of memory cells,
each one byte in size, and each with a unique address. These single-byte memory
cells are ordered in a way that allows data representations larger than one byte to
occupy memory cells that have consecutive addresses.
This way, each cell can be easily located in the memory by means of its unique
address. For example, the memory cell with the address 1776 always follows
immediately after the cell with address 1775 and precedes the one with 1777, and is
exactly one thousand cells after 776 and exactly one thousand cells before 2776.
110
When a variable is declared, the memory needed to store its value is assigned a
specific location in memory (its memory address). Generally, C++ programs do not
actively decide the exact memory addresses where its variables are stored.
Fortunately, that task is left to the environment where the program is run -
generally, an operating system that decides the particular memory locations on
runtime. However, it may be useful for a program to be able to obtain the address
of a variable during runtime in order to access data cells that are at a certain
position relative to it.
foo = &myvar;
This would assign the address of variable myvar to foo; by preceding the name of
the variable myvar with the address-of operator (&), we are no longer assigning the
content of the variable itself to foo, but its address.
The actual address of a variable in memory cannot be known before runtime, but
let's assume, in order to help clarify some concepts, that myvar is placed during
runtime in the memory address 1776.
1 myvar = 25;
2 foo = &myvar;
3 bar = myvar;
The values contained in each variable after the execution of this are shown in the
following diagram:
111
First, we have assigned the value 25 to myvar (a variable whose address in memory
we assumed to be 1776).
The second statement assigns foo the address of myvar, which we have assumed to
be 1776.
Finally, the third statement, assigns the value contained in myvar to bar. This is a
standard assignment operation, as already done many times in earlier chapters.
The main difference between the second and third statements is the appearance of
the address-of operator (&).
The variable that stores the address of another variable (like foo in the previous
example) is what in C++ is called a pointer. Pointers are a very powerful feature of
the language that has many uses in lower level programming. A bit later, we will
see how to declare and use pointers.
An interesting property of pointers is that they can be used to access the variable
they point to directly. This is done by preceding the pointer name with the
dereference operator (*). The operator itself can be read as "value pointed to by".
Therefore, following with the values of the previous example, the following
statement:
baz = *foo;
This could be read as: "baz equal to value pointed to by foo", and the statement
would actually assign the value 25 to baz, since foo is 1776, and the value pointed to
112
by 1776 (following the example above) would be 25.
It is important to clearly differentiate that foo refers to the value 1776, while *foo
(with an asterisk * preceding the identifier) refers to the value stored at address
1776, which in this case is 25. Notice the difference of including or not including the
dereference operator (I have added an explanatory comment of how each of these
two expressions could be read):
& is the address-of operator, and can be read simply as "address of"
Thus, they have sort of opposite meanings: An address obtained with & can be
dereferenced with *.
1 myvar = 25;
2 foo = &myvar;
Right after these two statements, all of the following expressions would give true as
result:
1 myvar == 25
2 &myvar == 1776
3 foo == 1776
113
4 *foo == 25
The first expression is quite clear, considering that the assignment operation
performed on myvar was myvar=25. The second one uses the address-of operator ( &),
which returns the address of myvar, which we assumed it to have a value of 1776.
The third one is somewhat obvious, since the second expression was true and the
assignment operation performed on foo was foo=&myvar. The fourth expression uses
the dereference operator (*) that can be read as "value pointed to by", and the
value pointed to by foo is indeed 25.
So, after all that, you may also infer that for as long as the address pointed to by
foo remains unchanged, the following expression will also be true:
*foo == myvar
Declaring pointers
Due to the ability of a pointer to directly refer to the value that it points to, a pointer
has different properties when it points to a char than when it points to an int or a
float. Once dereferenced, the type needs to be known. And for that, the declaration
of a pointer needs to include the data type the pointer is going to point to.
type * name;
where type is the data type pointed to by the pointer. This type is not the type of
the pointer itself, but the type of the data the pointer points to. For example:
1 int * number;
2 char * character;
3 double * decimals;
These are three declarations of pointers. Each one is intended to point to a different
data type, but, in fact, all of them are pointers and all of them are likely going to
occupy the same amount of space in memory (the size in memory of a pointer
depends on the platform where the program runs). Nevertheless, the data to which
they point to do not occupy the same amount of space nor are of the same type:
the first one points to an int, the second one to a char, and the last one to a double.
Therefore, although these three example variables are all of them pointers, they
114
actually have different types: int*, char*, and double* respectively, depending on
the type they point to.
Note that the asterisk (*) used when declaring a pointer only means that it is a
pointer (it is part of its type compound specifier), and should not be confused with
the dereference operator seen a bit earlier, but which is also written with an
asterisk (*). They are simply two different things represented with the same sign.
// my first pointer
1
#include <iostream>
2
using namespace std;
3
4
int main ()
5
{
6
int firstvalue, secondvalue;
7
int * mypointer;
8 firstvalue is 10 Edit &
9 secondvalue is Run
mypointer = &firstvalue;
10 20
*mypointer = 10;
11
mypointer = &secondvalue;
12
*mypointer = 20;
13
cout << "firstvalue is " << firstvalue << '\n';
14
cout << "secondvalue is " << secondvalue <<
15
'\n';
16
return 0;
17
}
Notice that even though neither firstvalue nor secondvalue are directly set any
value in the program, both end up with a value set indirectly through the use of
mypointer. This is how it happens:
First, mypointer is assigned the address of firstvalue using the address-of operator
(&). Then, the value pointed to by mypointer is assigned a value of 10. Because, at
this moment, mypointer is pointing to the memory location of firstvalue, this in fact
modifies the value of firstvalue.
In order to demonstrate that a pointer may point to different variables during its
lifetime in a program, the example repeats the process with secondvalue and that
same pointer, mypointer.
115
4 int main ()
5 {
6 int firstvalue = 5, secondvalue = 15;
7 int * p1, * p2;
8
9 p1 = &firstvalue; // p1 = address of firstvalue
10 p2 = &secondvalue; // p2 = address of secondvalue
11 *p1 = 10; // value pointed to by p1 = 10
12 *p2 = *p1; // value pointed to by p2 = 20
13 value pointed to by p1
14 p1 = p2; // p1 = p2 (value of pointer is
15 copied)
16 *p1 = 20; // value pointed to by p1 = 20
17
18 cout << "firstvalue is " << firstvalue << '\n';
19 cout << "secondvalue is " << secondvalue << '\n';
20 return 0;
}
Each assignment operation includes a comment on how each line could be read:
i.e., replacing ampersands (&) by "address of", and asterisks (*) by "value pointed to
by".
Notice that there are expressions with pointers p1 and p2, both with and without the
dereference operator (*). The meaning of an expression using the dereference
operator (*) is very different from one that does not. When this operator precedes
the pointer name, the expression refers to the value being pointed, while when a
pointer name appears without this operator, it refers to the value of the pointer
itself (i.e., the address of what the pointer is pointing to).
This declares the two pointers used in the previous example. But notice that there is
an asterisk (*) for each pointer, in order for both to have type int* (pointer to int).
This is required due to the precedence rules. Note that if, instead, the code was:
p1 would indeed be of type int*, but p2 would be of type int. Spaces do not matter
at all for this purpose. But anyway, simply remembering to put one asterisk per
pointer is enough for most pointer users interested in declaring multiple pointers
116
per statement. Or even better: use a different statemet for each variable.
mypointer = myarray;
After that, mypointer and myarray would be equivalent and would have very similar
properties. The main difference being that mypointer can be assigned a different
address, whereas myarray can never be assigned anything, and will always
represent the same block of 20 elements of type int. Therefore, the following
assignment would not be valid:
myarray = mypointer;
1 // more pointers
2 #include <iostream>
3 using namespace std;
4
5 int main ()
6 {
7 int numbers[5];
8 int * p;
9 p = numbers; *p = 10; 10, 20, 30, 40, 50, Edit & Run
10 p++; *p = 20;
11 p = &numbers[2]; *p = 30;
12 p = numbers + 3; *p = 40;
13 p = numbers; *(p+4) = 50;
14 for (int n=0; n<5; n++)
15 cout << numbers[n] << ", ";
16 return 0;
17 }
117
Pointers and arrays support the same set of operations, with the same meaning for
both. The main difference being that pointers can be assigned new addresses, while
arrays cannot.
In the chapter about arrays, brackets ([]) were explained as specifying the index of
an element of the array. Well, in fact these brackets are a dereferencing operator
known as offset operator. They dereference the variable they follow just as * does,
but they also add the number between brackets to the address being dereferenced.
For example:
1 a[5] = 0; // a [offset of 5] = 0
2 *(a+5) = 0; // pointed to by (a+5) = 0
These two expressions are equivalent and valid, not only if a is a pointer, but also if
a is an array. Remember that if an array, its name can be used just like a pointer to
its first element.
Pointer initialization
Pointers can be initialized to point to specific locations at the very moment they are
defined:
1 int myvar;
2 int * myptr = &myvar;
The resulting state of variables after this code is the same as after:
1 int myvar;
2 int * myptr;
3 myptr = &myvar;
When pointers are initialized, what is initialized is the address they point to (i.e.,
myptr), never the value being pointed (i.e., *myptr). Therefore, the code above shall
not be confused with:
1 int myvar;
2 int * myptr;
3 *myptr = &myvar;
118
Which anyway would not make much sense (and is not valid code).
The asterisk (*) in the pointer declaration (line 2) only indicates that it is a pointer, it
is not the dereference operator (as in line 3). Both things just happen to use the
same sign: *. As always, spaces are not relevant, and never change the meaning of
an expression.
Pointers can be initialized either to the address of a variable (such as in the case
above), or to the value of another pointer (or array):
1 int myvar;
2 int *foo = &myvar;
3 int *bar = foo;
Pointer arithmetics
To conduct arithmetical operations on pointers is a little different than to conduct
them on regular integer types. To begin with, only addition and subtraction
operations are allowed; the others make no sense in the world of pointers. But both
addition and subtraction have a slightly different behavior with pointers, according
to the size of the data type to which they point.
When fundamental data types were introduced, we saw that types have different
sizes. For example: char always has a size of 1 byte, short is generally larger than
that, and int and long are even larger; the exact size of these being dependent on
the system. For example, let's imagine that in a given system, char takes 1 byte,
short takes 2 bytes, and long takes 4.
1 char *mychar;
2 short *myshort;
3 long *mylong;
and that we know that they point to the memory locations 1000, 2000, and 3000,
respectively.
Therefore, if we write:
1 ++mychar;
2 ++myshort;
3 ++mylong;
119
mychar, as one would expect, would contain the value 1001. But not so obviously,
myshort would contain the value 2002, and mylong would contain 3004, even though
they have each been incremented only once. The reason is that, when adding one
to a pointer, the pointer is made to point to the following element of the same type,
and, therefore, the size in bytes of the type it points to is added to the pointer.
This is applicable both when adding and subtracting any number to a pointer. It
would happen exactly the same if we wrote:
1 mychar = mychar + 1;
2 myshort = myshort + 1;
3 mylong = mylong + 1;
Regarding the increment (++) and decrement (--) operators, they both can be used
as either prefix or suffix of an expression, with a slight difference in behavior: as a
prefix, the increment happens before the expression is evaluated, and as a suffix,
the increment happens after the expression is evaluated. This also applies to
expressions incrementing and decrementing pointers, which can become part of
more complicated expressions that also include dereference operators ( *).
Remembering operator precedence rules, we can recall that postfix operators, such
as increment and decrement, have higher precedence than prefix operators, such
as the dereference operator (*). Therefore, the following expression:
*p++
is equivalent to *(p++). And what it does is to increase the value of p (so it now
120
points to the next element), but because ++ is used as postfix, the whole expression
is evaluated as the value pointed originally by the pointer (the address it pointed to
before being incremented).
Essentially, these are the four possible combinations of the dereference operator
with both the prefix and suffix versions of the increment operator (the same being
applicable also to the decrement operator):
*p++ = *q++;
Because ++ has a higher precedence than *, both p and q are incremented, but
because both increment operators (++) are used as postfix and not prefix, the value
assigned to *p is *q before both p and q are incremented. And then both are
incremented. It would be roughly equivalent to:
1 *p = *q;
2 ++p;
3 ++q;
1 int x;
2 int y = 10;
3 const int * p = &y;
4 x = *p; // ok: reading p
5 *p = x; // error: modifying p, which is const-qualified
121
Here p points to a variable, but points to it in a const-qualified manner, meaning
that it can read the value pointed, but it cannot modify it. Note also, that the
expression &y is of type int*, but this is assigned to a pointer of type const int*.
This is allowed: a pointer to non-const can be implicitly converted to a pointer to
const. But not the other way around! As a safety feature, pointers to const are not
implicitly convertible to pointers to non-const.
1 // pointers as arguments:
2 #include <iostream>
3 using namespace std;
4
5 void increment_all (int* start, int* stop)
6 {
7 int * current = start;
8 while (current != stop) {
9 ++(*current); // increment value pointed
10 ++current; // increment pointer
11 }
12 }
13
14 void print_all (const int* start, const int* stop) 11
15 { 21 Edit & Run
16 const int * current = start; 31
17 while (current != stop) {
18 cout << *current << '\n';
19 ++current; // increment pointer
20 }
21 }
22
23 int main ()
24 {
25 int numbers[] = {10,20,30};
26 increment_all (numbers,numbers+3);
27 print_all (numbers,numbers+3);
28 return 0;
29 }
Note that print_all uses pointers that point to constant elements. These pointers
point to constant content they cannot modify, but they are not constant
themselves: i.e., the pointers can still be incremented or assigned different
addresses, although they cannot modify the content they point to.
122
And this is where a second dimension to constness is added to pointers: Pointers
can also be themselves const. And this is specified by appending const to the
pointed type (after the asterisk):
1 int x;
2 int * p1 = &x; // non-const pointer to non-const int
3 const int * p2 = &x; // non-const pointer to const int
4 int * const p3 = &x; // const pointer to non-const int
5 const int * const p4 = &x; // const pointer to const int
The syntax with const and pointers is definitely tricky, and recognizing the cases
that best suit each use tends to require some experience. In any case, it is
important to get constness with pointers (and references) right sooner rather than
later, but you should not worry too much about grasping everything if this is the
first time you are exposed to the mix of const and pointers. More use cases will
show up in coming chapters.
To add a little bit more confusion to the syntax of const with pointers, the const
qualifier can either precede or follow the pointed type, with the exact same
meaning:
As with the spaces surrounding the asterisk, the order of const in this case is simply
a matter of style. This chapter uses a prefix const, as for historical reasons this
seems to be more extended, but both are exactly equivalent. The merits of each
style are still intensely debated on the internet.
But they can also be accessed directly. String literals are arrays of the proper array
type to contain all its characters plus the terminating null-character, with each of
the elements being of type const char (as literals, they can never be modified). For
example:
123
This declares an array with the literal representation for "hello", and then a pointer
to its first element is assigned to foo. If we imagine that "hello" is stored at the
memory locations that start at address 1702, we can represent the previous
declaration as:
Note that here foo is a pointer and contains the value 1702, and not 'h', nor
"hello", although 1702 indeed is the address of both of these.
The pointer foo points to a sequence of characters. And because pointers and
arrays behave essentially in the same way in expressions, foo can be used to
access the characters in the same way arrays of null-terminated character
sequences are. For example:
1 *(foo+4)
2 foo[4]
Both expressions have a value of 'o' (the fifth element of the array).
Pointers to pointers
C++ allows the use of pointers that point to pointers, that these, in its turn, point to
data (or even to other pointers). The syntax simply requires an asterisk ( *) for each
level of indirection in the declaration of the pointer:
1 char a;
2 char * b;
3 char ** c;
4 a = 'z';
5 b = &a;
6 c = &b;
This, assuming the randomly chosen memory locations for each variable of 7230,
8092, and 10502, could be represented as:
124
With the value of each variable represented inside its corresponding cell, and their
respective addresses in memory represented by the value under them.
The new thing in this example is variable c, which is a pointer to a pointer, and can
be used in three different levels of indirection, each one of them would correspond
to a different value:
void pointers
The void type of pointer is a special type of pointer. In C++, void represents the
absence of type. Therefore, void pointers are pointers that point to a value that has
no type (and thus also an undetermined length and undetermined dereferencing
properties).
This gives void pointers a great flexibility, by being able to point to any data type,
from an integer value or a float to a string of characters. In exchange, they have a
great limitation: the data pointed to by them cannot be directly dereferenced (which
is logical, since we have no type to dereference to), and for that reason, any
address in a void pointer needs to be transformed into some other pointer type that
points to a concrete data type before being dereferenced.
One of its possible uses may be to pass generic parameters to a function. For
example:
125
10 { int* pint; pint=(int*)data; ++(*pint); }
11 }
12
13 int main ()
14 {
15 char a = 'x';
16 int b = 1602;
17 increase (&a,sizeof(a));
18 increase (&b,sizeof(b));
19 cout << a << ", " << b << '\n';
20 return 0;
21 }
sizeof is an operator integrated in the C++ language that returns the size in bytes
of its argument. For non-dynamic data types, this value is a constant. Therefore, for
example, sizeof(char) is 1, because char has always a size of one byte.
Neither p nor q point to addresses known to contain a value, but none of the above
statements causes an error. In C++, pointers are allowed to take any address value,
no matter whether there actually is something at that address or not. What can
cause an error is to dereference such a pointer (i.e., actually accessing the value
they point to). Accessing such a pointer causes undefined behavior, ranging from an
error during runtime to accessing some random value.
But, sometimes, a pointer really needs to explicitly point to nowhere, and not just an
invalid address. For such cases, there exists a special value that any pointer type
can take: the null pointer value. This value can be expressed in C++ in two ways:
either with an integer value of zero, or with the nullptr keyword:
1 int * p = 0;
2 int * q = nullptr;
126
Here, both p and q are null pointers, meaning that they explicitly point to nowhere,
and they both actually compare equal: all null pointers compare equal to other null
pointers. It is also quite usual to see the defined constant NULL be used in older code
to refer to the null pointer value:
int * r = NULL;
NULL is defined in several headers of the standard library, and is defined as an alias
of some null pointer constant value (such as 0 or nullptr).
Do not confuse null pointers with void pointers! A null pointer is a value that any
pointer can take to represent that it is pointing to "nowhere", while a void pointer is
a type of pointer that can point to somewhere without a specific type. One refers to
the value stored in the pointer, and the other to the type of data it points to.
Pointers to functions
C++ allows operations with pointers to functions. The typical use of this is for
passing a function as an argument to another function. Pointers to functions are
declared with the same syntax as a regular function declaration, except that the
name of the function is enclosed between parentheses () and an asterisk ( *) is
inserted before the name:
127
22
23 m = operation (7, 5, addition);
24 n = operation (20, m, minus);
25 cout <<n;
26 return 0;
27 }
In the example above, minus is a pointer to a function that has two parameters of
type int. It is directly initialized to point to the function subtraction:
Dynamic memory
In the programs seen in previous chapters, all memory needs were determined
before program execution by defining the variables needed. But there may be cases
where the memory needs of a program can only be determined during runtime. For
example, when the memory needed depends on user input. On these cases,
programs need to dynamically allocate memory, for which the C++ language
integrates the operators new and delete.
The first expression is used to allocate memory to contain one single element of
type type. The second one is used to allocate a block (an array) of elements of type
type, where number_of_elements is an integer value representing the amount of
these. For example:
1 int * foo;
2 foo = new int [5];
In this case, the system dynamically allocates space for five elements of type int
and returns a pointer to the first element of the sequence, which is assigned to foo
128
(a pointer). Therefore, foo now points to a valid block of memory with space for five
elements of type int.
Here, foo is a pointer, and thus, the first element pointed to by foo can be accessed
either with the expression foo[0] or the expression *foo (both are equivalent). The
second element can be accessed either with foo[1] or *(foo+1), and so on...
The dynamic memory requested by our program is allocated by the system from the
memory heap. However, computer memory is a limited resource, and it can be
exhausted. Therefore, there are no guarantees that all requests to allocate memory
using operator new are going to be granted by the system.
C++ provides two standard mechanisms to check if the allocation was successful:
This exception method is the method used by default by new, and is the one used in
a declaration like:
The other method is known as nothrow, and what happens when it is used is that
when a memory allocation fails, instead of throwing a bad_alloc exception or
terminating the program, the pointer returned by new is a null pointer, and the
program continues its execution normally.
129
This method can be specified by using a special object called nothrow, declared in
header <new>, as argument for new:
In this case, if the allocation of this block of memory fails, the failure can be
detected by checking if foo is a null pointer:
1 int * foo;
2 foo = new (nothrow) int [5];
3 if (foo == nullptr) {
4 // error assigning memory. Take measures.
5 }
This nothrow method is likely to produce less efficient code than exceptions, since it
implies explicitly checking the pointer value returned after each and every
allocation. Therefore, the exception mechanism is generally preferred, at least for
critical allocations. Still, most of the coming examples will use the nothrow
mechanism due to its simplicity.
1 delete pointer;
2 delete[] pointer;
The first statement releases the memory of a single element allocated using new,
and the second one releases the memory allocated for arrays of elements using
new and a size in brackets ([]).
The value passed as argument to delete shall be either a pointer to a memory block
previously allocated with new, or a null pointer (in the case of a null pointer, delete
produces no effect).
130
5 int main ()
6 {
7 int i,n;
8 int * p;
9 cout << "How many numbers would you
10 like to type? ";
11 cin >> i;
12 p= new (nothrow) int[i];
13 if (p == nullptr)
14 cout << "Error: memory could not
Enter number : 1067
15 be allocated";
Enter number : 8
16 else
Enter number : 32
17 {
You have entered: 75, 436,
18 for (n=0; n<i; n++)
1067, 8, 32,
19 {
20 cout << "Enter number: ";
21 cin >> p[n];
22 }
23 cout << "You have entered: ";
24 for (n=0; n<i; n++)
25 cout << p[n] << ", ";
26 delete[] p;
27 }
28 return 0;
}
Notice how the value within brackets in the new statement is a variable value
entered by the user (i), not a constant expression:
There always exists the possibility that the user introduces a value for i so big that
the system cannot allocate enough memory for it. For example, when I tried to give
a value of 1 billion to the "How many numbers" question, my system could not
allocate that much memory for the program, and I got the text message we
prepared for this case (Error: memory could not be allocated).
Dynamic memory in C
C++ integrates the operators new and delete for allocating dynamic memory. But
these were not available in the C language; instead, it used a library solution, with
131
the functions malloc, calloc, realloc and free, defined in the header <cstdlib>
(known as <stdlib.h> in C). The functions are also available in C++ and can also be
used to allocate and deallocate dynamic memory.
Note, though, that the memory blocks allocated by these functions are not
necessarily compatible with those returned by new, so they should not be mixed;
each one should be handled with its own set of functions or operators.
Data structures
Data structures
A data structure is a group of data elements grouped together under one name.
These data elements, known as members, can have different types and different
lengths. Data structures can be declared in C++ using the following syntax:
struct type_name {
member_type1 member_name1;
member_type2 member_name2;
member_type3 member_name3;
.
.
} object_names;
Where type_name is a name for the structure type, object_name can be a set of valid
identifiers for objects that have the type of this structure. Within braces {}, there is
a list with the data members, each one is specified with a type and a valid identifier
as its name.
For example:
1 struct product {
2 int weight;
3 double price;
4 } ;
5
6 product apple;
7 product banana, melon;
This declares a structure type, called product, and defines it having two members:
weight and price, each of a different fundamental type. This declaration creates a
new type (product), which is then used to declare three objects (variables) of this
type: apple, banana, and melon. Note how once product is declared, it is used just like
any other type.
Right at the end of the struct definition, and before the ending semicolon ( ;), the
132
optional field object_names can be used to directly declare objects of the structure
type. For example, the structure objects apple, banana, and melon can be declared at
the moment the data structure type is defined:
1 struct product {
2 int weight;
3 double price;
4 } apple, banana, melon;
In this case, where object_names are specified, the type name (product) becomes
optional: struct requires either a type_name or at least one name in object_names,
but not necessarily both.
Once the three objects of a determined structure type are declared ( apple, banana,
and melon) its members can be accessed directly. The syntax for that is simply to
insert a dot (.) between the object name and the member name. For example, we
could operate with any of these elements as if they were standard variables of their
respective types:
1 apple.weight
2 apple.price
3 banana.weight
4 banana.price
5 melon.weight
6 melon.price
Each one of these has the data type corresponding to the member they refer to:
apple.weight, banana.weight, and melon.weight are of type int, while apple.price,
banana.price, and melon.price are of type double.
133
10 } mine, yours;
11
12 void printmovie (movies_t movie);
13
14 int main ()
15 {
16 string mystr;
17
18 mine.title = "2001 A Space Odyssey";
19 mine.year = 1968;
20
21 cout << "Enter title: ";
22 getline (cin,yours.title);
23 cout << "Enter year: ";
24 getline (cin,mystr);
25 stringstream(mystr) >> yours.year;
26
27 cout << "My favorite movie is:\n ";
28 printmovie (mine);
29 cout << "And yours is:\n ";
30 printmovie (yours);
31 return 0;
32 }
33
34 void printmovie (movies_t movie)
35 {
36 cout << movie.title;
37 cout << " (" << movie.year << ")\n";
38 }
The example shows how the members of an object act just as regular variables. For
example, the member yours.year is a valid variable of type int, and mine.title is a
valid variable of type string.
But the objects mine and yours are also variables with a type (of type movies_t). For
example, both have been passed to function printmovie just as if they were simple
variables. Therefore, one of the features of data structures is the ability to refer to
both their members individually or to the entire structure as a whole. In both cases
using the same identifier: the name of the structure.
Because structures are types, they can also be used as the type of arrays to
construct tables or databases of them:
134
int year;
9
} films [3];
10
11
void printmovie (movies_t movie);
12
13
int main ()
14
{
15
string mystr;
16
int n;
17
18
for (n=0; n<3; n++)
19
{
20
cout << "Enter title: ";
21
getline (cin,films[n].title);
22 movies:
cout << "Enter year: ";
23 Blade Runner (1982)
getline (cin,mystr);
24 The Matrix (1999)
stringstream(mystr) >> films[n].year;
25 Taxi Driver (1976)
}
26
27
cout << "\nYou have entered these
28
movies:\n";
29
for (n=0; n<3; n++)
30
printmovie (films[n]);
31
return 0;
32
}
33
34
void printmovie (movies_t movie)
35
{
36
cout << movie.title;
37
cout << " (" << movie.year << ")\n";
38
}
Pointers to structures
Like any other type, structures can be pointed to by its own type of pointers:
1 struct movies_t {
2 string title;
3 int year;
4 };
5
6 movies_t amovie;
7 movies_t * pmovie;
Here amovie is an object of structure type movies_t, and pmovie is a pointer to point
to objects of structure type movies_t. Therefore, the following code would also be
valid:
pmovie = &amovie;
135
The value of the pointer pmovie would be assigned the address of object amovie.
Now, let's see another example that mixes pointers and structures, and will serve to
introduce a new operator: the arrow operator (->):
// pointers to structures
#include <iostream>
1
#include <string>
2
#include <sstream>
3
using namespace std;
4
5
struct movies_t {
6
string title;
7
int year;
8
};
9
10
int main ()
11
{
12
string mystr;
13 Enter title: Invasion of the body
14 snatchers
movies_t amovie;
15 Enter year: 1978 Edit &
movies_t * pmovie;
16 Run
pmovie = &amovie;
17 You have entered:
18 Invasion of the body snatchers
cout << "Enter title: ";
19 (1978)
getline (cin, pmovie->title);
20
cout << "Enter year: ";
21
getline (cin, mystr);
22
(stringstream) mystr >> pmovie-
23
>year;
24
25
cout << "\nYou have
26
entered:\n";
27
cout << pmovie->title;
28
cout << " (" << pmovie->year <<
29
")\n";
30
31
return 0;
}
The arrow operator (->) is a dereference operator that is used exclusively with
pointers to objects that have members. This operator serves to access the member
of an object directly from its address. For example, in the example above:
pmovie->title
136
(*pmovie).title
Both expressions, pmovie->title and (*pmovie).title are valid, and both access the
member title of the data structure pointed by a pointer called pmovie. It is
definitely something different than:
*pmovie.title
*(pmovie.title)
This would access the value pointed by a hypothetical pointer member called title
of the structure object pmovie (which is not the case, since title is not a pointer
type). The following panel summarizes possible combinations of the operators for
pointers and for structure members:
Nesting structures
Structures can also be nested in such a way that an element of a structure is itself
another structure:
1 struct movies_t {
2 string title;
3 int year;
4 };
5
6 struct friends_t {
7 string name;
8 string email;
9 movies_t favorite_movie;
10 } charlie, maria;
11
137
12 friends_t * pfriends = &charlie;
After the previous declarations, all of the following expressions would be valid:
1 charlie.name
2 maria.favorite_movie.title
3 charlie.favorite_movie.year
4 pfriends->favorite_movie.year
(where, by the way, the last two expressions refer to the same member).
In C++, there are two syntaxes for creating such type aliases: The first, inherited
from the C language, uses the typedef keyword:
For example:
1 typedef char C;
2 typedef unsigned int WORD;
3 typedef char * pChar;
4 typedef char field [50];
This defines four type aliases: C, WORD, pChar, and field as char, unsigned int, char*
and char[50], respectively. Once these aliases are defined, they can be used in any
declaration just like any other valid type:
More recently, a second syntax to define type aliases was introduced in the C++
language:
138
using new_type_name = existing_type ;
For example, the same type aliases as above could be defined as:
1 using C = char;
2 using WORD = unsigned int;
3 using pChar = char *;
4 using field = char [50];
Both aliases defined with typedef and aliases defined with using are semantically
equivalent. The only difference being that typedef has certain limitations in the
realm of templates that using has not. Therefore, using is more generic, although
typedef has a longer history and is probably more common in existing code.
Note that neither typedef nor using create new distinct data types. They only create
synonyms of existing types. That means that the type of myword above, declared
with type WORD, can as well be considered of type unsigned int; it does not really
matter, since both are actually referring to the same type.
Type aliases can be used to reduce the length of long or confusing type names, but
they are most useful as tools to abstract programs from the underlying types they
use. For example, by using an alias of int to refer to a particular kind of parameter
instead of using int directly, it allows for the type to be easily replaced by long (or
some other type) in a later version, without having to change every instance where
it is used.
Unions
Unions allow one portion of memory to be accessed as different data types. Its
declaration and use is similar to the one of structures, but its functionality is totally
different:
union type_name {
member_type1 member_name1;
member_type2 member_name2;
member_type3 member_name3;
.
.
} object_names;
This creates a new union type, identified by type_name, in which all its member
139
elements occupy the same physical space in memory. The size of this type is the
one of the largest member element. For example:
1 union mytypes_t {
2 char c;
3 int i;
4 float f;
5 } mytypes;
1 mytypes.c
2 mytypes.i
3 mytypes.f
Each of these members is of a different data type. But since all of them are referring
to the same location in memory, the modification of one of the members will affect
the value of all of them. It is not possible to store different values in them in a way
that each is independent of the others.
One of the uses of a union is to be able to access a value either in its entirety or as
an array or structure of smaller elements. For example:
1 union mix_t {
2 int l;
3 struct {
4 short hi;
5 short lo;
6 } s;
7 char c[4];
8 } mix;
If we assume that the system where this program runs has an int type with a size
of 4 bytes, and a short type of 2 bytes, the union defined above allows the access
to the same group of 4 bytes: mix.l, mix.s and mix.c, and which we can use
according to how we want to access these bytes: as if they were a single value of
type int, or as if they were two values of type short, or as an array of char
elements, respectively. The example mixes types, arrays, and structures in the
union to demonstrate different ways to access the data. For a little-endian system,
this union could be represented as:
140
The exact alignment and order of the members of a union in memory depends on
the system, with the possibility of creating portability issues.
Anonymous unions
When unions are members of a class (or structure), they can be declared with no
name. In this case, they become anonymous unions, and its members are directly
accessible from objects by their member names. For example, see the differences
between these two structure declarations:
The only difference between the two types is that in the first one, the member union
has a name (price), while in the second it has not. This affects the way to access
members dollars and yen of an object of this type. For an object of the first type
(with a regular union), it would be:
1 book1.price.dollars
2 book1.price.yen
whereas for an object of the second type (which has an anonymous union), it would
be:
1 book2.dollars
2 book2.yen
Again, remember that because it is a member union (not a member structure), the
members dollars and yen actually share the same memory location, so they cannot
be used to store two different values simultaneously. The price can be set in
141
dollars or in yen, but not in both simultaneously.
enum type_name {
value1,
value2,
value3,
.
.
} object_names;
This creates the type type_name, which can take any of value1, value2, value3, ... as
value. Objects (variables) of this type can directly be instantiated as object_names.
For example, a new type of variable called colors_t could be defined to store colors
with the following declaration:
enum colors_t {black, blue, green, cyan, red, purple, yellow, white};
Notice that this declaration includes no other type, neither fundamental nor
compound, in its definition. To say it another way, somehow, this creates a whole
new data type from scratch without basing it on any other existing type. The
possible values that variables of this new type color_t may take are the
enumerators listed within braces. For example, once the colors_t enumerated type
is declared, the following expressions will be valid:
1 colors_t mycolor;
2
3 mycolor = blue;
4 if (mycolor == green) mycolor = red;
Values of enumerated types declared with enum are implicitly convertible to the
integer type int, and vice versa. In fact, the elements of such an enum are always
assigned an integer numerical equivalent internally, of which they become an alias.
If it is not specified otherwise, the integer value equivalent to the first possible
142
value is 0, the equivalent to the second is 1, to the third is 2, and so on... Therefore,
in the data type colors_t defined above, black would be equivalent to 0, blue would
be equivalent to 1, green to 2, and so on...
A specific integer value can be specified for any of the possible values in the
enumerated type. And if the constant value that follows it is itself not given its own
value, it is automatically assumed to be the same value plus one. For example:
In this case, the variable y2k of the enumerated type months_t can contain any of
the 12 possible values that go from january to december and that are equivalent to
the values between 1 and 12 (not between 0 and 11, since january has been made
equal to 1).
Because enumerated types declared with enum are implicitly convertible to int, and
each of the enumerator values is actually of type int, there is no way to distinguish
1 from january - they are the exact same value of the same type. The reasons for
this are historical and are inheritance of the C language.
enum class Colors {black, blue, green, cyan, red, purple, yellow, white};
Each of the enumerator values of an enum class type needs to be scoped into its
type (this is actually also possible with enum types, but it is only optional). For
example:
1 Colors mycolor;
2
3 mycolor = Colors::blue;
4 if (mycolor == Colors::green) mycolor = Colors::red;
Enumerated types declared with enum class also have more control over their
143
underlying type; it may be any integral data type, such as char, short or unsigned
int, which essentially serves to determine the size of the type. This is specified by a
colon and the underlying type following the enumerated type. For example:
Here, Eyecolor is a distinct type with the same size of a char (1 byte).
Classes
Classes (I)
Classes are an expanded concept of data structures: like data structures, they can
contain data members, but they can also contain functions as members.
Classes are defined using either keyword class or keyword struct, with the
following syntax:
class class_name {
access_specifier_1:
member1;
access_specifier_2:
member2;
...
} object_names;
Where class_name is a valid identifier for the class, object_names is an optional list of
names for objects of this class. The body of the declaration can contain members,
which can either be data or function declarations, and optionally access specifiers.
Classes have the same format as plain data structures, except that they can also
include functions and have these new things called access specifiers. An access
specifier is one of the following three keywords: private, public or protected. These
specifiers modify the access rights for the members that follow them:
private members of a class are accessible only from within other members of
the same class (or from their "friends").
protected members are accessible from other members of the same class (or
from their "friends"), but also from members of their derived classes.
144
Finally, public members are accessible from anywhere where the object is
visible.
By default, all members of a class declared with the class keyword have private
access for all its members. Therefore, any member that is declared before any other
access specifier has private access automatically. For example:
1 class Rectangle {
2 int width, height;
3 public:
4 void set_values (int,int);
5 int area (void);
6 } rect;
Declares a class (i.e., a type) called Rectangle and an object (i.e., a variable) of this
class, called rect. This class contains four members: two data members of type int
(member width and member height) with private access (because private is the
default access level) and two member functions with public access: the functions
set_values and area, of which for now we have only included their declaration, but
not their definition.
Notice the difference between the class name and the object name: In the previous
example, Rectangle was the class name (i.e., the type), whereas rect was an object
of type Rectangle. It is the same relationship int and a have in the following
declaration:
int a;
where int is the type name (the class) and a is the variable name (the object).
After the declarations of Rectangle and rect, any of the public members of object
rect can be accessed as if they were normal functions or normal variables, by
simply inserting a dot (.) between object name and member name. This follows the
same syntax as accessing the members of plain data structures. For example:
1 rect.set_values (3,4);
2 myarea = rect.area();
The only members of rect that cannot be accessed from outside the class are width
and height, since they have private access and they can only be referred to from
within other members of that same class.
145
Here is the complete example of class Rectangle:
1 // classes example
2 #include <iostream>
3 using namespace std;
4
5 class Rectangle {
6 int width, height;
7 public:
8 void set_values (int,int);
9 int area() {return width*height;}
10 };
11
area: 12 Edit & Run
12 void Rectangle::set_values (int x, int y) {
13 width = x;
14 height = y;
15 }
16
17 int main () {
18 Rectangle rect;
19 rect.set_values (3,4);
20 cout << "area: " << rect.area();
21 return 0;
22 }
This example reintroduces the scope operator (::, two colons), seen in earlier
chapters in relation to namespaces. Here it is used in the definition of function
set_values to define a member of a class outside the class itself.
Notice that the definition of the member function area has been included directly
within the definition of class Rectangle given its extreme simplicity. Conversely,
set_values it is merely declared with its prototype within the class, but its definition
is outside it. In this outside definition, the operator of scope ( ::) is used to specify
that the function being defined is a member of the class Rectangle and not a regular
non-member function.
The scope operator (::) specifies the class to which the member being declared
belongs, granting exactly the same scope properties as if this function definition
was directly included within the class definition. For example, the function
set_values in the previous example has access to the variables width and height,
which are private members of class Rectangle, and thus only accessible from other
members of the class, such as this.
The only difference between defining a member function completely within the class
definition or to just include its declaration in the function and define it later outside
the class, is that in the first case the function is automatically considered an inline
member function by the compiler, while in the second it is a normal (not-inline)
146
class member function. This causes no differences in behavior, but only on possible
compiler optimizations.
Members width and height have private access (remember that if nothing else is
specified, all members of a class defined with keyword class have private access).
By declaring them private, access from outside the class is not allowed. This makes
sense, since we have already defined a member function to set values for those
members within the object: the member function set_values. Therefore, the rest of
the program does not need to have direct access to them. Perhaps in a so simple
example as this, it is difficult to see how restricting access to these variables may
be useful, but in greater projects it may be very important that values cannot be
modified in an unexpected way (unexpected from the point of view of the object).
The most important property of a class is that it is a type, and as such, we can
declare multiple objects of it. For example, following with the previous example of
class Rectangle, we could have declared the object rectb in addition to object rect:
In this particular case, the class (type of the objects) is Rectangle, of which there are
two instances (i.e., objects): rect and rectb. Each one of them has its own member
variables and member functions.
Notice that the call to rect.area() does not give the same result as the call to
rectb.area(). This is because each object of class Rectangle has its own variables
147
width and height, as they -in some way- have also their own function members
set_value and area that operate on the object's own member variables.
Classes allow programming using object-oriented paradigms: Data and functions are
both members of the object, reducing the need to pass and carry handlers or other
state variables as arguments to functions, because they are part of the object
whose member is called. Notice that no arguments were passed on the calls to
rect.area or rectb.area. Those member functions directly used the data members
of their respective objects rect and rectb.
Constructors
What would happen in the previous example if we called the member function area
before having called set_values? An undetermined result, since the members width
and height had never been assigned a value.
In order to avoid that, a class can include a special function called its constructor,
which is automatically called whenever a new object of this class is created,
allowing the class to initialize member variables or allocate storage.
This constructor function is declared just like a regular member function, but with a
name that matches the class name and without any return type; not even void.
148
The results of this example are identical to those of the previous example. But now,
class Rectangle has no member function set_values, and has instead a constructor
that performs a similar action: it initializes the values of width and height with the
arguments passed to it.
Notice how these arguments are passed to the constructor at the moment at which
the objects of this class are created:
Notice how neither the constructor prototype declaration (within the class) nor the
latter constructor definition, have return values; not even void: Constructors never
return values, they simply initialize the object.
Overloading constructors
Like any other function, a constructor can also be overloaded with different versions
taking different parameters: with a different number of parameters and/or
parameters of different types. The compiler will automatically call the one whose
parameters match the arguments:
149
21 }
22
23 int main () {
24 Rectangle rect (3,4);
25 Rectangle rectb;
26 cout << "rect area: " << rect.area() << endl;
27 cout << "rectb area: " << rectb.area() << endl;
28 return 0;
29 }
In the above example, two objects of class Rectangle are constructed: rect and
rectb. rect is constructed with two arguments, like in the example before.
But this example also introduces a special kind constructor: the default constructor.
The default constructor is the constructor that takes no parameters, and it is special
because it is called when an object is declared but is not initialized with any
arguments. In the example above, the default constructor is called for rectb. Note
how rectb is not even constructed with an empty set of parentheses - in fact, empty
parentheses cannot be used to call the default constructor:
This is because the empty set of parentheses would make of rectc a function
declaration instead of an object declaration: It would be a function that takes no
arguments and returns a value of type Rectangle.
Uniform initialization
The way of calling constructors by enclosing their arguments in parentheses, as
shown above, is known as functional form. But constructors can also be called with
other syntaxes:
First, constructors with a single parameter can be called using the variable
initialization syntax (an equal sign followed by the argument):
150
Optionally, this last syntax can include an equal sign before the braces.
Here is an example with four ways to construct objects of a class whose constructor
takes a single parameter:
The choice of syntax to call constructors is largely a matter of style. Most existing
code currently uses functional form, and some newer style guides suggest to
choose uniform initialization over the others, even though it also has its potential
pitfalls for its preference of initializer_list as its type.
151
inserting, before the constructor's body, a colon ( :) and a list of initializations for
class members. For example, consider a class with the following declaration:
1 class Rectangle {
2 int width,height;
3 public:
4 Rectangle(int,int);
5 int area() {return width*height;}
6 };
Or even:
Note how in this last case, the constructor does nothing else than initialize its
members, hence it has an empty function body.
For members of fundamental types, it makes no difference which of the ways above
the constructor is defined, because they are not initialized by default, but for
member objects (those whose type is a class), if they are not initialized after the
colon, they are default-constructed.
152
double radius;
public:
6 Circle(double r) : radius(r) { }
7 double area() {return
8 radius*radius*3.14159265;}
9 };
10
11 class Cylinder {
12 Circle base;
13 double height;
14 public:
15 Cylinder(double r, double h) : base (r),
16 height(h) {}
17 double volume() {return base.area() *
18 height;}
19 };
20
21 int main () {
22 Cylinder foo (10,20);
23
24 cout << "foo's volume: " << foo.volume() <<
25 '\n';
return 0;
}
In this example, class Cylinder has a member object whose type is another class
(base's type is Circle). Because objects of class Circle can only be constructed with
a parameter, Cylinder's constructor needs to call base's constructor, and the only
way to do this is in the member initializer list.
These initializations can also use uniform initializer syntax, using braces {} instead
of parentheses ():
Pointers to classes
Objects can also be pointed to by pointers: Once declared, a class becomes a valid
type, so it can be used as the type pointed to by a pointer. For example:
Rectangle * prect;
153
Similarly as with plain data structures, the members of an object can be accessed
directly from a pointer by using the arrow operator ( ->). Here is an example with
some possible combinations:
This example makes use of several operators to operate on objects and pointers
(operators *, &, ., ->, []). They can be interpreted as:
expressio
can be read as
n
*x pointed to by x
&x address of x
154
x[n] (n+1)th object pointed to by x
Most of these expressions have been introduced in earlier chapters. Most notably,
the chapter about arrays introduced the offset operator ( []) and the chapter about
plain data structures introduced the arrow operator ( ->).
The keyword struct, generally used to declare plain data structures, can also be
used to declare classes that have member functions, with the same syntax as with
keyword class. The only difference between both is that members of classes
declared with the keyword struct have public access by default, while members of
classes declared with the keyword class have private access by default. For all
other purposes both keywords are equivalent in this context.
Conversely, the concept of unions is different from that of classes declared with
struct and class, since unions only store one data member at a time, but
nevertheless they are also classes and can thus also hold member functions. The
default access in union classes is public.
Classes (II)
Overloading operators
Classes, essentially, define new types to be used in C++ code. And types in C++
not only interact with code by means of constructions and assignments. They also
interact by means of operators. For example, take the following operation on
fundamental types:
1 int a, b, c;
2 a = b + c;
Here, different variables of a fundamental type ( int) are applied the addition
operator, and then the assignment operator. For a fundamental arithmetic type, the
meaning of such operations is generally obvious and unambiguous, but it may not
be so for certain class types. For example:
1 struct myclass {
2 string product;
3 float price;
4 } a, b, c;
155
5 a = b + c;
Here, it is not obvious what the result of the addition operation on b and c does. In
fact, this code alone would cause a compilation error, since the type myclass has no
defined behavior for additions. However, C++ allows most operators to be
overloaded so that their behavior can be defined for just about any type, including
classes. Here is a list of all the operators that can be overloaded:
Overloadable operators
156
24 result = foo + bar;
25 cout << result.x << ',' << result.y << '\n';
26 return 0;
27 }
The function operator+ of class CVector overloads the addition operator (+) for that
type. Once declared, this function can be called either implicitly using the operator,
or explicitly using its functional name:
1 c = a + b;
2 c = a.operator+ (b);
The operator overloads are just regular functions which can have any behavior;
there is actually no requirement that the operation performed by that overload
bears a relation to the mathematical or usual meaning of the operator, although it is
strongly recommended. For example, a class that overloads operator+ to actually
subtract or that overloads operator== to fill the object with zeros, is perfectly valid,
although using such a class could be challenging.
The parameter expected for a member function overload for operations such as
operator+ is naturally the operand to the right hand side of the operator. This is
common to all binary operators (those with an operand to its left and one operand
to its right). But operators can come in diverse forms. Here you have a table with a
summary of the parameters needed for each of the different operators than can be
overloaded (please, replace @ by the operator in each case):
Expressi Non-member
Operator Member function
on function
a@ ++ -- A::operator@(int) operator@(A,int)
157
>= << >> && || ,
= += -= *= /= %= ^= &= |= <<=
a@b
>>= []
A::operator@(B) -
a(b,c...) () A::operator()(B,C...) -
Notice that some operators may be overloaded in two forms: either as a member
function or as a non-member function: The first case has been used in the example
above for operator+. But some operators can also be overloaded as non-member
functions; In this case, the operator function takes an object of the proper class as
first argument.
For example:
1 // non-member operator overloads
2 #include <iostream>
3 using namespace std;
4
5 class CVector {
6 public:
7 int x,y;
8 CVector () {}
9 CVector (int a, int b) : x(a), y(b) {}
10 };
11
12
13 CVector operator+ (const CVector& lhs, const CVector& rhs) { Edit &
14 CVector temp; 4,3 Run
15 temp.x = lhs.x + rhs.x;
16 temp.y = lhs.y + rhs.y;
17 return temp;
18 }
19
20 int main () {
21 CVector foo (3,1);
22 CVector bar (1,2);
23 CVector result;
24 result = foo + bar;
25 cout << result.x << ',' << result.y << '\n';
26 return 0;
27 }
158
The keyword this
The keyword this represents a pointer to the object whose member function is
being executed. It is used within a class's member function to refer to the object
itself.
One of its uses can be to check if a parameter passed to a member function is the
object itself. For example:
1 // example on this
2 #include <iostream>
3 using namespace std;
4
5 class Dummy {
6 public:
7 bool isitme (Dummy& param);
8 };
9
10 bool Dummy::isitme (Dummy& param)
11 {
yes, &a is b Edit & Run
12 if (¶m == this) return true;
13 else return false;
14 }
15
16 int main () {
17 Dummy a;
18 Dummy* b = &a;
19 if ( b->isitme(a) )
20 cout << "yes, &a is b\n";
21 return 0;
22 }
In fact, this function is very similar to the code that the compiler generates
implicitly for this class for operator=.
159
Static members
A class can contain static members, either data or functions.
A static data member of a class is also known as a "class variable", because there is
only one common variable for all the objects of that same class, sharing the same
value: i.e., its value is not different from one object of this class to another.
For example, it may be used for a variable within a class that can contain a counter
with the number of objects of that class that are currently allocated, as in the
following example:
In fact, static members have the same properties as non-member variables but they
enjoy class scope. For that reason, and to avoid them to be declared several times,
they cannot be initialized directly in the class, but need to be initialized somewhere
outside it. As in the previous example:
int Dummy::n=0;
Because it is a common variable value for all the objects of the same class, it can be
referred to as a member of any object of that class or even directly by the class
name (of course this is only valid for static members):
160
These two calls above are referring to the same variable: the static variable n within
class Dummy shared by all objects of this class.
Again, it is just like a non-member variable, but with a name that requires to be
accessed like a member of a class (or an object).
Classes can also have static member functions. These represent the same:
members of a class that are common to all object of that class, acting exactly as
non-member functions but being accessed like members of the class. Because they
are like non-member functions, they cannot access non-static members of the class
(neither member variables nor member functions). They neither can use the
keyword this.
The access to its data members from outside the class is restricted to read-only, as
if all its data members were const for those accessing them from outside the class.
Note though, that the constructor is still called and is allowed to initialize and
modify these data members:
161
The member functions of a const object can only be called if they are themselves
specified as const members; in the example above, member get (which is not
specified as const) cannot be called from foo. To specify that a member is a const
member, the const keyword shall follow the function prototype, after the closing
parenthesis for its parameters:
Note that const can be used to qualify the type returned by a member function. This
const is not the same as the one which specifies a member as const. Both are
independent and are located at different places in the function prototype:
Member functions specified to be const cannot modify non-static data members nor
call other non-const member functions. In essence, const members shall not modify
the state of an object.
const objects are limited to access only member functions marked as const, but
non-const objects are not restricted and thus can access both const and non-const
member functions alike.
You may think that anyway you are seldom going to declare const objects, and thus
marking all members that don't modify the object as const is not worth the effort,
but const objects are actually very common. Most functions taking classes as
parameters actually take them by const reference, and thus, these functions can
only access their const members:
162
15
16 int main() {
17 MyClass foo (10);
18 print(foo);
19
20 return 0;
21 }
If in this example, get was not specified as a const member, the call to arg.get() in
the print function would not be possible, because const objects only have access to
const member functions.
Member functions can be overloaded on their constness: i.e., a class may have two
member functions with identical signatures except that one is const and the other is
not: in this case, the const version is called only when the object is itself const, and
the non-const version is called when the object is itself non- const.
Class templates
Just like we can create function templates, we can also create class templates,
allowing classes to have members that use template parameters as types. For
example:
163
1 template <class T>
2 class mypair {
3 T values [2];
4 public:
5 mypair (T first, T second)
6 {
7 values[0]=first; values[1]=second;
8 }
9 };
The class that we have just defined serves to store two elements of any valid type.
For example, if we wanted to declare an object of this class to store two integer
values of type int with the values 115 and 36 we would write:
This same class could also be used to create an object to store any other type, such
as:
The constructor is the only member function in the previous class template and it
has been defined inline within the class definition itself. In case that a member
function is defined outside the defintion of the class template, it shall be preceded
with the template <...> prefix:
164
21
22 int main () {
23 mypair <int> myobject (100, 75);
24 cout << myobject.getmax();
25 return 0;
26 }
Confused by so many T's? There are three T's in this declaration: The first one is the
template parameter. The second T refers to the type returned by the function. And
the third T (the one between angle brackets) is also a requirement: It specifies that
this function's template parameter is also the class template parameter.
Template specialization
It is possible to define a different implementation for a template when a specific
type is passed as template argument. This is called a template specialization.
For example, let's suppose that we have a very simple class called mycontainer that
can store one element of any type and that has just one member function called
increase, which increases its value. But we find that when it stores an element of
type char it would be more convenient to have a completely different
implementation with a function member uppercase, so we decide to declare a class
template specialization for that type:
165
19 mycontainer (char arg) {element=arg;}
20 char uppercase ()
21 {
22 if ((element>='a')&&(element<='z'))
23 element+='A'-'a';
24 return element;
25 }
26 };
27
28 int main () {
29 mycontainer<int> myint (7);
30 mycontainer<char> mychar ('j');
31 cout << myint.increase() << endl;
32 cout << mychar.uppercase() << endl;
33 return 0;
34 }
First of all, notice that we precede the class name with template<> , including an
empty parameter list. This is because all types are known and no template
arguments are required for this specialization, but still, it is the specialization of a
class template, and thus it requires to be noted as such.
But more important than this prefix, is the <char> specialization parameter after the
class template name. This specialization parameter itself identifies the type for
which the template class is being specialized ( char). Notice the differences between
the generic class template and the specialization:
The first line is the generic template, and the second one is the specialization.
When we declare specializations for a template class, we must also define all its
members, even those identical to the generic template class, because there is no
"inheritance" of members from the generic template to the specialization.
Special members
166
Special member functions are member functions that are implicitly defined as
member of classes under certain circumstances. There are six:
Destructor C::~C();
Default constructor
The default constructor is the constructor called when objects of a class are
declared, but are not initialized with any arguments.
If a class definition has no constructors, the compiler assumes the class to have an
implicitly defined default constructor. Therefore, after declaring a class like this:
1 class Example {
2 public:
3 int total;
4 void accumulate (int x) { total += x; }
5 };
The compiler assumes that Example has a default constructor. Therefore, objects of
this class can be constructed by simply declaring them without any arguments:
Example ex;
But as soon as a class has some constructor taking any number of parameters
explicitly declared, the compiler no longer provides an implicit default constructor,
and no longer allows the declaration of new objects of that class without arguments.
For example, the following class:
167
1 class Example2 {
2 public:
3 int total;
4 Example2 (int initial_value) : total(initial_value) { };
5 void accumulate (int x) { total += x; };
6 };
Here, we have declared a constructor with a parameter of type int. Therefore the
following object declaration would be correct:
Would not be valid, since the class has been declared with an explicit constructor
taking one argument and that replaces the implicit default constructor taking none.
168
Here, Example3 has a default constructor (i.e., a constructor without parameters)
defined as an empty block:
Example3() {}
This allows objects of class Example3 to be constructed without arguments (like foo
was declared in this example). Normally, a default constructor like this is implicitly
defined for all classes that have no other constructors and thus no explicit definition
is required. But in this case, Example3 has another constructor:
Destructor
Destructors fulfill the opposite functionality of constructors: They are responsible for
the necessary cleanup needed by a class when its lifetime ends. The classes we
have defined in previous chapters did not allocate any resource and thus did not
really require any clean up.
But now, let's imagine that the class in the last example allocates dynamic memory
to store the string it had as data member; in this case, it would be very useful to
have a function called automatically at the end of the object's life in charge of
releasing this memory. To do this, we use a destructor. A destructor is a member
function very similar to a default constructor: it takes no arguments and returns
nothing, not even void. It also uses the class name as its own name, but preceded
with a tilde sign (~):
169
~Example4 () {delete ptr;}
14 // access content:
15 const string& content() const {return *ptr;}
16 };
17
18 int main () {
19 Example4 foo;
20 Example4 bar ("Example");
21
22 cout << "bar's content: " << bar.content() <<
23 '\n';
24 return 0;
}
The destructor for an object is called at the end of its lifetime; in the case of foo and
bar this happens at the end of function main.
Copy constructor
When an object is passed a named object of its own type as argument, its copy
constructor is invoked in order to construct a copy.
If a class has no custom copy nor move constructors (or assignments) defined, an
implicit copy constructor is provided. This copy constructor simply performs a copy
of its own members. For example, for a class such as:
1 class MyClass {
2 public:
3 int a, b; string c;
4 };
170
MyClass::MyClass(const MyClass& x) : a(x.a), b(x.b), c(x.c) {}
This default copy constructor may suit the needs of many classes. But shallow
copies only copy the members of the class themselves, and this is probably not
what we expect for classes like class Example4 we defined above, because it
contains pointers of which it handles its storage. For that class, performing a
shallow copy means that the pointer value is copied, but not the content itself; This
means that both objects (the copy and the original) would be sharing a single
string object (they would both be pointing to the same object), and at some point
(on destruction) both objects would try to delete the same block of memory,
probably causing the program to crash on runtime. This can be solved by defining
the following custom copy constructor that performs a deep copy:
The deep copy performed by this copy constructor allocates storage for a new
string, which is initialized to contain a copy of the original object. In this way, both
objects (copy and original) have distinct copies of the content stored in different
locations.
171
Copy assignment
Objects are not only copied on construction, when they are initialized: They can also
be copied on any assignment operation. See the difference:
MyClass foo;
1
MyClass bar (foo); // object initialization: copy constructor called
2
MyClass baz = foo; // object initialization: copy constructor called
3
foo = bar; // object already initialized: copy assignment
4
called
Note that baz is initialized on construction using an equal sign, but this is not an
assignment operation! (although it may look like one): The declaration of an object
is not an assignment operation, it is just another of the syntaxes to call single-
argument constructors.
The copy assignment operator is also a special function and is also defined implicitly
if a class has no custom copy nor move assignments (nor move constructor)
defined.
But again, the implicit version performs a shallow copy which is suitable for many
classes, but not for classes with pointers to objects they handle its storage, as is the
case in Example5. In this case, not only the class incurs the risk of deleting the
pointed object twice, but the assignment creates memory leaks by not deleting the
object pointed by the object before the assignment. These issues could be solved
with a copy assignment that deletes the previous object and performs a deep copy:
172
Or even better, since its string member is not constant, it could re-utilize the same
string object:
Unnamed objects are objects that are temporary in nature, and thus haven't even
been given a name. Typical examples of unnamed objects are return values of
functions or type-casts.
Using the value of a temporary object such as these to initialize another object or to
assign its value, does not really require a copy: the object is never going to be used
for anything else, and thus, its value can be moved into the destination object.
These cases trigger the move constructor and move assignments:
Both the value returned by fn and the value constructed with MyClass are unnamed
temporaries. In these cases, there is no need to make a copy, because the unnamed
object is very short-lived and can be acquired by the other object when this is a
more efficient operation.
173
The move constructor and move assignment are members that take a parameter of
type rvalue reference to the class itself:
An rvalue reference is specified by following the type with two ampersands ( &&). As
a parameter, an rvalue reference matches arguments of temporaries of this type.
The concept of moving is most useful for objects that manage the storage they use,
such as objects that allocate storage with new and delete. In such objects, copying
and moving are really different operations:
- Copying from A to B means that new memory is allocated to B and then the entire
content of A is copied to this new memory allocated for B.
- Moving from A to B means that the memory already allocated to A is transferred to
B without allocating any new storage. It involves simply copying the pointer.
For example:
1 // move constructor/assignment foo's content: Edit &
2 #include <iostream> Example
Run
3 #include <string>
4 using namespace std;
5
6 class Example6 {
7 string* ptr;
8 public:
9 Example6 (const string& str) : ptr(new
10 string(str)) {}
11 ~Example6 () {delete ptr;}
12 // move constructor
13 Example6 (Example6&& x) : ptr(x.ptr)
14 {x.ptr=nullptr;}
15 // move assignment
16 Example6& operator= (Example6&& x) {
17 delete ptr;
18 ptr = x.ptr;
19 x.ptr=nullptr;
20 return *this;
21 }
22 // access content:
23 const string& content() const {return *ptr;}
24 // addition:
25 Example6 operator+(const Example6& rhs) {
26 return Example6(content()+rhs.content());
27 }
28 };
29
30
31 int main () {
32 Example6 foo ("Exam");
174
Example6 bar = Example6("ple"); // move-
construction
33
foo = foo + bar; // move-
34
assignment
35
36
cout << "foo's content: " << foo.content() <<
37
'\n';
return 0;
}
Note that even though rvalue references can be used for the type of any function
parameter, it is seldom useful for uses other than the move constructor. Rvalue
references are tricky, and unnecessary uses may be the source of errors quite
difficult to track.
Implicit members
The six special members functions described above are members implicitly declared
on classes under certain circumstances:
Member default
implicitly defined:
function definition:
Default
if no other constructors does nothing
constructor
175
Notice how not all special member functions are implicitly defined in the same
cases. This is mostly due to backwards compatibility with C structures and earlier
C++ versions, and in fact some include deprecated cases. Fortunately, each class
can select explicitly which of these members exist with their default definition or
which are deleted by using the keywords default and delete, respectively. The
syntax is either one of:
function_declaration = default;
function_declaration = delete;
For example:
// default and delete implicit members
1
#include <iostream>
2
using namespace std;
3
4
class Rectangle {
5
int width, height;
6
public:
7
Rectangle (int x, int y) : width(x), height(y)
8
{}
9
Rectangle() = default; Edit &
10 bar's area:
Rectangle (const Rectangle& other) = delete; Run
11 200
int area() {return width*height;}
12
};
13
14
int main () {
15
Rectangle foo;
16
Rectangle bar (10,20);
17
18
cout << "bar's area: " << bar.area() << '\n';
19
return 0;
20
}
Here, Rectangle can be constructed either with two int arguments or be default-
constructed (with no arguments). It cannot however be copy-constructed from
another Rectangle object, because this function has been deleted. Therefore,
assuming the objects of the last example, the following statement would not be
valid:
It could, however, be made explicitly valid by defining its copy constructor as:
176
Which would be essentially equivalent to:
Note that, the keyword default does not define a member function equal to the
default constructor (i.e., where default constructor means constructor with no
parameters), but equal to the constructor that would be implicitly defined if not
deleted.
In general, and for future compatibility, classes that explicitly define one copy/move
constructor or one copy/move assignment but not both, are encouraged to specify
either delete or default on the other special member functions they don't explicitly
define.
Friend functions
In principle, private and protected members of a class cannot be accessed from
outside the same class in which they are declared. However, this rule does not
apply to "friends".
A non-member function can access the private and protected members of a class if
it is declared a friend of that class. That is done by including a declaration of this
external function within the class, and preceding it with the keyword friend:
177
18 res.height = param.height*2;
19 return res;
20 }
21
22 int main () {
23 Rectangle foo;
24 Rectangle bar (2,3);
25 foo = duplicate (bar);
26 cout << foo.area() << '\n';
27 return 0;
28 }
Typical use cases of friend functions are operations that are conducted between two
different classes accessing private or protected members of both.
Friend classes
Similar to friend functions, a friend class is a class whose members have access to
the private or protected members of another class:
178
23 void Rectangle::convert (Square a) {
24 width = a.side;
25 height = a.side;
26 }
27
28 int main () {
29 Rectangle rect;
30 Square sqr (4);
31 rect.convert(sqr);
32 cout << rect.area();
33 return 0;
34 }
There is something else new in this example: at the beginning of the program, there
is an empty declaration of class Square. This is necessary because class Rectangle
uses Square (as a parameter in member convert), and Square uses Rectangle
(declaring it a friend).
Another property of friendships is that they are not transitive: The friend of a friend
is not considered a friend unless explicitly specified.
For example, let's imagine a series of classes to describe two kinds of polygons:
rectangles and triangles. These two polygons have certain common properties, such
as the values needed to calculate their areas: they both can be described simply
with a height and a width (or base).
179
This could be represented in the world of classes with a class Polygon from which we
would derive the two other ones: Rectangle and Triangle:
The Polygon class would contain members that are common for both types of
polygon. In our case: width and height. And Rectangle and Triangle would be its
derived classes, with specific features that are different from one type of polygon to
the other.
Classes that are derived from others inherit all the accessible members of the base
class. That means that if a base class includes a member A and we derive a class
from it with another member called B, the derived class will contain both member A
and member B.
The inheritance relationship of two classes is declared in the derived class. Derived
classes definitions use the following syntax:
180
12
13 class Rectangle: public Polygon {
14 public:
15 int area ()
16 { return width * height; }
17 };
18
19 class Triangle: public Polygon {
20 public:
21 int area ()
22 { return width * height / 2; }
23 };
24
25 int main () {
26 Rectangle rect;
27 Triangle trgl;
28 rect.set_values (4,5);
29 trgl.set_values (4,5);
30 cout << rect.area() << '\n';
31 cout << trgl.area() << '\n';
32 return 0;
33 }
The objects of the classes Rectangle and Triangle each contain members inherited
from Polygon. These are: width, height and set_values.
The protected access specifier used in class Polygon is similar to private. Its only
difference occurs in fact with inheritance: When a class inherits another one, the
members of the derived class can access the protected members inherited from the
base class, but not its private members.
By declaring width and height as protected instead of private, these members are
also accessible from the derived classes Rectangle and Triangle, instead of just
from members of Polygon. If they were public, they could be accessed just from
anywhere.
We can summarize the different access types according to which functions can
access them in the following way:
Where "not members" represents any access from outside the class, such as from
181
main, from another class or from a function.
In the example above, the members inherited by Rectangle and Triangle have the
same access permissions as they had in their base class Polygon:
This is because the inheritance relation has been declared using the public keyword
on each of the derived classes:
This public keyword after the colon (:) denotes the most accessible level the
members inherited from the class that follows it (in this case Polygon) will have from
the derived class (in this case Rectangle). Since public is the most accessible level,
by specifying this keyword the derived class will inherit all the members with the
same levels they had in the base class.
With protected, all public members of the base class are inherited as protected in
the derived class. Conversely, if the most restricting access level is specified
(private), all the base class members are inherited as private.
For example, if daughter were a class derived from mother that we defined as:
This would set protected as the less restrictive access level for the members of
Daughter that it inherited from mother. That is, all members that were public in
Mother would become protected in Daughter. Of course, this would not restrict
Daughter from declaring its own public members. That less restrictive access level is
only set for the members inherited from Mother.
If no access level is specified for the inheritance, the compiler assumes private for
classes declared with keyword class and public for those declared with struct.
Actually, most use cases of inheritance in C++ should use public inheritance. When
182
other access levels are needed for base classes, they can usually be better
represented as member variables instead.
its friends
Even though access to the constructors and destructor of the base class is not
inherited as such, they are automatically called by the constructors and destructor
of the derived class.
Unless otherwise specified, the constructors of a derived class calls the default
constructor of its base classes (i.e., the constructor taking no arguments). Calling a
different constructor of a base class is possible, using the same syntax used to
initialize member variables in the initialization list:
For example:
183
};
18
19
class Son : public Mother {
20
public:
21
Son (int a) : Mother (a)
22
{ cout << "Son: int parameter\n\n"; }
23
};
24
25
int main () {
26
Daughter kelly(0);
27
Son bud(0);
28
29
return 0;
30
}
Notice the difference between which Mother's constructor is called when a new
Daughter object is created and which when it is a Son object. The difference is due to
the different constructor declarations of Daughter and Son:
Multiple inheritance
A class may inherit from more than one class by simply specifying more base
classes, separated by commas, in the list of a class's base classes (i.e., after the
colon). For example, if the program had a specific class to print on screen called
Output, and we wanted our classes Rectangle and Triangle to also inherit its
members in addition to those of Polygon we could write:
184
12 class Output {
13 public:
14 static void print (int i);
15 };
16
17 void Output::print (int i) {
18 cout << i << '\n';
19 }
20
21 class Rectangle: public Polygon, public Output {
22 public:
23 Rectangle (int a, int b) : Polygon(a,b) {}
24 int area ()
25 { return width*height; }
26 };
27
28 class Triangle: public Polygon, public Output {
29 public:
30 Triangle (int a, int b) : Polygon(a,b) {}
31 int area ()
32 { return width*height/2; }
33 };
34
35 int main () {
36 Rectangle rect (4,5);
37 Triangle trgl (4,5);
38 rect.print (rect.area());
39 Triangle::print (trgl.area());
40 return 0;
41 }
Polymorphism
Before getting any deeper into this chapter, you should have a proper
understanding of pointers and class inheritance. If you are not really sure of the
meaning of any of the following expressions, you should review the indicated
sections:
185
The example about the rectangle and triangle classes can be rewritten using
pointers taking this feature into account:
Function main declares two pointers to Polygon (named ppoly1 and ppoly2). These
are assigned the addresses of rect and trgl, respectively, which are objects of type
Rectangle and Triangle. Such assignments are valid, since both Rectangle and
Triangle are classes derived from Polygon.
Dereferencing ppoly1 and ppoly2 (with *ppoly1 and *ppoly2) is valid and allows us to
access the members of their pointed objects. For example, the following two
statements would be equivalent in the previous example:
1 ppoly1->set_values (4,5);
2 rect.set_values (4,5);
186
But because the type of ppoly1 and ppoly2 is pointer to Polygon (and not pointer to
Rectangle nor pointer to Triangle), only the members inherited from Polygon can be
accessed, and not those of the derived classes Rectangle and Triangle. That is why
the program above accesses the area members of both objects using rect and trgl
directly, instead of the pointers; the pointers to the base class cannot access the
area members.
Member area could have been accessed with the pointers to Polygon if area were a
member of Polygon instead of a member of its derived classes, but the problem is
that Rectangle and Triangle implement different versions of area, therefore there is
not a single common version that could be implemented in the base class.
Virtual members
A virtual member is a member function that can be redefined in a derived class,
while preserving its calling properties through references. The syntax for a function
to become virtual is to precede its declaration with the virtual keyword:
187
31 Polygon * ppoly1 = ▭
32 Polygon * ppoly2 = &trgl;
33 Polygon * ppoly3 = &poly;
34 ppoly1->set_values (4,5);
35 ppoly2->set_values (4,5);
36 ppoly3->set_values (4,5);
37 cout << ppoly1->area() << '\n';
38 cout << ppoly2->area() << '\n';
39 cout << ppoly3->area() << '\n';
40 return 0;
41 }
In this example, all three classes (Polygon, Rectangle and Triangle) have the same
members: width, height, and functions set_values and area.
The member function area has been declared as virtual in the base class because
it is later redefined in each of the derived classes. Non-virtual members can also be
redefined in derived classes, but non-virtual members of derived classes cannot be
accessed through a reference of the base class: i.e., if virtual is removed from the
declaration of area in the example above, all three calls to area would return zero,
because in all cases, the version of the base class would have been called instead.
Note that despite of the virtuality of one of its members, Polygon was a regular
class, of which even an object was instantiated ( poly), with its own definition of
member area that always returns 0.
188
1 // abstract class CPolygon
2 class Polygon {
3 protected:
4 int width, height;
5 public:
6 void set_values (int a, int b)
7 { width=a; height=b; }
8 virtual int area () =0;
9 };
Notice that area has no definition; this has been replaced by =0, which makes it a
pure virtual function. Classes that contain at least one pure virtual function are
known as abstract base classes.
Abstract base classes cannot be used to instantiate objects. Therefore, this last
abstract base class version of Polygon could not be used to declare objects like:
But an abstract base class is not totally useless. It can be used to create pointers to
it, and take advantage of all its polymorphic abilities. For example, the following
pointer declarations would be valid:
1 Polygon * ppoly1;
2 Polygon * ppoly2;
189
19
20 class Triangle: public Polygon {
21 public:
22 int area (void)
23 { return (width * height / 2); }
24 };
25
26 int main () {
27 Rectangle rect;
28 Triangle trgl;
29 Polygon * ppoly1 = ▭
30 Polygon * ppoly2 = &trgl;
31 ppoly1->set_values (4,5);
32 ppoly2->set_values (4,5);
33 cout << ppoly1->area() << '\n';
34 cout << ppoly2->area() << '\n';
35 return 0;
36 }
In this example, objects of different but related types are referred to using a unique
type of pointer (Polygon*) and the proper member function is called every time, just
because they are virtual. This can be really useful in some circumstances. For
example, it is even possible for a member of the abstract base class Polygon to use
the special pointer this to access the proper virtual members, even though Polygon
itself has no implementation for this function:
190
28
29 int main () {
30 Rectangle rect;
31 Triangle trgl;
32 Polygon * ppoly1 = ▭
33 Polygon * ppoly2 = &trgl;
34 ppoly1->set_values (4,5);
35 ppoly2->set_values (4,5);
36 ppoly1->printarea();
37 ppoly2->printarea();
38 return 0;
39 }
Virtual members and abstract classes grant C++ polymorphic characteristics, most
useful for object-oriented projects. Of course, the examples above are very simple
use cases, but these features can be applied to arrays of objects or dynamically
allocated objects.
Here is an example that combines some of the features in the latest chapters, such
as dynamic memory, constructor initializers and polymorphism:
191
33 ppoly2->printarea();
34 delete ppoly1;
35 delete ppoly2;
36 return 0;
37 }
are declared being of type "pointer to Polygon", but the objects allocated have been
declared having the derived class type directly ( Rectangle and Triangle).
Type conversions
Implicit conversion
Implicit conversions are automatically performed when a value is copied to a
compatible type. For example:
1 short a=2000;
2 int b;
3 b=a;
Here, the value of a is promoted from short to int without the need of any explicit
operator. This is known as a standard conversion. Standard conversions affect
fundamental data types, and allow the conversions between numerical types ( short
to int, int to float, double to int...), to or from bool, and some pointer conversions.
Converting to int from some smaller integer type, or to double from float is known
as promotion, and is guaranteed to produce the exact same value in the destination
type. Other conversions between arithmetic types may not always be able to
represent the same value exactly:
The conversions from/to bool consider false equivalent to zero (for numeric
types) and to null pointer (for pointer types); true is equivalent to all other
values and is converted to the equivalent of 1.
192
If the conversion is from a floating-point type to an integer type, the value is
truncated (the decimal part is removed). If the result lies outside the range of
representable values by the type, the conversion causes undefined behavior.
Some of these conversions may imply a loss of precision, which the compiler can
signal with a warning. This warning can be avoided with an explicit conversion.
For non-fundamental types, arrays and functions implicitly convert to pointers, and
pointers in general allow the following conversions:
For example:
193
8 public:
9 // conversion from A (constructor):
10 B (const A& x) {}
11 // conversion from A (assignment):
12 B& operator= (const A& x) {return *this;}
13 // conversion to A (type-cast operator)
14 operator A() {return A();}
15 };
16
17 int main ()
18 {
19 A foo;
20 B bar = foo; // calls constructor
21 bar = foo; // calls assignment
22 foo = bar; // calls type-cast operator
23 return 0;
24 }
The type-cast operator uses a particular syntax: it uses the operator keyword
followed by the destination type and an empty set of parentheses. Notice that the
return type is the destination type and thus is not specified before the operator
keyword.
Keyword explicit
On a function call, C++ allows one implicit conversion to happen for each argument.
This may be somewhat problematic for classes, because it is not always what is
intended. For example, if we add the following function to the last example:
void fn (B arg) {}
This function takes an argument of type B, but it could as well be called with an
object of type A as argument:
fn (foo);
This may or may not be what was intended. But, in any case, it can be prevented by
marking the affected constructor with the explicit keyword:
194
6
7 class B {
8 public:
9 explicit B (const A& x) {}
10 B& operator= (const A& x) {return *this;}
11 operator A() {return A();}
12 };
13
14 void fn (B x) {}
15
16 int main ()
17 {
18 A foo;
19 B bar (foo);
20 bar = foo;
21 foo = bar;
22
23 // fn (foo); // not allowed for explicit ctor.
24 fn (bar);
25
26 return 0;
27 }
B bar = foo;
Type-cast member functions (those described in the previous section) can also be
specified as explicit. This prevents implicit conversions in the same way as
explicit-specified constructors do for the destination type.
Type casting
C++ is a strong-typed language. Many conversions, specially those that imply a
different interpretation of the value, require an explicit conversion, known in C++ as
type-casting. There exist two main syntaxes for generic type-casting: functional and
c-like:
1 double x = 10.3;
2 int y;
3 y = int (x); // functional notation
4 y = (int) x; // c-like cast notation
195
The functionality of these generic forms of type-casting is enough for most needs
with fundamental data types. However, these operators can be applied
indiscriminately on classes and pointers to classes, which can lead to code that
-while being syntactically correct- can cause runtime errors. For example, the
following code compiles without errors:
1 // class type-casting
2 #include <iostream>
3 using namespace std;
4
5 class Dummy {
6 double i,j;
7 };
8
9 class Addition {
10 int x,y;
11 public: Edit & Run
12 Addition (int a, int b) { x=a; y=b; }
13 int result() { return x+y;}
14 };
15
16 int main () {
17 Dummy d;
18 Addition * padd;
19 padd = (Addition*) &d;
20 cout << padd->result();
21 return 0;
22 }
Unrestricted explicit type-casting allows to convert any pointer into any other
pointer type, independently of the types they point to. The subsequent call to
member result will produce either a run-time error or some other unexpected
results.
196
static_cast <new_type> (expression)
const_cast <new_type> (expression)
(new_type) expression
new_type (expression)
dynamic_cast
dynamic_cast can only be used with pointers and references to classes (or with
void*). Its purpose is to ensure that the result of the type conversion points to a
valid complete object of the destination pointer type.
// dynamic_cast
#include <iostream>
1
#include <exception>
2
using namespace std;
3
4
class Base { virtual void dummy() {} };
5
class Derived: public Base { int a; };
6
7
int main () {
8
try {
9
Base * pba = new Derived;
10
Base * pbb = new Base;
11 Edit &
Derived * pd; Null pointer on
12 Run
second type-cast.
13
pd = dynamic_cast<Derived*>(pba);
14
if (pd==0) cout << "Null pointer on first
15
type-cast.\n";
16
17
pd = dynamic_cast<Derived*>(pbb);
18
if (pd==0) cout << "Null pointer on second
19
type-cast.\n";
20
21
} catch (exception& e) {cout << "Exception:
22
" << e.what();}
23
return 0;
}
197
Compatibility note: This type of dynamic_cast requires Run-Time Type Information
(RTTI) to keep track of dynamic types. Some compilers support this feature as an
option which is disabled by default. This needs to be enabled for runtime type
checking using dynamic_cast to work properly with these types.
The code above tries to perform two dynamic casts from pointer objects of type
Base* (pba and pbb) to a pointer object of type Derived*, but only the first one is
successful. Notice their respective initializations:
Even though both are pointers of type Base*, pba actually points to an object of type
Derived, while pbb points to an object of type Base. Therefore, when their respective
type-casts are performed using dynamic_cast, pba is pointing to a full object of class
Derived, whereas pbb is pointing to an object of class Base, which is an incomplete
object of class Derived.
When dynamic_cast cannot cast a pointer because it is not a complete object of the
required class -as in the second conversion in the previous example- it returns a null
pointer to indicate the failure. If dynamic_cast is used to convert to a reference type
and the conversion is not possible, an exception of type bad_cast is thrown instead.
dynamic_cast can also perform the other implicit casts allowed on pointers: casting
null pointers between pointers types (even between unrelated classes), and casting
any pointer of any type to a void* pointer.
static_cast
static_cast can perform conversions between pointers to related classes, not only
upcasts (from pointer-to-derived to pointer-to-base), but also downcasts (from
pointer-to-base to pointer-to-derived). No checks are performed during runtime to
guarantee that the object being converted is in fact a full object of the destination
type. Therefore, it is up to the programmer to ensure that the conversion is safe. On
the other side, it does not incur the overhead of the type-safety checks of
dynamic_cast.
198
4 Derived * b = static_cast<Derived*>(a);
This would be valid code, although b would point to an incomplete object of the
class and could lead to runtime errors if dereferenced.
Therefore, static_cast is able to perform with pointers to classes not only the
conversions allowed implicitly, but also their opposite conversions.
static_cast is also able to perform all conversions allowed implicitly (not only those
with pointers to classes), and is also able to perform the opposite of these. It can:
Convert from void* to any pointer type. In this case, it guarantees that if the
void* value was obtained by converting from that same pointer type, the
resulting pointer value is the same.
reinterpret_cast
reinterpret_cast converts any pointer type to any other pointer type, even of
unrelated classes. The operation result is a simple binary copy of the value from one
pointer to the other. All pointer conversions are allowed: neither the content pointed
nor the pointer type itself is checked.
It can also cast pointers to or from integer types. The format in which this integer
value represents a pointer is platform-specific. The only guarantee is that a pointer
cast to an integer type large enough to fully contain it (such as intptr_t), is
guaranteed to be able to be cast back to a valid pointer.
199
types, which on most cases results in code which is system-specific, and thus non-
portable. For example:
1 class A { /* ... */ };
2 class B { /* ... */ };
3 A * a = new A;
4 B * b = reinterpret_cast<B*>(a);
This code compiles, although it does not make much sense, since now b points to an
object of a totally unrelated and likely incompatible class. Dereferencing b is unsafe.
const_cast
This type of casting manipulates the constness of the object pointed by a pointer,
either to be set or to be removed. For example, in order to pass a const pointer to a
function that expects a non-const argument:
1 // const_cast
2 #include <iostream>
3 using namespace std;
4
5 void print (char * str)
6 {
7 cout << str << '\n';
sample text Edit & Run
8 }
9
10 int main () {
11 const char * c = "sample text";
12 print ( const_cast<char *> (c) );
13 return 0;
14 }
The example above is guaranteed to work because function print does not write to
the pointed object. Note though, that removing the constness of a pointed object to
actually write to it causes undefined behavior.
typeid
typeid allows to check the type of an expression:
typeid (expression)
200
serve to obtain a null-terminated character sequence representing the data type or
class name by using its name() member.
// typeid
#include <iostream>
1
#include <typeinfo>
2
using namespace std;
3
4
int main () {
5
int * a,b;
6
a=0; b=0;
7 a and b are of different
if (typeid(a) != typeid(b)) Edit &
8 types:
{ Run
9 a is: int *
cout << "a and b are of different
10 b is: int
types:\n";
11
cout << "a is: " << typeid(a).name()
12
<< '\n';
13
cout << "b is: " << typeid(b).name()
14
<< '\n';
15
}
16
return 0;
}
When typeid is applied to classes, typeid uses the RTTI to keep track of the type of
dynamic objects. When typeid is applied to an expression whose type is a
polymorphic class, the result is the type of the most derived complete object:
Note: The string returned by member name of type_info depends on the specific
implementation of your compiler and library. It is not necessarily a simple string
201
with its typical type name, like in the compiler used to produce this output.
Notice how the type that typeid considers for pointers is the pointer type itself (both
a and b are of type class Base *). However, when typeid is applied to objects (like
*a and *b) typeid yields their dynamic type (i.e. the type of their most derived
complete object).
If the type typeid evaluates is a pointer preceded by the dereference operator ( *),
and this pointer has a null value, typeid throws a bad_typeid exception.
Exceptions
An exception is thrown by using the throw keyword from inside the try block.
Exception handlers are declared with the keyword catch, which must be placed
immediately after the try block:
// exceptions
1
#include <iostream>
2
using namespace std;
3
4
int main () {
5
try
6
{
7 Edit &
throw 20; An exception occurred.
8 Run
} Exception Nr. 20
9
catch (int e)
10
{
11
cout << "An exception occurred.
12
Exception Nr. " << e << '\n';
13
}
14
return 0;
15
}
The code under exception handling is enclosed in a try block. In this example this
code simply throws an exception:
202
throw 20;
A throw expression accepts one parameter (in this case the integer value 20), which
is passed as an argument to the exception handler.
The exception handler is declared with the catch keyword immediately after the
closing brace of the try block. The syntax for catch is similar to a regular function
with one parameter. The type of this parameter is very important, since the type of
the argument passed by the throw expression is checked against it, and only in the
case they match, the exception is caught by that handler.
Multiple handlers (i.e., catch expressions) can be chained; each one with a different
parameter type. Only the handler whose argument type matches the type of the
exception specified in the throw statement is executed.
If an ellipsis (...) is used as the parameter of catch, that handler will catch any
exception no matter what the type of the exception thrown. This can be used as a
default handler that catches all exceptions not caught by other handlers:
1 try {
2 // code here
3 }
4 catch (int param) { cout << "int exception"; }
5 catch (char param) { cout << "char exception"; }
6 catch (...) { cout << "default exception"; }
In this case, the last handler would catch any exception thrown of a type that is
neither int nor char.
After an exception has been handled the program, execution resumes after the try-
catch block, not after the throw statement!.
It is also possible to nest try-catch blocks within more external try blocks. In these
cases, we have the possibility that an internal catch block forwards the exception to
its external level. This is done with the expression throw; with no arguments. For
example:
1 try {
2 try {
3 // code here
4 }
5 catch (int n) {
6 throw;
7 }
203
8 }
9 catch (...) {
10 cout << "Exception occurred";
11 }
Exception specification
Older code may contain dynamic exception specifications. They are now deprecated
in C++, but still supported. A dynamic exception specification follows the
declaration of a function, appending a throw specifier to it. For example:
This declares a function called myfunction, which takes one argument of type char
and returns a value of type double. If this function throws an exception of some type
other than int, the function calls std::unexpected instead of looking for a handler
or calling std::terminate.
If this throw specifier is left empty with no type, this means that std::unexpected is
called for any exception. Functions with no throw specifier (regular functions) never
call std::unexpected, but follow the normal path of looking for their exception
handler.
Standard exceptions
The C++ Standard library provides a base class specifically designed to declare
objects to be thrown as exceptions. It is called std::exception and is defined in the
<exception> header. This class has a virtual member function called what that
returns a null-terminated character sequence (of type char *) and that can be
overwritten in derived classes to contain some sort of description of the exception.
204
8 virtual const char* what() const throw()
9 {
10 return "My exception happened";
11 }
12 } myex;
13
14 int main () {
15 try
16 {
17 throw myex;
18 }
19 catch (exception& e)
20 {
21 cout << e.what() << '\n';
22 }
23 return 0;
24 }
We have placed a handler that catches exception objects by reference (notice the
ampersand & after the type), therefore this catches also classes derived from
exception, like our myex object of type myexception.
All exceptions thrown by components of the C++ Standard library throw exceptions
derived from this exception class. These are:
exception description
Also deriving from exception, header <exception> defines two generic exception
types that can be inherited by custom exceptions to report errors:
exception description
205
A typical example where standard exceptions need to be checked for is on memory
allocation:
The exception that may be caught by the exception handler in this example is a
bad_alloc. Because bad_alloc is derived from the standard base class exception, it
can be caught (capturing by reference, captures all related classes).
Preprocessor directives
These preprocessor directives extend only across a single line of code. As soon as a
newline character is found, the preprocessor directive is ends. No semicolon ( ;) is
expected at the end of a preprocessor directive. The only way a preprocessor
directive can extend through more than one line is by preceding the newline
character at the end of the line by a backslash ( \).
206
identifier in the rest of the code by replacement. This replacement can be an
expression, a statement, a block or simply anything. The preprocessor does not
understand C++ proper, it simply replaces any occurrence of identifier by
replacement.
After the preprocessor has replaced TABLE_SIZE, the code becomes equivalent to:
1 int table1[100];
2 int table2[100];
This would replace any occurrence of getmax followed by two arguments by the
replacement expression, but also replacing each argument by its identifier, exactly
as you would expect if it was a function:
1 // function macro
2 #include <iostream>
3 using namespace std;
4
5 #define getmax(a,b) ((a)>(b)?(a):(b))
6
7 int main() 5 Edit & Run
8 { 7
9 int x=5, y;
10 y= getmax(x,2);
11 cout << y << endl;
12 cout << getmax(7,x) << endl;
13 return 0;
14 }
Defined macros are not affected by block structure. A macro lasts until it is
undefined with the #undef preprocessor directive:
207
This would generate the same code as:
1 int table1[100];
2 int table2[200];
Function macro definitions accept two special operators ( # and ##) in the
replacement sequence:
The operator #, followed by a parameter name, is replaced by a string literal that
contains the argument passed (as if enclosed between double quotes):
1 #define str(x) #x
2 cout << str(test);
1 #define glue(a,b) a ## b
2 glue(c,out) << "test";
Because preprocessor replacements happen before any C++ syntax check, macro
definitions can be a tricky feature. But, be careful: code that relies heavily on
complicated macros become less readable, since the syntax expected is on many
occasions different from the normal expressions programmers expect in C++.
These directives allow to include or discard part of the code of a program if a certain
condition is met.
208
#ifdef allows a section of a program to be compiled only if the macro that is
specified as the parameter has been defined, no matter which its value is. For
example:
1 #ifdef TABLE_SIZE
2 int table[TABLE_SIZE];
3 #endif
In this case, the line of code int table[TABLE_SIZE]; is only compiled if TABLE_SIZE
was previously defined with #define, independently of its value. If it was not
defined, that line will not be included in the program compilation.
#ifndef serves for the exact opposite: the code between #ifndef and #endif
directives is only compiled if the specified identifier has not been previously
defined. For example:
1 #ifndef TABLE_SIZE
2 #define TABLE_SIZE 100
3 #endif
4 int table[TABLE_SIZE];
In this case, if when arriving at this piece of code, the TABLE_SIZE macro has not
been defined yet, it would be defined to a value of 100. If it already existed it would
keep its previous value since the #define directive would not be executed.
The #if, #else and #elif (i.e., "else if") directives serve to specify some condition to
be met in order for the portion of code they surround to be compiled. The condition
that follows #if or #elif can only evaluate constant expressions, including macro
expressions. For example:
1 #if TABLE_SIZE>200
2 #undef TABLE_SIZE
3 #define TABLE_SIZE 200
4
5 #elif TABLE_SIZE<50
6 #undef TABLE_SIZE
7 #define TABLE_SIZE 50
8
9 #else
10 #undef TABLE_SIZE
11 #define TABLE_SIZE 100
12 #endif
13
14 int table[TABLE_SIZE];
209
Notice how the entire structure of #if, #elif and #else chained directives ends with
#endif.
The behavior of #ifdef and #ifndef can also be achieved by using the special
operators defined and !defined respectively in any #if or #elif directive:
The #line directive allows us to control both things, the line numbers within the
code files as well as the file name that we want that appears when an error takes
place. Its format is:
Where number is the new line number that will be assigned to the next code line. The
line numbers of successive lines will be increased one by one from this point on.
"filename" is an optional parameter that allows to redefine the file name that will be
shown. For example:
This code will generate an error that will be shown as error in file "assigning
variable", line 20.
210
Error directive (#error)
This directive aborts the compilation process when it is found, generating a
compilation error that can be specified as its parameter:
1 #ifndef __cplusplus
2 #error A C++ compiler is required!
3 #endif
This example aborts the compilation process if the macro name __cplusplus is not
defined (this macro name is defined by default in all C++ compilers).
1 #include <header>
2 #include "file"
In the first case, a header is specified between angle-brackets <>. This is used to
include headers provided by the implementation, such as the headers that compose
the standard library (iostream, string,...). Whether the headers are actually files or
exist in some other form is implementation-defined, but in any case they shall be
properly included with this directive.
The syntax used in the second #include uses quotes, and includes a file. The file is
searched for in an implementation-defined manner, which generally includes the
current path. In the case that the file is not found, the compiler interprets the
directive as a header inclusion, just as if the quotes ("") were replaced by angle-
brackets (<>).
If the compiler does not support a specific argument for #pragma, it is ignored - no
syntax error is generated.
211
Predefined macro names
The following macro names are always defined (they all begin and end with two
underscore characters, _):
macro value
Integer value representing the current line in the source code file
__LINE__
being compiled.
macro value
212
In C:
For example:
Standard library
213
C++ provides the following classes to perform output and input of characters
to/from files:
These classes are derived directly or indirectly from the classes istream and
ostream. We have already used objects whose types were these classes: cin is an
object of class istream and cout is an object of class ostream. Therefore, we have
already been using classes that are related to our file streams. And in fact, we can
use our file streams the same way we are already used to use cin and cout, with the
only difference that we have to associate these streams with physical files. Let's see
an example:
This code creates a file called example.txt and inserts a sentence into it in the same
way we are used to do with cout, but using the file stream myfile instead.
Open a file
The first operation generally performed on an object of one of these classes is to
associate it to a real file. This procedure is known as to open a file. An open file is
represented within a program by a stream (i.e., an object of one of these classes; in
the previous example, this was myfile) and any input or output operation performed
on this stream object will be applied to the physical file associated to it.
In order to open a file with a stream object we use its member function open:
214
open (filename, mode);
Where filename is a string representing the name of the file to be opened, and mode
is an optional parameter with a combination of the following flags:
All output operations are performed at the end of the file, appending
ios::app
the content to the current content of the file.
If the file is opened for output operations and it already existed, its
ios::trunc
previous content is deleted and replaced by the new one.
All these flags can be combined using the bitwise operator OR ( |). For example, if
we want to open the file example.bin in binary mode to add data we could do it by
the following call to member function open:
1 ofstream myfile;
2 myfile.open ("example.bin", ios::out | ios::app | ios::binary);
Each of the open member functions of classes ofstream, ifstream and fstream has a
default mode that is used if the file is opened without a second argument:
ofstream ios::out
ifstream ios::in
For ifstream and ofstream classes, ios::in and ios::out are automatically and
respectively assumed, even if a mode that does not include them is passed as
second argument to the open member function (the flags are combined).
For fstream, the default value is only applied if the function is called without
specifying any value for the mode parameter. If the function is called with any value
215
in that parameter the default mode is overridden, not combined.
File streams opened in binary mode perform input and output operations
independently of any format considerations. Non-binary files are known as text files,
and some translations may occur due to formatting of some special characters (like
newline and carriage return characters).
Since the first task that is performed on a file stream is generally to open a file,
these three classes include a constructor that automatically calls the open member
function and has the exact same parameters as this member. Therefore, we could
also have declared the previous myfile object and conduct the same opening
operation in our previous example by writing:
To check if a file stream was successful opening a file, you can do it by calling to
member is_open. This member function returns a bool value of true in the case that
indeed the stream object is associated with an open file, or false otherwise:
Closing a file
When we are finished with our input and output operations on a file we shall close it
so that the operating system is notified and its resources become available again.
For that, we call the stream's member function close. This member function takes
flushes the associated buffers and closes the file:
myfile.close();
Once this member function is called, the stream object can be re-used to open
another file, and the file is available again to be opened by other processes.
In case that an object is destroyed while still associated with an open file, the
destructor automatically calls the member function close.
216
Text files
Text file streams are those where the ios::binary flag is not included in their
opening mode. These files are designed to store text and thus all values that are
input or output from/to them can suffer some formatting transformations, which do
not necessarily correspond to their literal binary value.
Writing operations on text files are performed in the same way we operated with
cout:
Reading from a file can also be performed in the same way that we did with cin:
217
21 return 0;
22 }
This last example reads a text file and prints out its content on the screen. We have
created a while loop that reads the file line by line, using getline. The value
returned by getline is a reference to the stream object itself, which when
evaluated as a boolean expression (as in this while-loop) is true if the stream is
ready for more operations, and false if either the end of the file has been reached
or if some other error occurred.
bad()
Returns true if a reading or writing operation fails. For example, in the case
that we try to write to a file that is not open for writing or if the device where
we try to write has no space left.
fail()
Returns true in the same cases as bad(), but also in the case that a format
error happens, like when an alphabetical character is extracted when we are
trying to read an integer number.
eof()
Returns true if a file open for reading has reached the end.
good()
It is the most generic state flag: it returns false in the same cases in which
calling any of the previous functions would return true. Note that good and
bad are not exact opposites (good checks more state flags at once).
The member function clear() can be used to reset the state flags.
ifstream, like istream, keeps an internal get position with the location of the
218
element to be read in the next input operation.
ofstream, like ostream, keeps an internal put position with the location where the
next element has to be written.
Finally, fstream, keeps both, the get and the put position, like iostream.
These internal stream positions point to the locations within the stream where the
next reading or writing operation is performed. These positions can be observed and
modified using the following member functions:
seekg ( position );
seekp ( position );
Using this prototype, the stream pointer is changed to the absolute position
position (counting from the beginning of the file). The type for this parameter is
streampos, which is the same type as returned by functions tellg and tellp.
Using this prototype, the get or put position is set to an offset value relative to some
specific point determined by the parameter direction. offset is of type streamoff.
And direction is of type seekdir, which is an enumerated type that determines the
point from where offset is counted from, and that can take any of the following
values:
219
The following example uses the member functions we have just seen to obtain the
size of a file:
Notice the type we have used for variables begin and end:
streampos size;
streampos is a specific type used for buffer and file positioning and is the type
returned by file.tellg(). Values of this type can safely be subtracted from other
values of the same type, and can also be converted to an integer type large enough
to contain the size of the file.
These stream positioning functions use two particular types: streampos and
streamoff. These types are also defined as member types of the stream class:
Member
Type Description
type
Defined as fpos<mbstate_t>.
streampos ios::pos_type It can be converted to/from streamoff and can be added or
subtracted values of these types.
Each of the member types above is an alias of its non-member equivalent (they are
the exact same type). It does not matter which one is used. The member types are
220
more generic, because they are the same on all stream objects (even on streams
using exotic types of characters), but the non-member types are widely used in
existing code for historical reasons.
Binary files
For binary files, reading and writing data with the extraction and insertion operators
(<< and >>) and functions like getline is not efficient, since we do not need to format
any data and data is likely not formatted in lines.
File streams include two member functions specifically designed to read and write
binary data sequentially: write and read. The first one (write) is a member function
of ostream (inherited by ofstream). And read is a member function of istream
(inherited by ifstream). Objects of class fstream have both. Their prototypes are:
Where memory_block is of type char* (pointer to char), and represents the address of
an array of bytes where the read data elements are stored or from where the data
elements to be written are taken. The size parameter is an integer value that
specifies the number of characters to be read or written from/to the memory block.
221
}
In this example, the entire file is read and stored in a memory block. Let's examine
how this is done:
First, the file is open with the ios::ate flag, which means that the get pointer will be
positioned at the end of the file. This way, when we call to member tellg(), we will
directly obtain the size of the file.
Once we have obtained the size of the file, we request the allocation of a memory
block large enough to hold the entire file:
Right after that, we proceed to set the get position at the beginning of the file
(remember that we opened the file with this pointer at the end), then we read the
entire file, and finally close it:
At this point we could operate with the data obtained from the file. But our program
simply announces that the content of the file is in memory and then finishes.
The operating system may also define other layers of buffering for reading and
writing to files.
When the buffer is flushed, all the data contained in it is written to the physical
medium (if it is an output stream). This process is called synchronization and takes
place under any of the following circumstances:
222
When the file is closed: before closing a file, all buffers that have not yet
been flushed are synchronized and all pending data is written or read to the
physical medium.
When the buffer is full: Buffers have a certain size. When the buffer is full
it is automatically synchronized.
Ascii Codes
It is a very well-known fact that computers can manage internally only 0s (zeros)
and 1s (ones). This is true, and by means of sequences of 0s and 1s the computer
can express any numerical value as its binary translation, which is a very simple
mathematical operation (as explained in the paper numerical bases).
Nevertheless, there is no such evident way to represent letters and other non-
numeric characters with 0s and 1s. Therefore, in order to do that, computers use
ASCII tables, which are tables or lists that contain all the letters in the roman
alphabet plus some additional characters. In these tables each character is always
represented by the same order number. For example, the ASCII code for the capital
letter "A" is always represented by the order number 65, which is easily
representable using 0s and 1s in binary: 65 expressed as a binary number is
1000001.
The standard ASCII table defines 128 character codes (from 0 to 127), of which, the
first 32 are control codes (non-printable), and the remaining 96 character codes are
representable characters:
* 0 1 2 3 4 5 6 7 8 9 A B C D E F
1 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
30 1 2 3 4 5 6 7 8 9 : ; < = > ?
223
4@ A B C D E F G H I J K L M N O
5P Q R S T U V W X Y Z [ \ ] ^ _
6` a b c d e f g h i j k l m n o
7p q r s t u v w x y z { | } ~
Because most systems nowadays work with 8bit bytes, which can represent 256
different values, in addition to the 128 standard ASCII codes there are other 128
that are known as extended ASCII, which are platform- and locale-dependent. So
there is more than one extended ASCII character set.
The two most used extended ASCII character sets are the one known as OEM, that
comes from the default character set incorporated by default in the IBM-PC and the
other is the ANSI extend ASCII which is used by most recent operating systems.
The first of them, the OEM character set, is the one used by the hardware of the
immense majority of PC compatible machines, and was also used under the old DOS
system. It includes some foreign signs, some marked characters and pieces to
represent panels.
The ANSI character set is a standard that many systems incorporate, like Windows,
some UNIX platforms and many standalone applications. It includes many more
local symbols and marked letters so that it can be used with no need of being
224
redefined in many more languages:
Boolean Operations
A bit is the minimum amount of information that we can imagine, since it only
stores either value 1 or 0, which represents either YES or NO, activated or
deactivated, true or false, etc... that is: two possible states each one opposite to the
other, without possibility of any shades. We are going to consider that the two
possible values of a bit are 0 and 1.
Several operations can be performed with bits, either in conjunction with other bits
or themselves alone. These operations receive the name of boolean operations, a
word that comes from the name of one of the mathematicians who contributed the
more to this field: George Boole (1815-1864).
All these operations have an established behavior and all of them can be applied to
any bit no matter which value they contain (either 0 or 1). Next you have a list of
the basic boolean operations and a table with the behavior of that operation with
every possible combination of bits.
AND
This operation is performed between two bits, which we will call a and b. The result
of applying this AND operation is 1 if both a and b are equal to 1, and 0 in all other
cases (i.e., if one or both of the variables is 0).
AND (&)
a b a&b
0 0 0
225
0 1 0
1 0 0
1 1 1
OR
This operation is performed between two bits ( a and b). The result is 1 if either one
of the two bits is 1, or if both are 1. If none is equal to 1 the result is 0.
OR (|)
a b a|b
0 0 0
0 1 1
1 0 1
1 1 1
XOR (^)
a b a^b
0 0 0
0 1 1
1 0 1
1 1 0
NOT
This operation is performed on a single bit. Its result is the inversion of the actual
value of the bit: if it was set to 1 it becomes 0, and if it was 0 it becomes 1:
NOT (~)
226
a ~a
0 1
1 0
These are the 4 basic boolean operations (AND, OR, XOR and NOT). Combining
these operations we can obtain any possible result from two bits.
In C++, these operators can be used with variables of any integer data type; the
boolean operation is performed to all of the bits of each variable involved. For
example, supposing two variables: a and b, both of type unsigned char, where a
contains 195 (11000011 in binary) and b contains 87 (or 01010111 in binary). If we
write the following code:
That means, that we conducted a bitwise AND operation between a and b. The
operation is performed between the bits of the two variables that are located at the
same position: The rightmost bit of c will contain the result of conducting the AND
operation between the rightmost bits of a and b:
The same operation is also performed between the second bits of both variables,
and the third, and so on, until the operation is performed between all bits of both
variables (each one only with the same bit of the other variable).
Since we were kids, we have all used decimals to express quantities. This
nomenclature that seems so logical to us may not seem so to an inhabitant of
Classical Rome. For them, each symbol that they wrote to express a number always
represented the same value:
I 1
227
II 2
III 3
IV 4
V 5
All the I signs always represents the value 1 (one) wherever they are placed, and
the V sign always represents a value of 5 (five). Nevertheless that does not take
place in our decimal system. When we write the decimal symbol 1 we are not
always talking about a value of one (I in Roman numbers). For example:
1 I
10 X
100 C
In these cases, our symbol 1 does not have always a value of one (or I in Roman
numbers). For example, in the second case, the symbol 1 represents a value of ten
(or X in Roman) and in the third one, 1 represents a value of one hundred (or C).
For example:
200
+ 70
5
---
275
therefore, the first "2" sign is equivalent to 200 (2 x 100), the second "7" sign is
equivalent to 70 (7 x 10) whereas the last sign corresponds to the value 5 (5 x 1).
This is because our system is a positional numeral system. Therefore, the value of a
given digit depends on its position within the entire number being represented. All
the above can be mathematically represented in a very simple way. For example, to
represent the value 182736 we can assume that each digit is the product of itself
multiplied by 10 powered to its place as exponent, beginning from the right with
100, following with 101, 102, and so on:
228
Octal numbers (base 8)
Like our "normal" numbers are base 10 (or radix 10) because we have 10 different
digits (from the 0 to the 9):
0123456789
the octals numbers include only the representations for the values from 0 to 7:
01234567
and, therefore, its mathematical base is 8. In C++ octal numbers are denoted by
beginning always with a 0 digit. Let's see how we would write the first numbers in
octal:
octal decimal
----- -------
0 0 (zero)
01 1 (one)
02 2 (two)
03 3 (three)
04 4 (four)
05 5 (five)
06 6 (six)
07 7 (seven)
010 8 (eight)
011 9 (nine)
012 10 (ten)
013 11 (eleven)
014 12 (twelve)
015 13 (thirteen)
016 14 (fourteen)
017 15 (fifteen)
020 16 (sixteen)
021 17 (seventeen)
229
Thus, for example, the number 17 (seventeen, or XVII in Roman) it is expressed 021
as an octal number in C++. We can apply the same mechanism that we saw
previously for decimal numbers to the octal numbers simply by considering that its
base is 8. For example, taking the octal number 071263:
hexadecimal decimal
----------- -------
0 0 (zero)
0x1 1 (one)
0x2 2 (two)
0x3 3 (three)
0x4 4 (four)
0x5 5 (five)
0x6 6 (six)
0x7 7 (seven)
0x8 8 (eight)
0x9 9 (nine)
0xA 10 (ten)
0xB 11 (eleven)
0xC 12 (twelve)
0xD 13 (thirteen)
0xE 14 (fourteen)
0xF 15 (fifteen)
0x10 16 (sixteen)
0x11 17 (seventeen)
Once again we can use the same method to translate a number from a base to
230
another one:
Binary representations
Octal and hexadecimal numbers have a considerable advantage over our decimal
numbers in the world of bits, and is that their bases (8 and 16) are perfect multiples
of 2 (23 and 24, respectively), which allows us to make easier conversions from these
bases to binary than from decimal numbers (whose base is 2x5). For example,
suppose that we want to translate the following binary sequence to numbers of
other bases:
110011111010010100
Nevertheless to pass this sequence to octal it will only take us some seconds and
even the less skilled in mathematics can do it just by seeing it: Since 8 is 2 3, we will
separate the binary value in groups of 3 numbers:
and now we just have to translate to octal numberal radix each group separately:
giving the number 637224 as result. This same process can be inversely performed
to pass from octal to binary.
In order to conduct the operation with hexadecimal numbers we only have to
perform the same process but separating the binary value in groups of 4 numbers,
because 16 = 24:
231
3 3 E 9 4
Reference
C Library
The elements of the C language library are also included as a subset of the C++
Standard library. These cover many aspects, from general utility functions and
macros to input/output functions and dynamic memory management functions:
<cassert> (assert.h)
<cctype> (ctype.h)
<cerrno> (errno.h)
C Errors (header)
<cfenv> (fenv.h)
<cfloat> (float.h)
<cinttypes> (inttypes.h)
<ciso646> (iso646.h)
<climits> (limits.h)
232
<clocale> (locale.h)
<cmath> (math.h)
<csetjmp> (setjmp.h)
<csignal> (signal.h)
<cstdarg> (stdarg.h)
<cstdbool> (stdbool.h)
<cstddef> (stddef.h)
<cstdint> (stdint.h)
<cstdio> (stdio.h)
<cstdlib> (stdlib.h)
<cstring> (string.h)
C Strings (header)
<ctgmath> (tgmath.h)
<ctime> (time.h)
233
<cuchar> (uchar.h)
<cwchar> (wchar.h)
<cwctype> (wctype.h)
Containers
<array>
<bitset>
<deque>
<forward_list>
<list>
<map>
<queue>
<set>
<stack>
<unordered_map>
234
Unordered map header (header)
<unordered_set>
<vector>
Atomic (header)
<condition_variable>
<future>
Future (header)
235
<mutex>
Mutex (header)
<thread>
Thread (header)
Miscellaneous headers
<algorithm>
<chrono>
<codecvt>
<complex>
<exception>
<functional>
<initializer_list>
<iterator>
<limits>
<locale>
<memory>
236
Memory elements (header)
<new>
<numeric>
<random>
Random (header)
<ratio>
<regex>
<stdexcept>
<string>
Strings (header)
<system_error>
<tuple>
<typeindex>
<typeinfo>
<type_traits>
type_traits (header)
<utility>
237
Utility components (header)
<valarray>
Update progress
Work is under progress to update the whole reference to the latest standard. To
check the status of each header, see Update progress.
Search:
Reference
C library
Not logged in
registerlog in
library
C library
C Language Library
The C++ library includes the same definitions as the C language library organized in
the same structure of header files, with the following differences:
Each header file has the same name as the C language version but with a " c"
prefix and no extension. For example, the C++ equivalent for the C language
header file <stdlib.h> is <cstdlib>.
Nevertheless, for compatibility with C, the traditional header names name.h (like
stdlib.h) are also provided with the same definitions within the global namespace.
In the examples provided in this reference, this version is used so that the examples
are fully C-compatible, although its use is deprecated in C++.
238
wchar_t, char16_t, char32_t and bool are fundamental types in C++ and
therefore are not defined in the corresponding header where they appear in
C. The same applies to several macros in the header <iso646.h>, which are
keywords in C++.
The functions atexit, exit and abort, defined in <cstdlib> have additions to
their behavior in C++.
Note on versions
C++98 includes the C library as described by the 1990 ISO C standard and its
amendment #1 (ISO/IEC 9899:1990 and ISO/IEC 9899:1990/DAM 1).
C++11 includes the C library as described by the 1999 ISO C standard and its
Technical Corrigenda 1, 2 and 3 (ISO/IEC 9899:1999 and ISO/IEC
9899:1999/Cor.1,2,3), plus <cuchar> (as by ISO/IEC 19769:2004).
Other introductions by the 2011 ISO C standard are not compatible with C++.
Headers
C90 (C++98)
C99 (C++11)
<cassert> (assert.h)
C Diagnostics Library (header)
<cctype> (ctype.h)
<cerrno> (errno.h)
C Errors (header)
239
<cfloat> (float.h)
<ciso646> (iso646.h)
<climits> (limits.h)
<clocale> (locale.h)
<cmath> (math.h)
<csetjmp> (setjmp.h)
<csignal> (signal.h)
<cstdarg> (stdarg.h)
<cstddef> (stddef.h)
<cstdio> (stdio.h)
<cstdlib> (stdlib.h)
<cstring> (string.h)
C Strings (header)
<ctime> (time.h)
240
Amendment 1 to ISO-C 90 added two additional headers: <cwchar> and
<cwctype>.
241