David Tschumperle, Christophe Tilmant, Vincent Barra - Digital Image Processing With C++ - Implementing Reference Algorithms With The CImg Library-CRC Press (2023) PDF
David Tschumperle, Christophe Tilmant, Vincent Barra - Digital Image Processing With C++ - Implementing Reference Algorithms With The CImg Library-CRC Press (2023) PDF
with C++
Digital Image Processing with C++: Implementing Reference Algorithms with the Clmg Library
presents the theory of digital image processing and implementations of algorithms using a dedi-
cated library. Processing a digital image means transforming its content (denoising, stylizing,
etc.) or extracting information to solve a given problem (object recognition, measurement, mo-
tion estimation, etc.). This book presents the mathematical theories underlying digital image
processing as well as their practical implementation through examples of algorithms implement-
ed in the C++ language using the free and easy-to-use CImg library.
Chapters cover the field of digital image processing in a broad way and propose practical and
functional implementations of each method theoretically described. The main topics covered
include filtering in spatial and frequency domains, mathematical morphology, feature extraction
and applications to segmentation, motion estimation, multispectral image processing and 3D
visualization.
Students or developers wishing to discover or specialize in this discipline and teachers and re-
searchers hoping to quickly prototype new algorithms or develop courses will all find in this
book material to discover image processing or deepen their knowledge in this field.
David Tschumperlé is a permanent CNRS research scientist heading the IMAGE team at the
GREYC Laboratory in Caen, France. He’s particularly interested in partial differential equations
and variational methods for processing multi-valued images in a local or non-local way. He has
authored more than 40 papers in journals or conferences and is the project leader of CImg and
G’MIC, two open-source software/libraries.
David Tschumperlé
Christophe Tilmant
Vincent Barra
First edition published 2023
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
Title of the original French edition, Le traitement numerique des images en C++. Implementation
d’algorithmes avec la bibliotheque CIMG- published by Ellipses - Copyright 2021, Edition Marketing S. A.
Reasonable efforts have been made to publish reliable data and information, but the author and publisher
cannot assume responsibility for the validity of all materials or the consequences of their use. The authors
and publishers have attempted to trace the copyright holders of all material reproduced in this publica-
tion and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any future
reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, trans-
mitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter
invented, including photocopying, microfilming, and recording, or in any information storage or retrieval
system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.com or
contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-
8400. For works that are not available on CCC please contact [email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used
only for identification and explanation without intent to infringe.
DOI: 10.1201/9781003323693
Publisher’s note: This book has been prepared from camera-ready copy provided by the author.
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
I I NTRODUCTION TO CImg
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4 Mathematical Morphology . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1 Binary images 54
4.1.1 Dilation and erosion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.2 Opening and closing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2 Gray-level images 58
4.3 Some applications 59
4.3.1 Kramer-Bruckner filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3.2 Alternating sequential filters . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3.3 Morphological gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3.4 Skeletonization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.1 Spatial filtering 69
5.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.1.2 Low-pass filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.1.3 High-pass filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.1.4 Adaptive filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.1.5 Adaptive window filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.2 Recursive filtering 84
5.2.1 Optimal edge detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2.2 Deriche filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.3 Frequency filtering 94
5.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.3.2 The Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.3.3 Frequency filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.3.4 Processing a Moiré image . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.4 Diffusion filtering 110
5.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.4.2 Physical basis of diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.4.3 Linear diffusion filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Table of contents vii
7 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.1 Edge-based approaches 151
7.1.1 Introduction to implicit active contours . . . . . . . . . . . . . . . . 151
7.1.2 Implicit representation of a contour . . . . . . . . . . . . . . . . . . . 156
7.1.3 Evolution equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
7.1.4 Discretization of the evolution equation . . . . . . . . . . . . . . . . 160
7.1.5 Geodesic model propagation algorithm . . . . . . . . . . . . . . . 161
7.2 Region-based approaches 163
7.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.2.2 Histogram-based methods . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.2.3 Thresholding by clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.2.4 Transformation of regions . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
7.2.5 Super-pixels partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
viii Table of contents
10 3D Visualisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
10.1 Structuring of 3D mesh objects 227
10.2 3D plot of a function z = f (x, y) 229
10.3 Creating complex 3D objects 233
10.3.1 Details on vertex structuring . . . . . . . . . . . . . . . . . . . . . . . . . 233
10.3.2 Details on primitive structuring . . . . . . . . . . . . . . . . . . . . . . . 234
10.3.3 Details on material structuring . . . . . . . . . . . . . . . . . . . . . . . 235
10.3.4 Details on opacity structuring . . . . . . . . . . . . . . . . . . . . . . . . 235
10.4 Visualization of a cardiac segmentation in MRI 236
10.4.1 Description of the input data . . . . . . . . . . . . . . . . . . . . . . . . 236
10.4.2 Extraction of the 3D surface of the ventricle . . . . . . . . . . . . 237
10.4.3 Adding 3D motion vectors . . . . . . . . . . . . . . . . . . . . . . . . . . 238
10.4.4 Adding cutting planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
10.4.5 Final result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Table of contents ix
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
Preface
It is with great pleasure that I accepted to preface this book, which is the successful
outcome of several years of research and experience by a trio of authors who are
experts in digital image processing, combining both its theoretical aspects and its
software implementations.
After promising results, David continued his work in a PhD thesis, under my su-
pervision. And very quickly, it became obvious that we needed to develop a reference
C/C++ library to process images with more than three channels or volume images
with any values (matrices, tensors, . . . ) to develop our research work.
Through the implemented algorithms, tests, successes and failures, David grad-
ually built his own personal C++ library of reusable features, in order to complete
his thesis work. The originality of David’s research work, the need to optimize and
develop software that survives his PhD period and that is “reusable” by the members
of the team constitute in my opinion the basis of the CImg library’s genesis.
At the end of David’s thesis, the ease of use of CImg had already seduced the new
PhD students and permanent members of the team. At the end of 2003, we decided,
in agreement with Inria’s development department, to distribute CImg more widely
as free software, naturally using the new French free license CeCILL, which had just
been created jointly by Inria, CEA and CNRS.
More than 20 years after its first lines of code, CImg is now an image processing
library used by thousands of people around the world, at the heart of dozens of free
projects, and just as importantly, continuously and actively maintained.
At the origin of this remarkable success is, first of all, the nature and the quality of
the methodological work carried out throughout the doctoral program, as well as its
implementation guided by the development of processing algorithms that must work
on images of types and modalities from the field of computer vision (cameras, video,
velocity fields) as well as in the satellite or medical fields, in particular neuroimaging,
with magnetic resonance diffusion imaging and its well-known model called the diffu-
sion tensor.
This aspect of data genericity was very quickly a central element in the design and
success of the library. With a focus on simplicity of design and use, and a constant
and coherent development of the library API, the authors have clearly succeeded in
coupling ease of use with the genericity of the processing that the library allows. The
free distribution of the library has allowed the academic world, as well as the research
and industrial world, to discover the prototyping and implementation of efficient image
processing algorithms in a gentle and enjoyable way.
Preface xiii
For teachers, researchers, students or engineers, this book will provide you with
an introduction to the vast field of image processing, as well as an introduction to the
CImg library for the development of state-of-the-art algorithms.
This book is expected to spark new passions for image processing, e.g., for begin-
ners or more experienced C++ developers who are interested in getting started in this
discipline. But this book will also shed new light on the field of image processing for
users and readers interested in recent advances in artificial intelligence, deep learning,
and neural networks. You will learn, for example, that it is not necessary to have a
neural network with 500 million weights, nor a million training images, to extract
the edges of an image, to segment it, to detect geometric features as segments and
circles, or objects located in it, to estimate displacement vectors in video sequences,
etc. And even better, you will be able to study the implementations of corresponding
algorithms, disseminated and explained throughout this book, made with the CImg
library, while testing them on your own data.
Rachid Deriche
Sophia Antipolis, June 22, 2022.
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
Preamble
The practical realization of these methods depends on the nature of the image. In
the vast majority of cases, they are in digital form, i.e., sampled and quantified signals.
One carries out digital image processing which processes data-processing algorithms
on numerical machines (computers or dedicated circuits).
W HAT IS AN IMAGE ?
An image is a d dimensional signal. In order to process it, we associate this signal with
the notion of abstract signal. In this book we will only consider deterministic signals
to which we associate a function. In the context of random or stochastic signals, we
can use for example random processes.
I : Z ⊂ Rd → Rc I : Ω ⊂ Nd → Zc
(1)
(x1 , . . . , xd ) 7→ I (x1 , . . . , xd ) [i1 , . . . , id ] 7→ I [i1 , . . . , id ]
| {z } | {z }
Continuous image Digital (or numerical) image
x i
Sampling-Quantification
y I : Z ⊂ R2 → R3 I : Ω ⊂ N2 → Z3
j
(x, y) 7→ I (x, y) [i, j] 7→ I [i, j]
The conversion of a continuous signal (or image) to a digital (or numerical) signal
(or image) is carried out in two stages (Fig. 2):
• Sampling: discretize the evolution parameters (time, distances) ;
• Quantization: discretize the signal values.
3 3 3
2 2 2
1 1 1
2 4 6 8 10 12 14 2 4 6 8 10 12 14 2 4 6 8 10 12 14
With this book, we would like to offer you an enchanting, yet pragmatic walk
through the wonderful world of image processing:
This intertwining of theory and implementation is the essence of this book, and its
content is therefore intended for a variety of readers:
It is important to underline that we will only use simple concepts of the C++ lan-
guage and that the proposed programs will therefore be readable enough to be easily
transcribed into other languages if necessary. The CImg library, on which we rely,
has been developed for several years by researchers in computer science and image
processing (from CNRS - French National Centre for Scientific Research, INRIA -
French National Institute for Research in Digital Science and Technology, and the
University), mainly to allow rapid prototyping of new algorithms. It is also used as
a development tool in the practical work of several courses given at the bachelor’s,
master’s or engineering school level. Its use is therefore perfectly adapted to the
pedagogical approach that we wish to develop in this book.
The book is structured to allow, on the one hand, a quick appropriation of the
concepts of the CImg library (which motivates the first part of this book), and on
the other hand, its practical use in many fields of image processing, through various
workshops (constituting the second part of the book). The set of examples proposed,
ranging from the simplest application to more advanced algorithms, helps developing
a joint know-how in theory, algorithmic and implementation in the field of image
processing. Note that all the source codes published in this book are also available in
digital format, on the following repository:
https://github.com/CImg-Image-Processing-Book.
Image processing took off in the 1960s, with the advent of computers and the devel-
opment (or rediscovery) of signal processing techniques (the Fourier transform, for
example). From then on, in all the domains that this book proposes to approach in its
second part, many algorithms have been developed, always more powerful and precise,
that are able to process images of increasingly important size and in consequent
number.
In parallel to this development, since the 2000s, machine learning, and more
particularly deep learning, has achieved unequalled performance in computer vision,
even surpassing human capacities in certain areas. A deep neural network is now able
to annotate a scene by identifying all the objects, to realistically colorize a grayscale
image, or to restore highly noisy images.
So why are we still interested in “classical” image processing ? The shift from
image processing to deep learning is accompanied by a paradigm shift in data process-
ing: classical techniques first compute features on the original images (Chapter 6) and
then use them for the actual processing (segmentation: Chapter 7, tracking: Chapter
8, . . . ). The computation of these features is possibly preceded by pre-processing
(Chapters 3, 4 and 5) facilitating their extraction. In contrast, deep learning learns
these features, most often through convolution layers in deep networks, and uses these
learned features to perform processing.
And this is where the major difference comes in: to perform its task, the deep
network must learn. And to do this, it must have a training set made up of several
thousands (or even millions) of examples, telling it what it must do. For example, to
be able to recognize images of cats and dogs, the network must learn on thousands of
pairs (x, y), where x is an image of a cat or a dog and y is the associated label, before
being able to decide on an unknown image (Fig. 1.1).
However, obtaining this data is far from being an easy task. If, in some domains
(such as object recognition), well-established labeled databases are available (for
example, ImageNet1 , composed of more than 14 million different images distributed
in 21800 categories), it is most often very difficult, if not impossible, to constitute a
sufficiently well-supplied and constituted training set to train a neural network.
Moreover, beyond this problem of training data availability, deep neural networks
often require, during their learning phase, significant computing power and hardware
resources (via the use of GPUs - Graphics Processing Units - and TPUs - Tensor
Processing Units). Power that the student, or the engineer in search of a quick result,
will not necessarily have at his disposal.
1 http://www.image-net.org
6 Chapter 1. Introduction
Learning Inferring
So, even if the field of deep neural network learning has been expanding rapidly
for the last twenty years, and provides impressive results, classical image processing
has certainly not said its last word!
Among the plethora of existing programming languages, the C++ language has
the following advantages:
• The use of C++ templates eases the manipulation of generic image data, for
example, when the pixel values of images you process have different numerical
types (Boolean, integer, floating point, etc.).
One has to realize that writing such features from scratch is actually a tedious
task. Today, classic file formats have indeed a very complex binary structure: images
are mostly stored on disk in compressed form, each format uses its own compression
method that can be destructive or not. In practice, each image format is associated
with an advanced third-party library (e.g., libjpeg, libpng, libtiff, . . . ), each
being focused on loading and saving image data in its own file format. Similarly,
displaying an image in a window is a more complex task than it seems, and is al-
ways done through the use of third-party libraries, either specialized in “raw” display
(libX11, Wayland under Unix, or gdi32 under Windows), or in the display of
more advanced graphical interfaces with widgets (GTK, Qt, . . . ).
Finally, basic processing algorithms themselves are not always trivial to implement,
especially when optimized versions are required. For all these reasons, one usually
resorts to a high-level third-party library specialized in image processing, to work
comfortably in this domain in C++.
Why only these six libraries? Because they are well-established ones (all of them
existing for more than 15 years), widely used in the image processing community
8 Chapter 1. Introduction
and therefore well-proven in terms of performance and robustness. They are also still
under active development, free to use, multi-platform, and extensive enough to allow
the development of complex and diversified image processing programs. We have
voluntarily put aside libraries that are either distributed under a proprietary license,
or that are too young, or not actively maintained, or with a too restrictive application
domain (for example, libraries that are only capable of reading/writing images in a
few file formats, or that propose a too limited panel of image processing algorithms).
This diversity of choice actually reflects the various application domains that were
initially targeted by the authors of these different libraries.
d) OpenCV e) Magick++
Figure 1.2 – Logos of the main opensource C++ libraries for image processing (note
that the libvips library does not have an official logo).
CImg is a lightweight C++ library, which has been around for more than 20
years. It is a free library, whose source code is open (distributed under the CeCILL-C
open-source license), and which runs on various operating systems (Windows, Linux,
Mac OSX, FreeBSD, etc.). CImg gives the programmer access to classes and methods
for manipulating images or sequences of images, an image being defined here in the
broadest sense of the term, as a volumetric array with up to three spatial coordinates
10 Chapter 1. Introduction
(x, y, z) and containing vector values of any size and type. The library allows the
programmer to be relieved of the usual “low-level” tasks of manipulating images on
a computer, such as managing memory allocations and I/O, accessing pixel values,
displaying images, user interaction, etc. It also offers a fairly complete range of
common processing algorithms, in several areas including:
Compared to its competitors, the properties of the CImg library make it particularly
interesting in a pedagogical context such as the one we want to develop with this book:
• CImg is powerful. Most of its algorithms can be run in parallel, using the
different cores of the available processor(s). Parallelization is done through the
use of the OpenMP library, which can be optionally activated when compiling a
CImg-based program.
• CImg is an open source library, whose development is currently led by the
GREYC (Research lab in digital science of the CNRS), a public research lab-
oratory located in Caen, France. This ensures that the development of CImg
is scientifically and financially independent from any private interest. The
source code of CImg is and will remain open, freely accessible, studyable by
anyone, and thus favoring the reproducibility and sharing of image processing
algorithms. Its permissive free license (CeCILL-C) authorizes its use in any
type of computer program (including those with closed source code, intended
to be distributed under a proprietary license).
All these features make it an excellent library for practicing image processing in C++,
either to develop and prototype new algorithms from scratch, or to have a complete and
powerful collection of image processing algorithms already implemented, immediately
usable in one’s own programs.
The CImg API is simple: the library exposes four classes (two of them with a
template parameter) and two namespaces (Fig. 1.3).
• cimg_library: this namespace includes all the classes and functions of the
library. A source code using CImg usually starts with the following two lines:
12 Chapter 1. Introduction
#include "CImg.h"
using namespace cimg_library;
Thus, the programmer will have direct access to the library classes, without
having to prefix them with the namespace identifier cimg_library::.
• cimg: this namespace contains some utility functions of the library, which
are not linked to particular classes, and which can be useful for the devel-
oper. For example, functions cimg::sqr() (returns the square of a number),
cimg::factorial() (returns the factorial of a number), cimg::gcd()
(returns the greatest common divisor between two numbers) or
cimg::maxabs() (compute the maximum absolute value between two num-
bers) are some of the functions defined in the cimg:: namespace.
CImg defines four classes:
• CImg<T>: this is the most essential and populated class of the library. An
instance of CImg<T> represents an “image” that the programmer can manip-
ulate in his C++ program. The numerical type T of the pixel values can be
anything. The default type T is float, so we can write CImg<> instead of
CImg<float>.
• CImgList<T>: this class represents a list of CImg<T> images. It is used for
example to store sequences of images, or sets of images (that may have different
sizes). The default type T is float, so you can write CImgList<> instead of
CImgList<float>.
• CImgDisplay: this class represents a window that can display an image on
the screen, and interact through user events. It can be used to display animations
or to create applications requiring some user interactions (e.g., placement of
key points on an image, moving them, . . . ).
• CImgException: this is the class used to handle library exceptions, i.e.,
errors that occur when classes and functions of the library are misused. The pro-
grammer never instantiates objects of this class, but can catch the corresponding
exceptions raised with this class by the library to manage errors.
This concise design of the library makes it easy to learn, even for novice C++ pro-
grammers.
At first sight, this conception may seem surprising: in C/C++, the libraries that
one encounters are generally organized in the form of one or more header files (most
often, one header file per different structure or class defined by the library), completed
by a binary file (.a or .so files under Linux, .lib or .dll files under Windows),
which contains the library’s functions in compiled form.
Our teaching experience with CImg has shown that the first question raised by
new users of the library is: “Why is everything put in one file?”. Here we answer this
frequent question, by listing the technical choices justifying this kind of structuring,
and by pointing out the advantages (and disadvantages) of it. The global answer takes
into account several different aspects of the C++ language, and requires consideration
of the following points:
Because the library is generic. The CImg<T> image and CImgList<T> image
structures exposed by the library have a template parameter T, which corresponds
to the type of pixels considered for these images. However, the types T that will be
selected by the user of CImg classes are a priori unknown.
Of course, the most commonly used types T are in practice the basic C++ types
for representing numbers, i.e.,: bool (Boolean), unsigned char (unsigned 8-bit
integer), unsigned short (unsigned 16-bit integer), short (signed 16-bit inte-
ger), unsigned int (unsigned 32-bit integer), int (signed 32-bit integer), float
(float value, 32-bit), double (float value, 64-bit), etc. However, it is not uncom-
mon to see source codes that uses images of other types, such as CImg<void*>,
CImg<unsigned long long> or CImg<std::complex>.
One might think that pre-compiling the methods of the two classes CImg<T> and
CImgList<T> for these ten or so most common types T would be a good idea. This
is to overlook the fact that many of these methods take as arguments values whose
types are themselves CImg<t> images or CImgList<t> image lists, with another
template parameter t potentially different from T.
One can easily see that the multiplicity of possible combinations of types for the
arguments of the library’s methods makes it unwise to precompile these functions in
the form of binary files. The size of the generated file(s) would simply be huge, and
the functions actually used by the programmer would in practice only represent a tiny
portion of the pre-compiled functions.
The correct approach is therefore to let the compiler instantiate the methods and
functions of the CImg classes only for those combinations of template types that are
actually exploited in the user’s program. In this way, lighter and more optimized
binary objects or executables are generated, compared to what would be done with
a static binding to a large pre-compiled library. The main disadvantage is that the
functions of the CImg library used in the program must be compiled at the same time
as those of the program itself, which leads to an additional compilation overhead.
First, because unlike the C++ standard library, CImg defines only four different
classes, which turn out to be strongly interdependent. Moreover, the algorithms
operating on these class instances are defined as methods of the classes, not as external
functions acting on “containers”. This differs a lot from how the C++ standard library
is designed.
In practice, methods of the CImg<T> class need methods of CImgList<T>
(even if this is sometimes invisible to the user), simply because implementations
of CImg<T> methods require the functionality of CImgList<T> (and vice versa).
Similarly, CImgException is a ubiquitous class in CImg, since it is used to handle
errors that occur when library functions are misused. If the programmer does not want
to handle these errors, this class might seem useless to include. However, it is required
during compilation, since it is obviously used by the library core, which is, after all,
compiled at the same time as the user’s program.
This class interdependence means that if we wanted to have one header file per
CImg class, the first thing it would do is probably include the header files for the other
Introduction 15
classes. From a purely technical point of view, the gain from such a split would be
null: the four header files would be systematically included as soon as only one of
the classes in the library is used. In consequence, CImg proposes only one header file,
rather than one per class, without any real consequences on the compilation time.
But the fact that CImg is distributed in the form of a single header file is not only
due to the satisfaction of technical constraints bound to the C++ language. In practice,
this is indeed an undeniable advantage for the library user:
• Easy to install: copying a single file to a folder to get access to the functions
of a complete image processing library is comfortable (a thing that few current
libraries actually offer).
will tell CImg to use the functions of the libtiff and libjpeg libraries
when it needs to read or write images in TIFF or JPEG format (it is then of
course necessary to link the generated binary, statically or dynamically, with
these two libraries). There are a lot of such configuration macros that can be set
to activate specific features of CImg, when compiling a program.
On the other hand, this means that it is also possible to compile a CImg-based
program without activating any dependencies on external libraries. This flexibil-
ity in the control of dependencies is very important: using CImg does not imply
an automatic dependency on dozens of “low-level” third-party libraries, whose
functionalities might not be used by the programmer. Yet this is what happens
with most of the competing image processing libraries!
16 Chapter 1. Introduction
But one of the great strengths of the CImg library is its ease of use, and its ability to
express image processing algorithms in a clear and concise way in C++. This is what
we will show you in the rest of this book.
2. Getting Started with the CImg Library
Figure 2.1 – Goal of our first CImg-based program: decompose an image into blocks
of different sizes, and visualize the result in an interactive way.
The most informed readers will notice a parallel with the so-called quadtree
decomposition. The same type of decomposition is proposed here, but by putting
aside the tree structure of the decomposition (so, in practice, we only keep the leaves
of the quadtree).
All the necessary third-party libraries are expected to be installed, so let’s write
our first code:
Code 2.1 – A first code using CImg.
// first_code.cpp:
// My first code using CImg.
#include "CImg.h"
using namespace cimg_library;
int main() {
CImg<unsigned char> img("kingfisher.bmp");
img.display("Hello World!");
return 0;
}
As you can guess, the purpose of this first program is to instantiate an image object of
type CImg<unsigned char> (i.e., each pixel channel stored as an 8-bit unsigned
integer), by reading the image from the file kingfisher.bmp (the bmp format is
generally uncompressed, so it doesn’t require any additional external dependencies),
and displaying this image in a window.
In order to compile this program, we must specify that the program has to be
linked with the necessary libraries for display. Under Linux, with the g++ compiler,
we will for instance write the following minimal compilation command:
$ g++ -o first_code first_code.cpp -lX11 -lpthread
Under Windows, with the Visual C++ compiler, we will write in a similar way:
> cl.exe /EHsc first_code.cpp /link gdi32.lib user32.lib
shell32.lib
Running the corresponding binary does indeed display the image kingfisher.bmp
in an interactive window, and it allows us to explore the pixel values and zoom in
to see the details of the image (Fig. 2.2). At this point, we are ready to use more
advanced features of CImg to decompose this image into several blocks.
With CImg, obtaining such an image of smoothed and normalized brightness can
be written as:
CImg<> lum = img.get_norm().blur(sigma).normalize(0,255);
Here, we notice that the calculation of the lum image is realized by pipelining three
calls to methods of the CImg<T> class:
Thus, with a single line of code, we have defined a processing pipeline that returns
an image of type CImg<> (i.e., CImg<float>), providing information about the
luminosity of the colors in the original image (Fig. 2.3). The CImg architecture makes
it very easy to write this kind of pipelines, which are often found in source codes
based on this library.
22 Chapter 2. Getting Started with the CImg Library
Figure 2.3 – Computation of the lum image of color brightness, from an input image.
Now let’s look at the variations of this brightness image. Since the calculation
of the gradient ∇I is a basic operation in image processing, it is already implemented
in CImg via the method CImg<>::get_gradient(), that we are going to use
here:
CImgList<> grad = lum.get_gradient("xy");
CImg has many methods for applying mathematical functions to pixel values, and
the usual arithmetic operators are redefined to allow writing such expressions. Here,
calls to the CImg<float>::get_sqr() method return images where each pixel
value has been squared. These two images are then summed via the CImg method
CImg<float>::operator+() which returns a new image of the same size. Fi-
nally, CImg<float>::sqrt() replaces each value of this summed image by its
square root.
2.3 Computing the variations 23
Here again, we chose to use the get and non-get versions of the methods in
order to minimize the number of image copies. With this in mind, we can even use
CImg<float>::operator+=(), which can be seen as the non-get version of
CImg<float>::operator+(), and which avoids an additional creation of an
image in memory:
Figure 2.4 shows a detail of the different images of variations that we obtain. Re-
member: at any time, in a CImg program, the content of an image or a list of images can
be displayed by calling the CImg<T>::display() and CImgList<T>::display()
methods, which proves to be very useful for checking the correct step-by-step imple-
mentation of a program.
∂I
a) lum b) grad[0] = ∂x
∂I
c) grad[1] = ∂y
d) normgrad = k∇Ik
Figure 2.4 – Computation of the gradient image and its norm, from image lum (detail).
24 Chapter 2. Getting Started with the CImg Library
To store the whole set of blocks, we use a CImgList<int>: each element (of
type CImg<int>) of this list will represent a block, stored as an image of size 1 × 4
containing the coordinates (x0, y0, x1, y1) of the top-left and bottom-right corners of
the block. To initialize our algorithm, we can therefore write:
In a second step, we iterate over the existing blocks in the list blocks, inserting
the new blocks resulting from the successive subdivisions at the end of the list (and
deleting from the list the blocks that have been subdivided):
The central part of this code portion concerns the block subdivision test, where
two criteria must be met. On the one hand,
std::min(x1 - x0,y1 - y0)>8
requires that both the width and height of a block must be greater than 8 pixels.
selects the blocks whose maximum variation value k∇Ik is greater than the threshold
(set by the user). Note the chaining of CImg methods in this last expression:
CImg<float>::get_crop() first returns the portion of the normGrad image
that corresponds only to the considered block (in the form of a new temporary im-
age CImg<float>). Then CImg<float>::max() returns the maximum value
encountered in this sub-image.
The subdivision of a block is done by calculating first the coordinates of its center
(xc,yc), then splitting it into four equal parts, which are added to the list of blocks
to be further examined.
Now we have a list of blocks and their coordinates. What remains is visualizing
them as an image such as the one presented in Fig. 2.1b.
Here we have used one of the basic constructors of the library, which takes as argu-
ments the desired dimension of the image to allocate, according to its different axes:
• Its width, defining the number of columns of the image (here chosen to be W ),
i.e., its dimension along the x-axis.
• Its height, defining the number of rows of the image (here chosen to be H),
i.e., its dimension along the y-axis.
• Its depth, defining the number of slices of the image (here chosen to be 1),
i.e., its dimension along the z-axis.
A number of slices greater than 1 is found when storing and manipulating
volumetric images, composed of voxels (rather than pixels). This type of data is
often encountered in medical imaging (MRI images for instance). It can also be
encountered when processing image sequences, the key images of the animation
being stored as a set of slices in a volumetric image. For “conventional” images
(in two dimensions), the number of slices is always equal to 1.
• Its number of channels, denoted as spectrum, defining its dimension along
the c-axis (here chosen to be 3). A scalar image has a single channel, a color
image has three, if it is stored in the form of RGB pixels (more rarely 4, in the
case of images encoded in CMY K or RGBA, i.e., with an alpha channel). Here,
2.5 Rendering of the decomposition 27
we wish to draw RGB colored blocks in the image res, so we provide it with 3
channels.
• And finally, the default value of its pixels (here chosen at 0, corresponding to a
default background colored in black).
Keep in mind that a CImg<T> image always has four dimensions, no more, no less:
three of these dimensions have a spatial meaning (width, height and depth), the re-
maining dimension has a spectral meaning (number of channels). Furthermore, CImg
will never try to store the meaning of what the pixel or voxel values represent in an
image. Thus, the pixels of a three-channel CImg<T> image can represent as well
RGB colors, as 3D point coordinates (x, y, z), or as probabilities of occurrence of three
distinct events. It is up to the user of the library to know the meaning of the pixels he
manipulates. The good news is that, he actually knows it 100% of the time! In the
case of the image res, we decide that the pixels of the image represent colors stored
in RGB, with 256 possible integer values per component (hence the choice of type
CImg<unsigned char>, with 3 channels to define this image).
Now that the image res is allocated, let’s draw in it the different blocks calcu-
lated in the previous step, using CImg<T>::draw_rectangle(), which is the
appropriate method for this task:
Code 2.3 – Rendering of the block decomposition of the image.
cimglist_for(blocks,l)
{
CImg<int>& block = blocks[l];
int
x0 = block[0], y0 = block[1],
x1 = block[2], y1 = block[3];
CImg<unsigned char> color(1,1,1,3);
color.rand(0,255);
res.draw_rectangle(x0,y0,x1,y1,color.data(),1);
}
In this way, we will go through all the elements of the list of blocks.
CImg defines many useful macros that simplify the writing of loops iterating
over objects of type CImg<T> and CImgList<T>. The most common loop macros
are cimg_forX() (loop over image columns), cimg_forY() (loop over rows),
28 Chapter 2. Getting Started with the CImg Library
cimg_forXY() (loop over rows and columns), cimg_forC() (loop over chan-
nels), and cimglist_for() (loop over image list items). In practice, more than
300 loop macros are defined in CImg and using them helps having a code that looks
more concise and readable.
In order to better visualize the different blocks, especially neighboring blocks with
similar colors, we now add a one-pixel thick black border around each block. To do
that, we want to fill in black all pixels whose direct neighbor to the right or bottom
has a different color. This problem can be approached in several ways with CImg. We
propose here three variations which introduce different concepts of the library.
1. A first method would be to multiply our resulting color image by a mask image,
whose pixels are either 0 (on the borders) or 1 (elsewhere). This mask image
can be calculated in a clever way, from the arithmetic difference between the
color image, and its translated version of 1 pixel up, and 1 pixel to the left. All
the non-zero vectors of this difference image correspond to the desired border
pixels. The norm of this difference image can therefore be calculated as:
2.5 Rendering of the decomposition 29
In our final rendering, we want the opposite (pixels with value 0 on the border,
and 1 elsewhere), so we can multiply the image 1 - mask to our rendered
color image. All can be done in a single line of code:
res.mul(1 - (res.get_shift(1,1,0,0,0) - res).norm().cut(0,1));
2. A second, even more concise method, uses one of the variants of the
CImg<unsigned char>::fill() methods, capable of evaluating a math-
ematical expression, potentially complex, for each pixel of an image.
Thus we can write:
res.fill("I*(I!=J(1,0) || I!=J(0,1)?0:1)",true);
3. The third method introduces a particular type of CImg loops: loops on neigh-
borhoods. We have already mentioned the existence of simple loop macros,
but there are also more sophisticated macros, allowing not only to explore each
pixel independently, but also to give access at any time to the values of the
neighboring pixels. For instance, we can fill our image mask this way:
30 Chapter 2. Getting Started with the CImg Library
CImg<unsigned char>
mask(res.width(),res.height(),1,1,1),
V(3,3);
cimg_forC(res,c) cimg_for3x3(res,x,y,0,c,V,unsigned char)
if (V[4]!=V[5] || V[4]!=V[7]) mask(x,y) = 0;
Once our mask image has been computed, we just have to multiply it point to
point with the res image, obtained by the block decomposition.
res.mul(mask);
Note that we don’t use the multiplication operator * between two images. This
operator exists in CImg, but corresponds to the matrix multiplication of two
images (seen as matrices).
Note also that this kind of loop macro cimg_forNxN() exists for a large
number of different N. 3D versions of these neighborhood loops are also
available, with macros such as cimg_forNxNxN().
a) Rendering blocks with random colors b) Rendering blocks with averaged colors
The constructor takes as argument the image to be displayed, the desired title appearing
in the display window bar, as well as a normalization mode for the values, chosen here
to be 0, which indicates that we do not want CImgDisplay to manage the normaliza-
tion of the values before displaying them (because we will take care of it by ourselves).
normalization is done internally to the CImgDisplay class and is not really ap-
plied to the pixels of the image being displayed (i.e., the values of the CImg<T>
or CImgList<T> instance displayed remain unchanged in practice).
• normalization=2: a linear normalization of the values is also performed in
this mode, but with normalization parameters calculated at the first display only.
The CImgDisplay class then reuses these parameters when displaying other
images in the same display (even if the min/max values of these images are
different). This is useful when one wants to keep a coherence of the gray levels
or colors when displaying sequences of images whose minimum and maximum
values may change over time.
• normalization=3: this is the automatic mode (also the default mode in
CImgDisplay constructors). It is equivalent to one of the three previous
normalization behaviors, chosen according to the type T of the image values.
More precisely, in this mode, the display of a CImg<T> in a CImgDisplay
will correspond to the mode:
• normalization=0, if type T is an 8-bit unsigned integer (i.e., an
unsigned char).
• normalization=1, if T is a floating-point value type (float, double,
. . . ).
• normalization=2, for other integer types (char, int, . . . ).
Once the CImgDisplay instance has been constructed, the image res is dis-
played in a new window that appears on the screen. On the contrary, when the
destructor ˜CImgDisplay() is called, this window disappears. Between these two
states, we can manage an event loop, which will drive the behavior of the window
according to the user actions. The simplest event loop can be written as:
while (!disp.is_closed() && !disp.is_keyESC()) {
disp.wait(); // Wait for an user event
}
This loop does nothing but wait for the user to close the window, or press the Escape
key on his keyboard. The call to disp.wait() is blocking, and only succeeds when
some event occurs. During this time, it does not consume any resources.
We now detect if the mouse cursor is over a block. The cursor position is retrieved
by methods CImgDisplay::mouse_x() and CImgDisplay::mouse_y():
2.6 Interactive visualization 33
int
x = disp.mouse_x(),
y = disp.mouse_y();
if (x>=0 && y>=0) {
// Mouse cursor is over the display window.
}
If the cursor is above the displayed window, the values of x and y give the coordinates
of the cursor (relative to the top-left corner of the displayed image, having coordinates
(0, 0)). If the cursor is outside the window, x and y are both equal to -1.
We want to display the content of the original color image corresponding to the
block under which the mouse is positioned, as shown in Fig. 2.6. How can we simply
find the coordinates (x0, y0) − (x1, y1) of the block under the mouse cursor? We will
slightly modify our block image rendering code, to fill, in addition to the res image,
a second image, named coords, defined as follows:
CImg<int> coords(img.width(),img.height(),1,4,0);
The coords image has four channels, with integer values, which contain for each
pixel the coordinates of the decomposed block to which it belongs, namely the values
(x0, y0, x1, y1):
cimglist_for(blocks,l)
{
CImg<int>& block = blocks[l];
int
x0 = block[0], y0 = block[1],
x1 = block[2], y1 = block[3];
CImg<unsigned char> color(1,1,1,3);
color.rand(0,255);
res.draw_rectangle(x0,y0,x1,y1,color.data(),1);
// Filling image ’coords’:
coords.draw_rectangle(x0,y0,x1,y1,block.data());
}
In the event loop, we can now easily retrieve the coordinates (x0, y0) − (x1, y1) of the
block pointed by the mouse, as well as the center of this block (xc, yc):
34 Chapter 2. Getting Started with the CImg Library
All that remains is to retrieve the portions of the original image img and the image
of the variations norm corresponding to the current block, to resize them to a fixed
size (here 128 × 128), and draw them on a copy of the image res, intended to be
displayed in the display window disp:
CImg<unsigned char>
pImg = img.get_crop(x0,y0,x1,y1).resize(128,128,1,3,1),
pGrad = normGrad.get_crop(x0,y0,x1,y1).resize(128,128,1,3,1).
normalize(0,255).
map(CImg<unsigned char>::hot_LUT256());
Note that in the meantime, the values of the region extracted from the normGrad
variation image are normalized, mapped to colors using a hot colormap, i.e., a classic
palette of 256 colors ranging from black to white through red and yellow. The pGrad
image thus becomes a normalized color image ready to be displayed (whereas the
normGrad image was a scalar image, with a different range of values as J0, 255K).
We can now create the visual rendering of our image to be displayed:
Code 2.5 – Visual rendering for interactive display.
The unary operator operator+() is used here to make a copy of the image
res, in which we will draw several graphic elements. A few things to note:
• Line 4: the call to draw_rectangle() decreases the luminosity of the block
pointed by the mouse, by drawing a black rectangle over it, with a 25% opacity.
• Lines 5 and 6: note here the use of an opacity of 0.75 for the drawing of
the lines going from the center of the block to the two left-hand thumbnails,
2.6 Interactive visualization 35
Figure 2.6 illustrates the result of the different methods for plotting graphical primitives.
Figure 2.6 – Detail of the role of each methods for drawing graphical primitives.
36 Chapter 2. Getting Started with the CImg Library
/*
File: decompose_blocks.cpp
CImgList<int> blocks;
CImg<int>::vector(0,0,img.width() - 1,img.height() - 1).
move_to(blocks);
2.7 Final source code 37
CImg<unsigned char>
pImg = img.get_crop(x0,y0,x1,y1).resize(128,128,1,3,1),
pGrad = normGrad.get_crop(x0,y0,x1,y1).resize(128,128,1,3,1).
normalize(0,255).
map(CImg<unsigned char>::hot_LUT256());
(+res).
draw_text(10,3,"X, Y = %d, %d",white,0,1,24,x,y).
draw_rectangle(x0,y0,x1,y1,black,0.25f).
draw_line(74,109,xc,yc,white,0.75,0xCCCCCCCC).
draw_line(74,264,xc,yc,white,0.75,0xCCCCCCCC).
draw_rectangle(7,32,140,165,white).
draw_rectangle(7,197,140,330,white).
draw_image(10,35,pImg).
draw_image(10,200,pGrad).
display(disp);
}
return 0;
}
Now you have a better idea of the features offered by the CImg library, as well as
the practical use of its classes and their associated methods. The technical reference
documentation, available on the CImg1 website, lists all the methods and operators
available for each of the four classes CImg<T>, CImgList<T>, CImgDisplay
and CImgException exposed by the library. Reading it will be useful to dive into
the details of all the available methods. So will the developments that follow in this
book!
1 http://cimg.eu/reference
II- Image Processing Using CImg
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
Introduction
In this section, we illustrate various aspects of the CImg library for image processing,
through several workshops.
In the classic workflow of digital image processing, several steps are generally
used:
• The pre-processing step (i.e., how to improve the image to reach the goal
of the processing): it can be a matter of denoising, improving the contrast,
transforming the initial image into another image, easier to process . . . ;
• The processing step as such: the objective may be to find objects in the image,
to follow objects in a temporal sequence, to classify pixels in different homo-
geneous groups, to reconstruct a 3D scene, to extract compressed information
from an image composed of many channels . . . ;
• The quantization step, which uses the previous step to provide quantitative
results to measure the processing (e.g., what is the surface of the isolated object?
What is the average speed of the moving object? What is the depth of the object
in the image?. . . ).
Without pretending to be exhaustive, we propose here a few workshops, organized in
chapters, allowing us to approach classical methods among the wide range of classical
algorithms in the field image processing. Each workshop is organized in a short but
necessary introduction to its theoretical aspects, followed by the implementation in
CImg of useful algorithms. The source codes presented allow to make an immedi-
ate parallel with the equations described in the theoretical part. The results of the
algorithms are illustrated with many figures.
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
3. Point Processing Transformations
where T acts directly on the image I, or on the histogram of I. Gonzalez and Woods
[15] propose many examples of ad hoc transformations. In the following, we give an
overview of some of them.
CImg<> imgIn("butterfly.bmp");
CImgList<> mathOps(imgIn,
imgIn.get_exp()/50,
10*imgIn.get_sqrt(),
(1 + imgIn.get_abs()).log(),
imgIn.get_pow(3));
mathOps.display();
Function Name
sqr(), sqrt(), pow(double p),
Power pow(CImg<t>& img),
pow(const char *expression)
Logarithm log(), log2(), log10(), exp()
cos(), sin(), tan(), acos(), asin(),
Trigonometry
sinc(), atan(), atan2(CImg<t>& img)
cosh(), sinh(), tanh(), acosh(), asinh(),
Hyperbolic
atanh()
min(T& value), max(T& value),
Min-Max
minabs(T& value), maxabs(T& value)
Various abs(), sign()
0.8
γ = 0.05
γ = 0.2
0.6 γ = 0.5
Figure 3.1 – Gamma transform c.iγ , c ∈ R.
T (i)
γ =1
For γ < 1, the transformation compresses the 0.4 γ =2
dynamic range of I; for γ > 1, it compresses γ =5
0.2
γ = 20
the high grayscale values, and stretches the
0
range of small values. 0 0.2 0.4
i
0.6 0.8 1
In the case of non-binary images, these operators act at the bit level, encoding the
values of the intensities of each pixel.
This is done by setting the least significant bits of these images to 0, then applying a
bitwise OR between the output and the image of the text to produce an image, visually
indistinguishable from the original one, which contains the hidden text (Fig. 3.6c). To
decode the message, a bitwise AND is applied with the value 1 to recover the message
(Fig. 3.6d). Code 3.3 illustrates how easy this operation is performed with CImg.
46 Chapter 3. Point Processing Transformations
CImg<unsigned char>
imgIn("butterfly.bmp"),
mess("message.bmp");
// Decoding step.
CImg<unsigned char> res = (imgOut&1) *= 255;
3.1 Image operations 47
(∀ j ∈ [gi , gi+1 ]) Ti ( j) = ai . j + bi
This set of transformations can be used, for example, to calculate the negative
of an image, or to perform thresholding operations. They can be visualized by a
48 Chapter 3. Point Processing Transformations
Look-Up Table (LUT) which gives the correspondence between old and new gray
levels (Fig. 3.7). Of course, the mathematical transformations previously described
also define LUTs, corresponding to the graph of the function on the image intensity
domain.
50 50 50
50 100 150 200 250 50 100 150 200 250 50 100 150 200 250
Original gray levels Original gray levels Original gray levels
Negative image Threshold Histogram enhancement
3.2.1 Definition
The probability distributions of the gray levels can be estimated by measurements on
the histogram of the image. For example, the first order distribution
P (I(x, y) = k)
can be evaluated using a set of D images, D large, representative of a given image
class. The estimated distribution is then
Num(k)
h(k, x, y) =
D
where Num(k) is the number of images for which I(x, y) = k. Using the ergodicity
hypothesis, the measure on D images can be replaced by a spatial measure: the first
order probability distribution can then be estimated by
Num(k)
H(k) =
|I|
where |I| is the total number of pixels and Num(k) the number of times in I where
I(x, y) = k, for all (x, y).
3.2 Histogram operations 49
Definition 3.2.1
H : [[0, N − 1]] → N
Num(k)
k 7→ H(k) =
|I|
In the case of color images, the histogram generally refers either to the intensity
(luminance) of the pixels or to the histograms of the different color channels.
Ni Nj
∑ P(Ii = li ) = 1 and ∑ P(J j = k j ) = 1
i=1 j=1
j i
∑ P(Jp = k p ) = ∑ P(Iq = lq )
p=1 q=1
The second term is nothing but the cumulative distribution of I and can thus be
j i
replaced by the cumulative histogram: ∑ P(Jp = k p ) = ∑ HI (q).
p=1 q=1
with continuous densities. If p j is the output density and HI the cumulative histogram
of I, then
1
p j ( j) = ⇒ j = ( jmax − jmin ) HI (i) + jmin (equalization)
jmax − jmin
1
p j ( j) = a.e(−a( j− jmin )) ⇒ j = jmin − Ln (1 − HI (i)) (exponential output)
a
Code 3.4 presents a simple version of the histogram equalization. Figure 3.8
illustrates some results. The histograms are plotted using the method:
CImg<T>& display_graph(CImgDisplay &disp, unsigned int plot_type=1,
unsigned int vertex_type=1,
const char *labelx=0, double xmin=0, double xmax=0,
const char *labely=0, double ymin=0, double ymax=0)
The CImg<T> class also proposes a method to directly compute this equalization:
CImg<T>& equalize(unsigned int nb_levels, T value_min, T value_max)
/*
Histogram equalization
// Cumulated histogram.
cimg_forX(hist,pos)
{
cumul += hist[pos];
hist[pos] = cumul;
}
if (cumul==0) cumul = 1;
// Equalized image.
cimg_for(imgIn,off)
{
int pos = (int)((imgIn[off] - vmin)*(nb - 1)/vdiff);
if (pos>=0 && pos<(int)nb)
imgOut[off] = vmin + vdiff*hist[pos]/size;
}
return imgOut;
}
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
4. Mathematical Morphology
The shape operator Ψ to be designed must then be distributed over the set of
unions and intersections (equivalent to linearity), that is:
• Ψδ (X1 ∪ X2 ) = Ψδ (X1 ) ∪ Ψδ (X2 )
• Ψε (X2 \ X1 ) = Ψε (X1 ) ∩ Ψε (X2 )
The first operation will be called morphological dilation and the second morpho-
logical erosion. These two operations are the basis of mathematical morphology, from
which more complex morphological operators can be designed.
In the following, we specify the notions of erosion, dilation and various other
operations in the context of binary images, and then we extend the discussion to the
more general case of grayscale images.
54 Chapter 4. Mathematical Morphology
A ⊕ B = {x, (B0 )x ∩ A 6= }
A B = {x, (B)x ⊆ A}
B is the shape operator, also known as the structuring element in mathematical mor-
phology.
To be clear, for a binary object A and a binary and symmetric structuring element B,
the simple operations of mathematical morphology consist in going through the image
and considering B as a binary mask: if, centered in x, B intersects A, then the value of
the dilation of A by B in x is 1, and 0 otherwise. Similarly if B is not fully enclosed in
A, then the value of the erosion of A by B in x is 0, and 1 otherwise (Fig. 4.1). Thus,
erosion shrinks A, and dilation extends it, according to B.
When B is not symmetric, the reflection of the structuring element must be taken into
account (Fig. 4.2).
It is easy to show that erosion is the dual transformation of dilation versus comple-
mentation: A B = (AC ⊕ B)C . Thus, it is equivalent to erode an object or to dilate its
complement.
The implementation in CImg uses the dedicated methods CImg<T>::erode()
and CImg<T>::dilate() (code 4.1).
CImg<unsigned char>
imgErode = img.get_erode(B), // Erosion
imgDilate = img.get_dilate(B); // Dilation
Erosion B Dilation B
Figure 4.2 – Erosion and dilation. The object is in white, the background in black,
the erosion and the dilation are represented by the pixels filled with gray disks. The
structuring element B, not symmetrical, is centered on the black box.
The size of the structuring element and its shape greatly influence the results of
morphological operations (Fig. 4.3). In particular, erosion tends to remove details of
object A whose characteristic size is smaller than that of the structuring element (e.g.,
the thinnest lines, or the white circle that disappears almost entirely). Conversely,
dilation enlarges the object.
56 Chapter 4. Mathematical Morphology
The opening generally smoothes the contours of a binary shape, breaks the closed
links between objects (isthmuses), and eliminates “small” isolated objects (“small” in
the sense of the structuring element B). The type and the amount of smoothing are
determined by the shape and the size of B. Closing also tends to smooth contours,
but gathers “close” objects (in the sense of B), eliminates “small” holes (in the sense
of B) and connects contours. Figure 4.4 shows the result of opening and closing the
previous image, with a square structuring element of size 11 × 11.
Figure 4.5 presents these same operations with a non-symmetric structuring ele-
ment.
Opening B Closing B
Figure 4.5 – Opening and closing. The object is in white, the background in black,
the opening and closing are represented by the pixels filled with gray disks. The
structuring element B, not symmetrical, is centered on the black box.
CImg<unsigned char>
imgErode = img.get_erode(B), // Erosion
imgDilate = img.get_dilate(B), // Dilation
imgOpening = imgErode.get_dilate(B), // Opening
imgClosing = imgDilate.get_erode(B); // Closing
Opening and closing are morphological filters. The notion of filtering becomes
important when considering a shape and a size adapted to B: Fig. 4.6a presents a
painting by Henri Matisse (Woman with Amphora and Pomegranates, 1953), Fig. 4.6b
a noisy version of the original image by a vertical noise, and Fig. 4.6c the restored
image, by morphological opening using an adapted structuring element.
58 Chapter 4. Mathematical Morphology
where DI (resp. DB ) is the image (resp. the structuring element) domain. Serra [36]
proposed to illustrate this definition on 1D functions (Fig. 4.7, where I = f ), for which
the previous formula is rewritten
One can define as well the gray-level erosion of an image by a gray-level structur-
ing element
1
M(x, y) = (( f b)(x, y) + ( f ⊕ b)(x, y))
2
and the output image J is defined as
( f b)(x, y) if I(x, y) ≤ M(x, y)
J(x, y) =
( f ⊕ b)(x, y) otherwise
Code 4.3 performs this enhancement, and Fig. 4.9 shows the result for two sizes of
structuring element.
/*
Kramer-Bruckner filter.
cimg_forXY(imgOut,x,y)
{
float M = 0.5f*(imgErode(x,y) + imgDilate(x,y));
imgOut(x,y) = (imgIn(x,y)<=M ? imgErode(x,y) : imgDilate(x,y));
}
return imgOut;
}
(· · · (((I ◦ B1 ) • B1 ) ◦ B2 ) • B2 ) · · · ◦ Bn ) • Bn
4.3 Some applications 61
We thus obtain increasing and idempotent operations (thus morphological filters). The
last structuring element (of size n) is determined according to the minimal size of the
objects of the image that we want to keep after the filtering process (Code 4.4 and
Fig. 4.10).
/*
Alternating sequential filters.
// Opening.
imgOut.erode(mask).dilate(mask);
62 Chapter 4. Mathematical Morphology
// Closing.
imgOut.dilate(mask).erode(mask);
}
return imgOut;
}
Figure 4.11 – Some morphological gradients (square structuring element of size 3).
/*
Morphological gradients.
// Hop Hat.
CImg<> whiteTopHat = imgIn - imgOpening,
blackTopHat = imgClosing - imgIn;
// Edge detector.
CImg<> contourMin = gradE.min(gradD),
64 Chapter 4. Mathematical Morphology
contourMax = gradE.max(gradD);
// Nonlinear laplacian.
CImg<> Laplacien = gradD - gradE;
}
4.3.4 Skeletonization
Skeletonization (or thinning) is a classical technique in mathematical morphology
whose objective is to reduce binary objects to a structure of a single pixel thick without
breaking the topology and connectivity of the objects. This process is achieved by the
iterative application of conditional erosions in a local neighborhood of each point, if a
sufficient “thickness” remains in order not to modify the connectivity of the objects.
This implies a local decision process which can make this skeletonization costly. Many
skeletonization algorithms exist in the literature and we propose in the following to
implement Zhang and Suen’s algorithm [44].
Let I be a binary image containing a white object on a dark background. For each
pixel in I, a 3×3 neighborhood N is analyzed, following Fig. 4.12.
N3 N2 N1
N4 (x, y) N0
N5 N6 N7
Figure 4.12 – Neighborhood N of pixel (x, y). Neighbors are labeled from N0 to N7 .
R1 (N) = {(2 ≤ B(N) ≤ 6) AND (C(N) = 1) AND (N6 N0 N2 = 0) AND (N4 N6 N0 = 0)}
R2 (N) = {(2 ≤ B(N) ≤ 6) AND (C(N) = 1) AND (N0 N2 N4 = 0) AND (N2 N4 N6 = 0)}
4.3 Some applications 65
Depending on the values of R1 (N) and R2 (N), the pixel (x, y) is eroded or marked as
non-erodible in two passes.
Code 4.6 details the implementation of this skeletonization. Note the use of the
neighborhood loops cimg_for3x3() of CImg, which allows an easy access to the
neighborhoods of the current pixel.
// Neighborhood.
CImg_3x3(N,unsigned char);
// Pass 1.
int n1 = 0;
cimg_for3x3(imgIn,x,y,0,0,N,unsigned char)
{
if (imgIn(x,y))
{
unsigned char
B = Npp + Ncp + Nnp + Npc + Nnc + Npn + Ncn + Nnn,
C = Nnc*(Nnc-Nnp) + Nnp*(Nnp-Ncp) +
Ncp*(Ncp-Npp) + Npp*(Npp-Npc) +
Npc*(Npc-Npn) + Npn*(Npn-Ncn) +
Ncn*(Ncn-Nnn) + Nnn*(Nnn-Nnc);
// Pass 2.
int n2 = 0;
D.fill(0);
66 Chapter 4. Mathematical Morphology
cimg_for3x3(imgIn,x,y,0,0,N,unsigned char)
{
if (imgIn(x,y))
{
unsigned char
B = Npp + Ncp + Nnp + Npc + Nnc + Npn + Ncn + Nnn,
C = Nnc*(Nnc-Nnp) + Nnp*(Nnp-Ncp) +
Ncp*(Ncp-Npp) + Npp*(Npp-Npc) +
Npc*(Npc-Npn) + Npn*(Npn-Ncn) +
Ncn*(Ncn-Nnn) + Nnn*(Nnn-Nnc);
Code 4.7 uses the previous function in a loop, which ends after a maximum number
of iterations or when no more pixels should be deleted.
Code 4.7 – Zhang and Suen’s algorithm.
/*
Thinning algorithm.
}
while (n>0 && i<maxiter);
return i;
}
Iteration 1 Iteration 5
Iteration 8 Iteration 15
M−1 N−1
2 2
J(x, y) = ∑ ∑ wi, j I(x + i, y + j) (5.1)
i=− M−1
2 j=− N−1
2
which allows to perform the discrete convolution of the image instance by a mask
kernel given in parameter. Code 5.1 presents an example of discrete convolution by
a mask W, estimating the vertical gradient of an image.
A convolution mask cannot be applied entirely on the pixels located at the edges of
the image: M−1 N−1
2 rows at the top and bottom of the image and 2 columns on the left
and right cannot be processed using Equation 5.1. Several strategies are then possible:
• do not calculate the convolution values for these rows and columns;
• assign a value (typically zero or the maximum gray-level) on the pixels that
should be outside the definition domain of the image when computing the
convolution (padding);
• assign to the outer pixels the values of their nearest neighbor;
• truncate the result image J into a smaller image, by suppressing the correspond-
ing rows and columns in I.
Linear filtering
1)
(8,
1)
(7,
1) 2)
(6, (8,
1) 2)
(5, (7,
(4 ,1) (6 ,2) (8,
3)
1) 2) 3)
(3, (5, (7,
1) 2) 3) 4)
(2, (4, (6, (8,
1) 2) 3) 4)
(1, (3, (5, (7,
(2 ,2) (4 ,3) (6 ,4) (8,
5)
2) 3) 4) 5)
(1, (3, (5, (7,
3) 4) 5) 6)
(2, (4, (6, (8,
3) 4) 5) 6)
(1, (3, (5, (7,
4) 5) 6) 7)
(2, (4, (6, (8, w1,−
1
4) 5) 6) 7) 1)
(1, (3, (5, (7, 1
w0,− (6,
) ) ) )
(2,5 (4,6 (6,7 8,81,−1
( w− w1,0 (5,
1)
5) 6) 7) 8) ,1) 2)
(1, (3, (5, (7, w0,0 (4 (6,
6) 7) 8) 1) 2)
(2, (4, (6, w−1
,0
w1,1 (3, (5,
(1,6) (3,7) (5,8) 1) 2) 3)
7) 8)
w0,1 (2, (4, (6,
(2, (4, 1) 2) 3)
(1, (3, (5,
,1
w−1
7) 8) ,2) ,3) 4)
(1, (3, (2 (4 (6,
8) 2) 3) 4)
(2, (1, (3, (5,
(1,8) 3) 4) 5)
(2, (4, (6,
W (1,
3)
(3,
4)
(5,
5)
I (1,
4)
(2,4)
(3,
5)
(4,5)
(5,
6)
(6,
6)
T (I )
5)
(2,
5)
6)
(4,
6)
J =
(1, (3,
6)
(2,
(1,6)
Smoothing the image generates a blurring effect. Since the coefficients of the
mask sum up to 1, smoothing will preserve any area of the image where the gray level
is almost constant. Therefore, this type of filtering does not modify the areas with
small variations of gray levels and conversely, attenuates the rapid variations. The
size of the convolution mask has of course a considerable importance in the result.
Figure 5.3 shows filtering results for n × n smoothing masks with constant coefficients
(all equal to 1/n2 ), for n ∈ {3, 7, 11}.
7 × 7 filtering 11 × 11 filtering
g 7 × 7 filtering11 × 11 filterin
5.1 Spatial filtering 73
Non-linear filtering
One of the main drawbacks of averaging filter smoothing methods is that the
smoothing also applies to edge boundaries and high-frequency details, which intro-
duces “blurring” into the image. If the objective of the processing is only noise
reduction, one would like to preserve the natural contours in the image as much as
possible.
Section 5.4 will introduce a technique based on partial differential equations to
address this problem. Here we are interested in simpler methods, which can neverthe-
less be effective depending on the type of degradation (noise) encountered in the image.
Median filtering answers this problem for impulse noise. This filter also has the
good property of leaving unchanged the monotonic transitions between neighboring
regions of different intensities (Fig. 5.5). The principle is simple: replace the value of
I(x, y) by the median of the gray levels in a neighborhood of (x, y). If the computation
is trivial, the method has some drawbacks. For noises with a low concentration
distribution (Gaussian, uniform), its performance is poor compared to the optimal
linear filter. In addition, it can still affect the geometry of the image regions: areas with
an acute angle (corners of structures) tend to be rounded. Thus, geometric information
is lost on angular points.
N
statistics ∑ ck ak in replacement of I(x, y).
k=1
In order not to modify the intensity of the homogeneous areas, coefficients are
N
chosen to satisfy ∑ ck = 1. In the case where the image is homogeneous without
k=1
transition and if the noise can locally be modeled by a white noise of density f and
distribution function F, it is possible to optimize the choice of the ck in the sense of
a quadratic error criterion. It can be shown that the noise power at the output of an
optimal L-filter is always less than or at worst equal to that of the best linear filter
(mean filter). The values of the optimal coefficients depend on the shape of f :
• For a Gaussian noise of variance σ 2 , ∀k ck = N1 (mean filtering with variance
σ2
N );
• For a uniform noise, ck = 21 if k = 1 or k = N, 0 otherwise.
Moreover, if ck = χ{k=1} (χA indicator function of A), the filtering process is an
erosion, and if ck = χ{k=N} it is a dilation. We thus find the two basic operations of
mathematical morphology (cf. chapter 4), from which we can also perform non-linear
filtering.
Linear filtering
Sharpening consists in subtracting the smoothed image from the original one. We
therefore calculate, for an image I and a smoothing mask W
I − (W ∗ I) = (δ −W ) ∗ I
5.1 Spatial filtering 75
where δ is the Kronecker delta (neutral element of the discrete convolution) defined
by: (
1 if i = j
δ [i, j] =
0 if i 6= j
This is equivalent to applying to I the mask W 0 = δ −W , which then has the following
properties:
Average filtering, modeled by the filter (a) in Fig. 5.2 tends to blur the image. This
averaging is the approximation of an integration, and so we expect that differentiation
will have the opposite effect, namely a strengthening of the image contours. The
simplest differentiation operator to implement in image processing is the gradient. As
a reminder, the gradient of a function f : R × R → R is the function:
2
∇ : R×R → R
∂f
∂ x (x, y)
(x, y) 7→
∂f
∂y (x, y)
Since the image is discretized, the partial derivatives ∂∂x and ∂∂y must be approximated
by left, right or centered finite differences, and for example (Fig. 5.6):
∂I ∂I
(x, y) ≈ I(x, y) − I(x + 1, y) et (x, y) ≈ I(x, y) − I(x, y + 1)
∂x ∂y
The CImg<T> class proposes the method
CImgList<> get_gradient(const char *axes=0, int scheme=0)
to compute the gradient of an image. Parameter axes specifies the set of axis along
which the partial derivatives are to be computed (e.g., "xy"), and scheme specifies
76 Chapter 5. Filtering
the desired discretization scheme (finite left, finite right, centered differences, etc.).
The method returns a list of images CImgList<> where each element is an estimate
of the partial derivative along each of the requested directions.
The gradient can also be decomposed at each point (Fig. 5.7) using:
• its norm k∇I(x, y)k, which defines the amplitude of the local variations (“con-
trast” of a contour);
• its phase φ (x, y) = atan ∂∂ yI (x, y)/ ∂∂ xI (x, y) , which defines the direction of the
local variations (direction locally orthogonal to that of the contours).
// Display.
(grad[0],grad[1],norme,phi).display();
return 1;
}
or
|I(x, y) − I(x + 1, y + 1)| + |I(x + 1, y) − I(x, y + 1)|
∂I ∂I
Figure 5.6 – Approximation of partial derivatives ∂x et ∂y using right finite differences.
By analyzing the intensity profiles of the images, contours can be seen as ramps
between two areas having locally almost constant values. The first derivative on the
ramp is constant, and depending on the “length” of this ramp, the contours will be
more or less thick. The study of the second derivative, and more particularly of its
zeros, thus makes it possible to detect edges, and in a sense to specify their orientation.
Among all the second order derivative operators, the Laplacian is the most widely
used. It is an isotropic operator, rotation invariant (the Laplacian of a rotated image
is the rotated Laplacian of the original image) and can easily be approximated in a
discrete way.
Formally the Laplacian, denoted ∇2 , is defined by
∇2 : R × R → R
∂2 f ∂2 f
(x, y) 7→ (x, y) + (x, y)
∂ x2 ∂ y2
Approximating the second order derivatives using finite differences, for an image I:
∂ 2I
(x, y) ≈ I(x + 1, y) − 2I(x, y) + I(x − 1, y)
∂ x2
∂ 2I
(x, y) ≈ I(x, y + 1) − 2I(x, y) + I(x, y − 1)
∂ y2
78 Chapter 5. Filtering
and ∇2 I(x, y) ≈ I(x + 1, y) + I(x − 1, y) + I(x, y + 1) + I(x, y − 1) − 4I(x, y), which can
be expressed using a convolution mask presented in Fig. 5.9a.
-1 -1 -1 -1 0 1
0 0 0 -1 0 1
1 1 1 -1 0 1
Two of the 8 Prewitt filters (up to a rotation)
-1 -2 -1 -1 0 1
0 0 0 -2 0 2
1 2 1 -1 0 1
Two of the 8 Sobel filters (up to a rotation)
0 1 0 0 -1 0
1 -4 1 -1 5 -1
0 1 0 0 -1 0
a) Laplacian mask b) Edge enhancer
cimg_for3x3(rap,x,y,0,0,I,float)
imgOut(x,y) = (Ipp + Ipc + Ipn + Icp + Icc +
Icn + Inp + Inc + Inn)/(Sgrad(x,y) + epsilon);
return imgOut;
}
80 Chapter 5. Filtering
It is a variant of the Nagao filter, where only the definition of the neighborhoods
changes (Fig. 5.11). Code 5.5 realizes this filter, and Fig. 5.12 proposes a comparison
of these two filters on a noisy image.
Wu filter
Figure 5.10 – Three of the nine Nagao windows. Three are deduced from the left
window by iterated rotations of π/2, three from the center window using the same
principle.
//Nagao.
CImgList<unsigned char> Nagao(9,5,5,1,1,0);
Nagao(0,0,0) = Nagao(0,0,1) = Nagao(0,0,2) = Nagao(0,0,3) =
Nagao(0,0,4) = Nagao(0,1,1) = Nagao(0,1,2) = Nagao(0,1,3) =
82 Chapter 5. Filtering
Nagao(0,2,2) = 1;
for (int i = 1; i<4; ++i) Nagao[i] = Nagao[0].get_rotate(i*90);
// Neighborhood analysis.
CImg<>
mu(9,1,1,1,0),
sigma(9,1,1,1,0),
st,
N(5,5);
CImg<int> permutations;
cimg_for5x5(imgIn,x,y,0,0,N,float)
{
CImgList<> res(9);
for (int i = 0; i<9; ++i)
{
res[i] = N.get_mul(Nagao[i]);
st = res[i].get_stats();
mu[i] = st[2];
sigma[i] = st[3];
}
// Searching minimal variance.
sigma.sort(permutations);
imgOut(x,y) = mu[permutations[0]];
}
return imgOut;
}
// Kuwahara.
CImgList<unsigned char> Kuwahara(4,5,5,1,1,0);
cimg_for_inXY(Kuwahara[0],0,0,2,2,i,j) Kuwahara(0,i,j) = 1;
for (int i = 1; i<4; ++i)
Kuwahara[i] = Kuwahara[0].get_rotate(i*90);
5.1 Spatial filtering 83
// Neighborhood analysis.
CImg<>
mu(9,1,1,1,0),
sigma(9,1,1,1,0),
st,
N(5,5);
CImg<int> permutations;
cimg_for5x5(imgIn,x,y,0,0,N,float)
{
CImgList<> res(4);
for (int i = 0; i<4; ++i)
{
res[i] = N.get_mul(Kuwahara[i]);
st = res[i].get_stats();
mu[i] = st[2];
sigma[i] = st[3];
}
sigma.sort(permutations);
imgOut(x,y) = mu[permutations[0]];
}
return imgOut;
}
I(x)
of:
0
hopt (0)
Λ = qR ;
+∞ 0 2
−∞ h opt (x) dx
3. No multiplicity of responses: ensure that for a contour there will be only one
detection. It can be shown that this is equivalent to optimizing the expression:
sR
+∞ 0 2
h (x)dx
xmax = R −∞ +∞ 00 2 .
−∞ h (x)dx
This criterion corresponds to the average distance between two local maxima
detected in response to a contour.
The detection and localization criteria being antinomic, we combine them by maxi-
mizing the product ΣΛ, called Canny’s criterion, under the constraint of the third one.
This optimization leads to a differential equation which admits as a general solution:
hopt (x) = c0 + c1 eαx sin ωx + c2 eαx cos ωx + c3 e−αx sin ωx + c4 e−αx cos ωx (5.2)
where the coefficients ci and ω are determined from the filter size.
The scale parameter α is important since it indicates the maximum distance for
two parallel contours to be merged into one through the detector.
Canny looks for a FIR filter of width L, i.e., defined on the interval J−L; +LK, with
slope s at the origin and with the following boundary conditions: h(0) = 0, h(L) = 0,
h0 (0) = s and h0 (L) = 0. Using numerical optimization, the optimal filter yields
a Canny criterion of value 1.12. For implementation issues and ease of use, an
86 Chapter 5. Filtering
(1 − e−α )2
hopt (x) = hderiche (x) = −cxe−α|x| with c = (5.4)
e−α
where c is a normalization coefficient of the function. In the literature, it is common to
find different values of c, depending on the type of normalization desired. Figure 5.14
illustrates the Canny (Equation 5.3) and the Deriche (Equation 5.4) filters.
This filter allows us to set up a contour detector based on a first order derivative
approach. We can deduce a smoothing filter which is obtained by integrating the opti-
mal filter hderiche (x). This smoothing filter, noted fderiche (x), is given by Equation 5.5.
We can also deduce a second order derivative filter by derivation of the optimal filter.
(1 − e−α )2
fderiche (x) = k (α|x| + 1) e−α|x| with k = (5.5)
1 + 2αe−α − e−2α
5.2 Recursive filtering 87
hopt (x)
Canny filter
Deriche filter
Figure 5.14 – The Canny filter and the Deriche filter in continuous form for the
calculation of the first derivative.
The Deriche contour filter not only corresponds to the exact solution of the
optimization problem of the Canny criterion, it also has the remarkable property
(shared also by its smoothing and derivative filter) to be implemented recursively with
few operations, independently of the filter width.
• FIR filters are filters whose finite size allows to implement their numerical
convolution with a relation between the output, denoted Io [i], and the input,
denoted Ii [i], which is written as follows:
L
Io [i] = ∑ H[k] × Ii [i − k].
k=−L
• IIR filters, which can be exactly recursively implemented, have a lower compu-
tational cost than the previous ones, but it is necessary to pay attention to their
stability. The relation between the output, denoted Io [i], and the input, denoted
Ii [i], is written as follows:
N M
Io [i] = ∑ bk × Ii [i − k] − ∑ ak × Io [i − k] (5.6)
k=0 k=1
where the coefficients ai and bi are to be calculated from the impulse response
of the filter H[i]. This type of filter uses a recursive scheme and requires
(N + M + 1) operations, which is smaller and independent of the size of the
filter. It should also be noted that a convolution with the Deriche filter or its
derivatives and its smoothing filter can be implemented exactly in the form
of Equation 5.6. This remarkable property is not shared by all IIR filters (the
Gaussian filter and its derivatives, like the Canny filter cannot). A recursive
implementation by filters of order 2 to 4 approximating in a least squares sense
the convolution with the Gaussian and its derivatives has been developed by
Deriche [11] and by other authors [41, 43].
Smoothing filter:
Code 5.6 allows to apply the Deriche filter along the X axis. The parameter
order allows to select either the smoothing filter (order 0), the calculation of the first
derivative (order 1) or the second derivative (order 2).
/*
Deriche filter on a 1D signal (along the X axis).
switch (order)
{
case 0 : { // Order 0 (smoothing)
float k = (1 - ema)*(1 - ema)/(1 + 2*alpha*ema - ema2);
a0 = k;
a1 = k*(alpha - 1)*ema;
a2 = k*(alpha + 1)*ema;
90 Chapter 5. Filtering
a3 = -k*ema2;
} break;
case 1 : { // Order 1 (first derivative)
float k = -(1 - ema)*(1 - ema)*(1 - ema)/(2*(ema + 1)*ema);
a0 = a3 = 0;
a1 = k*ema;
a2 = -a1;
} break;
case 2 : { // Order 2 (second derivative)
float
ea = std::exp(-alpha),
k = -(ema2 - 1)/(2*alpha*ema),
kn = -2*(-1 + 3*ea - 3*ea*ea + ea*ea*ea)/
(3*ea + 1 + 3*ea*ea + ea*ea*ea);
a0 = kn;
a1 = -kn*(1 + k*alpha)*ema;
a2 = kn*(1 - k*alpha)*ema;
a3 = -kn*ema2;
} break;
}
coefp = (a0 + a1)/(1 + b1 + b2);
coefn = (a2 + a3)/(1 + b1 + b2);
cimg_forYC(imgIn,y,c)
{
float *X = imgIn.data(0,y,0,c);
xn = xa = X[imgIn.width() - 1];
yn = ya = coefn*xn;
}
Several interesting and fast processings can be performed using the Deriche filter.
Smoothing of an image (order 0 filter)
Due to its recursive implementation, the Deriche smoothing filter has a constant
computational complexity regardless of the width of the filter specified by the scale
parameter α. Using the separability property of the filter, we can smooth a 2D image
I first by filtering I along the X axis, then by transposing the image (equivalent to a
90◦ rotation), reapplying the filter and transposing the image again (Code 5.7).
Figure 5.15 presents the result of the smoothing filter for several values of α. The
larger this scale parameter is, the less the image is smoothed.
92 Chapter 5. Filtering
Figure 5.15 – Results of the order 0 Deriche filter (smoothing filter) for several values
of the scale parameter α.
To compute the norm of the gradient, we can compute the first derivative along X
and the first derivative along Y (by transposing the image), then compute the norm
of the gradient from the two filtered images. For each derivative, it is convenient to
pre-smooth the image along the opposite axis, with an order 0 Deriche filter, in order
to obtain derivatives computed on a spatially smoothed image in an isotropic way
(Code 5.8).
∂I
(x, y) = (I ∗ fderiche (y)) ∗ hderiche (x)
∂ x
∂I
(x, y) = (I ∗ fderiche (x)) ∗ hderiche (y)
∂y
s 2 2
∂I ∂I
|∇I| (x, y) = (x, y) + (x, y)
∂x ∂y
CImg<>{}
imgIn("image.bmp"), // Loading the image
imgX = imgIn.get_transpose(),
imgY = imgIn;
imgX.transpose();imgY.transpose();
deriche(imgX,alpha,1); // First derivative along ’X’
deriche(imgY,alpha,1); // First derivative along ’Y’
imgY.transpose();
Figure 5.16 illustrates the result of the Deriche filter for the computation of the gradient
norm, for several values of the parameter α. Extracting the local maxima of the norm
of the gradient of the image allows us to detect the contours.
Figure 5.16 – Results for order 1 Deriche filtering (gradient norm calculation) for
several values of the scaling parameter α.
The Laplacian is equal to the sum of the second derivatives of the image along the
X and Y axes. It can be easily and quickly computed with the order 2 Deriche filters
(Code 5.9).
CImg<>
imgIn("test.bmp"), // Loading the image
img2X = imgIn,
img2Y = imgIn.get_transpose();
deriche(img2X,alpha,2); // Second derivative along ’X’
deriche(img2Y,alpha,2); // Second derivative along ’Y’
img2Y.transpose();
Figure 5.17 shows the result of Code 5.9 for several values of α. The zero crossings
of the Laplacian allow us to detect the contours.
Figure 5.17 – Results for order 2 Deriche filtering (Laplacian computation) for several
values of the scaling parameter α.
5.3.1 Introduction
Linear filters are defined as linear time-invariant systems. It can be shown that the rela-
tion between the input signal, denoted Ie (x, y), and the output signal, denoted Is (x, y),
of such a filter can be written as a convolution product. The quantity characterizing
the filter, denoted H (x, y), is called the impulse response. This convolution approach
has been described in Section 5.1.
This processing can also be seen in the frequency domain by using the Fourier
transform of the images (Fig. 5.18). The Fourier transform of the output image,
denoted Ibs ( fx , fy ), is equal to the multiplication between the Fourier transform of
the input image, denoted Ibe ( fx , fy ), and the frequency response, denoted H b ( fx , fy ),
defined as the Fourier transform of the impulse response H (x, y).
In the same way as for spatial filtering, notions of smoothing filters and edge
enhancers are defined. However, in the frequency domain, these notions are more
easily understood. Indeed, the frequency content (obtained by the Fourier transform)
5.3 Frequency filtering 95
Spatial domain
Ibe ( fx , fy ) b ( fx , fy )
H b ( fx , fy ) × Ibe ( fx , fy )
Ibs ( fx , fy ) = H
Frequency domain
Figure 5.18 – Equivalence of linear filtering in the spatial and frequency domain.
can be decomposed into low frequency and high spatial frequency domains (Fig. 5.19).
The first refers to the domain of low spatial gray levels variations (the rather homoge-
neous areas of the image), while the second refers to the high spatial variations (the
contours).
fy
x
Low
fx
Frequencies
We will thus find the filters known as “low-pass” which attenuate or eliminate high
frequencies in the Fourier domain without modifying the low frequencies. The high
frequency components characterize the contours and other abrupt details in the image,
the overall effect of the filter is a smoothing. Similarly, a “high-pass” filter attenuates
or eliminates low-frequency components. Because these frequencies are responsible
96 Chapter 5. Filtering
for slow variations in the image such as overall contrast or average intensity, the
effect of this filter is to reduce these characteristics and thus provide an apparent edge
enhancement. The third type of filter is called a “bandpass” filter, which retains only
the intermediate frequency components.
Definition 5.3.1 The Fourier transform of a continuous image I(x, y), assumed to
have infinite support is given by
T F : R × R 7→ C ZZ
fx , fy → I(
b fx , fy ) = I(x, y)e− j(2π fx x+2π fy y) dxdy
Figure 5.20 – Example of amplitude spectrum of the rectangular function, with two
different orientations.
To compute the Fourier transform on digital images, the Fast Fourier Transform
(FFT) is classicaly used and computed in CImg using CImg<T>::get_FFT():
Parameter axis is a string specifying the axes along which the Fourier transform must
be computed (for example "xy" for a 2D image) and the argument is_inverse
allows to specify if one wishes to calculate a direct or an inverse transform. Computing
the Fourier transform in this way returns an object of type CImgList<T> containing
two images, because this transformation returns a complex function. This list of
images allows to store the real part (first image [0] of the list) and the imaginary part
(second image [1]) of the transform.
Code 5.10 presents an example, where we create a synthetic image by mixing the
modulus of the Fourier transform of a first image with the argument of the Fourier
transform of a second image. The result is given in Fig. 5.22.
98 Chapter 5. Filtering
fy
x
fx
cimg_forXY(R_img3,x,y) {
5.3 Frequency filtering 99
R_img3(x,y) = Img1_Mag(x,y)*std::cos(Img2_Arg(x,y));
I_img3(x,y) = Img1_Mag(x,y)*std::sin(Img2_Arg(x,y));
}
The Fourier transform of a discrete signal is periodic with period Fe = T1e on all
axes. In practice, only one period is computed. Graphically Fand theoretically, it is
e Fe
classical to represent the frequency domain in the range − 2 , 2 for a 2D image.
One can also choose normalized frequencies f˜ = f /Fe and so the range − 21 , 21 .
The FFT algorithm outputs data in the range [0, Fex ] × 0, Fey or in [0, 1] × [0, 1]
when using normalized frequencies (Fig. 5.23). It is necessary to take this data ar-
rangement into account when calculating or visualizing.
Code 5.11 allows to visualize the spectrum (in decibels) of an image from the raw
data of the FFT and the spectrum by rearranging the data of the FFT to center the
zero frequency in the middle of the interval.
CImg<> img("image.bmp");
// Image display
(img,imgS,imgSR).display("Input image - "
"Spectrum (Raw FFT data) - "
"Spectrum (Rearranged FFT data)");
Figure 5.23 – Interpreting FFT output. A digital image has a periodic Fourier trans-
form. We represent the spectrum on a period with the zero frequency in the middle.
The FFT algorithm focuses on another period and the zero is found in the corners.
Figure 5.24 shows an image spectrum with raw and rearranged FFT data.
In the remainder of this section, the natural arrangement of the data from the FFT
algorithm will be taken into account.
5.3 Frequency filtering 101
Figure 5.24 – Visualizing an image spectrum from FFT: raw and centered data.
P RINCIPLE
where fc is the cutoff frequency and ρ( fx , fy ) is the distance from ( fx , fy ) to the origin
of the frequency plane:
q
ρ( fx , fy ) = fx2 + fy2
cI ( fx , fy ).
Figure 5.25 shows a representation of H
fy
cI ( f )
H
1
0
1 0.75
fx
0.5
fc
0.25
cI ( fx , fy ) fc
H
It is an ideal filter because all frequencies inside a circle of radius fc are restored
without attenuation, while all others are cancelled. Low-pass filters considered in this
chapter have radial symmetry, so it is sufficient to know the profile of the filter on one
of its radii. The disadvantage of the ideal low-pass filter is that it introduces ripples
in the spatial domain: its impulse response is none but the Bessel function of order
1 (inverse Fourier transform of the frequency response). This phenomenon is called
5.3 Frequency filtering 103
the Gibbs effect. To reduce these ripples, it is necessary to avoid having too abrupt
variations of the filter in the frequency domain, and in this context, Butterworth filters
are a possible solution.
B UTTERWOR TH FILTERS
cB ( fx , fy ) = 1
H
1 + [ρ( fx , fy )/ f0 ]2n
where f0 is related to the cutoff frequency. Figure 5.26 shows the frequency response
of Butterworth filters for n = 1 and n = 10, for a given value of f0 . In contrast to the
a) n = 1 b) n = 10
theoretical low-pass filter, the Butterworth filter does not have a sharp cutoff between
low and high frequencies. This variation of the passband at the cutoff band is set by
the value of n (Fig. 5.27a). In general, the√ cutoff frequency is defined at the point
where the transfer function goes below 1/ 2 of the maximum (equivalent to -3dB in
104 Chapter 5. Filtering
1
cB ( fx , fy ) =
H √
1 + [ 2 − 1][ρ( fx , fy )/ fc ]2n
1
1
f˜c = 0.05
f˜c = 0.10
0.8 f˜c = 0.15
0.8
f˜c = 0.20
f˜c = 0.25
f˜c = 0.30
0.6 0.6
cn ( f˜)
cn ( f˜)
H
H
0.4 0.4
n=1
n=2
n=4
0.2 n=6 0.2
n=8
n = 10
0 0
0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5
f˜ f˜
a) Influence of n b) Influence of fc
Filtered images show less ripples than in the ideal filter case. Moreover, the
smoothing rate varies more slowly as fc decreases (Fig. 5.27b).
G AUSSIAN FILTER
The frequency response is also a Gaussian function (Fig. 5.28), with the standard
deviation proportional to the inverse of the spatial standard deviation:
fx2 + fy2
− 1
b ( fx , fy ) = e −2π 2 σs2 ( fx2 + fy2 ) 2σ 2f
H =e where σ f =
2πσs
This is the optimal filter for removing additive Gaussian noise. Code 5.12 imple-
ments Gaussian filtering in the frequency domain. The frequency response of this
filter is illustrated in Fig. 5.28.
5.3 Frequency filtering 105
// Filtering
cimglist_for(fImg,k)
fImg[k].mul(gaussMask);
The result of this processing is represented in Fig. 5.29 where the image is
corrupted with a Gaussian noise.
Figure 5.28 – Frequency response of the Gaussian filter with a spatial standard devia-
1
tion σs = 1, corresponding to a frequency standard deviation of σ f = 2πσ s
.
T can be written as the convolution product of a Dirac comb following the y axis
and of period P with the rectangular function of width P/2 in the y direction:
with
1I E : F −→ {0, 1} X(∞,P) : R2 −→ R
+∞
1 if m ∈ E
m 7−→ and (x, y) 7−→ ∑ δ (x, y − jP)
0 if m ∈
/ E j=−∞
To carry out this calculation, it is initially necessary to calculate the Fourier transform
of Tb. Let’s recall that:
TF
XP (x, y) ↔ f p X f p( fx , fy ) 1
F P P with f p = .
1I[−∞,+∞]× − P , P (x, y) T↔ sinc π fy P
[ 4 4] 2 2
5.3 Frequency filtering 107
Input image σs = 1 σs = 2 σs = 6
Figure 5.29 – Result of the Gaussian filtering for different standard deviations. The
first row visualizes the images, the second row their spectrum in the frequency domain.
TF
Knowing that F ∗ G (x, y) ↔ Fb ( fx , fy ) × G b ( fx , fy ), the Fourier transform of T is ex-
pressed as:
1 P
T ( fx , fy ) =
b X f p ( fx , fy ) × sinc π fy
2 2
1 +∞ π
= ∑ sinc 2 fy P δ ( fx , fy − i f p )
2 i=−∞
1 +∞ π
Tb ( fx , fy ) = ∑ sinc i δ ( fx , fy − i f p ) (5.9)
2 i=−∞ 2
1 +∞ π
Ibt ( fx , fy ) = ∑ sinc i Ibr ∗ δ ( fx , fy − i f p ) . (5.10)
2 i=−∞ 2
1 +∞ π
Ibt ( fx , fy ) = ∑ sinc i Ib( fx , fy − i f p ) (5.11)
2 i=−∞ 2
108 Chapter 5. Filtering
The graphical representation of the modulus of the Fourier transform of the Moiré
image, considering that the image we wish to restore is band-limited in [− f0 , f0 ] ×
[− f0 , f0 ], is given in Fig. 5.11. The shape of Ib in this figure is arbitrary.
Figure 5.30 – Spatial and frequency models of a Moiré image. First row: spatial
modeling as the multiplication by a pattern. Second row: frequency modeling as
convolution with set of Dirac distributions. Third row: Image spectrum. Here, P = 8.
/*
Moiré removal.
// Cutoff frequency.
int Freq_c = imgIn.height()/(2*period);
// Frequency filtering.
cimg_forXYC(F_Img[0],x,y,c)
{
if (y>Freq_c && y<F_Img[0].height() - Freq_c)
F_Img[0](x,y,c) = F_Img[1](x,y,c) = 0;
}
// Inverse FFT and real part.
return F_Img.get_FFT(true)[0].normalize(0,255);
}
Figure 5.31 shows an example on a synthetic result. In the original image, half of
the lines are cancelled with a pattern of 6 pixels period. The result shows that there
is a significant difference with the original image. This can be explained by the fact
that the image is not band-limited (the frequency filtering has therefore removed a
110 Chapter 5. Filtering
Figure 5.32 presents the results on a real image extracted from a video sequence
where the lines are interlaced. After separating the odd and even lines, a frequency
filter is applied taking into account that the pattern has a period of 2 pixels.
non-linear character of the diffusion process, thus allowing to obtain locally dif-
ferentiated anisotropic smoothing according to the type of structures encountered.
Diffusion filtering is a process allowing in particular to attenuate the noise of an image,
while preserving the important information, in particular the contours of the objects.
j = −D ∇U,
where
• D is the diffusion tensor, a symmetric and positive definite matrix;
• U(x,t) is the concentration of matter in x at time t, U : Rd × [0; +∞[ → R;
• ∇U is the spatial gradient of the concentration of matter;
• j is the diffusion flux.
∇U∇U T
D = d Id (Id is the identity matrix), or D = d k∇Uk2
.
In image processing, we can imagine that the concentration U(x, y) at a given point
is given by the gray level I (x, y) of the image at this point, and the initial conditions
of the evolution of the diffusion equation, by the input image that we want to filter.
In this case, the diffusion tensor (or diffusivity) defines how the diffusion should be
done and is not necessarily constant. In practice, there is an advantage in choosing
this tensor as a function of the local characteristics of the image. Two cases can be
considered (Fig. 5.33):
• the linear isotropic diffusion filter, using a constant diffusivity;
• the non-linear anisotropic diffusion filter using a diffusion tensor or a diffusivity
adapted to the local characteristics of the image.
In this case, it is not necessary to use Equation 5.12, since one can directly carry
out a Gaussian filtering by using the CImg<T>::get_blur() method:
5.4 Diffusion filtering 113
x x
y y
Figure 5.33 – Illustration of the different diffusion filters. Left: in the homogeneous
linear case, we have a constant diffusion. Right: in the anisotropic case, the diffusion
is done parallel to the contours, by exploiting for example the orientation of the spatial
gradient.
Historically, this is the first multi-scale filtering that has been studied. However,
if this filtering reduces noise, it also blurs the image and attenuates the contours. In
Fig. 5.34 is illustrated the application of a Gaussian filter with increasing diffusion
times (i.e., increasing standard deviation values of σ ).
The idea is then to adapt the diffusivity to a local “measure” of the contours (for
example, based on the spatial gradient of the image), leading to non-linear diffusion
filtering techniques.
114 Chapter 5. Filtering
The idea of Perona and Malik is to smooth the image in the homogeneous areas,
and not to make the image evolve along the contours, or even to enhance them, as we
will see more precisely. The corresponding equation is written:
∂
I (x, y,t) = div (g (k∇Ik (x, y,t)) ∇I (x, y,t))
∂t
where g is a decreasing function, with g(0) = 1 and lim g(x) = 0. The equation is
x→∞
therefore close to the heat equation at the points where k∇Ik is close to 0. As an
example, we will consider the following function, originally proposed in [32].
1
g1 (s) = (5.14)
1 + (λ s)2
This metric imposes a lower diffusion for higher gradient values (in norm). So the
contours are preserved. The parameter λ allows to set a more or less important
diffusion compared to the value of the norm of the gradient. Indeed, a high λ value
will preserve contours with gradients which are lower in norm.
D ISCRETIZATION SCHEME
The explicit discretization scheme of Perona-Malik’s method for all points [i, j]
and at time t + ∆t is the following:
Ii,t+∆t
j = I t
i, j + ∆t ct
Ei, j ∇E I t
i, j + ct
Wi, j ∇W I t
i, j + ct
Ni, j ∇N I t
i, j + ct
Si, j ∇S I t
i, j
where N, S, E and W represent the north, south, east and west spatial directions, the
symbol ∇ denotes the spatial gradient in the direction indicated by the subscript, and
coefficients c are defined by:
5.4 Diffusion filtering 115
ctNi, j = g ∇N Ii,t j ctSi, j = g ∇S Ii,t j
ctEi, j = g ∇E Ii,t j t
cW = g I t
i, j
∇W i, j
The directions to the cardinal points are shown in Fig. 5.35.
i
N
I[i, j−1]
I[i+1, j]
W E
I[i−1, j] I[i, j]
I[i, j+1]
S
Figure 5.35 – Relationship between discrete positions (i, j) and spatial directions
North, South, East, and West.
In order to compute the spatial gradients in these directions, we use numerical partial
derivative computation by left and right finite difference techniques by using the
CImg<T>::get_gradient() method:
// Gradient calculated by finite left differences.
CImgList<> gradient_g = img.get_gradient("xy",-1);
// Gradient calculated by finite right differences.
CImgList<> gradient_d = img.get_gradient("xy",1);
The relationships between the spatial gradients in the north, south, east and west
directions and the finite differences are then:
∇N I [i, j] = I [i, j − 1] − I [i, j] = (−1) × finite left difference following y
∇S I [i, j] = I [i, j + 1] − I [i, j] = finite right difference following y
∇ I [i, j] = I [i − 1, j] − I [i, j] = (−1) × finite left difference following x
W
∇E I [i, j] = I [i + 1, j] − I [i, j] = finite right difference following x
/*
2D diffusion filtering (Perona and Malik method).
cimg_forXY(imgOut,x,y)
{
float
cN = 1/(1 + cimg::sqr(lambda*NW(1,x,y))),
cS = 1/(1 + cimg::sqr(lambda*SE(1,x,y))),
cE = 1/(1 + cimg::sqr(lambda*SE(0,x,y))),
cW = 1/(1 + cimg::sqr(lambda*NW(0,x,y)));
R ESULTS
Figure 5.36 presents the result of such a filtering for increasing diffusion times (t0 <
t1 < t2 < t3 ) and for decreasing parameters of the diffusivity function (λ0 > λ1 > λ2 >
λ3 ). The expression of this diffusivity is not the only one, and for example, Perona
2
and Malik proposed g2 (s) = e−(λ s) . The parameter λ allows to define the level of
contrast (through the norm of the gradient of the image) which reduces the diffusion
phenomenon. This is the non-linear character of this processing. It allows to set the
type of contour preserved during filtering (Fig. 5.36).
5.4 Diffusion filtering 117
λ0
λ1
λ2
λ3
Note that it is very easy to generalize Perona and Malik’s method for images with a
higher number of dimensions (e.g., 3D volumetric images or 2D+t video sequences).
The example Code 5.15 illustrates a modification of Algorithm 5.14 for processing
several consecutive images from a video sequence. Here, we take into account the
temporal variation of light intensities.
We can notice that algorithmically, this adds terms which are linked to the gradi-
ents in the directions along the z axis, which is used to store the different timesteps of
the sequence.
118 Chapter 5. Filtering
Figure 5.37 compares results of the linear and non-linear isotropic diffusion
filtering in the case of a video sequence.
/*
2D+T diffusion filtering (Perona and Malik method).
cimg_forXYZ(seqOut,x,y,t)
{
float
cW = 1/(1 + cimg::sqr(lambda*NWP(0,x,y,t))),
cE = 1/(1 + cimg::sqr(lambda*SEF(0,x,y,t))),
cN = 1/(1 + cimg::sqr(lambda*NWP(1,x,y,t))),
cS = 1/(1 + cimg::sqr(lambda*SEF(1,x,y,t))),
cP = 1/(1 + cimg::sqr(lambda*NWP(2,x,y,t))),
cF = 1/(1 + cimg::sqr(lambda*SEF(2,x,y,t)));
Figure 5.37 – Results of the diffusion filtering. Top left: the noisy input image. Top
right: linear isotropic diffusion filtering with a diffusion time of t = 12.5 s, equivalent
to a Gaussian filtering of standard deviation σ = 5. Bottom left: non linear isotropic
filtering in two dimensions by the Perona-Malik method. Bottom right: non-linear
isotropic filtering in two dimensions + time by the Perona-Malik method. (Source of
the video sequence: SBI Database [27].)
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
6. Feature Extraction
Features play a central role in image processing. These numerical values, derived
from direct calculations on the image, allow:
• to represent each pixel i by a vector xi ∈ Rd , each of the components reflecting
a feature relevant to the problem at hand;
• to detect points of interest in the image, for further processing (registration,
matching, . . . ).
We illustrate the first aspect in Sections 7.2 and 6.3. We focus on the second aspect in
the two other sections of this chapter, and we more particularly search for particular
geometric objects (points, shapes) in the image. First, we detail a method dedicated
to the detection of points of interest and more particularly corners. Then we focus
on a method for the detection of parametric shapes in images, allowing to locate for
example lines, circles or ellipses.
6.1.1 Introduction
A corner in an image can be seen as a point for which there are two dominant and dif-
ferent contour directions in a local neighborhood. Even if this definition seems simple,
and even if our visual system is very good at detecting these features, automatic corner
detection is not an easy task for a computer. A good detector must meet many criteria,
including low false positive detection, robust detection to illumination changes, noise,
partial occlusions, accurate precision constraint, or an implementation allowing real
time detection.
Numerous corner detection methods have been proposed in the literature and most
of them are based on the following principle: while a contour is classically defined by
122 Chapter 6. Feature Extraction
a strong gradient in one direction of space and weak in the others (see Section 5.1 of
Chapter 5), a corner is defined by the location of high gradient points along several
directions simultaneously. Many algorithms exploit the first and second derivatives of
the image to perform this detection. We present in the following two of these methods.
For a 2D image I and x = (x, y)> , this sum is computed for a shift δ = (δ x, δ y)>
by
S(x) = ∑ Gx (xi , yi ) (I(xi , yi ) − I(xi + δ x, yi + δ y))2
(xi ,yi )∈W
A corner (or more generally a point of interest) is expected to have a large variation
of S in all spatial directions. The matrix M(x) being real symmetric, its spectral
factorization is Q(x)Λ(x)QT (x), with:
6.1 Points of interest 123
• Λ(x) = diag (λ1 (x), λ2 (x)) the diagonal matrix of eigenvalues, assuming
λ1 (x) ≥ λ2 (x) ≥ 0;
• Q(x) = (q1 (x) q2 (x)) the orthogonal matrix of the eigenvectors.
In the eigenvectors’ basis, with coordinates (X,Y ), the first order approximation of
S is then S(X,Y ) ≈ λ1 (x)X 2 + λ2 (x)Y 2 . Depending on the respective value of these
eigenvalues, three cases have to be considered in the neighborhood W (Fig. 6.1):
1. If λ1 (x) and λ2 (x) are small, S(x) is small in all directions and the image on W
can be considered as constant in terms of intensity;
2. If λ1 (x) λ2 (x), a small shift in the direction of q2 (x) slightly changes S(x),
whereas a small shift in the direction of q1 (x) makes S(x) change significantly.
An edge therefore passes through the point x;
3. If λ1 (x) and λ2 (x) are large, a small variation of position makes S(x) vary
significantly. The point x is then a point of interest (a corner).
The exact computation of λ1 (x), λ2 (x), x ∈ I can be expensive, and it has been sug-
gested in [16] to use
R(x) = det (M(x)) − k.Tr (M(x))2 = λ1 (x)λ2 (x) − k(λ1 (x) + λ2 (x))2
where k allows to tune the detector sensitivity (k ∈ [0.04, 0.15] empirically gives good
results). Then,
1. on an homogeneous area, R(x) ≈ 0;
2. on an edge, R(x) < 0 and |R(x)| is high;
3. on a point of interest, R(x) > s where s is a (high) threshold value.
It is also possible to detect a pre-defined number n of points of interest, to avoid giving
an arbitrary value to s.
To compute such a detector with CImg, we start by writing a windowing function
W, which we will suppose to be Gaussian of standard deviation sigma.
Code 6.1 – Gaussian window.
cimg_forXY(res,i,j)
res(i,j) = std::exp(-(cimg::sqr(center - i) +
cimg::sqr(center - j))/(2*sigma2));
return res;
}
124 Chapter 6. Feature Extraction
λ2
“ Edge”
λ2 λ1
“ Corner”
λ1 et λ2
are high
λ1 ∼ λ2
“ Edge”
“ Flat area” λ1 λ2
λ1
We then write Code 6.2, which implements the detection algorithm itself. We give
as parameters the k value, the number of points of interest to detect and the variance
of the Gaussian kernel. Figure 6.2 presents two outputs of this algorithm.
/*
Corner detector using Harris and Stephens algorithm.
// Windowing.
CImg<> G = W(7,sigma);
// Structure tensor.
CImg<>
Ixx = gradXY[0].get_mul(gradXY[0]).get_convolve(G),
Iyy = gradXY[1].get_mul(gradXY[1]).get_convolve(G),
6.1 Points of interest 125
Ixy = gradXY[0].get_mul(gradXY[1]).get_convolve(G);
// R function.
CImg<>
det = Ixx.get_mul(Iyy) - Ixy.get_sqr(),
trace = Ixx + Iyy,
R = det - k*trace.get_sqr();
// Local maxima of R.
CImgList<> imgGradR = R.get_gradient();
CImg_3x3(I,float);
CImg<> harrisValues(imgIn.width()*imgIn.height(),1,1,1,0);
CImg<int>
harrisXY(imgIn.width()*imgIn.height(),2,1,1,0),
perm(imgIn.width()*imgIn.height(),1,1,1,0);
int nbHarris = 0;
cimg_for3x3(R,x,y,0,0,I,float) {
if (imgGradR[0](x,y)<eps && imgGradR[1](x,y)<eps) {
float
befx = Ipc - Icc,
befy = Icp - Icc,
afty = Icn - Icc,
aftx = Inc - Icc;
if (befx<0 && befy<0 && aftx<0 && afty<0) {
harrisValues(nbHarris) = R(x,y);
harrisXY(nbHarris,0) = x;
harrisXY(nbHarris++,1) = y;
}
}
}
// Sorting.
harrisValues.sort(perm,false);
is to use:
(λ1 (x) + λ2 (x))2 − (λ1 (x) − λ2 (x))2 = 4λ1 (x)λ2 (x)
and then (λ1 (x) − λ2 (x))2 = (λ1 (x) + λ2 (x))2 − 4λ1 (x)λ2 (x) or
q
(λ1 (x) − λ2 (x)) = Tr2 (M(x)) − 4det(M(x))
Knowing the sum and the difference of the eigenvalues, we can then determine the
latter easily for all the pixels of the image:
CImg<>
diff = (trace.get_sqr() - 4*det).sqrt(),
lambda1 = (trace + diff)/2,
lambda2 = (trace - diff)/2,
R = lambda1.min(lambda2);
6.1 Points of interest 127
Suppose that x is a corner, detected in the image I, e.g., using one of the former
detectors, and let Vx denote a neighborhood of size n of this point. Since x is a corner,
any point y ∈ Vx can be of two types: either a point on a homogeneous region, or an
edge point:
• if y is in a homogeneous region, then ∇I(y) = 0 and also ∇I(y)> (x − y) = 0
• if y is on an edge, then x − y follows this edge, and since the gradient is
orthogonal to the edge in y we also have ∇I(y)> (x − y) = 0
The idea is then to form and solve for any yi ∈ Vx a linear system
If A is the matrix whose rows are the ∇I(yi )> and b is the vector of components
∇I(yi )> , the overdetermined system is Ax = b and x is the solution of the system with
normal equations A> Ax = A> b. Since n 2, A (and thus A> A) is most likely of
rank 2 and x = (A> A)−1 A> b. Code 6.3 implements such a sub-pixel corner detection
algorithm.
/*
Improvement of the position of the points of interest.
// Image gradients.
CImgList<> grad = imgIn.get_gradient();
values of a and b such that “as many” of the possible x = (x, y)> contour points in
I satisfy the equation. This approach in practice is unfeasible, since it is impossible
from a combinatorial point of view to determine the number of points belonging to a
given line.
`5
`3
`2
x0
`1
x Figure 6.3 – A set of lines passing
through x0 For each of these lines,
the equation y0 = a j x0 + b j is veri-
`4 fied, for a certain pair (a j , b j ).
Hough adopts another strategy: the algorithm examines all the lines that pass
through a given point x0 = (x0 , y0 )> of I. Each line ` j : y = a j x + b j that passes
through x0 verifies y0 = a j x0 + b j . This equation is underdetermined and the set of
solutions (a j , b j ) is infinite (Fig. 6.3). Note that for a fixed a j , b j = y0 − a j x0 and
a j , b j are now variables, (x0 , y0 ) being fixed parameters. The set of solutions of this
last equation then describes the set of parameters of all lines ` j which pass through x0 .
However, this parametrization poses a problem, since it does not allow to represent
the vertical lines (a = ∞). Moreover, a priori a, b ∈ R and the parameter space Θ is
therefore R2 . For this reason, the Hough transform uses more specifically the Hessian
normal form, or polar representation of a line (Fig. 6.4):
r = x.cosα + y.sinα
or
cosθ r
y=− x+
sinα sinα
A line ` j is then described by θ = (r j , α j )> .
`j
rj
Figure 6.4 – Polar representation of a line. r ∈ R is αj
the distance of the line from the origin and α ∈ [0, π[ x
is the angle that the line makes with the x-axis.
to Θ = R × [0, π[, a point x0 of I being matched to all the lines that can pass through
this point. In the case of the polar parametrization, since these lines can be written as
r(α) = x0 cosα + y0 sinα
the projection of x0 is represented in Θ by a sinusoidal curve (Figs. 6.5a and 6.5b).
When more than one point of I is considered, the result is a set of sinusoidal curves
(Fig. 6.5c) in Θ.
Projection of several
Original image and x0 Set of lines passing through x0
points of I into Θ.
(a) (b) (c)
Once this matching is done, the Hough algorithm proceeds according to the
following steps:
1. Detect edge points in I: set of points B;
2. Discretize the Hough space Θ into a grid of accumulators;
3. Project points of B into Θ. For each projection, increment the accumulators of
all the lines that pass through this point. The accuracy with which the lines can
be detected is determined by the resolution of the grid in Θ;
4. Threshold the grid: the accumulators with higher values represent the lines
most likely to be present in I. A simple thresholding is often not sufficient
6.2 Hough transform 131
(detection of close but distinct lines possible, since close accumulators can
have approximately the same value). A local analysis of the accumulators must
therefore be defined. For example, for each accumulator, we search if there is a
larger value in its neighborhood. If this is not the case, we leave this value; if
not, we set the accumulator to 0;
5. Convert infinite lines into segments: since Θ = R × [0, π[ no information on the
length of the lines is available in the Hough space. If the objective is to detect
segments, rather than lines, it is necessary to use algorithms in Θ to limit the
length of the detected lines: for example, scan the lines in the edge image of I
to detect segment boundaries, or directly integrate the length constraint in the
Hough transform [13].
I MPLEMENTATION
Code 6.4 details the function to threshold the accumulator grid. Each accumulator
is compared to the values of its neighborhood in a window of size 2*wsize+1 after a
thresholding of value th. The number accnb, as well as the list of positions accXY
are returned by the function. This function is then used in the line detection algorithm
itself (Code 6.5). Figure 6.6 shows the result of the line detection by the Hough
transform.
/*
Accumulator thresholding
accXY(accnb,0) = x;
accXY(accnb++,1) = y;
}
}
/*
Line detection using the Hough transform
// Hough space.
cimg_forXY(imgIn,x,y)
{
float
X = (float)x - wx/2,
Y = (float)y - wy/2,
gx = grad(0,x,y),
gy = grad(1,x,y),
theta = std::atan2(gy,gx),
rho = std::sqrt(X*X + Y*Y)*std::cos(std::atan2(Y,X) - theta);
if (rho<0)
{
rho *= -1;
theta += cimg::PI;
}
theta = cimg::mod(theta,thetamax);
6.2 Hough transform 133
acc((int)(theta*acc.width()/thetamax),
(int)(rho*acc.height()/rhomax)) += (float)std::sqrt(gx*gx
+ gy*gy);
}
// Line display.
unsigned char col1[3] = { 255,255,0 };
for (unsigned i = 0; i<accNumber; ++i)
{
float
rho = coordinates(i,1)*rhomax/acc.height(),
theta = coordinates(i,0)*thetamax/acc.width(),
x = wx/2 + rho*std::cos(theta),
y = wy/2 + rho*std::sin(theta);
int
x0 = (int)(x + 1000*std::sin(theta)),
y0 = (int)(y - 1000*std::cos(theta)),
x1 = (int)(x - 1000*std::sin(theta)),
y1 = (int)(y + 1000*std::cos(theta));
imgOut.
draw_line(x0,y0,x1,y1,col1,1.0f).
draw_line(x0+1,y0,x1+1,y1,col1,1.0f).
draw_line(x0,y0+1,x1,y1+1,col1,1.0f);
}
return imgOut;
}
The solution, defined if α1 6= α2 (i.e., if the two lines are not parallel, which makes
sense) is given by
1 r1 sin(α2 ) − r2 sin(α2 )
xi =
sin(α2 − α1 ) r2 cos(α1 ) − r1 cos(α1 )
The principle of the Hough transform can be applied to the detection of circles in
images. A circle can be parameterized in a three-dimensional Θ space by (Fig. 6.8a):
(x − x0 )2 + (y − y0 )2 = r2
and then
x0 = x − r.cosα and y0 = y − r.sinα
Given φ at point x = (x, y)> , the parameter r can be eliminated to lead to
y0 = x0 .tanφ − x.tanφ + y
/*
Accumulator for circle detection.
// Gradient
CImgList<> imgGrad = ImgIn.get_gradient();
imgGrad[0].blur(a);
imgGrad[1].blur(a);
cimg_forXY(ImgIn,x,y)
{
float
gx = imgGrad(0,x,y),
gy = imgGrad(1,x,y),
norm = std::sqrt(gx*gx + gy*gy);
136 Chapter 6. Feature Extraction
if (norm>s)
{
cimg_forZ(Acc,r)
{
// Center in the direction of the gradient
int
xc = (int)(x + (r + Rmin)*gx/norm),
yc = (int)(y + (r + Rmin)*gy/norm);
// Voting scheme
if (xc>=0 && xc<Acc.width() && yc>=0 && yc<Acc.height())
Acc(xc, yc,r) += norm;
Figure 6.7 – Detection of circles by Hough transform. (a) shows the result for r = 17.
E LLIPSES
In the case of ellipse detection, five parameters are a priori needed (Fig. 6.8b): the
center O, the half-lengths of the axis ra , rb and the orientation α of the major axis with
respect to the X axis. The Θ space is therefore a 5D space and it must be discretized
in a sufficiently fine way to be able to detect ellipses in I. A quick calculation gives an
6.3 Texture features 137
idea of the memory occupation of such an algorithm: with a grid of resolution 128
along each dimension, Θ has to be discretized into 235 accumulators which, if coded
on long integers, require no less than 128 GB of memory for their storage.
rb ra
r α
O O
In the literature, many works try to describe textures by their frequency and/or spatial
aspects (e.g., Gabor filters or wavelets) or statistical aspects (Markov field models).
We propose here to describe simple methods that compute, for each pixel of an image,
quantities allowing to characterize the underlying texture. These quantities are then
used locally to segment textures (by a region segmentation method using these quanti-
ties as features, see Section 7.2), or computed on the whole image to allow Content
Based Image Retrieval (CBIR).
Texture means elementary spatial pattern, and these methods are therefore based
on indices computed in spatial neighborhoods of each pixel.
The vector (E1 · · · E8 )> gives a texture unit. Each Ei can take 3 values, so there
are 38 = 6561 possible texture units. We choose to identify a given unit by its
representation in base 3: N(x) = ∑8i=1 Ei 3i−1 , and we assign this value to the central
pixel (Code 6.7). The distribution S of the N(x)’s is called the texture spectrum of I.
The parameter τ allows to encode the notion of homogeneity in an area, in a different
way than the equality I(x) = I(xi ) (Fig. 6.9).
N(x,y) = E(0);
for (int j = 1; j<8; ++j)
N(x,y) += E(j)*pow(3,j);
}
}
return N;
}
Original image N, τ = 5 N, τ = 30
C ONTRAST
The contrast measures the distribution of gray levels in the image. It is equal
to the ratio between the standard deviation and a power n (parameter, whose usual
values are around 0.25) of the kurtosis of the empirical distribution (histogram, see
Chapter 3) of the gray levels (Code 6.8).
/*
Tamura’s contrast.
{
CImg<> h = imgIn.get_histogram(nbins);
float
mean = h.mean(),
variance = h.variance(),
kurtosis = 0;
cimg_forX(h,x)
kurtosis += cimg::sqr(cimg::sqr((h(x) - mean)));
kurtosis /= (nbins*cimg::sqr(variance));
return std::sqrt(variance)/std::pow(kurtosis,n);
}
C OARSENESS
For an efficient implementation, we use the integral image (image of the same size as
the original image, each pixel containing the sum of the pixels located above and to
the left of this point). Code 6.9 allows to compute the mean at the point (x, y) for a
neighborhood of size 2k .
142 Chapter 6. Feature Extraction
/*
Integral image of a point.
float
l1 = startx - 1<0 ? 0 : imgInt(startx - 1,stopy,0),
l2 = starty - 1<0 ? 0 : imgInt(stopx,starty - 1,0),
l3 = starty - 1<0 || startx - 1<0 ? 0 :
imgInt(startx - 1,starty - 1,0),
l4 = imgInt(stopx,stopy,0);
2. For each k ∈ [[1, 5]] absolute differences Ek (x) between pairs of non-overlapping
neighborhood means in the horizontal and vertical directions are then computed
(Code 6.11):
/*
Absolute difference computation
Ak : Image of local means
Ekh,Ekv : Images of horizontal and vertical differences
the k scale is encoded in z
*/
void ComputeE(CImg<>& Ak, CImg<>& Ekh, CImg<>& Ekv)
{
int kp = 1;
cimg_forZ(Ekh,k)
{
int k2 = kp;
kp *= 2;
cimg_forXY(Ekh,x,y)
{
int
posx1 = x + k2,
posx2 = x - k2,
posy1 = y + k2,
posy2 = y - k2;
if (posx1<Ak.width() && posx2>=0)
Ekh(x,y,k) = cimg::abs(Ak(posx1,y,k) - Ak(posx2,y,k));
3. We look for the value of k maximizing these absolute differences in one or the other
direction. We then note S(x) = 2k and we deduce the coarseness of I by averaging
S over the whole image (Code 6.12, only one property for the texture present in the
image) or by studying the histogram of S (several possible coarseness for the texture).
/*
Tamura’s coarseness.
cimg_forXY(Ekh,x,y)
{
float maxE = 0;
int maxk = 0;
cimg_forZ(Ekh,k)
if (std::max(Ekh(x,y,k),Ekv(x,y,k))>maxE)
{
maxE = std::max(Ekh(x,y,k),Ekv(x,y,k));
maxk = k + 1;
}
sum += pow(2,maxk);
}
return sum/(Ekh.width()*Ekh.height());
}
D IRECTIONALITY
where n is the number of peaks, φ p is the pth peak position, w p is the phase spreading
of the peak, r is a normalizing factor related to the angle discretization of the phase
space. Code 6.13 implements this feature when w p covers the whole phase space.
CImg<> h = phi.get_histogram(100);
h /= (imgIn.width()*imgIn.height());
float D = 0;
for (int p = 0; p<nb_pics; ++p)
cimg_forX(h,x) D -= h(x)*(cimg::sqr(x - perm(p)));
float r = 1;
D *= r*nb_pics;
return D + 1;
}
p−1
∑ σ (I(xi ) − I(x)) if U(LBPp,R ) ≤ 2
LBPp,R = i=0
p+1 otherwise
where
and σ (.) is the sign function. The uniformity function U(LBPp,R ) corresponds to the
number of spatial transitions in the neighborhood of x: the larger it is, the more chances
there are that a spatial transition occurs. If LBPp,R captures the spatial structure of the
texture, the contrast of the texture is not expressed. We therefore add an additional
feature
1 p−1 p−1
¯ 2 where I¯ = 1 ∑ I(x)
Cp,R = ∑ (I(xi ) − I)
p i=0 p i=0
A texture can then be characterized by the joint distribution of LBPp,R and Cp,R .
Code 6.14 implements this version of LBP and Fig. 6.10 shows a result on a grass-like
texture.
/*
LBP and contrast - rotation invariant version
cimg_for_insideXY(imgIn,x,y,(int)(R + 1))
{
6.3 Texture features 147
float
Ibar = 0,
Vc = imgIn(x,y);
// Computing U.
float U = 0;
for (int n = 1; n<p; ++n)
{
float Vj = imgIn.linear_atXY(xi(n - 1,0),xi(n - 1,1));
U += cimg::abs(V(n) - Vc>0 ? 1 : 0) -
cimg::abs(Vj - Vc>0 ? 1 : 0);
}
float
Vi = imgIn.linear_atXY(xi(p - 1,0),xi(p - 1,1)),
Vj = imgIn.linear_atXY(xi(0,0),xi(0,1));
U += cimg::abs((Vi - Vc>0 ? 1 : 0) -
(Vj - Vc>0 ? 1 : 0));
if (U>2) lbp(x,y) = p + 1;
else
cimg_forX(V,n)
lbp(x,y) += (V(n) - Vc>0 ? 1 : 0);
cimg_forX(V,n)
C(x,y) += cimg::sqr(V(n) - Ibar);
C(x,y) /= p;
}
}
Figure 6.10 – LBP and contrast image for R=2 and p=20.
6.3.4 Application
Section 7.2 will focus on the creation of local features for the segmentation of an
image into regions. We propose here the use of these texture features for image search
in large databases. To be more precise, we propose to search in an image database the
“closest” image to a given target image. Many methods exist, and we implement in the
following a method based on [2]. The idea is simple and is specified in Algorithm 3:
the image I is cut into patches (i.e., small rectangular thumbnails of the same size), on
which the LBP are computed, from which we then draw the distributions. The set of
distributions (histograms) is then concatenated to form a global feature of I.
The interest of representing the image by the concatenated histogram is that it
offers three levels of representation:
• at the pixel level, the LBP coefficients carry this information;
• at the regional level, by the representation of the histograms of the patches;
• at the global level, by concatenation.
The algorithm depends on a metric between histograms, and we choose in the following
the simplest one, the L1 norm of the difference:
nbins
d(hI , hJ ) = ∑ |hI (k) − hJ (k)|
k=1
Code 6.15 implements the first part of the algorithm (and uses Code 6.14).
// Concatenated histogram.
CImg<> hglobal(nbins*nbX*nbY);
for (int i = 0; i<nbX; ++i)
{
for (int j = 0; j<nbY; ++j)
{
CImg<>
patch = imgIn.get_crop(i*dx/nbX,j*dy/nbY,0,0,
(i + 1)*dx/nbX,(j + 1)*dy/nbY,0,0),
lbp(patch.width(),patch.height(),1,1,0),
C(lbp);
LBP(patch,R,p,lbp,C);
CImg<> hlbp = lbp.get_histogram(nbins);
cimg_forX(hlbp,x) hglobal((j + i*nbY)*nbins + x) = hlbp(x);
}
}
return hglobal;
}
Computing d is simple
150 Chapter 6. Feature Extraction
Results (Fig. 6.11) are presented using UTK1 dataset. This database contains
20,000 images of people, from 0 to 116 years old, covering a wide range of poses,
expressions and lighting conditions. All ethnicities and genders are represented.
Figure 6.11 – Results of the CBIR algorithm using LBP histograms. On the first and
third columns are the query images, on the second and fourth are the closest images in
the database.
1 https://www.kaggle.com/abhikjha/utk-face-cropped
7. Segmentation
the contour to move. γ is thus a function of time γ (t). The objective is to make the
contour converge toward the structures of interest that we wish to extract. This method
has the advantage, among others, to control the shape of the solution curve, which can
be forced to be closed using an adapted contour representation. The principle of active
contours is represented in Fig. 7.1. The method is initialized by defining an initial
contour γ (t = 0), to which we associate an energy that we minimize. The convergence
of the motion of the curve allows to obtain a segmentation of the image. The rest of
this study focuses on the so-called implicit representation of active contours.
t0 t1 t2 tf
Energy of γ (t)
t0
t1
t2
tf
Figure 7.1 – Presentation of the active contour method. The iterative minimization of
the energy associated to a contour γ will allow γ to move.
whose topology can even vary during the convergence (by separation or fusion of
curves).
From a chronological point of view, the original active contours model was pro-
posed by Kass et al. [22] and consists of an intrinsically closed deformable model
that evolves to the boundaries of the desired region. The deformation is based on a
Lagrangian formulation of energy minimization, expressed as the sum of an image
data related term and a regularization term. This method does not allow for simple
topological changes, and the value of the energy depends strongly on the parametriza-
tion of the curve.
Many other models deriving from geometric and geodesic approaches have been
proposed. Rather than presenting a catalog of the different variants described in
the literature, the following paragraph unifies them in a single energy minimization
formalism. The link between energy decay and velocity field generating a displacement
is also presented, in order to have both an energetic and a geometrical approach.
The use of an implicit representation has as a main disadvantage a higher computation
time. In this section, we will not discuss the fast algorithms for solving the implicit
active contour methods, which exist but are more complex to implement.
G ENERAL PRINCIPLE
object to be segmented Ri . The problem is therefore to find the vector p which verifies
Rγ = Ri . In order to constrain the progression of the model, an energy E is associated
to it. It is composed of the sum of a data attachment term and a regularization term.
The system minimizes E by converting it into kinetic energy until a stopping criterion
is verified. This energy problem can be formulated by a force balance, corresponding
to a numerical gradient descent problem:
∂ γ (p,t) γ (p,t = 0) = γ0 (p)
= F with δE
∂t F =
δγ
In this model, three degrees of freedom are left to the user:
• the representation of the curve (p), which can be explicit, parametric or implicit;
• the definition of the energy function (E or F by calculation), which is based on
geometrical constraints and/or data extracted from the image (gradient, texture
parameters, temporal information . . . ) ;
• the initialization of the method (γ (p,t = 0)) which influences the solution found
because of the possible non-convexity of the energy function.
In most cases, the energy attached to the model can be described as a combination of
integral, curvilinear and/or surface functions. The partial differential equation for the
evolution of the γ contour can be deduced mathematically by calculus of variations.
E NERGY FORMULATION
This contour energy can be expressed as a simple integral along the contour of a
function f depending on the characteristics of the image (boundaries for example):
Z
Eb = f (m) da (m)
γ(p)
In [6] authors have shown that this form of energy leads to the velocity field:
where n is the unit normal vector, κ the Euclidean curvature and ∇ the gradient
operator.
7.1 Edge-based approaches 155
In the literature,
the special case of independence with respect to the region considered,
i.e., g m, Rγ = g (m), is widely used:
Fr = g (m) n
Regularization energies
• When f (m) = 1, the energy is equivalent to the Euclidean length of the curve γ.
This case is widely used for its regularization properties and leads to the field:
Fb = κn
Fr = µn
Each point on the contour experiences a constant force in the normal direction
of the contour, which is equivalent to applying a homothetic motion to the shape.
This energy is commonly used to push the curve toward the valleys of potential.
156 Chapter 7. Segmentation
The information from the image can be derived from data extracted along the γ
contour (curvilinear integral) or within Rγ (surface integral). The associated energies
are called “boundary” energy or “region” energy, respectively.
For the boundary term, the function f usually defined in the curvilinear integral
is called the stopping function. It depends on a derivative operator, for example the
norm of the gradient, which has a maximum response on the amplitude jumps in the
image; it decreases when the operator increases and theoretically tends toward zero
when it tends to infinity. In practice, the following form is commonly used:
1
f (m) =
ˆ
1 + k∇I(m)k p
One of the main advantages of the implicit contour description is that it intrinsi-
cally handles topological changes during convergence. For example, if a segmentation
process is initialized by multiple seeds, collisions and merges of related components
are automatically managed without modification of the algorithm, which is not the
case for example with active contours with explicit representation. The counterpart of
this flexibility is an additional cost in computation time, which requires in practice the
use of optimized algorithms [37].
7.1 Edge-based approaches 157
In the level set formalism, the γ curve is not parametrized, but implicitly defined
through a higher dimensional function:
ψ : R2 × R → R
m × t 7→ ψ (m,t)
The 2D front at a time t is then defined by the zero isolevel of the function ψ at this
time:
Code 7.1 allows to extract the contour γ from an implicit function ψ, by looking for
the zero crossings of ψ.
/*
ExtractContour : Calculation of the approximate position of the
contour from the levelset map (psi): Isocontour with value 0.
return Contour;
}
Figure 7.2 – Implicit representation of a curve γ in the form of a level set: γ = ψ −1 (0).
In this example, the contour is a circle so the function ψ is a cone.
What makes this formalism interesting is that the normal n at each point of the
contour γ, as well as its curvature κ, are easily computable using differential operators
applied to the implicit map ψ:
∇ψ ∇ψ
n= and κ = ∇ · ,
k∇ψk k∇ψk
/*
InitLevelSet: Initialization of the LevelSet (psi) using the
signed euclidean distance. The initial contour is a circle of
center (x0,y0) and radius r.
In order to extract the desired region Ri from the image, it is necessary to construct
a velocity field F adapted from various sources. As an example, in the level set
formulation, the geodesic model of Caselles et al. proposes the equation:
∂ ψ (m,t)
= f (m) k∇ψ(m,t)kκ + ∇ f (m) · ∇ψ(m,t).
∂t
Note that whatever the rate of evolution F, it is important that the implicit representa-
tion ψ keeps its property k∇ψ(m)k = 1 during its evolution, so that the equivalence
(Equation 7.2) between the evolution of ψ and the γ curve it represents remains valid.
In practice, a normalization of the ψ gradients is performed at regular time intervals.
This normalization is not algorithmically simple to achieve, and we will not go into
the theoretical details of its implementation.
160 Chapter 7. Segmentation
ψin+1
j = ψinj + ∆tFinj k∇ψinj k
A complete first-order convex numerical scheme can be found in [37] for estimating
the spatial derivatives of the previous equation when the velocity function can be
written as:
F = Fprop + Fcurv + Fadv = F0 − εκ + U (m,t) · n
where Fprop = F0 is a propagation velocity, Fcurv = −εκ is a curvature-dependent veloc-
ity term and Fadv = U (m,t)·n is an advection term where U (m,t) = (u (m,t) , v (m,t))> .
The numerical calculation of the boundary and region velocity terms uses this scheme
because it is possible to approximate the two terms of the geodesic velocity to Fcurv
and Fadv and the region velocity field is analogous to Fprop :
h i
− max (F0i j , 0) ∇+ ψinj + min (F0i j , 0) ∇− ψinj
" r #
2 2
+ εκinj D0x ψinj + D0y ψinj
ψin+1 = n
+ ,
j ψ ij ∆t
( )
max unij , 0 D−x ψinj + min unij , 0 D+x n
ψi j +
−
max vnij , 0 D−y ψinj + min vnij , 0 D+y n
ψi j
where min (•, •) and max (•, •) correspond to the minimum and maximum operators,
D+β (•), D−β (•) and D0β (•) are respectively the finite difference derivatives on the
right, on the left and centered with respect to the variable β . Finally, ∇+ (•) and
∇− (•) are the following derivative operators:
s
2 2
max (D−x (•) , 0) + min (D+x (•) , 0)
∇+ (•) = 2 2 ,
+ max (D−y (•) , 0) + min (D+y (•) , 0)
s
2 2
max (D+x (•) , 0) + min (D−x (•) , 0)
∇− (•) = 2 2 .
+ max (D+y (•) , 0) + min (D−y (•) , 0)
There are also higher order schemes that apply to convex or non-convex functionals to
iteratively solve the problem. However, since the application of image segmentation
7.1 Edge-based approaches 161
does not require a high accuracy in the resolution of the contour propagation, these
schemes are not necessary in practice.
The sign of the data-driven function will cause the contour to expand or contract.
Algorithmically, we can force the contour to move by fixing the sign of this function.
Code 7.3 allows to set up the iterative scheme calculating the propagation of the
contour with a propagation speed term and an advection speed term.
Note that the unit normalization step of the ψ gradient norms is performed every
20 iterations, by the call to CImg<T>::distance_eikonal():
CImg<T>& distance_eikonal(unsigned int nb_iterations, float
band_size=0, float time_step=0.5f)
/*
Propagate : Propagation algorithm of an implicit contour
(geodesic model)
cimg_forXY(LevelSet,x,y)
{
float
Dxm = GradLS_moins(0,x,y), Dxp = GradLS_plus(0,x,y),
Dym = GradLS_moins(1,x,y), Dyp = GradLS_plus(1,x,y);
}
if (!(iter%20)) LevelSet.distance_eikonal(10,3);
}
}
a) b) c)
Figure 7.3 – Region-based approach (synthetic case). (a) original image, (b) using
gradient, (c) using local variance.
The feature is a concept (most often a vector xi ∈ Rd ) that we attach to each pixel
i of the image. The computation of features consists in characterizing the pixel by
values relative to its intensity (Chapter 3), its gradient (Section 5.1), attributes of
geometric type (Section 6.1), texture (Section 6.3) or any other value depending on
the application. In the vector, several types of features can coexist.
If some knowledge about the classes we are looking for is available (in particular
the probability of occurrence of a gray level for each class, and a priori probabilities of
the classes), then it is possible to use Bayesian decision theory, and choose thresholds
which minimize the cost of false errors (Bayesian thresholding, Neyman-Pearson for
example). If not, we look for the threshold(s) in the case of several regions or classes
from an analysis of the histogram:
• Direct thresholding: thresholds are automatically derived from statistical calcu-
lations on the gray levels histogram (modes, zero crossings, maximum entropy,
maximum variance, conservation of moments, etc.) and a priori constraints on
the number of classes;
• Adaptive thresholding, which uses the previous computation techniques, but
with a focus on the local study of the criteria, i.e., on sub-regions (managing the
size of the regions, overlaps);
• Hysteresis thresholding, where four threshold values are interactively or auto-
matically defined, determining three classes of pixels: rejected pixels (range
s1 -s2 ), accepted pixels (range s3 -s4 ), and candidate pixels (range s2 -s3 ). A con-
nectivity test of the sets (s2 -s3 ) and (s3 -s4 ) allows to validate or invalidate the
candidate pixels: (s3 -s4 ) is used as a marker for the reconstruction of (s2 -s3 ).
It is also possible to control the extension of the reconstruction of (s2 -s3 ) by
geodesic reconstruction.
O TSU A LGORITHM
where h is the histogram of I. Means and variances of both classes are then:
s
1 1 M−1
µ0 (s) = ∑ gh(g) ; µ1 (s) = ∑ gh(g)
n0 (s) g=0 n1 (s) g=s+1
7.2 Region-based approaches 165
s
1 M−1
and σ02 (s) = 1
n0 (s) ∑ (g − µ0 (s))2 h(g) ; σ12 (s) = ∑ (g − µ1 (s))2 h(g).
n1 (s) g=s+1
g=0
The sum of intra-class variances is then:
n0 (s)σ02 (s) + n1 (s)σ12 (s)
σw2 (s) = P0 (s)σ02 (s) + P1 (s)σ12 (s) =
nx × ny
s M−1
1 1
with: P0 (s) = nx×ny ∑ h(g) and P1 (s) = ∑ h(g).
g=0 nx × ny g=s+1
Similarly, the inter-class variance is calculated by:
The total variance of I is the sum of σb2 (s) and σw2 (s). Since it is constant for a
given I, the threshold s can be found either by minimizing σw2 (s), or by maximizing
σb2 (s). This latter choice is used since it depends only on order 1 statistics. Rewriting
σb2 (s) using the expression of µI :
Code 7.4 implements the Otsu algorithm and Fig. 7.4 shows some results of the
algorithm.
/*
Otsu algorithm
{
if (i<nb_levels - 1)
// If i==nb_levels - 1, all the pixels belong to class 0.
{
n0 += hist[i];
n1 = imgIn.size() - n0;
if (n0*n1>0)
{
float sigmaB = n0*n1*cimg::sqr(mu0 - mu1)/
cimg::sqr(imgIn.size());
if (sigmaB>sigmaBmax)
{
sigmaBmax = sigmaB;
th = i;
}
}
}
}
return (float)th;
}
Bernsen algorithm [4] computes for each pixel (x, y) a threshold based on the
minimum and maximum intensities in a neighborhood V (x, y). If:
the threshold is simply defined as the mean s(x, y) = (Imax (x, y) + Imin (x, y))/2, as
long as the local contrast c(x, y) = Imax (x, y) − Imin (x, y) is greater than a limit cmin .
Otherwise, pixels of V (x, y) all belong to the same region. Code 7.5 implements this
algorithm in a very efficient way, thanks to the structure of neighborhoods of CImg.
/*
Bernsen algorithm.
CImg<> N(5,5);
cimg_for5x5(imgIn,x,y,0,0,N,float)
{
float min, max = N.max_min(min);
imgOut(x,y) = max - min>cmin ? (max + min)/2 : valClass;
}
return imgOut;
}
One might think that it is sufficient to enumerate all possible partitions of X and
to select only the best one, but the combinatorial explosion of this approach makes
this implementation impossible. The number of partitions in g classes of a set with n
elements, which we note Sng is the Stirling number of the second kind. If S00 = 1 and
for all n > 0, Sn0 = S0n = 0, then Sng = Sn−1
g−1 g
+ gSn−1 .
And
1 g i
Sn = ∑ Cg (−1)g−i in
g
g! i=1
n
and then Sng ∼ gg! when n → ∞. Using a computer calculating 106 partitions per second,
it would take 126,000 years to compute all the partitions of a set with n = 25 elements!
squared distance between each xi ∈ X and the centroid of its class. Since
ci = arg min0
c∈X x∈P
∑ d(x, ci )2
i
and the optimization is related to the centroids and the memberships of the x to
the classes.
• k-medoids objective function: similar to the previous one, this function requires
that the class centers are elements of X.
• k-median objective function: here again, the formulation is the same, the distance
is no longer squared:
g
fk−median = ∑ ∑ d(x, ci )
i=1 x∈Pi
These functions search for class centers ci , and assign each point x j to the nearest
center. Other functions do not use class centers as a goal, such as:
g
• the sum of inter-class distances fSOD = ∑ ∑ d(x, y)
i=1 x,y∈Pi
g
• MinCut fcut = ∑ ∑ Wx,y , where Wx,y is a similarity measure between x
i=1 x∈Pi ,y∈P
/ i
and y.
As an example, we implement below the k-means, the feature space being R2 , where
each vector xi describing the pixel i has for components:
• the mean gray level in a 5×5 neighborhood;
• the variance of gray levels in a 5×5 neighborhood.
Each feature is normalized to avoid differences in the components’ amplitude. Code
7.6 describes how a set of features is computed. Of course, a relevant set of features is
always defined according to the application considered.
CImg<> N(5,5);
cimg_for5x5(imgIn,x,y,0,0,N,float)
{
features(x,y,0) = N.mean();
features(x,y,1) = N.variance();
}
features.get_shared_slice(0).normalize(0,255);
features.get_shared_slice(1).normalize(0,255);
return features;
}
Code 7.7 implements step (i), i.e., the assignment of points to classes. The
Euclidean metric in R2 is used here.
Code 7.7 – Assigning points to classes.
/*
Assigment of points to classes
7.2 Region-based approaches 171
Code 7.8 implements step (ii), where new class centers are computed.
/*
Computing class centers.
g[i].fill(0);
cimg_forX(g,i)
g[i] /= npc(i);
}
The stopping criterion can take various forms, Code 7.9 proposes to compute the
sum of the distances of the feature vectors to their assigned class center, whose small
variation will serve as a stopping criterion.
Code 7.9 – Stopping criterion.
/*
K-means stopping criterion.
cimg_forXY(data,x,y)
d += d2(g[label(x,y)],data,x,y);
return d;
}
/*
K-means algorithm.
// Class centers.
CImgList<> g(ncl,attributs.depth());
// Initialization.
cimg_forX(g,i)
{
int
x = (int)(rand()%attributs.width()),
y = (int)(rand()%attributs.height());
cimg_forX(g[i],dim)
g[i](dim) = attributs(x,y,dim);
}
// Initial partition.
F(attributs,g,imgOut);
w = W(attributs,g,imgOut);
return imgOut;
}
Figure 7.6 presents results with respect to the number of classes and the feature
space. Three cases are considered:
• D1 : x = I(x, y), the pixel is described by its gray level;
>
• D2 : x = k∇I(x, y)k φ (x, y) , the pixel is described by the norm and phase
of its gradient;
¯ y) σ 2 (I(x, y)) > , the pixel is described by the mean and variance
• D3 : x = I(x,
of the gray levels in a 5×5 neighborhood.
174 Chapter 7. Segmentation
For each image, the pixel value represents the class to which it has been assigned.
HH ncl
H 2 3 4
D HHH
D1
D2
D3
Figure 7.6 – k-means results with respect to the number of classes and the feature
space.
When all the regions of an image verify P, we say that the partition is verified.
Of course, there is a very large number of partitions which verify this property (for
example, it is enough to subdivide any region of a partition which verifies P to obtain
a new valid partition), and we do not know how to find all these partitions, nor how to
theoretically choose among the partitions verifying P the one which best solves the
segmentation problem.
Empirical criteria are often used, such as the evaluation of the cardinal of the
partition (to be minimized), the size of the smallest region (to be maximized), an
inter-region distance. . . In the absence of a defined strategy, here are some general
methods for region transformation.
The core of the SLIC method is based on a k-means classification of all the pixels
of the image (Algorithm 4, Section 7.2.3), with the consideration of an objective
function that depends on both the color difference between the pixels to be classified
and the colors of the centroids, and their respective spatial distance. More precisely,
the authors propose to compute the distance D(x1 , x2 ) between two points x1 and x2
by:
s 2
ds
q
2
D(x1 , x2 ) = dc + m2 where dc = (L1 − L2 )2 + (a1 − a2 )2 + (b1 − b2 )2
S
p
and ds = (x1 − x2 )2 + (y1 − y2 )2 .
The measure ds corresponds to the spatial distance between the two measured
points, and dc to the color distance, where the color of each point is expressed in
the CIE L∗ a∗ b∗ color space. The two constants S, m ∈ R are user-defined parameters,
which indicate the approximate size of the resulting super-pixels and the regularity of
their shape, respectively.
float D(float x1, float y1, float L1, float a1, float b1,
float x2, float y2, float L2, float a2, float b2,
float S, float m) {
return std::sqrt(cimg::sqr(L1 - L2) +
cimg::sqr(a1 - a2) +
cimg::sqr(b1 - b2) +
m*m/(S*S)*(cimg::sqr(x1 - x2) +
cimg::sqr(y1 - y2)));
}
where the values L1 , a1 , b1 , L2 , a2 , b2 come from the image lab, which is a copy of
the input color image img, expressed in the CIE L∗ a∗ b∗ color space:
7.2 Region-based approaches 177
C ENTROID I NITIALIZATION
Unlike the classical k-means algorithm, the position of the centroids ci is not
chosen randomly in the SLIC method, but initialized in multiple steps:
1. Centroids ci are first set at the center of the rectangles that correspond to the
subdivision of the image, as a grid of S × S cells.
2. The position of each ci is then shifted toward the point having minimal gradient,
in the neighborhood of size S × S.
3. The Lab color of this point in the image is finally assigned to the centroid ci .
Each centroid is then defined by five feature values ci = (x, y, L, a, b).
cimg_forXY(centroids,x,y) {
int
xc = x*S + S1,
yc = y*S + S1,
x0 = std::max(xc - S1,0),
y0 = std::max(yc - S1,0),
x1 = std::min(xc + S2,img.width() - 1),
y1 = std::min(yc + S2,img.height() - 1);
centroids.resize(centroids.width()*centroids.height(),1,1,5,-1);
Once the centroids are initialized, it is possible to assign to each pixel of the image
a label i attaching it to the centroid ci . We take advantage of the fact that the centroids
are scattered in the image, not to test the attachment of each pixel to all the existing
centroids, but only to those present in a neighborhood of this pixel (of size 2S × 2S).
This greatly reduces the number of evaluations of the objective function to label these
pixels. However, it happens that some pixels are outside these neighborhoods, and
they are then labeled in a second step in a more conventional way, by testing them
against all existing centroids (code 7.13).
Code 7.13 – Assignment of the centroids to the image pixels for SLIC.
cimg_forXY(labels,x,y) if (labels(x,y)>=centroids.width())
{
// The pixel (x,y) is not assigned yet.
float
L = lab(x,y,0),
a = lab(x,y,1),
b = lab(x,y,2);
float distmin = 1e20;
int kmin = 0;
return labels;
}
The resulting label map is a CImg<float> having two channels: the first one
gives the centroid label associated to each pixel. The second one keeps the minimal
cost D obtained for this pixel.
Once the image pixels have been labeled, the positions and colors of each centroid
are updated, using the positions and colors of the pixels that make up the corresponding
class. The residual error E, defined as the L1 norm of the differences between the
old centroids and the updated ones, is calculated and the labeling/updating process is
repeated as long as the centroids change significantly (Code 7.14).
180 Chapter 7. Segmentation
At the end of this loop, we have a stable set of centers for the super-pixels of the
image.
To visualize the resulting partition, we must recompute first the label map from
the positions of the centroids:
CImg<> labels = get_labels(lab,centroids,S,m).channel(0);
Then, we generate a color rendering out of it, as a new image visu, by assigning
to each pixel the average RGB color of its corresponding centroid (color that has been
stored in the channels 2 to 4 of the image centroids):
CImg<unsigned char>
visu = labels.get_map(centroids.get_channels(2,4)).LabtoRGB();
The detection of neighboring pixels having different labels, in the image labels,
allows to draw a black border around the countours of the super-pixels, in the image
7.2 Region-based approaches 181
visu.
CImg<> N(9);
cimg_for3x3(labels,x,y,0,0,N,float)
if (N[4]!=N[1] || N[4]!=N[3])
visu.fillC(x,y,0,0,0,0);
Finally, we mark the center of each centroid in visu with a semi-transparent red
dot:
unsigned char red[] = { 255,0,0 };
cimg_forX(centroids,k)
visu.draw_circle((int)centroids(k,0),(int)centroids(k,1),
2,red,0.5f);
Figure 7.7 illustrates the partitioning into SLIC super-pixels, obtained from the
800 × 800 kingfisher color image, for two different values of the regularization param-
eter m = 1 and m = 10, and for S = 30.
Figure 7.7 – Results of partitioning a color image with the SLIC algorithm, with
different regularization values m.
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
8. Motion Estimation
One of the critical points of motion estimation is therefore the search for correspon-
dences in the frames of a sequence. This includes the correspondence between points
(calculated for example by a point detector, cf. Section 6.1), lines or curves (cf.
Section 6.2), or even regions (cf. Section 7.2). It is then a question of estimating first
a sparse displacement field. On the other hand, these correspondences can be also
considered pixelwise, in a dense approach of motion estimation.
Figure 8.1 – Example of calculus of the optical flow. The points located on the rotating
tray have zero velocity although they rotate in 3D.
where I(m,t) is the grayscale at the point m = (x, y)> at time t > 0, and v(m,t) =
(u(m,t), v(m,t))> is the velocity vector. Under the assumptions of small displace-
ments and spatiotemporal differentiability of luminance, the luminance conservation
constraint is generally expressed by the first-order constraint of the motion equation
[18]:
Ix · u + Iy · v + It = ∇I > v + It = 0
where Ix , Iy , It represent respectively the horizontal/vertical components of the spatial
gradient of the luminance I and the temporal gradient of the luminance. This equation
allows to uniquely determine the projection of the velocity vector into the direction
of the luminance spatial gradient. This projection being locally orthogonal to the
photometric boundaries, we name it normal component of the velocity vector. To find
the second, tangential component, it is necessary to regularize the estimation, i.e., to
reduce the space of solutions by introducing an additional constraint.
where F is a real positive function, with value 0 when the optical flow constraints
hold. ∗x (respectively ∗y ) indicates the partial derivative of ∗ along the axis x (resp.
y). The minimization is performed by iterative schemes which are defined from the
Euler-Lagrange equations related to the functional E(v):
∂F d ∂F d ∂F ∂F d ∂F d ∂F
− − =0 and − − =0
∂ u dx ∂ ux dy ∂ uy ∂ v dx ∂ vx dy ∂ vy
Horn and Schunck [18] combine the luminance conservation equation with a
global regularization to estimate the velocity field v, minimizing:
Z
(∇I T v + It )2 + α(k∇uk2 + k∇vk2 )dm
D
defined on the domain D, where the constant α allows to adjust the influence of the
regularization term. It is thus a special case of the functional (8.1), with
Note also that Lucas and Kanade [24] propose a similar minimization, in a discretized
and localized way, which will be discussed in Section 8.1.2.
Horn and Schunck propose to re-write these equations with a discrete approximation
of the Laplacians ∇2 u = 4(ū − u) and ∇2 v = 4(v̄ − v) with ū and v̄ being spatially
averaged versions of u and v. This leads to a linear system of two equations with two
unknowns u and v, at each point (x, y) of the field v:
Solving this linear system gives the following expressions for u and v:
Ix [Ix ū + Iy v̄ + It ] Iy [Ix ū + Iy v̄ + It ]
u = ū − and v = v̄ −
Ix2 + Iy2 + 4α Ix2 + Iy2 + 4α
186 Chapter 8. Motion Estimation
Code 8.1 proposes the implementation of this method. The input of the function
is the initial estimate V of the displacement field (here, an image with two channels,
with values set to 0), as well as a sequence seq of two images stacked together
along the z-axis, in order to be able to easily estimate the temporal gradient with the
function CImg<T>::get_gradient(). As an output, the image V is filled with
the estimated displacement field.
Code 8.1 – Horn and Schunck method for estimating the displacement field.
/*
Horn and Schunck method
V : Displacement field
seq : Sequence of two images, stacked along z
nb_iters: Number of iterations for the numerical scheme
alpha : Regularization weight for the displacement field
*/
void HornSchunck(CImg<>& V, CImg<>& seq,
unsigned int nb_iters, float alpha)
{
// Compute the gradient along the axes ’x’,’y’ and ’t’.
CImgList<> grad = (seq.get_slice(0).get_gradient("xy"),
seq.get_gradient("z",1));
// Iteration loop.
for (unsigned int iter = 0; iter<nb_iters; ++iter)
{
CImg<> Vavg = V.get_convolve(avg_kernel);
cimg_forXY(V,x,y) {
float tmp = (grad[0](x,y)*Vavg(x,y,0) +
grad[1](x,y)*Vavg(x,y,1) +
grad[2](x,y))/denom(x,y);
V(x,y,0) = Vavg(x,y,0) - grad[0](x,y)*tmp;
V(x,y,1) = Vavg(x,y,1) - grad[1](x,y)*tmp;
}
}
}
8.1 Optical flow: dense motion estimation 187
D IRECT METHOD
A variant of the Horn and Schunck algorithm consists in solving the direct problem,
rather than the linearized problem, i.e., minimizing the functional:
Z
(I(x+u,y+v,t) − I(x,y,t+dt) )2 + α(k∇uk2 + k∇vk2 )dm
D
The associated Euler-Lagrange equations define the gradient descent which minimizes
this functional: ∂u
∂ k = Ix(x+u,y+v,t) δ I + α∇2 u
∂v
= Iy(x+u,y+v,t) δ I + α∇2 v
∂k
/*
Direct method
V : Displacement field
seq : Sequence of two images, stacked along z
nb_iters: Number of iterations for the numerical scheme
alpha : Regularization weight for the displacement field
*/
void DirectMotion(CImg<>& V, CImg<>& seq,
unsigned int nb_iters, float alpha)
{
// Normalize the input sequence
// (improve convergence of the numerical scheme).
CImg<> nseq = seq.get_normalize(0,1);
188 Chapter 8. Motion Estimation
// Iteration loop.
float denom = 1 + 4*alpha;
for (unsigned int iter = 0; iter<nb_iters; ++iter)
{
CImg<> Vavg = V.get_convolve(avg_kernel);
cimg_forXY(V,x,y) {
float
X = x + V(x,y,0),
Y = y + V(x,y,1),
deltaI = nseq(x,y,0) - nseq.linear_atXY(X,Y,1);
V(x,y,0) = (V(x,y,0) + deltaI*grad[0].linear_atXY(X,Y,0,0,0) +
4*alpha*Vavg(x,y,0))/denom;
V(x,y,1) = (V(x,y,1) + deltaI*grad[1].linear_atXY(X,Y,0,0,0) +
4*alpha*Vavg(x,y,1))/denom;
}
}
}
In practice, the determination of the velocity field using this kind of method does
not allow to obtain a correct estimation in the case of large displacements. Indeed, the
expressions to be iterated are based only on the first derivatives of the image, giving
by nature a very local information of the intensity variations (and thus of the motion).
It is then classical to use a multi-scale resolution scheme to get rid of this problem:
CImg<> V;
for (int N = nb_scales - 1; N>=0; --N)
{
float factor = (float)std::pow(2,(double)N);
int
s_width = std::max(1,(int)(seq.width()/factor)),
s_height = std::max(1,(int)(seq.height()/factor));
CImg<> scale_seq = seq.get_resize(s_width,s_height,-100,-100,2).
blur(N,N,0,true,false);
if (V) (V *= 2).resize(s_width,s_height,1,-100,3);
else V.assign(s_width,s_height,1,2,0);
DirectMotion(V,scale_seq,nb_iters<<N,alpha);
}
The smaller the scale, the more iterations are performed, as this is less time consuming
(fewer pixels to process). We also apply a Gaussian smoothing filter on the input
images for each scale, with a standard deviation that is as high as the scale is small.
This multi-scale resolution scheme is particularly efficient for minimizing the direct
problem. And, moreover, it allows estimating motion with large displacement vectors.
E STIMATION RESULTS
which draws the vector field stored in the image flow (the components of each vector
being stored as several channels in the image). The pixels for which the motion vectors
are displayed, are sampled according to the parameter sampling. The length of the
vectors is set by the parameter factor.
(a) Solving the linearized problem (b) Solving the direct problem
Figure 8.2 – Results of motion estimation between two frames by the method of Horn
and Schunck.
The solution of this weighted least square problem is the field v which verifies the
normal equations system A>W 2 Av = A>W 2 b, where, for the n points mi ∈ Ω at time
t:
A = [∇I(m1 ), . . . , ∇I(mn )]> , W = diag [W (m1 ), . . . ,W (mn )]
b = − (It (m1 ), . . . , It (mn ))>
// Velocity field.
CImg<> field(seq.width(),seq.height(),1,2,0);
// Windowing function.
float
sigma = 10,
color = 1;
W.draw_gaussian(n2,n2,sigma,&color);
cimg_for_insideXY(seq,i,j,n)
{
B.fill(0);
C.fill(0);
// Matrix computation.
for (int k = -n2; k<=n2; ++k)
for (int l = -n2; l<=n2;++l)
{
float temp = cimg::sqr(W(k + n2,l + n2));
B(0,0) += temp*cimg::sqr(grad(0,i + k,j + l));
B(1,1) += temp*cimg::sqr(grad(1,i + k,j + l));
B(0,1) += temp*(grad(0,i + k,j + l)*grad(1,i + k,j + l));
C(0) += temp*(grad(0,i + k,j + l)*grad(2,i + k,j + l));
C(1) += temp*(grad(1,i + k,j + l)*grad(2,i + k,j + l));
}
B(1,0) = B(0,1);
CImg<> v = -B*C;
field(i,j,0,0) = v(0);
field(i,j,0,1) = v(1);
}
return field;
}
Code 8.5 shows the algorithm using the amplitudes of the eigenvalues of B.
Incidentally, note the use of CImg instances as matrices in the calculation of eigenele-
ments and in the matrix/vector product (CImg<> C(1,2) designates a one-column,
two-row matrix [image notation], and thus the matrix product B ∗C makes sense). Fig-
ure 8.3 compares the two displacement fields obtained, with and without the analysis
of the spectrum of B.
Without the analysis of the eigenvalues With the analysis of the eigenvalues
Figure 8.3 – Displacement field with the method of Lucas and Kanade, estimated on
the images of Fig. 8.1.
CImgList<> eig;
epsilon = 1e-8f,
tau_D = 300;
// Velocity field.
CImg<> field(seq.width(),seq.height(),1,2,0);
// Windowing function.
float
sigma = 10,
color = 1;
W.draw_gaussian(n2,n2,sigma,&color);
cimg_for_insideXY(seq,i,j,n)
{
B.fill(0);
C.fill(0);
// Compute M and b in a neighborhood n*n.
for (int k = -n2; k<=n2; ++k)
for (int l = -n2; l<=n2; ++l)
{
float temp = cimg::sqr(W(k + n2,l + n2));
B(0,0) += temp*cimg::sqr(grad(0,i + k,j + l));
B(1,1) += temp*cimg::sqr(grad(1,i + k,j + l));
B(0,1) += temp*(grad(0,i + k,j + l)*grad(1,i + k,j + l));
C(0) += temp*(grad(0,i + k,j + l)*grad(2,i + k,j + l));
C(1) += temp*(grad(1,i + k,j + l)*grad(2,i + k,j + l));
}
B(1,0) = B(0,1);
B.invert();
CImg<> v = -B*C;
const float tmp = v(0)*vec(1,0) + v(1)*vec(0,0);
field(i,j,0,0) = -tmp*vec(1,0);
field(i,j,0,1) = tmp*vec(0,0);
}
}
return field;
}
with
x − x0 y − y0 0 0 1 0
and a = ai ∈ R6
A(m) =
0 0 x − x0 y − y0 0 1
At a lower cost, this model can be used to search for similarities (uniform scale change
α, rotation of angle θ and translation of vector t). For this:
cos(θ ) −sin(θ )
v=α (m − m0 ) + t = Aa
sin(θ ) cos(θ )
x − x0 −y + y0 1 0 >
where A = and a = αcos(θ ) αsin(θ ) t1 t2 .
y − y0 x − x0 0 1
This can be easily done with CImg, using here again images as matrices. Code 8.6
proposes, for a given point m = (x, y)> , and a neighborhood of size n, the procedure
to estimate the parameters of the similarity for a sequence of images seq.
8.2 Sparse estimation 195
// Windowing function.
float
sigma = 10,
color = 1;
W.draw_gaussian(n2,n2,sigma,&color);
M.fill(0);
b.fill(0);
M += temp*At*g*g.get_transpose()*A;
b += temp*At*g*grad(2,x + k,y + l);
}
CImg<> t = CImg<>::vector(a(2),a(3));
1. In the case of two images only, the analysis consists in finding the disparities
between two consecutive images in the sequence. For a sequence of several
images, each pair is then considered.
2. Tracking the movement of a primitive through a sequence of images as a whole.
Code 8.7 presents the calculation of the correlation between the patches extracted
from the image I(t2 ) = I(1), in an area defined by the dimensions area, and a
reference image T. The patch center which maximizes the correlation, as well as the
corresponding correlation value, are used to display the optimal patch containing the
object to track, having size size, initially located at coordinates pos in the image
I(t1 ) = I(0). The method CImg<T>::draw_rectangle is used to draw the
bounding rectangle on the image.
CImg<T>& draw_rectangle(int x0, int y0, int x1, int y1,
const tc *color, float opacity, unsigned int
pattern)
Figure 8.4 shows a result obtained from two non-consecutive images of a video
sequence. The first cyclist, identified by its position pos in the initial image (arrow) is
tracked in the image I(t2 ) by the search of the maximal correlation in a neighborhood
of size area centered around the initial position. A bounding box is used to identify
the tracked object. Figure 8.5 shows the result of the tracking on nine consecutive
images.
/*
Object tracking by cross-correlation.
I : Image sequence
pos : Initial position of the object to track
size : Size of the template window around p
area : Size of the lookup window
*/
CImg<> SuiviCC(CImgList<>& I, CImg<int>& pos,
CImg<int>& size, CImg<int>& area)
{
CImg<int>
8.2 Sparse estimation 197
prevPos(pos),
currPos(1,2);
// Normalized reference.
CImg<> T = I[0].get_crop(pos(0) - size(0),pos(1) - size(1),
pos(0) + size(0),pos(1) + size(1));
int
w = T.width(),
h = T.height();
T -= T.sum()/(w*h);
float
norm = T.magnitude(),
correlation,
corr = -1;
// Region of interest.
unsigned char red [3] = {255,0,0};
CImg<> imgOut(I[1]);
imgOut.draw_rectangle(currPos(0) - size(0),currPos(1) - size(1),
currPos(0) + size(0),currPos(1) + size(1),
red,1,~0U);
return imgOut;
}
If, instead of R, n points are to be matched (e.g., corners detected by Harris, cf.
Section 6.1), the idea is to find a function to match the n points in I(t1 ) with the n
points of I(t2 ), so that no two points in the initial image are matched with the same
point in I(t2 ).
The difficulty lies in the translation of these heuristics into quantitative expressions
allowing to mathematically solve the problem. As a result, we often use dynamic
programming to solve these optimization tasks.
While these methods are easy to implement and effective on well-detected regions
or points, they rely on an assumption of regularity that is ill-suited to track multiple
objects, especially since these objects may overlap. In the case of false detection,
moreover, the tracking may have difficulties in re-aligning with the object of interest
(Fig. 8.6).
Let I1 and I2 be two images, of identical size, acquired with the same acquisition
system. We are looking for two images that, after registration, will have the same
gray levels for an identical position. We are therefore looking for a translation, noted
t = (tx ,ty )> , verifying the relation I2 (x, y) = I1 (x + tx , y + ty ) for all (x, y).
8.2 Sparse estimation 199
Figure 8.5 – Object tracking by searching for maximal correlation. Global view.
The use of the cross-power spectrum isolates this phase difference. It is calculated
from the two images I1 and I2 by:
Iˆ1 ( fx , fy ) Iˆ2 ( fx , fy )
R ( fx , fy ) = .
kIˆ1 ( fx , fy ) Iˆ2 ( fx , fy )k
We have then
r (x, y) = δ (x + tx , y + ty ).
Finally, to estimate the translation between two images, it is sufficient to search for
the maximum of their phase correlation. Algorithm 5 summarizes this method.
Code 8.8 implements the calculation of the phase correlation for the estimation of the
translation between two images.
/*
Image registration by phase correlation
IS : Source image
8.2 Sparse estimation 201
IC : Target image
tx : Horizontal component of the translation (output)
ty : Vertical component of the translation (output)
* /
void CorrelationPhase(CImg<>& IS, CImg<>& IC, int& tx, int& ty)
{
float eps = 1.0e-8f;
// Compute the Fourier transform of the images.
CImgList<> fft_S = IS.get_FFT("xy"),
fft_T = IC.get_FFT("xy");
cimg_forXY(r,x,y)
{
if (r(x,y)>r_max) {
r_max = r(x,y);
tx = -((x - 1 + (w/4))%(w/2) - (w/4) + 1);
ty = -((y - 1 + (h/4))%(h/2) - (h/4) + 1);
}
}
}
202 Chapter 8. Motion Estimation
e) Phase correlation:
the maximum gives
c) Image I1 (noisy) d) Image I2 (noisy) the translation
between the two
images
In Fig. 8.7, Code 8.8 is first tested for synthetic displacements. We extract two
images (Fig. 8.7a) from an input image, then we add noise (Fig. 8.7c and 8.7d) to
show the good performances of this approach even for noisy images. Indeed, the phase
is less sensitive to noise in an image than the previous correlation techniques seen
in Section 8.2.1. In Fig. 8.7e we can observe that the phase correlation has a sharp
maximum. Figure 8.7b illustrates the final result, with images I1 , I2 , the displacement
estimated by the method as well as the starting image merged on the three color
channels of an image.
8.2 Sparse estimation 203
R EPRESENTATION CHANGES
The previous technique estimates a translation between two images, but it is also
possible to use it for other transformations. Using variable substitution, we can convert
for example a rotation or a similarity into a translation:
• Rotation: we consider that the image I2 is a rotation of angle θ0 of the image
I1 . By changing the variable from a cartesian space to a polar space, we get
the following relation: I2 (ρ, θ ) = I1 (ρ, θ − θ0 ). With this new representation,
estimating a rotation is equivalent to estimating a translation vector (0, θ0 ) in
the polar space;
• Similarity: we consider that I2 is a scaling of factor (sx , sy ) of the image I1 .
By changing the variable from a cartesian space to a logarithmic space, we
have I2 (log(x), log(y)) = I1 (log(x) + log(sx ), log(y) + log(sy )). With this new
representation, estimating the scaling is equivalent to estimating a translation
vector (sx , sy )> in the logarithmic space.
O BJECT TRACKING
Image registration by phase correlation can be used for sparse motion estimation.
Code 8.9 shows how to perform object tracking using CImg.
/*
Object tracking by phase correlation.
* /
int main(int argc, const char **argv)
{
cimg_usage("Object tracking by image registration (phase
correlation)");
int
roi_x0 = cimg_option("-x",102,"X-coordinate of the ROI"),
roi_y0 = cimg_option("-y",151,"Y-coordinate of the ROI"),
roi_w = cimg_option("-w",64,"Width or the ROI"),
roi_h = cimg_option("-h",64,"Height of the ROI");
bool save = cimg_option("-o",1,"Save (0 or 1)");
img.append(CImg<>("frame_0140.bmp").channel(0),’z’);
int
tx,ty, // Translation
x0 = roi_x0, // Object position (X-coordinate)
y0 = roi_y0; // Object position (Y-coordinate)
out.draw_image(0,0,0,0,img.get_slice(f + 1),1).
draw_image(0,0,0,1,img.get_slice(f + 1),1).
draw_image(0,0,0,2,img.get_slice(f + 1),1).
draw_rectangle(x0,y0,
x0 + roi_w - 1,y0 + roi_h - 1,mycolor,0.5f);
Figure 8.8 shows the results obtained using Code 8.9. This approach has several
limitations. In order to robustly estimate the translation of the object, the background
of the object must be uniform. Otherwise, there would be two moving areas (the
object and the background) and therefore the theoretical results previously given are
no longer valid. Finally, if the object is occluded, this tracking method can “lose” the
object, because the method does not take into account the dynamics of the object. This
8.2 Sparse estimation 205
Figure 8.8 – Object tracking using image registration based on phase correlation.
Let mi = (xi yi )> be a point of interest detected in the ith frame, moving with a
speed of vi = (ui vi )> (e.g., estimated by the optical flow). In the image plane, one
>
can describe the motion of a point by a state vector si = xi yi ui vi . We will
denote by zi the measure of the position of the point of interest in the image i.
The goal of the Kalman filter is to compute si , knowing si−1 and zi . Under the
assumption of a Gaussian noise, the prediction of the current state si ∼ N (Di si−1 ; Σd,i )
where Di is a dynamical model of its evolution state. Similarly, we suppose that the
measure zi ∼ N (Mi si ; Σm,i ), where Mi is a precision model of the measures.
s0 and Σd,0 , Σm,0 are assumed to be known. We denote by Σi the a priori estimation
of the error.
206 Chapter 8. Motion Estimation
si = si + Ki (zi − Mi si )
Σi = (I − Ki Mi )Σi
The prediction step derives an estimate of the state at time i from the prediction at
time i − 1 , as well as the error covariance matrix based on the error covariance matrix
of the system at the previous time. The correction step corrects the estimate provided
by the prediction step.
/*
Linear Kalman filter.
I : List of images
pos : Initial position of the tracked object
size : Size of the window surrounding the object
8.2 Sparse estimation 207
*/
void Kalman(CImgList<>& I, CImg<int>& pos, CImg<int>& size)
{
CImg<>
prevState(1,4),
currState(1,4);
CImg<int>
prevPos(pos),
currPos(1,2),
estimPos(1,2);
prevState(0,0) = pos(0);
prevState(0,1) = pos(1);
prevState(0,2) = 0;
prevState(0,3) = 0;
// Matrix M.
CImg<> M(4,2,1,1,0);
M(0,0) = M(1,1) = 1;
CImg<> Mt = M.get_transpose();
// Correction.
CImg<> Kt = PkCurrent*Mt*((M*PkCurrent*Mt+ SigmaM).get_invert())
;
currState = currState + Kt*(estimPos - M*currState);
CImg<> I = CImg<>::identity_matrix(PkCurrent.width());
PkCurrent = (I - Kt*M)*PkCurrent;
In general, the Kalman filter quantifies the uncertainty in the state estimate. This
information allows the point detector to automatically size the region of interest for
the primitive in the next image. This region is centered on the best estimated position
and its width is proportional to the uncertainty. In a properly designed filter, the value
of this uncertainty decreases rapidly with time and the region shrinks accordingly.
In this context, dimension reduction methods are often applied [25]. We propose
in the following a simple example of linear dimension reduction in multi-spectral
imaging, and its implementation in CImg.
For quantitative data, Principal Component Analysis (PCA) [19] is one of the most
widely used methods. It considers that the new variables are linear combinations and
uncorrelated variables of the initial ones.
In the following, the data will be arrays Mnp (R) of quantitative variables, a row
being a pixel of the image, and the columns describing the measured parameters (the
channels) on this pixel.
P RINCIPLE
PCA often works on centered and/or standardized variables. Let g ∈ R p note the vector
of arithmetic means of each channel:
g = XT D1
9.1 Dimension reduction 211
where D is a diagonal matrix, each dii giving the importance of the individual i in the
data (most often D = 1n I, which we will assume in the sequel), and 1 is the vector
of Rn whose components are all equal to 1. The matrix Y = X − 1gT is the centered
array associated to X.
If D1/σ is the diagonal matrix of the inverses of the standard deviations of the
variables, then Z = YD1/σ is the matrix of standardized centered data and
R = ZT Z
is the correlation matrix of the standardized centered data (up to the p coefficient) and
summarizes the structure of the linear dependencies between the p variables (the PCA
can also reason on the variance/covariance matrix R = YT Y.)
Using CImg<> as matrices, and assuming that the p channel multispectral image
is contained in a CImg<> of size(w,h,1,p), then the matrix R is computed by:
CImg<> mean(p,1), var(p,1);
cimg_forC(imgIn,c)
{
mean(c) = imgIn.get_channel(c).mean();
var(c) = 1./std::sqrt(imgIn.get_channel(c).variance());
}
// Data matrix.
CImg<> X(p,imgIn.width()*imgIn.height());
cimg_forXYC(imgIn,x,y,c)
X(c,x + y*imgIn.width()) = imgIn(x,y,0,c);
// Centered/standardized data.
CImg <> ones(1,imgIn.width()*imgIn.height());
var.diagonal();
CImg<> Z = (X - ones*mean)*var;
PTu Pu = uT ZT Zu
Seeking to maximize the inertia on the line F1 , the vector u1 to choose is the
eigenvector of ZT Z associated to the largest eigenvalue λ1 . For this vector, vT Λv = λ1
is the inertia of individuals on the line generated by u1 . If this variance is not large
enough (we do not explain enough of the initial variation of X), then we start the
procedure again in the space orthogonal to u1 . We obtain a second line, generated by
u2 , which explains part of the variation of the initial data. And we iterate possibly in
the orthogonal of (u1 , u2 ). . . until we obtain a sufficient amount of explained variance.
This finally aims to search for Fk , the eigensubspace of ZT Z, generated by the first k
eigenvectors, defining the principal axis (or principal components - PC). The inertia
explained by this subspace is ∑ki=1 λi .
I NTERPRETATION
The primary goal of PCA is to reduce the dimension to allow an efficient visual-
ization of the data, while preserving the information (here represented by the inertia).
Therefore, we need tools to answer the question: what dimension for Fk ? There is
no universal theoretical answer, the main thing being to have a sufficiently expressive
representation to allow a correct interpretation of the variation of the data. Commonly
used criteria include:
∑k λ j
• the total percentage of explained variance, expressed on the first k axis by ∑ pj=1 λ .
j=1 j
A threshold (e.g., 90%) of the total explained inertia gives a corresponding value
of k. However, the percentage of inertia must take into account the number of
initial variables;
9.2 Color imaging 213
9.1.2 Example
As an illustration, we propose to process images of the Sun, from the Solar Dynamics
Observatory1 project [33], whose main objective is the understanding and prediction
of solar variations that influence life on Earth and technological systems. The mission
includes three scientific investigations (Atmospheric Imaging Assembly (AIA), Extreme
Ultraviolet Variability Experiment (EVE) and Helioseismic and Magnetic Imager
(HMI)), giving access to p = 19 solar images under different wavelengths (Fig. 9.1).
Applying the previous method, three components capture more than 90% of the
initial variation of the data. The code
CImgList<> imgOut(nb_pca,256,256);
cimg_forXYC(imgIn,x,y,c)
if (c<nb_pca) imgOut(c,x,y) = Xpca(c,x + y*imgIn.width());
imgOut.display();
1 https://sdo.gsfc.nasa.gov/
214 Chapter 9. Multispectral Approaches
Figure 9.1 – Example of solar images from the Solar Dynamics Observatory.
RGB SPACE
RGB is an additive color space, which remains by far the most used color space
for digital images. Each color is coded by a triplet of values (most often in the integer
range [[0,255]]), representing the quantities of red, green and blue which make up the
color. The storage and the display of the color images use most of the time a coding of
the pixel values in the RGB space. It is “the default color space” in image processing.
The XY Z color space was defined in order to correct certain defects of the RGB
space. This space consists of three primary colors X, Y and Z, known as virtual, ob-
tained by linear transformations of the (R, G, B) triplet. According to the International
9.2 Color imaging 215
L ∗ a∗ b∗ SPACE
RGB and XY Z spaces are perceptually non-uniform spaces, in the sense that the
value of the Euclidean distance between two colors in these spaces is not necessarily
characteristic of the perception of these colors by the human visual system: two colors
distant in the RGB space can be perceived as close to the eye (and vice versa).
The L∗ a∗ b∗ space is a system created in 1976 by the ICI, which seeks to be
perceptually uniform, obtained by non-linear relations computed in the XY Z space.
Components are computed from a reference white (X0 ,Y0 , Z0 ), by:
L∗ = 116 f (Y /Y0 ) − 16
(
6 3
x1/3 if x > ( 29 )
a∗ = 500 [ f (X/X0 ) − f (Y /Y0 )] with f (x) = 1 29 2
4
3 6 x + 29 otherwise
b∗ = 200 [ f (Y /Y0 ) − f (Z/Z0 )]
216 Chapter 9. Multispectral Approaches
YUV SPACE
Y represents the luma (luminance with gamma correction), U and V represent color
chrominance. CImg proposes the following methods CImg<T>::RGBtoYUV(),
CImg<T>::YUVtoRGB() (and the get_* versions), CImg<T>::load_yuv()
and CImg<T>::save_yuv() to manipulate color images in the YUV color space.
YCbCr SPACE
The YCbCr space is the international standard for the coding of digital television
images. It is currently part of the JPEG standard, for the lossy compression of
color images. It is conceptually very close to YUV . The associated mathematical
transformation is:
Y 0.299 0.587 0.114 R
Cb = −0.169 −0.331 0.500 G
Cr 0.500 −0.419 −0.081 B
Similar to HSI, the HSV (Hue, Saturation, Value) model is built upon RGB values:
if M = max(R, G, B) and m = min(R, G, B), then
G−B
60 M−m + 360 mod 360 if M = R
m
B−R
S = 1 − , V = M and H = 60 M−m + 120 if M = G
M
R−G
60 M−m + 240 if M = B
HSL REPRESENTATION
Like HSI and HSV , HSL (Hue, Saturation, Lightness) representation is built from
the RGB values: if M = max(R, G, B) and m = min(R, G, B), then
G−B
(
M−m 60 M−m + 360 mod 360 if M = R
M+m 2L if L < 0.5
B−R
L= , S = M−m , H = 60 M−m + 120 if M = G
2 2−2L otherwise
R−G
60 M−m + 240 if M = B
CMY (Cyan, Magenta, Yellow) is the space dedicated to the printing of digital color
images on paper. It results from a subtractive synthesis of the colors and is represented
by the inverse cube of the RGB space: the origin (0, 0, 0) color is white, and it has
cyan, magenta and yellow as axes. Assuming that an RGB color has components
coded in the interval [0, 1], then:
C = 1 − R, M = 1 − G and Y = 1 − B
For material needs (mainly for the printers), this space was extended by adding
a fourth component, K (for Key black), which, on the one hand, comes to fill the
difficulty for a printer to restore correctly gray levels by mixing cyan, magenta and
yellow inks (which are in practice imperfect), and on the other hand, makes it possible
to minimize the quantity of ink necessary to print gray levels (colors very frequently
met in printing). For this reason, printers have a black ink cartridge, in addition to the
cyan, magenta and yellow ink cartridges. The CMY K components are calculated as
follows:
C−K C−K C−K
K = min(C, M,Y ), C0 = , M0 = and Y 0 =
1−K M−K Y −K
Figures 9.3 and 9.4 illustrate the values of the various color channels obtained in
these colorimetric spaces, for the kingfisher image, originally coded by RGB values
(first row of the first figure). All channels have been normalized for visualization.
Following this definition, the transposition to the vector case is then immediate: if
X = (x1 · · · xn ) is a set of vectors of Rd (in our case, d = 3 for a color image, or d = 4
for the CMY K case), then we can define the median element xm of X as the one
satisfying:
n n
(∀ j ∈ [[1, n]]) ∑ kxm − xi k ≤ ∑ kx j − xi k
i=1 i=1
Code 9.1 implements this filter for the L2 norm. Note that, in the basic version
presented here, the complexity of the algorithm is significant: for a neighborhood of
size p, computing distances requires O(p2 ) and the search for the smallest distance is
in O(p), thus a complexity of O(p3 ) for each point processed.
CImg<>
imgOut(imgIn),
V(5,5,1,3), // 5x5 color neighbor
vL = V.get_shared_channel(0),
va = V.get_shared_channel(1),
vb = V.get_shared_channel(2);
cimg_for5x5(imgIn,x,y,0,0,vL,float)
{
cimg_get5x5(imgIn,x,y,0,1,va,float);
cimg_get5x5(imgIn,x,y,0,2,vb,float);
cimg_forXY(V,i,j)
{
float d = 0;
CImg<> z = V.get_vector_at(i,j);
220 Chapter 9. Multispectral Approaches
cimg_forXY(V,u,v)
{
CImg<> zi = V.get_vector_at(u,v);
d += (zi -= z).magnitude();
}
if (d<dmin)
{
dmin = d;
imgOut(x,y,0,0) = V(i,j,0);
imgOut(x,y,0,1) = V(i,j,1);
imgOut(x,y,0,2) = V(i,j,2);
}
}
}
return imgOut.LabtoRGB();
}
Figure 9.5 shows a median filtering result of a color image. A salt and pepper noise
(b) is added to color test pattern (a). The resulting image is filtered by the multispectral
median filter (d) and by a more classical median filtering applied channel by channel
(c) (see Section 5.1.2). We can clearly see on details (e) and (f) that the median filtering
designed specifically for the color image minimizes the appearance of “false colors”
(the only false colors being from the noise), unlike the filtering applied on each channel
independently, and also generates less geometric distortions (see white and black grid).
Let us note finally that, since this filtering depends on the definition of a distance
in the color space, we preferentially use the perceptually uniform space L∗ a∗ b∗ .
The norm of this vector is called the local contrast and is denoted Sθ (I, x).
Directions maximizing Sθ (I, x) can be searched by zeroing the partial derivative
of the local contrast with respect to θ (di Zenzo gradient [12]). It is also possible
to compute the maximum local contrast as the root of the largest eigenvalue of the
symmetric matrix M(x) = JI >(x)JI (x):
2
Ix (x) Ix (x)Iy (x) A(x) B(x)
M(x) = =
Ix (x)Iy (x) I2y (x) B(x) C(x)
It is the equivalent in color imaging of the structure tensor, already met in Section 6.1.
The largest eigenvalue of M(x), giving the maximum local contrast, is then computed
by p
A(x) + B(x) + (A(x) − B(x))2 + 4C2 (x)
λ1 (x) =
2
p
A(x) − B(x) + (A(x) − B(x))2 + 4C2 (x)
and the corresponding eigenvector is q1 (x) =
2C(x)
The other eigenvector of M(x) is orthogonal to q1 (x) (M(x) is symmetric), and is
therefore tangent to the contour at the point x. The phase of the gradient in x is:
2C(x)
tan(θ (x)) = p
A(x) − B(x) + (A(x) − B(x))2 + 4C2 (x)
2C(x)
that can be written as tan(2θ (x)) = A(x)−B(x) , and then, if A(x) 6= B(x) and C(x) 6= 0
1 2C(x)
θ (x) = arctan
2 A(x) − B(x)
Code 9.2 illustrates the application of the previous method, and Fig. 9.6 shows a
result. Instead of using the explicit calculation of λ1 described above, we use
CImg<T>::get_symmetric_eigen().
/*
Gradient of a color image.
// Gradients.
CImgList<> grad = imgIn.get_gradient();
cimg_forXY(imgIn,x,y)
{
CImg<>
Ix = grad[0].get_vector_at(x,y),
Iy = grad[1].get_vector_at(x,y),
M(2,2);
M(0,0) = Ix.dot(Ix);
M(1,1) = Iy.dot(Iy);
M(0,1) = M(1,0) = Iy.dot(Ix);
eig = M.get_symmetric_eigen();
E(x,y) = std::sqrt(eig[0](0));
Phi(x,y) = std::atan2(M(0,0) - M(1,1),2*M(0,1));
}
}
If only the norm of the gradient matters in the calculations related to a processing
algorithm, then applying a gradient on the vector field or channel to channel is almost
equivalent. On the other hand, if the phase (contour orientation) plays an important
role, then the multi-spectral approach gives a generally more consistent and reliable
result.
9.2 Color imaging 223
R G B
H S I
L∗ a∗ b∗
Y U V
X Y Z
Y Cb Cr
geOriginal Tangent
image vectors
Tange
to c
geOriginal Tangent
image vectors
Tange
to c
The CImg library has basic capabilities for visualizing and rendering 3D mesh objects,
which can help the developer represent complex image data. In this chapter, we
propose a quick tour of these possibilities, first describing the general principles
governing the creation of 3D mesh objects, and then detailing the way these objects
are internally represented in the library. Several examples are proposed to illustrate
different practical use cases.
3. A set of P materials, defining either the color or the texture of each primitive.
This set is stored as a list of CImgList<T> images, each representing the
material of a single primitive. The type T is usually unsigned char, be-
cause 3D objects are drawn or visualized most often in 8-bit per component
CImg<unsigned char> images.
With these four datasets defined, it is possible to interactively view the corresponding
3D object directly via the CImg<T>::display_object3d() method or to draw
it in an image, via the CImg<T>::draw_object3d() method. In addition, there
is a set of CImg<T> methods returning pre-defined 3D mesh objects, such as par-
allelepipeds (CImg<T>::box3d()), spheres (CImg<T>::sphere3d()), torus
(CImg<T>::torus3d()), cylinders (CImg<T>::cylinder3d()), etc. Two
separate 3D objects can be merged into one, with the CImg<T>::append_object3d()
method. Note that the sets of materials and opacities do not appear as arguments in
the signature of these functions, since they only participate in the decoration of the 3D
object, not in the definition of its structure per se.
Code 10.1 illustrates the use of these different functions for the creation and
visualization of a compound 3D mesh object (a chain, with a ball), obtained by
concatenating several simple 3D primitives (deformed and rotated torii and a sphere).
Figure 10.1 – Result of Code 10.1: Creation and visualization of a simple 3D mesh
object.
First, we retrieve the different values of the parameters passed on the command
line (by defining default values for the parameters not specified by the user), and
we initialize the minimum and maximum bounds (x0 , y0 ) − (x1 , y1 ) of the evaluation
domain of the function f :
230 Chapter 10. 3D Visualisation
const char
*expr = cimg_option("-z",
"sinc(sqrt(x^2+y^2)/4)*abs(sin(x/2)*cos(y/2))"
,
"Expression of f(x,y)"),
*xyrange = cimg_option("-xy",
"-30,-30,30,30",
"Definition domain");
int
resolution = cimg_option("-r",256,"3D plot resolution");
float
sigma = cimg_option("-s",0.0f,"Smoothness of the function f"),
factor = cimg_option("-f",150.0f,"Scale factor");
The evaluation of the function f (x, y) on the whole domain (x0 , y0 ) − (x1 , y1 ) is
done thanks to the following version of the constructor of CImg<T>:
CImg(unsigned int size_x, unsigned int size_y,
unsigned int size_z, unsigned int size_c,
const char *values, bool repeat_values);
The string we are going to pass here to the constructor will consist of the con-
catenation of three formulas, the first two lines setting new values for the pre-defined
variables x and y so that they become linearly normalized in the intervals [x0 , x1 ] and
[y0 , y1 ], respectively. The last row is just the function f (x, y) to evaluate. We also
allow to smooth the resulting image and normalize its values:
10.2 3D plot of a function z = f (x, y) 231
CImg<char> s_expr(1024);
std::sprintf(s_expr,"x = lerp(%g,%g,x/(w-1));"
"y = lerp(%g,%g,y/(h-1));"
"%s",
x0,x1,y0,y1,expr);
CImg<> elevation(resolution,resolution,1,1,s_expr,true);
elevation.blur(sigma).normalize(0,factor);
In a second step, we use this same constructor to generate a gradient color palette
built by linear interpolation of four basic colors (red, orange, yellow and white) (Code
10.4).
Code 10.4 – Create a color gradient (red, orange, yellow, white).
CImg<unsigned char>
lut(4,1,1,3,"!x?[255,79,106]:" // Red
"x==1?[196,115,149]:" // Orange
"x==2?[231,250,90]:" // Yellow
"[255,255,255]",true); // White
lut.resize(256,1,1,3,3);
At this point, we have a 2D scalar image elevation, which gives the value of the
function f (x, y) at each point, as well as a colored version c_elevation, (Fig. 10.2),
which we will use afterwards to color the 3D object.
The only thing left to do is to build a background image, with a color gradient, and call
the CImg<T>::display_object3d() method to visualize f in 3D (Code10.6).
232 Chapter 10. 3D Visualisation
Figure 10.2 – Result of Codes 10.3 and 10.4: Images elevation (scalar, left),
and c_elevation
√ (color, right), for the expression defined by default as f (x, y) =
2
x +y 2
abs sin( 2x ) cos( 2y ) .
sinc 4
The rendering of this code can be seen in Fig. 10.3, with different functions f (x, y)
and different evaluation ranges passed as parameters. The first image (top left) corre-
sponds to the function defined by default in the program (i.e., the elevation illustrated
in Fig. 10.2). The second row of Fig. 10.3 illustrates the evaluation results of the Man-
delbrot fractal function, and shows that the mathematical function evaluator integrated
in CImg is actually able to evaluate complex expressions, which are programs in their
own (here, a program evaluating the convergence of a series of complex numbers).
10.3 Creating complex 3D objects 233
Figure 10.3 – Result of our program for plotting functions z = f (x, y), as 3D elevation
maps.
Similarly, the dilation of the 3D coordinates along each of the axes x, y and z, with
scale factors 0.2, 1.2 and 2.3, respectively, can be done with:
points = CImg<>::diagonal(0.2,1.2,2.3)*points;
• When S = 1: the primitive is a colored point (or sprite). The image img p = (i1 )
contains a single value which is the index i1 ∈ J0, N − 1K of the associated vertex.
In order to display a sprite instead, the associated material must be an image
(rather than a simple color vector).
• When S = 2: the primitive is a colored segment. The image img p = (i1 , i2 )
contains the two indices of the starting and ending vertices.
• When S = 3: the primitive is a colored triangle. The image img p = (i1 , i2 , i3 )
contains the three indices of the triangle vertices.
• When S = 4: the primitive is a colored quadrangle. The image img p =
(i1 , i2 , i3 , i4 ) contains the four indices of the quadrangle vertices.
• When S = 5: the primitive is a colored sphere. The image img p = (i1 , i2 , 0, 0, 0)
contains two indices of vertices that define the diameter of the sphere.
• When S = 6: the primitive is a textured segment. The image
img p = (i1 , i2 ,tx1 ,ty1 ,tx2 ,ty2 ) contains the two indices of the segment vertices,
followed by the coordinates of the corresponding texture (the texture being
given as the material associated to this primitive).
• When S = 9: the primitive is a textured triangle. The image
img p = (i1 , i2 , i3 ,tx1 ,ty1 ,tx2 ,ty2 ,tx3 ,ty3 ) contains the three indices of the tri-
angle vertices, followed by the coordinates of the corresponding texture (the
texture being given as the material associated to this primitive).
• When S = 12: the primitive is a textured quadrangle. The image
img p = (i1 , i2 , i3 , i4 ,tx1 ,ty1 ,tx2 ,ty2 ,tx3 ,ty3 ,tx4 ,ty4 ) contains the four indices of
the quadrangle vertices, followed by the coordinates of the corresponding tex-
ture (the texture being given as the material associated to this primitive).
The set of these primitives, associated to the set of vertices, entirely define the 3D
structure of the mesh object.
10.3 Creating complex 3D objects 235
• When the primitive is a sprite, img p is the sprite image to draw. If we are
looking for a sprite with transparent pixels, we must associate a corresponding
image of opacity, with same dimensions, to the list of primitive opacities.
The knowledge of these few rules that define the set of possible properties of 3D
vector objects with CImg allows to build complex objects in a procedural way: by
iteratively adding in CImgList<T> new vertices, associated primitives and colored
or textured materials, we can easily create elaborated 3D visualizations. This is what
we illustrated in the next section, with an application to cardiac image segmentation.
236 Chapter 10. 3D Visualisation
The visualization of dense volumetric images such as these is never easy: the
number of points is often too large for the simultaneous display of all the voxels in 3D
to visually highlight the relevant structures of the images. Here, we therefore propose
to extract and display three features of interest:
1. The 3D surface corresponding to the inner side of the ventricle, to be extracted
from frame1.
2. The estimated motion vectors, displayed for the side points.
3. One of the full XY slice planes of frame1, allowing to locate the side of the
ventricle with respect to the original image data.
10.4 Visualization of a cardiac segmentation in MRI 237
A 3D vector object will be constructed for each of these features. The union of these
objects will constitute the complete visualization.
The extraction of the 3D surface of the ventricle is performed in two steps: the ven-
tricle is first segmented in a purely volumetric way, by the region growth algorithm
implemented by the CImg<T>::draw_fill() method. This function starts from
a point in the image, with coordinates specified by the user, and fills all surrounding
pixels whose values are sufficiently close to the value of the initial point. This method
is usually called for filling shapes in classical 2D images, but it can be used for volu-
metric images as well.
Here we choose an origin point of coordinates (55, 55, 7), located in the bright
structure from which we want to extract the shape. We call one of the versions of
CImg<T>::draw_fill() allowing a region image to be passed as an argument.
This method will construct the set of voxels visited during the region growth. When
returning from CImg<T>::draw_fill(), the image region is a binary volu-
metric image with all voxels defining the interior of the 3D segmented region set to 1
(0 everywhere else).
CImg<> region;
float value = 0;
238 Chapter 10. 3D Visualisation
(+frame1).draw_fill(55,55,7,&value,1,region,40,true);
g_points = region.blur(1.f,1.f,3.f).threshold(0.5f).
get_isosurface3d(g_primitives,0.5f);
g_colors.insert(g_primitives.size(),
CImg<unsigned char>::vector(255,128,0));
g_opacities.insert(g_primitives.size(),
CImg<>::vector(1));
CImg<>::vector(x,y,z).move_to(points);
CImg<>::vector(x + u, y + v, z + w).move_to(points);
CImg<unsigned int>::vector(ind,ind + 1).
move_to(g_primitives);
g_points.append_object3d(g_primitives,c_points,c_primitives);
g_colors.insert(c_colors);
g_opacities.insert(c_primitives.size(),CImg<>::vector(1));
All that remains is to generate a colored background, and launch CImg’s inter-
active 3D viewer using the CImg<T>::display_object3d() method, exactly
as we did in previous examples (Code 10.6). The final visualization is displayed in
Figure 10.5.
Figure 10.5 – 3D visualization of one ventricular side of the heart, from a sequence of
MRI volumetric images.
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
11. And So Many Other Things. . .
The field of digital image processing is so large that it is actually impossible to embrace
all its aspects exhaustively in a single book. In this last chapter, we therefore propose
some other possible applications, more original or more rarely discussed, and which
point out once again the flexibility and genericity of the CImg library to deal with
various problems. Thus, we will illustrate here how to efficiently compress an image,
how to reconstruct an object from its tomographic projections, how to recover the 3D
geometry of a scene from a pair of stereographic images, or how to develop an original
user interface allowing to warp images (e.g., portraits) from the webcam. Afterwards,
we trust you to invent or discover other aspects of image processing that have not been
covered in this book!
J1 J1
Splitting into sub-bands
e
Q1
J2 J2
e
Q2
I J J
e eI
T T −1
..
.
JM JM
e
QM
Compression=⇒ Compressed
⇐=Decompression Image
RLE decoding
Reconstructed Image Color Over- Inverse Inverse
and
Image restitution transformation sampling DCT quantization
Huffman
Code 11.1 computes the discrete cosine transform of a square image of size N × N.
/*
JPEG_DCT: Compute the discrete cosine transform (DCT)
CImg<> dct(N,N,1,1,0);
cimg_forXY(dct,i,j)
{
float
ci = i==0 ? 1/sqrt(2.0f) : 1,
cj = j==0 ? 1/sqrt(2.0f) : 1;
cimg_forXY(block,x,y)
dct(i,j) += block(x, y)*cosvalues(x,i)*cosvalues(y,j);
dct(i,j) *= 2.0f/N*ci*cj;
}
return dct;
}
11.1 Compression by transform (JPEG) 247
The inverse transform (Code 11.2) is defined for all (i, j) ∈ J0, N − 1K2 by:
2 N−1 N−1
(2i + 1)uπ (2 j + 1)vπ
I [i, j] = ∑ ∑ C [u]C [v] J [u, v] cos cos
N u=0 v=0 2N 2N
/*
JPEG_IDCT: Compute the inverse discrete cosine transform (iDCT).
CImg<> img(N,N,1,1,0);
cimg_forXY(img,x,y)
{
cimg_forXY(dct,i,j)
{
float
ci = (i==0) ? 1/std::sqrt(2) : 1,
cj = (j==0) ? 1/std::sqrt(2) : 1;
img(x,y) += ci*cj*dct(i,j)*cosvalues(x,i)*cosvalues(y,j);
}
img(x,y) *= 2.0f/N;
}
return img;
}
2 J [u, v]
∀ (u, v) ∈ J0, N − 1K , Je[u, v] = round .
γ × Q [u, v]
248 Chapter 11. And So Many Other Things. . .
where the quantization matrix (defined by the JPEG standard) is given by:
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
Q=
18
.
22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
The γ parameter adjusts the compression ratio (or the quality of the reconstructed
image). The larger γ is, the more loss the image will suffer, thus leading to a better
compression ratio. The implementation of the quantization step in the transformed
space is given by Code 11.3.
/*
*/
CImg<> JPEGEncoder(CImg<>& image, float quality, CImg<>& cosvalues)
{
unsigned int N = 8; // Resolution of a bloc
block(N,N), dct(N,N);
dct = JPEG_DCT(block,cosvalues);
cimg_forXY(dct,i,j)
comp(k*N + i,l*N + j) = cimg::round(dct(i,j)/Q(i,j));
}
return comp;
}
Code 11.4 then illustrates the coefficient reconstruction. Of course, there is a loss
of information between the quantization and reconstruction steps.
/*
JPEGDecoder: Compute the reconstructed image from the quantized
DCT
cimg_forXY(blk,i,j)
decomp(k*N + i,l*N + j) = blk(i,j);
}
return decomp;
}
The information loss in the compressed image can be evaluated by the distortion
2
ε = N×M1
∑N−1
i=0 ∑M−1 e
j=0 I [i, j] − I [i, j] , which is the average quadratic error between
the reconstructed and the initial images (Code 11.5).
/*
distortionRate : Compute the quadratic deviation.
The corresponding algorithm is given in Code 11.7, and uses a function (Code 11.6)
that precomputes the cosine values.
11.1 Compression by transform (JPEG) 251
CImg<> genCosValues()
{
CImg<> cosinusvalues(N,N);
cimg_forXY(cosinusvalues,i,x)
cosinusvalues(x,i) = std::cos(((2*x+1)*i*cimg::PI)/(2*N));
return cosinusvalues;
}
// Image reconstruction.
CImg<> comp_image = JPEGDecoder(dct_image,quality,cos_values);
// Display images.
(imgIn,dct_image,comp_image).display("Input image - "
"Image of the DCT blocks - "
"Decompressed image");
x x
y y
i i i i
j j j j
DCT Quantization Inverse DCT
Figure 11.3 – Processing of a block by the DCT for the JPEG compression.
Figure 11.4 illustrates the compression by the DCT transform. The algorithm
is a simplified version of the JPEG compression where the entropy coding, i.e., the
Huffman coding step, is not implemented. The compression ratio can be overestimated
by the number of the non-zero coefficients in the DCT space, divided by the number
of coefficients. The loss of quality is not proportional to the reduction in the size of
the compressed information.
γ = 0.01 γ =1 γ =5
γ = 10 γ = 20 γ = 40
2
Distorsion: Ie− I # coefficients 6= 0 (%)
γ = 40 100
γ = 0.01 − 91.2%
600
γ = 40 − 0.7%
400 γ = 20 γ = 20 − 1.5%
50
γ = 10 γ = 10 − 2.4%
200 γ = 5 − 3.8%
γ =5
γ = 0.01 γ =1
γ = 1 − 11%
0 0
10−2 10−1 100 101 102 10−2 10−1 100 101 102
γ γ
ion
diat Sinogram
e of ra
ourc
θ
S
τ y
f (x, y)
σ
θ
x
pθ (σ )
σ)
σ σ
pθ (
pθ (σ ).
R ADON T RANSFORM
The link between the image f (x, y) and the projections pθ (σ ) is defined by the Radon
transform (RT):
Z
RT
f (x, y) ↔ pθ (σ ) = f (σ , τ) dτ
R
/*
RadonTransform: Calculation of projections (<=> Radon transform).
To do this, a first step is to establish a link between the different projections and
the image so we place ourselves in the Fourier space on the projections.
256 Chapter 11. And So Many Other Things. . .
Z +∞
Pθ (q) = pθ (σ ) e− j2πσ q dσ (definition via the 1D Fourier transform.)
−∞
Z +∞ Z +∞
= f (σ , τ) dτ e− j2πσ q dσ
−∞
ZZ −∞
= f (σ , τ) e− j2πσ q dσ dτ (substitution (σ ,τ) ⇒ (x,y))
ZZR2
= f (x, y) e− j2π(xq cos(θ )+yq sin(θ )) dxdy
R2
= F (u, v) | u = q cos (θ )
v = q sin (θ )
/*
BackProjTransform: Calculation of a simple back projection.
return bp;
}
The simple back projection is not the solution to the inverse Radon transform. On
the other hand, we can show that:
Z π Z +∞ Z π
RT −1 j2πqσ
pθ (σ ) ↔ f (x, y) = Pθ (q) |q|e dq dθ = hθ (σ ) dθ
0 −∞ 0
FT1D
with hθ (σ ) ←→ |q|Pγ (q). The image to be reconstructed f (x, y) is thus the back
projection of a quantity noted hθ (σ ). From the definition of hθ (σ ), we notice that it
is a filtered version of the projections pθ (σ ). The filter to be used has a frequency
response which is a ramp G (q) = |q|. This equation allows to propose a reconstruction
algorithm (Algorithm 6).
The filtering of the projections, arranged in the sinogram, is performed by Code 11.10.
The whole algorithm of tomographic reconstruction by filtered back projection is then
given in Code 11.11.
Code 11.10 – Filtering of the projections by the ramp filter.
/*
SinogramFiltering: Projection filtering ( sinogram )
cimg_forX(FFTy_sinog[0],o)
{
for (int sigma = 0; sigma<sinog.height()/2; ++sigma)
{
float coeff = xi/sinog.height()/2.0f;
FFTy_sinog[0](o,sigma) *= coeff;
FFTy_sinog[0](o,sinog.height() - 1 - sigma) *= coeff;
FFTy_sinog[1](o,sigma) *= coeff;
FFTy_sinog[1](o,sinog.height() - 1 - sigma) *= coeff;
}
}
// Calculation of the 1D inverse Fourier transform.
CImgList<> iFFTy_sinog = FFTy_sinog.get_FFT(’y’,true);
return iFFTy_sinog[0];
}
/*
Main function for analytical tomographic reconstruction
*/
In Fig. 11.6, the results for the analytical tomographic reconstruction are shown for
a number of projections. Theoretically, it is necessary to have a number of projections
11.2 Tomographic reconstruction 259
(thus measurements) of the same order of magnitude as the size of the image. One of
the particularities of this reconstruction is the presence of artifacts due to the use of
back projection.
In Equation 11.1, the values of the N pixels of the image to be reconstructed are stored
in the vector f. They are linked to the M measurements which are stored in the vector
p through the projection matrix A.
The projection matrix allows to model the acquisition system: the coefficient
ai, j gives the contribution of the pixel j in the measurement i. This modeling is
illustrated in Fig. 11.7. The major interest of this modeling concerns the beams
where we can take into account the geometry (non-parallel geometries), the physical
phenomena (attenuation, scattering, . . . ) and the response of the acquisition system.
260 Chapter 11. And So Many Other Things. . .
These different elements are illustrated in Fig. 11.8: the vector p which corresponds to
the measurements (in vectorized form), the projection matrix A which shows the link
between the pixels and the measurements and finally the image to be reconstructed f
which must be as close as possible to the original image.
f0 f1 f2 f3
f4 f5 f6 f7
f8 f9 f10 f11
ai,14
/*
ART algorithm for algebraic tomographic reconstruction.
A : Projection matrix
p : Measurements (or projections)
nbiter : Number of iterations
11.2 Tomographic reconstruction 261
y y
Original image f (visualization as an image)
M measures
p (visualization as an image) A
*/
CImg<> ART(CImg<>& A,CImg<>& p, int nbiter)
{
float eps = 1e-8f;
CImg<> f(1,A.width(),1,1,0);
To test this code, it is necessary to simulate the data acquisition process. Code
11.13 allows to calculate the projections of the image to be reconstructed. This code is
almost identical to Code 11.8, except that the data are vectorized in the vector p.
/*
Simulation of p projections (ART algorithm)
imgIn : Original image
nbProj : Number of projection orientations
*/
CImg<> Projections(CImg<> img, int nbProj)
{
int size = imgIn.width();
CImg<> p(1,nbProj*size,1,1,0);
Code 11.14 – Computation of the projection matrix for the ART algorithm.
/*
Simulation of the projection matrix A (ART algorithm).
cimg_forXY(imgPixel,x,y)
{
imgPixel(x,y) = 1;
for (int o = 0; o<nbProj; ++o)
{
float orient = o*180.0f/nbProj;
CImg<> rot_img = imgPixel.get_rotate(orient,
size/2.0f,size/2.0f,3);
cimg_forXY(rot_img,i,j)
A(x + y*size,i + o*size) += rot_img(i,j);
imgPixel(x,y) = 0;
}
return A;
}
Figure 11.9 shows the evolution of the ART algorithm over the iterations. On the
image reconstructed after 0.1% of the total number of iterations, we can see the update
of the pixels on the first projections that have been processed by the algorithm. The
image then tends iteratively towards the starting image.
Code 11.15 – Computation of the projection matrix for the ART algorithm.
/*
Main program (ART algorithm)
*/
int main(int argc, const char * argv[])
{
cimg_usage("ART algorithm (tomographic reconstruction)");
char *file = cimg_option("-i","phantom.bmp","Input image");
int
264 Chapter 11. And So Many Other Things. . .
// Simulation of measurements.
CImg<> p = Projections(img,nbProj);
return 0;
}
11.3 Stereovision
Stereovision is a set of image analysis techniques aiming at retrieving information
about the 3D structure and depth information of the real world from two (or more)
images acquired from different points of view (parallax). Far from being exhaustive,
we address two classical problems: the search for matching points between two images
and the estimation of the depth of objects.
E PIPOLAR CONSTRAINT
The epipolar constraint characterizes the fact that the corresponding m2 of a point
m1 lies on a line `m1 in I2 . Indeed, m2 necessarily belongs to the plane defined by
m1 , C1 and C2 . The line `m1 is called the epipolar line of the point m1 in I2 . This
constraint is symmetrical. The epipolar lines of an image all intersect at a point e
called epipole, which corresponds to the projection of the center of projection of the
other image. Thus e1 = P1 C2 and e2 = P2 C1 .
F UNDAMENTAL MATRIX
(focal length, scale factors, optical axis) of the cameras. It is defined up to a scale factor.
The matrix F can be estimated from at least 8 point matches between images I1
and I2 . These points can be obtained automatically (using a point of interest detector
for example, see Section 6.1), or clicked by the user.
/*
Manual selection of interest point pairs on stereo images
{
CImgList<> m(n,2,3);
int DG = 1, i = 0;
unsigned char red[] = { 255,0,0 }, gre[] = { 0,255,0 };
CImgDisplay Dis1(I1,"I1"), Dis2(I2,"I2");
/*
8-points algorithm.
n : Number of points
*/
CImg<> FundamentalMatrix(CImgList<>& m, int n) {
CImg<> F(3,3),
A(9,n);
// Matrix A.
for (int i = 0; i<n; ++i)
{
A(0,i) = m(i,1,0)*m(i,0,0);
A(1,i) = m(i,1,0)*m(i,0,1);
A(2,i) = m(i,1,0);
A(3,i) = m(i,1,1)*m(i,0,0);
A(4,i) = m(i,1,1)*m(i,0,1);
A(5,i) = m(i,1,1);
A(6,i) = m(i,0,0);
A(7,i) = m(i,0,1);
A(8,i) = 1;
}
// SVD.
CImg<> U, S, V;
A.SVD(U,S,V);
// F = last column of V.
CImg<> f = V.get_column(8);
F(0,0) = f(0); F(1,0) = f(1); F(2,0) = f(2);
F(0,1) = f(3); F(1,1) = f(4); F(2,1) = f(5);
F(0,2) = f(6); F(1,2) = f(7); F(2,2) = f(8);
return F;
}
F is generally not a rank 2 matrix (it would be of rank 2 if there were no measure-
ment, model or computational error). To impose the rank, we go again through the
singular value decomposition. Let F be the matrix previously computed. We look for
the matrix F̂ of rank 2 closest to F by computing F = U, by imposing to the smallest
singular value to be null, and by reconstructing F̂ = UΣ̂V> where Σ̂ differs from Σ by
the last null column.
11.3 Stereovision 269
CImg<> U, S, V;
F.SVD(U,S,V);
S(2) = 0;
F = U*S.diagonal()*V.transpose();
The epipolar line in I2 of a point m1 ∈ I1 is then given by `m1 = Fm1 , and the
>
equation of the line in the image plane is `>
m1 x y 1 = 0 (Fig. 11.11), that is:
CImgDisplay Disp2(I2,"I2");
CImg<> l = F*m1;
float
l1x = -l(2)/l(0),
l1y = 0,
l2x = 0,
l2y = -l(2)/l(1);
I2.draw_line(l1x,l1y,l2x,l2y,red,1.0).display(Disp2);
I1 I2
and the clicked point m1 and the corresponding epipolar line
Figure 11.11 – Epipolar line computed using the fundamental matrix. (Source of the
original image: [34].)
270 Chapter 11. And So Many Other Things. . .
M ATCHING
A possible use of the epipolar line is the matching of points detected on images I1 and
I2 .
Let’s suppose that we have ni points of interest detected on the image Ii , 1 ≤ i ≤ 2
by a dedicated algorithm (Harris and Stephens detector type, see Section 6.1). Among
these points we assume to know at least eight correspondences, so that the fundamental
matrix F can be estimated. The question that arises is the following: for a point of
interest P1j detected on I1 , is it possible to find its corresponding point on I2 , if it has
indeed been detected?
A simple solution to this matching problem is to look for the point on the epipolar
line associated with P1j with maximum correlation. For this purpose, we define around
P1j a neighborhood of size t, and we look for the window of the same size, centered at
a point on the epipolar line, having a maximum correlation. This technique, similar to
motion estimation by correlation (see Section 8.2.1), is very simple to implement and
is robust to errors in the estimation of F. It assumes a conservation of luminance, and
no geometrical deformation of the two images.
Other methods exist, not discussed here, such as the RANSAC algorithm, the phase
shift approach or methods based on dynamic programming.
R Epipolar lines have no reason to be parallel. But this makes it easier to find the
corresponding image and to estimate the depth. We can rectify the images, i.e.,
have parallel epipolars, if we know how to compute the images corresponding
to the scene with parallel image planes. For this, we can generate an image of
the scene by rotation around the optical center without knowing the scene. It is
a question of projecting the images on a plane parallel to the baseline (the line
passing through C1 and C2 ) in order to have rectified views.
The matching can be done by epipolar matching, as we have seen. It is also possible
to generate dense disparity maps, by computing the disparity of each pixel using its
11.3 Stereovision 271
x
m=
y
z
Figure 11.12 – Calculating the depth z of
point m as a function of the geometry. b
I1 x x
m1 = 1 I2 m2 = 2
y1 y2 is the distance between the two optical
f f centers of the cameras, f is their focal
C1 b C2 length.
neighborhood. The matching costs of each pixel with each disparity level in a given
interval (disparity interval) are calculated. These costs determine the probability of a
good matching, with these two quantities being inversely proportional. The costs are
then aggregated into a neighborhood of a given size. The best match for each pixel is
then sought by selecting the minimum cost match, independently of the other pixels.
Different measures are classically defined to measure matching costs, the most
used being the Sum of Absolute Differences (SAD), the Sum of Squared Differences,
(SSD), the Normalized Cross-Correlation (NCC) and the Sum of Hamming Distances
(SHD). Code 11.18 implements the calculation of the disparity by sum of the square
differences, for disparity values in the range [-dbound,dbound] and a neighbor-
hood of size h.
Figure 11.13 shows as a result of this algorithm the estimation of the depth,
knowing b and f .
/*
Computing the disparity using SSD.
cimg_forXY(I1,x,y)
272 Chapter 11. And So Many Other Things. . .
{
patch1 = I1.get_crop(x - h2, y - h2,x + h2, y + h2);
float min = 1e8f;
for (int dx = -dbound; dx<=dbound; ++dx)
{
patch2 = I2.get_crop(x + dx - h2,y - h2,x + dx + h2,y + h2);
float ssd = (patch1 - patch2).get_sqr().sum();
if (ssd<min)
{
min = ssd;
arg = dx;
}
}
disparityMap(x,y) = arg;
}
return disparityMap;
}
I1 I2
Figure 11.13 – Depth map calculated on two rectified images. (Source of the original
images: [34].)
11.4 Interactive deformation using RBF 273
a) Image from the webcam, and interactive b) Image deformation obtained after mov-
placement of keypoints ing the keypoints
1. The ability to access the image stream from the webcam. With CImg, this is
made easy by using the CImg<T>::load_camera() method, which inter-
nally uses functions of the OpenCV library to access the image stream. When
compiling the program, all you need to do is enable the cimg_use_opencv
macro, and link the executable with the OpenCV libraries to make use of this
functionality.
where φ : R → R is a pre-defined radial function, which will be evaluated for all the
L2 distances between each point x for which we want to estimate f (x) and the known
points xk of f . Classically, we choose φ (r) = r2 log(r) for an interpolation known as
Thin Plate Spline.
The weights wl being estimated, the function f can then be evaluated at any point
x ∈ Rn of the space, with Equation 11.2.
11.4 Interactive deformation using RBF 275
The RBF interpolation works the same for any dimension n of the definition space
of f in that it interpolates the known values in a continuous way, and that these values
do not have to be located on a regular grid in Rn . From an algorithmic point of view,
the method relies mainly on the inversion of a K × K symmetric matrix, for which
there exists an algorithm in o(K 3 ). This means that the computation time increases
very quickly with the number of known samples K.
Figure 11.15 shows the application of the RBF interpolation for the reconstruction
of an gray-level image, of size 224×171, from a small selection of 1187 value samples
(representing 3% of the total image data).
We first initialize this set of keypoints with four points, corresponding to the four
corners of the image:
CImgList<> keypoints;
CImg<>::vector(0,0,0,0).move_to(keypoints);
CImg<>::vector(100,0,100,0).move_to(keypoints);
CImg<>::vector(100,100,100,100).move_to(keypoints);
CImg<>::vector(0,100,0,100).move_to(keypoints);
CImg<> M(keypoints.size(),keypoints.size());
cimg_forXY(M,p,q) { // Filling the matrix M
float
xp = keypoints(p,0)*(img.width()-1)/100,
yp = keypoints(p,1)*(img.height()-1)/100,
xq = keypoints(q,0)*(img.width()-1)/100,
yq = keypoints(q,1)*(img.height()-1)/100,
r = cimg::hypot(xq - xp,yq - yp);
M(p,q) = phi(r);
}
CImg<> F(2,keypoints.size());
cimg_forY(F,p) { // Filling the matrix F
float
xp = (keypoints(p,0) - keypoints(p,2))*(img.width()-1)/100,
yp = (keypoints(p,1) - keypoints(p,3))*(img.height()-1)/100;
F(0,p) = xp;
F(1,p) = yp;
}
The variable img is a CImg<unsigned char> containing the image from the
webcam. The function cimg::hypoth(a,b), used to calculate r, simply returns
11.4 Interactive deformation using RBF 277
√
a2 + b2 . With these two matrices defined, the system MW = F can be solved using
the method CImg<T>::solve(), which returns the 2 × N image that contains the
weights W:
CImg<> W = B.get_solve(A); // Get the RBF weights
CImg<> warp(img.width()/4,img.height()/4,1,2);
cimg_forXY(warp,x,y) {
float u = 0, v = 0;
cimglist_for(keypoints,p) {
float
xp = keypoints(p,0)*(img.width()-1)/100,
yp = keypoints(p,1)*(img.height()-1)/100,
r = cimg::hypot(4*x - xp,4*y - yp),
phi_r = phi(r);
u += W(0,p)*phi_r;
v += W(1,p)*phi_r;
}
warp(x,y,0) = 4*x - u;
warp(x,y,1) = 4*y - v;
}
warp.resize(img.width(),img.height(),1,2,3);
The desired image warp is then obtained by applying the displacement field warp
to the original image img, using the method CImg<T>::warp():
img.warp(warp,0,1,1);
The creation of an empty CImgDisplay instance does not directly display a window
on the screen. It will only appear when an image is displayed for the first time. We
can then write this first simple event loop associated to the display disp:
278 Chapter 11. And So Many Other Things. . .
Code 11.21 – Event loop for the display of the image and its keypoints.
do {
// Get the webcam image.
img.load_camera(0,640,480,0,false);
This event loop simply displays in a window the image stream from the webcam
(in 640 × 480 resolution). It also displays the keypoints defined in the keypoints list
(so for the moment, only the four corners of the image).
The user management of these keypoints (adding, moving, deleting) is not very
difficult to implement, as shown in Code 11.22, to be added at the end of the loop.
Code 11.22 – Keypoint management.
where
int selected = -1;
is a variable that will be defined at the beginning of the program (outside the event
loop), which gives the index of a selected keypoint (if any), or -1 is no keypoint has
been selected.
All these blocks of code, put together, make it possible to build a fun and interactive
distorting mirror application.
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
List of CImg Codes
C HAPTER 2
C HAPTER 3
C HAPTER 4
C HAPTER 5
C HAPTER 6
C HAPTER 7
C HAPTER 8
8.1: Horn and Schunck method for estimating the displacement field. . . . 186
8.2: Direct method for estimating the displacement field. . . . . . . . . . 187
8.3: Multi-scale resolution scheme for the variational methods. . . . . . . 189
8.4: Lucas and Kanade algorithm without eigenelement analysis. . . . . . 191
8.5: Lucas and Kanade algorithm with eigenelement analysis. . . . . . . . 192
8.6: Calculus of the parameters of a similarity at a point m. . . . . . . . . 195
8.7: Estimate the object position by correlation. . . . . . . . . . . . . . . 196
8.8: Estimation of a translation by phase correlation. . . . . . . . . . . . . 200
8.9: Object tracking by phase correlation. . . . . . . . . . . . . . . . . . 203
8.10: Kalman filter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
C HAPTER 9
C HAPTER 10
C HAPTER 11
[25] Y. Ma and Y. Fu. Manifold Learning Theory and Applications. CRC Press,
2012 (cited on page 210).
[26] J. Macqueen. “Some methods for classification and analysis of multivariate
observations”. In: In 5-th Berkeley Symposium on Mathematical Statistics and
Probability. 1967, pages 281–297 (cited on page 170).
[27] L. Maddalena and A. Petrosino. “Towards Benchmarking Scene Background
Initialization”. In: New Trends in Image Analysis and Processing – ICIAP
2015 Workshops. Edited by Vittorio Murino et al. Cham: Springer International
Publishing, 2015, pages 469–476 (cited on page 119).
[28] T. Ojala, M. Pietikäinen, and T. Mäenpää. “Multiresolution Gray-Scale and
Rotation Invariant Texture Classification with Local Binary Patterns”. In:
IEEE Transactions on Pattern Analysis and Machine Intelligence 24.7 (2002),
pages 971–987. ISSN: 0162-8828 (cited on pages 145, 146).
[29] S. Osher and J. A. Sethian. “Fronts Propagating with Curvature-Dependent
Speed: Algorithms Based on Hamilton-Jacobi Formulations”. In: J. Comput.
Phys. 79.1 (Nov. 1988), pages 12–49. ISSN: 0021-9991 (cited on pages 153,
157).
[30] N. Otsu. “A Threshold Selection Method from Gray-Level Histograms”. In:
IEEE Transactions on Systems, Man and Cybernetics 9.1 (1979), pages 62–66
(cited on page 164).
[31] P. V. C. Hough. A Method and Means for Recognizing Complex Patterns. US
Patent: 3,069,654. Dec. 1962 (cited on page 128).
[32] P. Perona and J. Malik. “Scale-Space and Edge Detection Using Anisotropic Dif-
fusion”. In: IEEE Trans. Pattern Anal. Mach. Intell. 12.7 (July 1990), pages 629–
639. ISSN: 0162-8828 (cited on page 114).
[33] W. Pesnell, B. Thompson, and P. Chamberlin. “The Solar Dynamics Observa-
tory (SDO)”. In: Solar Physics 275 (Nov. 2012), pages 3–15 (cited on page 213).
[34] D. Scharstein and R. Szeliski. “A Taxonomy and Evaluation of Dense Two-
Frame Stereo Correspondence Algorithms”. In: International Journal of Com-
puter Vision 47.1/2/3 (Apr. 2002), pages 7–42 (cited on pages 269, 272).
[35] C. Schmid, R. Mohr, and C. Bauckhage. “Evaluation of Interest Point Detec-
tors”. In: Int. J. Comput. Vis. 37.2 (2000), pages 151–172 (cited on page 122).
[36] J. Serra. Image Analysis and Mathematical Morphology. USA: Academic Press,
Inc., 1983. ISBN: 0126372403 (cited on pages 53, 58, 59).
288 Bibliography
[37] J.A. Sethian. Level Set Methods and Fast Marching Methods: Evolving Inter-
faces in Computational Geometry, Fluid Mechanics, Computer Vision, and
Materials Science. Cambridge Monographs on Applied and Computational
Mathematics. Cambridge University Press, 1999. ISBN: 9780521645577 (cited
on pages 156, 158–160).
[38] C. E. Shannon. “A Mathematical Theory of Communication”. In: The Bell Sys-
tem Technical Journal 27 (July 1948), pages 379–423, 623– (cited on pages 4,
244).
[39] J. Shi and C. Tomasi. “Good Features to Track”. In: IEEE Conference on
Computer Vision and Pattern Recognition (1994), pages 593–600 (cited on
page 126).
[40] H. Tamura, S. Mori, and T. Yamawaki. “Texture features corresponding to
visual perception”. In: IEEE Transactions on Systems, Man and Cybernetics
8.6 (1978) (cited on page 140).
[41] B. Triggs and M. Sdika. “Boundary conditions for Young-van Vliet recursive
filtering”. In: IEEE Transactions on Signal Processing 54.6 (2006), pages 2365–
2367 (cited on page 88).
[42] D. Tschumperlé and B. Besserer. “High Quality Deinterlacing Using Inpainting
and Shutter-Model Directed Temporal Interpolation”. In: Computer Vision and
Graphics: International Conference, ICCVG 2004, Warsaw, Poland, September
2004, Proceedings. Edited by K Wojciechowski et al. Dordrecht: Springer
Netherlands, 2006, pages 301–307 (cited on page 111).
[43] I.T. Young and L.J. Van Vliet. “Recursive implementation of the Gaussian
filter”. In: Signal processing 44.2 (1995), pages 139–151 (cited on page 88).
[44] T. Y. Zhang and Ching Y. Suen. “A Fast Parallel Algorithm for Thinning Digital
Patterns.” In: Commun. ACM 27.3 (1984), pages 236–239 (cited on page 64).
Index
A RGBtoYUV . . . . . . . . . . . . . . . . . 216
accumulator . . . . . . . . . . . . . . . . . . . . 130 XYZtoLab . . . . . . . . . . . . . . . . . 216
active contour XYZtoRGB . . . . . . . . . . . . . . . . . 215
geodesic . . . . . . . . . . . . . . . . . . . . 156 XYZtoxyY . . . . . . . . . . . . . . . . . 215
active contours . . . . . . . . . . . . . . . . . . 151 YCbCrtoRGB . . . . . . . . . . . . . . 216
YUVtoRGB . . . . . . . . . . . . . . . . . 216
B append_object3d . . . . . . . 228
Bernsen blur . . . . . . . . . . . . . . . . . . . 21, 112
algorithm . . . . . . . . . . . . . . . . . . . 167 box3d . . . . . . . . . . . . . . . . . . . . . 228
Butterworth crop . . . . . . . . . . . . . . . . . . . . . . . 26
filter . . . . . . . . . . . . . . . . . . . . . . . 103 cut . . . . . . . . . . . . . . . . . . . . . . . . . 29
cylinder3d . . . . . . . . . . . . . . 228
C diagonal . . . . . . . . . . . . . . . . . 234
CImg display_graph . . . . . . . . . . . 50
fill . . . . . . . . . . . . . . . . . . . . . . . 29 distance_eikonal . . . . . . 161
CMYKtoCMY . . . . . . . . . . . . . . . 218 draw_fill() . . . . . . . . 175, 237
CMYKtoRGB . . . . . . . . . . . . . . . 218 draw_object3d . . . . . . . . . . 228
CMYtoCMYK . . . . . . . . . . . . . . . 218 draw_quiver . . . . . . . . . . . . . 189
CMYtoRGB . . . . . . . . . . . . . . . . . 218 draw_rectangle . . . . . 27, 196
HSItoRGB . . . . . . . . . . . . . . . . . 217 eigen . . . . . . . . . . . . . . . . . . . . . 126
HSLtoRGB . . . . . . . . . . . . . . . . . 217 equalize . . . . . . . . . . . . . . . . . . 50
HSVtoRGB . . . . . . . . . . . . . . . . . 217 fill . . . . . . . . . . . . . . . . . . . . . . . 43
LabtoRGB . . . . . . . . . . . . . . . . . 216 get_FFT . . . . . . . . . . . . . . . . . . . 97
LabtoXYZ . . . . . . . . . . . . . . . . . 216 get_dilate . . . . . . . . . . . . . . . 54
RGBtoCMYK . . . . . . . . . . . . . . . 218 get_elevation3d . . . 231, 239
RGBtoCMY . . . . . . . . . . . . . . . . . 218 get_erode . . . . . . . . . . . . . . . . 54
RGBtoHSI . . . . . . . . . . . . . . . . . 217 get_gradient . . . . . . . . . 22, 75
RGBtoHSL . . . . . . . . . . . . . . . . . 217 get_isosurface3d . . . . . . 237
RGBtoHSV . . . . . . . . . . . . . . . . . 217 get_norm . . . . . . . . . . . . . . . . . . 20
RGBtoLab . . . . . . . . . . . . . . . . . 216 get_structure_tensors . . .
RGBtoXYZ . . . . . . . . . . . . . . . . . 215 126
RGBtoYCbCr . . . . . . . . . . . . . . 216 get_symmetric_eigen . 212,
290 Index
T
texture
coarseness . . . . . . . . . . . . . . . . . . 141