Deformable Models
Deformable Models
Jasjit S. Suri
Aly A. Farag
Deformable Models
Biomedical and Clinical Applications
Jasjit S. Suri Aly A. Farag
Eigen LLC Professor of Electrical
Grass Valley, CA 95945 and Computer Engineering
USA Computer Vision and Image
[email protected] Processing Laboratory
University of Louisville
Louisville, KY 40292
USA
[email protected]
Series Editor
Evangelia Micheli-Tzanakou
Professor and Chair
Department of Biomedical Engineering
Rutgers University
Piscataway, NJ 08854-8014
[email protected]
9 8 7 6 5 4 3 2 1
springer.com
PREFACE
v
vi PREFACE
Chapter 2 examines the distance transform and other distance measures and
provides an approach to evaluate them. Interpolation and skeletonization are dis-
cussed as applications for the distance transform in image analysis and processing.
Chapter 3 deals with structural inversion for modeling structural information
in medical imaging. It provides a brief introduction to some techniques that have
been recently developed for solving structural inverse problems using level sets.
Chapter 4 deals with shape- and texture-based deformable models as applied
to facial image analysis. The chapter examines various characteristics of active
shape, appearance, and morphable models.
Chapter 5 describes a method for identification of the breast boundary in
mammograms and its use in CAD systems for the breast and other applications.
Chapter 6 describes the use of statistical deformable models for cardiac seg-
mentation and functional analysis in Gated Single Positron Emission Computer
Tomography (SPECT) perfusion studies.
Chapter 7 presents an implicit formulation for dual snakes based on the level
set approach. The key idea is to view the inner/outer contours as a level set
of a suitable embedding function. Details of the approach are provided with
applications to segmentation of cell images.
Chapter 8 describes a generalized approach for monotonically tracking ad-
vancing fronts using a multistencil fast marching (MSFM) method, which com-
putes a solution at each grid point by solving the eikonal equation along several
stencils and then picks the solution that satisfies the fast marching causality rela-
tionship.
Chapter 9 examines the use of deformable models for image segmentation
and introduces an approach to reduce the dependency on initialization that utilizes
object and background differentiation through watershed theories.
Chapter 10 describes the use of deformable models for detection of renal re-
jection as detected by Dynamic Contrast Enhanced Magnetic Resonance Images
(DCE-MRI). The approach involves segmentation of the kidney from surround-
ing tissues and alignment of the segmented cross-sections to remove the motion
artifices. The renogram describing the kidney perfusion is constructed from the
graylevel distribution of the aligned cross-sections.
Chapter 11 deals with a class of physically and statistically based deformable
models and their use in medical image analysis. These models are used for seg-
mentation and shape modeling.
Chapter 12 reviews deformable organisms, a decision-making framework
for medical image analysis that complements bottom–up, data-driven deformable
models with top–down, knowledge-driven mode-fitting strategies in a layered fash-
ion inspired by artificial life modeling concepts.
Chapter 13 provides a detailed description and analysis for use of PDE meth-
ods for path planning with application to virtual colonoscopy. The method works
in two passes: the first identifies the important topological nodes, while the second
PREFACE vii
pass computes the flight path of organs by tracking them starting from each
identified topological node.
Chapter 14 describes an approach for object tracking in a sequence of ultra-
sound images using the Hausdorff distance and entropy in level sets. The approach
tracks the region of interest (TOI) using information in previous and current slices
and accomplishes segmentation with Tsallis entropy. The Hausdorff distance is
used to match candidate regions against the ROI in the previous image. This
information is then used in a level set formulation to obtain the final output.
Chapter 15 describes a deformable model-based approach for image regis-
tration. A nonuniform interpolation functions is used in estimating the joint his-
togram between the target and reference scans. A segmentation-guided nonrigid
registration framework is described.
The authors of these chapters deserve a lot of credit and have the respon-
sibility for preparing first class manuscripts that will stand the test of time and
will guarantee the long-term value of these two volumes. Several people at
Springer deserve special credit for making every effort to carry out this project
in such a beautiful and professional fashion. In particular, Aaron Johnson,
Senior Editor, Beverly Rivero, Editorial Assistant, Tim Oliver, Project Manager,
and Amy Hendrickson, LATEX Consultant, have made every effort to make this
project smooth to its superb completion.
Finally, Jasjit Suri and Aly Farag acknowledge the support of their families
and express their gratitude to their collaborators and graduate students.
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Chapter 1
SIMULATING BACTERIAL BIOFILMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
David L. Chopp
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. General Mathematical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Example: Quorum-Sensing in P. aeruginosa Biofilms . . . . . . . . . . . . . . . . 6
4. Numerical Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Chapter 2
DISTANCE TRANSFORM ALGORITHMS AND THEIR
IMPLEMENTATION AND EVALUATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
George J. Grevera
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2. Distance Transform Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3. Evaluating Distance Transform Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4. Results of Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
ix
x CONTENTS
Chapter 3
LEVEL SET TECHNIQUES FOR STRUCTURAL INVERSION
IN MEDICAL IMAGING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Oliver Dorn and Dominique Lesselier
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2. Level Set Techniques for Linear Inverse Problems . . . . . . . . . . . . . . . . . . . . 63
3. Level Set Techniques for Nonlinear Inverse Problems . . . . . . . . . . . . . . . . 76
4. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Chapter 4
SHAPE AND TEXTURE-BASED DEFORMABLE MODELS
FOR FACIAL IMAGE ANALYSIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Stan Z. Li, Zhen Lei, Ying Zheng, and Zeng-Fu Wang
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2. Classical Deformable Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3. Motivations for Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4. Direct Appearance Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5. Texture-Constrained Active Shape Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6. Evaluation for Face Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7. Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
10. Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Chapter 5
DETECTION OF THE BREAST CONTOUR IN MAMMOGRAMS
BY USING ACTIVE CONTOUR MODELS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Ricardo J. Ferrari, Rangaraj M. Rangayyan, J.E. Leo Desautels,
Annie F. Frère and Rejane A. Borges
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
2. Method 1: Identification of the breast boundary using a traditional
active deformable contour model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
3. Method 2: Identification of the breast boundary using an adaptive
active deformable contour model (AADCM). . . . . . . . . . . . . . . . . . . . . . . . . 142
4. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
CONTENTS xi
Chapter 6
STATISTICAL DEFORMABLE MODELS FOR CARDIAC
SEGMENTATIONAND FUNCTIONAL ANALYSIS IN
GATED-SPECT STUDIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
C. Tobon-Gomez, S. Ordas, A.F. Frangi, S. Aguade and J. Castell
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
2. Three-Dimensional Active Shape Models (3D-ASM) . . . . . . . . . . . . . . . . . 172
3. Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
4. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Chapter 7
LEVEL SET FORMULATION
FOR DUAL SNAKE MODELS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Gilson A. Giraldi, Paulo S.S. Rodrigues, Rodrigo L.S. Silva,
Antonio L. Apolinário Jr. and Jasjit S. Suri
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
2. Background Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
3. T-Snakes Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
4. Dual-T-Snakes Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
5. Level Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
6. Dual-Level-Set Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
7. Segmentation Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8. Dual-Level-Set Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
9. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
10. Conclusions and Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
12. Appendix: Theoretical Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Chapter 8
ACCURATE TRACKING OF MONOTONICALLY
ADVANCING FRONTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
M. Sabry Hassouna and A. A. Farag
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
2. The Fast Marching Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
3. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
4. Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
xii CONTENTS
5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
7. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
Chapter 9
TOWARD CONSISTENTLY BEHAVING DEFORMABLE MODELS
FOR IMPROVED AUTOMATION IN IMAGE SEGMENTATION . . . . . . . . 259
Rongxin Li and Sébastien Ourselin
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
3. Skeleton by Influence Zones Based on Topographic Distance . . . . . . . . . . 264
4. Computing the GTD Transform and Swamping Transform . . . . . . . . . . . . 266
5. Image Partitioning Based on GTD transforms . . . . . . . . . . . . . . . . . . . . . . . . 268
6. Integration into Deformable Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
7. Qualitative Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
8. Quantitative Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
9. Application: Determination of Tissue Density . . . . . . . . . . . . . . . . . . . . . . . 285
10. Discussion and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
12. Notes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
Chapter 10
APPLICATION OF DEFORMABLE MODELS FOR THE DETECTION
OF ACUTE RENAL REJECTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
Ayman El-Baz, Aly A. Farag, Seniha E. Yuksel, Mohamed E.A. El-Ghar,
Tarek A. Eldiasty, and Mohamed A. Ghoneim
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
2. Related Work in Renal Image Analysis Using DCE-MRI . . . . . . . . . . . . . 296
3. Related Work in Shape-Based Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . 298
4. Methods and Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
5. Kidney Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
6. Model for the local deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
7. Cortex Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
8. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
9. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
CONTENTS xiii
Chapter 11
PHYSICALLY AND STATISTICALLY BASED DEFORMABLE
MODELS FOR MEDICAL IMAGE ANALYSIS . . . . . . . . . . . . . . . . . . . . . . . . . 335
Ghassan Hamarneh and Chris McIntosh
1. Energy-Minimizing Deformable Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
2. Smart Snakes: Incorporating Knowledge about Shape . . . . . . . . . . . . . . . . 351
3. Statistically Constrained Snakes: Combining ACMs and ASMs . . . . . . . 363
4. Deformable Spatiotemporal Shape Models:
Extending Active Shape Models to 2D+Time . . . . . . . . . . . . . . . . . . . . . . . . 372
5. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
Chapter 12
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS . . . . . 387
Ghassan Hamarneh and Chris McIntosh
1. Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
2. Deformable Organisms:
An Artificial Life Modeling Paradigm for Medical Image Analysis. . . . . 391
3. The Layered Architecture of Deformable Organisms . . . . . . . . . . . . . . . . . 395
4. Results and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
6. Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Chapter 13
PDE-BASED THREE DIMENSIONAL PATH PLANNING
FOR VIRTUAL ENDOSCOPY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
M. Sabry Hassouna, Aly A. Farag, and Robert Falk
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
2. Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
3. Limitation of Existing Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
4. The Proposed Medial Curve Extraction Framework . . . . . . . . . . . . . . . . . . 449
5. Validation and Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
6. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
7. Conclusion and Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
xiv CONTENTS
Chapter 14
OBJECT TRACKING IN IMAGE SEQUENCE COMBINING
HAUSDORFF DISTANCE, NON-EXTENSIVE ENTROPY IN
LEVEL SET FORMULATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
Paulo S. Rodrigues, Gilson A. Giraldi, Ruey-Feng Chang and Jasjit S. Suri
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
2. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
3. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
4. Proposed Hausdorff-Tsallis Level Set Algorithm . . . . . . . . . . . . . . . . . . . . . 494
5. Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
6. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
7. Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
Chapter 15
DEFORMABLE MODEL-BASED IMAGE REGISTRATION . . . . . . . . . . . . 517
Jundong Liu
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
2. Mutual Information Metric and Artifact Effects . . . . . . . . . . . . . . . . . . . 522
3. Segmentation-Guided Deformable Image Registration Frameworks . . . . 531
4. Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
5. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
CONTRIBUTORS
Ruey-Feng Chang
Department of Computer Science and Tarek A. Eldiasty
Information Engineering Urology and Nephrology Department
National Chung Cheng University University of Mansoura
Chiayi, Taiwan Mansoura, Egypt
David L. Chopp
Department of Engineering Sciences and Mohamed E.A. El-Ghar
Applied Mathematics Urology and Nephrology Department
Northwestern University University of Mansoura
Evanston, Illinois, USA Mansoura, Egypt
xv
xvi CONTRIBUTORS
Zhen Lei
Ricardo J. Ferrari
National Laboratory of Pattern Recognition
Department of Electrical and Computer
Institute of Automation
Engineering
Chinese Academy of Sciences
University of Calgary
Beijing, China
Calgary, Alberta, Canada
Rongxin Li
A.F. Frangi BioMedIA Lab
Computational Imaging Laboratory Autonomous Systems Laboratory
Universitat Pompeu Fabra CSIRO ICT Centre
Barcelona, Spain Marsfield, New South Wales 2122,
Australia
Annie F. Frère
Department of Electrical and Computer Stan Z. Li
Engineering National Laboratory of Pattern Recognition
University of Calgary Institute of Automation
Calgary, Alberta, Canada Chinese Academy of Sciences
Beijing, China
Mohamed A. Ghoneim
Jundong Liu
Urology and Nephrology Department
School of Electrical Engineering and
University of Mansoura
Computer Science
Mansoura, Egypt
Ohio University
Athens, Ohio, USA
Gilson A. Giraldi
National Laboratory for Scientific Chris McIntosh
Computing School of Computing Science
Petropolis, Brazil Simon Fraser University
Burnaby, British Columbia, Canada
George J. Grevera
Mathematics and Computer Science S. Ordas
Department Computational Imaging Laboratory
St. Joseph’s University Universitat Pompeu Fabra
Philadelphia, Pennsylvania, USA Barcelona, Spain
CONTRIBUTORS xvii
SIMULATING BACTERIAL
BIOFILMS
David L. Chopp
Department of Engineering Sciences and Applied Mathematics
Northwestern University, Evanston, Illinois
Biofilms are the most ubiquitous form of life on the planet. More than 90% of bacteria live in
biofilms, which are aggregates of cells attached to both biotic and abiotic surfaces [6, 13].
Biofilms are responsible for nitrogen loss from agricultural fertilizers, and they deplete
oxygen in streams, cause disease in humans and plants, and foul pipes, heat exchangers,
and ship hulls. Biofilms are responsible for a number of human diseases, including cystic
fibrosis and Legionnaire’s disease, and are a potential source of nosocomial infections.
According to The Biofilm Institute, biofilms cost U.S. industry billions of dollars annually
in equipment and product damage, energy losses, and human infections.
On the other hand, biofilms are also exploited for their good properties. Biofilms
are employed for treating sewage, industrial waste streams, and contaminated groundwa-
ter. Biofilms can also be used to improve nutrient cycling through plant roots to improve
agricultural productivity, and are used to produce a wide variety of biochemicals used in
medicines, food additives, and cleaning products. Because of their immense impact on
society, both positively and negatively, understanding biofilms is an important goal. In
this chapter, we describe how the level set method is used to simulate biofilm growth and
development.
1. INTRODUCTION
Address correspondence to David L. Chopp, Engineering Sciences and Applied Mathematics Depart-
ment, Northwestern University, Evanston, IL 60208-3125, USA. [email protected].
1
2 DAVID L. CHOPP
are used to treat nitrogen-contaminated water and wastewater [54]. The nitrifying
bacteria are autotrophs that consume inorganic carbon sources and produce mi-
crobial products that can be consumed by heterotrophic bacteria. Heterotrophic
bacteria rely on organic carbon sources such as glucose and typically grow much
faster than the autotrophs. In these types of biofilms, the heterotrophs shield the
nitrifiers from the external environment while the autotrophs provide an additional
source of nutrients. Biofilms are also composed of inert biomass produced by dead
and decaying cells, and extracellular polymeric substances (EPS), which comprise
the glue that holds the biofilm together and bonds it to surfaces [35].
The biofilm structure has many important properties that give cells in biofilms
an advantage over free-floating cells. For example, biofilms are well known to have
a much higher resistance to antimicrobial treatment [13, 14, 60, 67]. While the
precise reason is not known, several different hypotheses have been proposed for
how biofilm formation enhances antimicrobial resistance [67]. For example, the
EPS may have a large binding-site capacity that inactivates antimicrobials before
they are able to penetrate the full depth of the biofilm. This gives bacteria deep in
the biofilm protection from treatment, and they are consequently more difficult to
eradicate.
In addition to the structural components of biofilms, there are other important
soluble diffusing substances present in biofilms. Some substances, such as sub-
strates, may come from external sources, while other substances, such as certain
soluble microbial products, are produced by the biofilms themselves. For example,
several bacterial species are known to produce specialized molecules, called sig-
nals, which allow the bacteria to monitor their local population densities [23, 24].
When the signal concentration reaches a critical threshold, the bacteria are said to
be quorum-sensing, and upregulate certain genes, leading to a change in behavior
[15, 51].
Computer simulation of biofilms have generally been done using discrete
type models. The most notable among these methods is the individual based
model (IbM) developed by the Delft group [32, 33, 68]. In this model, biofilms
are represented by a collection of packed spheres. Each sphere is divided into
mass fractions representing bacteria, EPS, and other residual biomass. Diffusing
substances, such as substrates, are modeled using continuum equations. Biomass
reproduction is simulated through two distinct mechanisms: growth of spheres and
sphere division. Because the growth model leads to overlaps between neighboring
spheres, an elastic relaxation problem is solved to allow overlapping spheres to
push apart until all overlaps are resolved. While this approach has produced
interesting simulations, e.g., [74], the growth rules can be viewed as arbitrary.
Other discrete type models include cellular automata based methods [5, 20,
21, 27–29, 48–50, 52, 73]. In this case, simple rules are applied to each cell,
or collection of cells, for substrate consumption, soluble products production,
and reproduction. As in the IbM method, the substrate and soluble products are
typically modeled as a continuum, while the biomass is represented discretely.
SIMULATING BACTERIAL BIOFILMS 3
where µji is the rate that species i is produced by species j, and v is the velocity
of the biomass. Note that for most cases µji ≡ 0 for i = j, but not always. For
example, if species i produces EPS, then µ1EPS = 0. Likewise, EPS is not able to
reproduce itself; hence µEPS
EPS = 0.
The velocity, v, is generated by changes in total biomass, which are not able
to go through the substratum, the surface on which the biofilm is growing. The
velocity is computed from the mass balance equations. Since the sum of the
volume fractions is constant, then
n
∂
0= Xi ,
∂t i=1
n n
ρj
= µji (S1 , . . . , Sm ) Xj − ∇ · (vXi ) ,
i=1 j=1
ρ i
n n
n
j ρj
= µi (S1 , . . . , Sm ) Xj − ∇ · v Xi ,
i=1 j=1
ρi i=1
n
n
ρj
= µji (S1 , . . . , Sm ) Xj − ∇ · v.
i=1 j=1
ρi
Next, we make the assumption that the velocity field is irrotational so that
v can be derived from a potential, v = ∇Φ for some function Φ. This gives an
equation to determine the velocity field:
n
n
ρj
∇2 Φ = µji (S1 , . . . , Sm ) Xj . (3)
i=1 j=1
ρi
This equation is coupled with boundary conditions including no flux through solid
surfaces, i.e., ∂Φ/∂n = 0, and constant Φ = 0 at the biofilm/liquid interface.
Once the velocity potential is computed, then the normal speed of the biofilm/
liquid interface is given by ∂Φ/∂n.
SIMULATING BACTERIAL BIOFILMS 5
n
∂Sj
= ηji (S1 , . . . , Sm )Xi − βj Sj + Dj ∇2 Sj , (4)
∂t i=1
where ηji is the rate of production (ηji > 0) or consumption (ηji < 0) of substance
Sj by Xi , βj is the rate of degradation of Sj due to hydrolysis or other effects, and
Dj is the diffusion coefficient. The diffusion coefficient may vary by substance,
and also by location, e.g., inside or outside the biofilm [59].
For the length scales at work in biofilms, the time scale for diffusion is typically
on the order of seconds, while the time scale for growth is on the order of days.
Consequently, the mass balance equations for the diffusing substances are taken
to be quasi-steady state. Thus, the equations to be solved are
n
Dj ∇2 Sj = − ηji (S1 , . . . , Sm )Xi + βj Sj . (5)
i=1
Note that the summation in [1.5] is nonzero only inside the biofilm, and is
zero outside the biofilm. Furthermore, since the diffusion coefficient is typically
smaller inside the biofilm compared to outside, then [1.5] is really a combination
of two elliptic equations coupled together through a shared boundary, the interface
of the biofilm, along with the interface conditions:
∂Sj
Dj = 0, [Sj ] = 0. (6)
∂n
Here, the brackets [·] indicate the jump across the biofilm/liquid interface. The
first condition is continuity of flux, and the second condition is continuity of
concentration.
The boundary conditions for [1.5] depend on the particular system being
modeled. One example of boundary conditions is given in the next section.
All the equations discussed in this section are summarized in Figure 1.
6 DAVID L. CHOPP
concentration of the signal builds up. When the concentration reaches a threshold,
the bacteria upregulate certain gene expressions and are said to be quorum-sensing
[15, 51]. Quorum-sensing bacteria increase signal molecule production tenfold,
and also release virulence factors which lead to tissue damage. Understanding
quorum-sensing is an important goal in the effort to treat cystic fibrosis.
For this model, the biofilm is assumed to consist of two components: a single
bacterial species, X1 , and EPS, XEPS . In this model, inert biomass will be included
in the XEPS term. The substrate is assumed to be saturating, so the reaction rates are
limited by availability of O2 represented by S1 . The signal molecule concentration
will be given by S2 . Following the description in the previous section and using
the reactions detailed in [9, 10], the structure equations are
∂X1 S1
= (Yx/o q̂o − b) X1 − ∇ · (vX1 ), (7)
∂t K o + S1
∂XEPS ρ1 S1
= ((1 − fD )b + Yw/o q̂o ) X1 − ∇ · (vXEPS ). (8)
∂t ρEPS K o + S1
In (7), the first term in µ11 gives the rate of reproduction and the second term gives
the rate of endogenous decay. The reactions require the presence of oxygen; hence
the Monod term S1 /(Ko + S1 ). In (8), the first term in µ1EPS represents the accu-
mulation of inert biomass from the nonbiodegradable portions of cells undergoing
endogenous decay, and the second term represents the direct production of EPS
by the bacteria. Note that both equations have been divided by the corresponding
densities ρ1 and ρEPS .
The quasi-steady state substrate and signal equations are
S1
Do ∇2 S1 = ρ1 (q̂0 + γfD b) X1 , (9)
K o + S1
S1
Da ∇2 S2 = −ρ1 β1 + β2 + β3 H(S2 − A) X1 + β4 S2 . (10)
K o + S1
In (9), the two terms correspond to consumption of O2 for reproduction and en-
dogenous decay as described above. In (10), the first term corresponds to signal
production by bacteria that have access to O2 . The second term corresponds to
signal production independent of the environment. The third term corresponds to
the increased signal production of bacteria that are quorum-sensing, the function
H(S2 − A) is the Heaviside function, and A is the signal concentration thresh-
old for quorum-sensing. The final term corresponds to hydrolysis of the signal
molecules.
The boundary conditions for (9), (10) are no flow through the attachment
surface, i.e., ∂Sj /∂n = 0 and a fixed constant at the edge of the boundary layer,
Sj top = S̄j . The O2 concentration in the bulk flow is S̄1 , while the signal
concentration in the bulk flow is S̄2 = 0.
8 DAVID L. CHOPP
4. NUMERICAL IMPLEMENTATION
The simulations presented in this chapter use the level set method to track
the location of the biofilm surface as it grows. This information is important for a
number of the steps in the numerical solution of the mathematical model described
in the previous section. We shall also see that certain steps in the process are
sensitive and require specialized numerical tools in conjunction with the level set
method. These new tools will be described in detail in Section 4.2.
Both of the first two steps in the above algorithm involve solving a set of
nonlinear elliptic equations. In Step 1, the equations are solved on the
whole domain, but with an interface internal to the domain that imposes
additional constraints. In Step 2, the equations are solved only inside the
region where φ < 0, hence it is an additional nonlinear elliptic equation
solved on an irregularly shaped domain. Both of these steps will require
an accurate solver since there are often sharp boundary layers near the
biofilm/fluid interface due to stiff reactions.
Step 3 will use standard conservative upwind methods for propagating the
volume fractions as the biofilm expands.
In order to advance φ in Step 4, we will require evaluation of the normal
derivative of the velocity potential, and then use velocity extensions.
In the remainder of this section, we will detail how these steps are imple-
mented.
SIMULATING BACTERIAL BIOFILMS 9
a growing number of applications, including crack growth [11, 18, 40, 62, 65],
material interfaces and voids [64], solidification [22, 30], and moving solids in
fluid flow [70].
Consider the elliptic equation
∇ · (β∇u) + κu = f (11)
where N is the set of all nodes in the domain, NE is the set of enriched nodes,
ϕi is a standard finite-element basis function corresponding to node ni , ψ is an
enrichment function, and φ is the signed distance function from Γ. Coefficients
ui and aj are the unenriched and enriched degrees of freedom, respectively. In
practice, set NE is only a small subset of N , consisting of those nodes bounding
elements that are cut by Γ.
Following the ordinary transformation of (11) into its equivalent weak formu-
lation, coefficients ui , aj are computed by solving system Ax = b, where matrix
A can be broken down into four submatrices:
UU
A AU E
A= EU , (13)
A AEE
X-FEM IIM
n System size Density System size Density
19 520 2.07396% 400 1.09250%
39 1,840 0.54277% 1,600 0.29234%
79 6,880 0.13840% 6,400 0.07558%
159 26,560 0.03490% 25,600 0.01921%
319 104,320 0.00876% 102,400 0.00484%
The most common, and simplest, enrichment functions are a step function
[66] and a linear ramp function [31]. However, enrichment functions
specific to a particular application can also be used, for example a square
root singularity function around crack tips [3]. In each case, the enrichment
functions are given as functions of the signed distance to the interface. This
is easily evaluated when the method is coupled with the level set method.
The resulting matrix that must be inverted for the X-FEM is a little larger,
and not quite as sparse as the one generated by the IIM. In Table 1, the
dimensions and sparsity for a sample problem are compared. The compu-
tational cost of solving the two systems are comparable.
∇2 u = 0 (14)
1
in the domain [−1, 1] × [−1, 1], with Γ consisting of a circle of radius 2 in the
center. At the interface, the following jump conditions are specified:
Table 2. Global max-norm errors for the X-FEM and the IIM
solving system (14)–(16)
n X-FEM IIM
19 1.7648 × 10−4 3.6253 × 10−3
39 6.0109 × 10−5 4.6278 × 10−4
79 1.7769 × 10−5 3.0920 × 10−4
159 4.8626 × 10−6 1.1963 × 10−4
319 1.2362 × 10−6 4.5535 × 10−5
n X-FEM IIM
1
ex cos y r≤ 2
u(x, y) = 1
. (17)
0 r> 2
Both the X-FEM and the IIM were used to solve system (14)–(16). In Table 2,
the global accuracy of the two methods are compared. This measures the accuracy
over the whole domain Ω. While the convergence rates are the same, Table 2 shows
how the X-FEM produces an order of magnitude better solution than the IIM. If
the error is measured only around the interface, the X-FEM again outperforms the
IIM as shown in Table 3.
As noted earlier, while the global error is important, it is critical that the
gradient of the solution be accurately computed from the solution. In this regard,
the X-FEM performs dramatically better, as shown in Table 4. Taking the derivative
of the numerical solution is expected to produce lower accuracy as is observed.
However, while the X-FEM produces two or three digits of accuracy, the IIM is
unable to produce any digits of accuracy for this problem. It is worth noting that
this problem is typical of the difficulty faced in (11). One explanation for the
dramatic difference in the results is that the immersed interface method is only
computed at the grid points, so interpolation is required to obtain values at the
interface, which does not generally pass directly through grid points. On the other
SIMULATING BACTERIAL BIOFILMS 13
n X-FEM IIM
19 5.6520 × 10−2 3.0009 × 10+1
39 2.4190 × 10−2 5.5185 × 10+1
79 9.4512 × 10−3 1.2034 × 10+2
159 7.1671 × 10−3 2.6466 × 10+2
319 2.6865 × 10−3 5.2870 × 10+2
n X-FEM IIM
19 1.9 × 10−1 2.8 × 10+2
39 4.1 × 10−2 6.9 × 10+2
79 1.1 × 10−2 8.3 × 10+2
159 3.6 × 10−3 8.0 × 10+2
319 2.3 × 10−3 8.6 × 10+2
hand, the X-FEM, through the use of suitable enrichment functions, can provide
more accurate solutions between the grid points.
Comparable results are obtained for a problem much more similar to (11). In
this example, the two methods are compared solving
∇2 u − λ2 u = 0, (18)
where λ >> 1 inside a circle, and λ = 0 outside the circle. Table 5 shows
the errors in computing the normal derivative to the interface. This example is
significantly more difficult, because of the large value of λ, and hence an additional
enrichment function was used to improve the accuracy of the X-FEM. In this case,
an exponential function was used, since it is a reasonable approximation to the
solution in a neighborhood of the interface. A side view of the solution for a circle
of radius 1/4 is shown in Figure 2. With the inclusion of this extra enrichment
function, the X-FEM is able to produce at least two digits of accuracy, while the
IIM is not.
While the X-FEM does outperform the IIM in terms of accuracy and solvabil-
ity of the resulting linear system, this does come at a price. The IIM is significantly
easier to program than the X-FEM. In particular, generating the entries in the
14 DAVID L. CHOPP
Figure 2. Sample solution of the Helmholtz equation on a circle using the X-FEM with
exponential enrichment. See attached CD for color version.
The X-FEM, given the right enrichment functions, can accurately compute
solutions of elliptic equations that are often required for computing the
interface velocity.
SIMULATING BACTERIAL BIOFILMS 15
The nodes to be enriched are easily identified using the signed distance
construction of the level set function [61, 63, 64, 66].
−6
8×10
1.0 6×10−6
0.8
4×10−6
0.6
1.0
0.4 2×10−6 0.8
0.6
0.2 0.4
0.2
0.0 0.0
The velocity potential equation that is obtained from (7), (8) is given by
ρ1 S1
∇2 Φ = Yx/o q̂0 − b + ((1 − fD )b + Yw/o q̂0 ) X1 . (19)
ρEPS K o + S1
Figure 4 shows a sample graph of the solution of (19) for the substrate concentration
computed in Figure 3. Note that this equation is solved only in the interior of
the biofilm, and is constant zero outside the biofilm. Larger negative values are
indicative of greater local growth rates.
16 DAVID L. CHOPP
−2×10−5
1.0
0.8 −4×10−5
0.6 −6×10−5 1.0
0.4 0.8
−4×10−5 0.6
0.2 0.4
−4×10−4 0.2
0.0 0.0
j ρj n
∂Xi
= µi Xj − ∇ · (vXi ).
∂t j=1
ρi
j ρj n
∂Xi
= µi Xj − Xi ∇ · v − v · ∇Xi
∂t j=1
ρi
j ρj n
∂Xi
= µi Xj − Xi ∇2 Φ − v · ∇Xi
∂t j=1
ρi
j ρj n j ρj n n
∂Xi
= µi Xj − Xi µk Xj − v · ∇Xi . (20)
∂t j=1
ρi j=1
ρk
k=1
SIMULATING BACTERIAL BIOFILMS 17
Equation (20) is only valid inside the biofilm, but as described below, the values
of Xi are extended outside the domain. Thus, the finite-difference approximation
in (21) is valid for all points inside the biofilm.
Note that the volume fractions are updated before the interface is advanced
according to the velocity, v. Thus, while Xi are updated inside the biofilm here,
there may be points that are currently outside the biofilm, which will subsequently
be inside the biofilm after the interface is advanced. To account for this, the values
of Xi are extended outside the biofilm. To do this, it is assumed that ∂Xi /∂n = 0.
In terms of the level set function, φ, which encapsulates the biofilm/liquid interface,
the unit normal to the interface is given by
∇φ
n= . (22)
∇φ
∇φ
0 = ∇Xi · n = ∇Xi · . (23)
∇φ
This equation is easily recognized as the equation required for computing velocity
extensions in the level set method [1]. However, in this instance, the Xi data
need only be extended one layer of additional grid points outside the biofilm, as
illustrated in Figure 5.
The extended value of Xi is computed by discretizing ((23)) in the same
manner as is done for velocity extensions in the level set method. For example,
consider point (xj , yk ) indicated in Figure 5. This grid point has two neighboring
interior grid points. For this point, the discretized version of (23) will be
Xj+1,k − Xj,k φj+1,k − φj,k Xj,k − Xj,k−1 φj,k − φj,k−1
+ = 0,
∆x ∆x ∆y ∆y
(24)
18 DAVID L. CHOPP
Figure 5. Illustration of where Xi values must be extended. The black dots are inside the
biofilm, and the gray dots are where the extension values must be computed.
where here we have dropped the subscript i from Xi for clarity. Equation [1.24]
is easily solved for the extension value Xj,k :
φt + F ∇φ = 0. (26)
The key to all level set method applications is the generation of speed function F .
Once F is determined, φ is easily updated via (26) using upwind finite-difference
methods borrowed from techniques used to solve hyperbolic conservation laws.
SIMULATING BACTERIAL BIOFILMS 19
In the case of the biofilm application presented here, speed function F may
only be determined on the interface, and is given by
∂Φ
F Γ
= , (27)
∂n
where Φ is the velocity potential discussed in Section 4.2. Once F Γ is computed,
it must be extended off the interface to the rest of the domain so that we can use
(26). This is accomplished through a velocity extension, discussed later in this
section.
Since Φ is only defined on the inside of the biofilm, then ∂Φ/∂n must be com-
puted using a one-sided finite-difference approximation at the interface. Suppose
x ∈ Γ is where ∂Φ/∂n is to be computed; then it is approximated by
p(xk , y ) = φ(xk , y )
∂p ∂φ
(xk , y ) = (xk , y )
∂x ∂x
∂p ∂φ
(xk , y ) = (xk , y )
∂y ∂y
∂2p ∂2φ
(xk , y ) = (xk , y )
∂x∂y ∂x∂y
p(x1 , y1 ) = 0, (30)
∇p(x1 , y1 ) × ((x0 , y0 ) − (x1 , y1 )) = 0. (31)
Equation (30) is a requirement that (x1 , y1 ) must be on the interface. Equation (31)
is a requirement that the interface normal, given by ∇p(x1 , y1 ), must be aligned
with the line through points (x0 , y0 ) and (x1 , y1 ). Equations (30), (31) are solved
SIMULATING BACTERIAL BIOFILMS 21
simultaneously using Newton’s method. Typically, less than five iterations are
necessary in order to achieve sufficient accuracy.
To compute the speed at grid point (xi , yj ) off the interface, a bicubic inter-
polant is generated for the box containing the interface and grid point (xi , yj ).
The point on the interface nearest to (xi , yj ), labeled (x1 , y1 ) in the discussion
above, is determined by solving (30), (31). Given the interface speed F (x1 , y1 )
from (28), the speed function value is then F (xi , yj ) = F (x1 , y1 ). These points
are labeled as accepted points for purposes of the Fast Marching Method described
below. Once all the points near the interface have been given initial speed function
values, the remainder of the grid points have their speed function values computed
using the velocity extension method described next.
G∇φ = 1, (32)
where G is the speed of the interface. What makes the fast marching method fast
is the fact that (32) can be solved with one pass over the mesh. For the purposes
of the biofilm application, we only require the fast marching method to compute
the velocity extension, and hence (32) will be solved with G ≡ 1.
The key to solving (32) in one pass is to traverse the mesh in the proper order.
The grid points must be evaluated in the order of increasing φ. This is accomplished
by using a sorted heap that always keeps track of which grid point is to be evaluated
next. To begin, the set of grid points is divided into three disjoint sets, accepted
points A, tentative points T , and distant points D. The accepted points in A are
points (xi , yj ) for which the computed value of φi,j is already determined. The
tentative points in T are points the (xi , yj ) for which a tentative value for φi,j is
computed. The remainder of the points are in set D. One by one, points in T are
taken, in order of increasing value of φi,j , from set T into A. Each time, points
(xi , yj ) in D that become adjacent to points in set A are moved into set T and
a tentative value for φij is computed using a finite-difference approximation for
(32). The algorithm terminates when all points have migrated into set A. See (7)
for an illustration of sets A, T , and D.
The implementation of the fast marching method uses upwind finite differ-
ences, where the direction of upwind differences is taken toward smaller values of
φ. For example, suppose points (xi−1 , yj ), (xi , yj+1 ) are in set A, then φi−1,j ,
φi,j+1 are already determined, and we wish to compute a tentative value for φi,j .
Equation (32) is discretized using one-sided differences to obtain
Figure 7. Illustration of the sets A, T , D, associated with the fast marching method. This
figure reprinted from [8].
1. Initialize all the points adjacent to the initial interface with an initial value
using the bicubic interpolation, and put those points in set A. All points
(xi , yj ) ∈
/ A, adjacent to a point in A are given initial estimates for φi,j by
solving (32). These points are tentative points and are put in the set T . All
remaining points unaccounted for are placed in D and given initial values
of φi,j = +∞.
2. Choose point (xi , yj ) ∈ T that has the smallest value of φi,j and move it
into A.
3. Any point that is adjacent to (xi , yj ) (i.e., points (xi−1 , yj ), (xi , yj−1 ),
(xi+1 , yj ), (xi , yj+1 )) and that is in T has its value φi,j recalculated using
(32). Any point adjacent to (xi , yj ) and in D has its value φi,j computed
using (32) and is moved into set T .
4. If T = ∅, go to step 2.
SIMULATING BACTERIAL BIOFILMS 23
As observed in [1], if the level set speed function is F , then the velocity that
preserves the signed distance function solves
∇F · ∇φ = 0, (35)
where φ is the level set function. If we assume Fi,j is given for all points (xi , yj )
initially in set A, then the remainder of the values for Fi,j are computed using the
same upwind direction finite difference as the fast marching method. Thus, consid-
ering again the example where (xi−1 , yj ) and (xi , yj+1 ) ∈ A, (35) is discretized
to become
1. Initialize the location of the biofilm with the surface at φ = 0, the interior
of the biofilm indicated by φ < 0, where φ is the signed distance function
to the interface. Also initialize all volume fractions, 0 ≤ Xi ≤ 1, inside
the biofilm.
2. Solve [1.5] for diffusing substances, Sj . Use the X-FEM with exponential
and step enrichment functions at the interface to ensure accuracy.
3. Solve velocity potential equation (3) again using the X-FEM with step
enrichment.
4. Update the volume fractions inside the biofilm using 20. Use the velocity
extension equation, (23), to extend the volume fraction values outside the
biofilm.
6. Advance the interface using the level set evolution equation, (26).
7. Go to step 2.
24 DAVID L. CHOPP
5. EXAMPLES
Figure 9. Substrate concentration contours for the given biofilm shown with bold lines.
Each contour line indicates half the concentration of the one above it as labeled.
SIMULATING BACTERIAL BIOFILMS 27
Figure 10. Contours of the velocity potential, Φ, which also indicates the rate of substrate
consumption. Consumption is concentrated at the tops of the tallest colonies.
1.0
0.8
1.0
0.6
0.8
0.6 0.4 1.0
0.4 0.8
0.2 0.6
0.2 0.4
0.0 0.2
0.0 0.0
Figure 11. Graph of the volume fraction, X1 , of the bacteria in the biofilm. The composition
is very uniform.
Figure 12. Signal concentration contours in and around the biofilm just prior to quorum-
sensing.
28 DAVID L. CHOPP
6. CONCLUSION
In this chapter we showed how the level set method can be coupled with other
numerical methods to model bacterial biofilms. One of the key methods that was
coupled to the level set method is the eXtended Finite Element Method (X-FEM).
The combination of methods is a powerful tool extending the range of problems
to which the level set method can be applied.
The simulated biofilms generated by the level set method algorithm behave
qualitatively very similar to real biofilms observed in experiments. Models such as
this will be used to explore a variety of important biofilm phenomena. Considering
that biofilms are so critically intertwined in both the environment, industry, and
society, tools such as the one presented here may have applications across a wide
spectrum of natural and manmade processes.
7. ACKNOWLEDGMENTS
This research was supported by a grant from the National Institutes of Health
through contract R01-GM067248. Thanks to the many people who contributed
to this work, including Mary Jo Kirisits, Brian Moran, Matt Parsek, Bryan Smith,
and Benjamin Vaughan.
8. REFERENCES
1. Adalsteinsson D, Sethian JA. 1999. The fast construction of extension velocities in level set
methods. J Comput Phys 48(1):2–22.
2. Bakke R, Characklis WG, Turakhia MH, Yeh A. 1990. Modeling a monopopulation biofilm
system: pseudomonas aeruginosa. In Biofilms. New York: John Wiley & Sons.
3. Belytschko T, Black T. 1999. Elastic crack growth in finite element with minimal remeshing. Int
J Num Meth Eng 45:601–620.
4. Bramble J, King J. 1996. A finite element method for interface problems in domains with smooth
boundaries and interfaces. Adv Comput Math 6:109–138.
5. Chang I, Gilber ES, Eliashberg N, Keasling JD. 2003. A three-dimensional, stochastic sim-
ulation of biofilm growth and transport-related factors that affect structure. Microbiol–SGM
149(10):2859–2871.
6. Characklis WG, Marshall KC. 1990. Biofilms. New York: John Wiley & Sons.
7. Chen Z, Zou J. 1998. Finite element methods and their convergence for elliptic and parabolic
interface problems. J Num Math 79:175–202.
8. Chopp DL. 2001. Some improvements of the fast marching method. SIAM J Sci Comp 23(1):230–
244.
9. Chopp DL, Kirisits MJ, Moran B, Parsek M. 2002. A mathematical model of quorum sensing in
a growing P. aeruginosa biofilm. J Ind Microbiol Biotechnol 29(6):339–346.
10. Chopp DL, Kirisits MJ, Parsek MR, Moran B. 2003. The dependence of quorum sensing on the
depth of a growing biofilm. Bull Math Biol. To appear.
11. Chopp DL, Sukumar N. 2003. Fatigue crack propagation of multiple coplanar cracks with the
coupled extended finite element/fast marching method. Int J Eng Sci 41:845–869, 2003.
SIMULATING BACTERIAL BIOFILMS 29
12. Chopp DL, Sukumar N. 2003. Fatigue crack propagation of multiple coplanar cracks with the
coupled extended finite element/fast marching method. Int J Eng Sci 41(8):845–869.
13. Costerton JW, Lewandowski Z, Caldwell DE, Korber DR, Lappin-Scott HM. 1995. Microbial
biofilms. Annu Rev Microbiol 49:711–745.
14. Costerton JW, Stewart PS, Greenberg EP. 1999. Bacterial biofilms: A common cause of persistent
infections. Science 284:1318–1322.
15. Davies DG, Parsek MR, Pearson JP, Iglewski BH, Costerton JW, Greenberg EP. 1998. The in-
volvement of cell-to-cell signals in the development of a bacterial biofilm. Science 280:295–298.
16. De Kievit TR, Gillis R, Marx S, Brown C, Iglewski BH. 2001. Quorum-sensing genes in
Pseudomonas aeruginosa biofilms: their role and expression patterns. Appl Environ Microbiol
67:1865–1873.
17. Dockery J, Klapper I. 2001. Finger formation in biofilm layers. SIAM J Appl Math 62(3):853–869.
18. Dolbow JE, Moës N, Belytschko T. 2000. Discontinuous enrichment in finite elements with a
partition of unity method. Finite Elem Anal Des 36:235–260.
19. Dolbow JE, Moës N, Belytschko T. 2001. An extended finite element method for modeling crack
growth with frictional contact. Comput Meth Appl Mech Eng 190:6825–6846.
20. Eberl HJ, Parker DF, van Loosdrecht MCM. 2001. A new deterministic spatiotemporal continuum
model for biofilm development. J Theor Med 3:161–175.
21. Eberl HJ, Picioreanu C, Heijnen JJ, van Loosdrecht MCM. 2000. A three-dimensional numerical
study on the correlation of spatial structure, hydrodynamic conditions, and mass transfer and
conversion in biofilms. Chem Eng Sci 55:6209–6222.
22. Chessa J, et. al. 2002. The extended finite element method (xfem) for solidification problems. Int
J Num Meth Eng 53:1959–1977.
23. Fuqua C, Greenberg EP. 1995. Self perception in bacteria: quorum sensing with acylated
homoserine lactones. Curr Opin Microbiol 118(2):269–277.
24. Fuqua C, Parsek MR, Greenberg EP. 2001. Regulation of gene expression by cell-to-cell com-
munication: acyl-homoserine lactone quorum sensing. Ann Rev Genet 35:439–468.
25. Gaul L, Kögl M, Wagner M. 2003. Boundary element methods for engineers and scientists. New
York: Springer.
26. Gravouil A, Moës N, Belytschko T. 2002. Non-planar 3d crack growth by the extended finite
element and the level sets, II: level set update. Int J Num Meth Eng 53(11):2569–2586.
27. Hermanowicz SW. 1999. Two-dimensional simulations of biofilm development: effect of external
environmental conditions. Water Sci Technol 39(7): 107–114.
28. Hermanowicz SW. 2001. A simple 2D biofilm model yields a variety of morphological features.
Math Biosci 169:1–14.
29. Indekeu JO, Giuraniuc CV. 2004. Cellular automaton for bacterial towers. Phys A 336(1–2):14–26.
30. Ji H, Chopp D, Dolbow JE. 2002. A hybrid extended finite element/level set method for modeling
phase transformations. Int J Num Meth Eng 54:1209–1233.
31. Ji H, Chopp D, Dolbow JE. 2002. A hybrid extended finite element/level set method for modeling
phase transformations. Int J Num Meth Eng 54(8):1209–1233.
32. Kreft JU, Booth G, Wimpenny JWT. 1998. BacSim, a simulator for individual-based modeling
of bacterial colony growth. Microbiology 144:3275–3287.
33. Kreft JU, Picioreanu C, Wimpenny JWT, van Loosdrecht MCM. 2001. Individual-based modeling
of biofilms. Microbiology–SGM 147:2897–2912.
34. Laspidou CS, Rittmann BE. 2002. Non-steady state modeling of extracellular polymeric sub-
stances, soluble microbial products, and active and inert biomass. Water Res 36:1983–1992.
35. Laspidou CS, Rittmann BE. 2002. Non-steady state modeling of microbial products and active
and inert biomass. Water Res 36:1983–1992.
36. LeVeque R, Li Z. 1994. The immersed interface method for elliptic equations with discontinuous
coefficients and singular sources. SIAM J Num Anal 31:1019–1044.
37. Li Z. 2003. An overview of the immersed interface method and its applications. Taiwan J Math
7:1–49.
30 DAVID L. CHOPP
38. Lide DR, ed. 1990. CRC handbook of chemistry and physics. Boca Raton, FL: CRC Press.
39. Mobarry BK, Wagner M, Urbain V, Rittmann BE, Stahl DA. 1996. Phylogenetic probes for analyz-
ing abundance and spatial organization of nitrifying bacteria. Appl Environ Microb 62(6):2156–
2162.
40. Moës N, Dolbow J, Belytschko T. 1999. A finite element method for crack growth without
remeshing. Int J Num Meth Eng 46(1):131–150.
41. Moës N, Gravouil A, Belytschko T. 2002. Non-planar 3d crack growth by the extended finite
element and the level sets, I: mechanical model. Int J Num Meth Eng 53(11):2549–2568.
42. Osher S, Sethian JS. 1988. Fronts propagating with curvature dependent speed: algorithms based
on Hamilton-Jacobi formulation. J Comput Phys 79:12–49.
43. Parsek MR. Unpublished data.
44. Pesci EC, Iglewski BH. 1997. The chain of command in Pseudomonas quorum sensing. Trends
Microbiol 5(4):132–135.
45. Pesci EC, Pearson JP, Seed PC, Iglewski BH. 1997. Regulation of las and rhl quorum sensing in
Pseudomonas aeruginosa. J Bacteriol 179(10):3127–3132.
46. Peskin CS. 1977. Numerical analysis of blood flow in the heart. J Comput Phys 25:220–252.
47. Peskin CS. 1981. Lecures on mathematical aspects of physiology. Lectures Appl Math 19:69–107.
48. Picioreanu C, van Loosdrecht MCM, Heijnen JJ. 1998. Mathematical modeling of biofilm
structure with a hybrid differential-discrete cellular automaton approach. Biotechnol Bioeng
58(1):101–116.
49. Picioreanu C, van Loosdrecht MCM, Heijnen JJ. 1999. Discrete-differential modeling of biofilm
structure. Water Sci Technol 39(7):115–122.
50. Picioreanu C, van Loosdrecht MCM, Heijnen JJ. 2000. A theoretical study on the effect of surface
roughness on mass transport and transformation in biofilms. Biotechnol Bioeng 68(4):355–369.
51. Piper KR, Beck von Bodman S, Farrand SK. 1993. Conjugation factor of Agrobacterium tume-
faciens regulates Ti plasmid transfer by autoinduction. Nature 362:448–450.
52. Pizarro G, Griffeath D, Noguera DR. 2001. Quantitative cellular automaton model for biofilms.
J Environ Eng 127(9):782–789.
53. Rittmann BE. 2002. Personal communication.
54. Rittmann BE, McCarty P. 2001. Environmental Biotechnology. New York: McGraw Hill.
55. Rosenfeld M, Ramsey B. 1992. Evolution of airway microbiology in the infant with cystic fibrosis:
role of nonpseudomonal and pseudomonal pathogens. Semin Respir Infect 7:158–167.
56. SchaeferAL, Hanzelka BL, Parsek MR, Greenberg EP. 2000. Detection, purification and structural
elucidation of acylhomoserine lactone inducer of Vibrio fischeri luminescence and other related
molecules. Meth Enzymol 305:288–301.
57. Sethian JA. 1996. A marching level set method for monotonically advancing fronts. Proc Natl
Acad Sci USA 93(4):1591–1595.
58. Sethia JA. 1999. Fast marching methods. SIAM Rev 41(2):199–235.
59. Stewart PS. 2003. Diffusion in biofilms. J Bacteriol 185(5):1485–1491.
60. Stewart PS, Costerton JW. 2001. Antibiotic resistance of bacteria in biofilms. Lancet
358(9276):135–138.
61. Stolarska M, Chopp DL. 2003. Modeling spiral cracking due to thermal cycling in integrated
circuits. Int J Eng Sci 41(20)2381–2410.
62. Stolarska M, Chopp DL, Möes N, Belytschko T. 2001. Modelling crack growth by level sets in
the extended finite element method. Int J Num Meth Eng 51:943–960.
63. Stolarska M, Chopp DL, Moës N, Belytschko T. 2001. Modelling crack growth by level sets in
the extended finite element method. Int J Num Meth Eng 51(8):943–960.
64. Sukumar N, Chopp DL, Moës N, Belytschko T. 2001. Modeling holes and inclusions by level
sets in the extended finite element method. Comput Meth Appl Mech Eng 90(46–47):6183–6200.
65. Sukumar N, Chopp DL, Moran B. 2003. Extended finite element method and fast marching
method for three-dimensional fatigue crack propagation. Eng Fracture Mech 70:29–48.
SIMULATING BACTERIAL BIOFILMS 31
66. Sukumar N, Chopp DL, Moran B. 2003. Extended finite element method and fast marching
method for three-dimensional fatigue crack propagation. Eng Fracture Mech 70(1):29–48.
67. Szomolay B, Klapper I, Dockery J, Stewart PS. 2005. Adaptive response to antimicrobial agents
in biofilms. Environ Microbiol 7(8):1186–1191.
68. van Loosdrecht MCM, Heijnen JJ, Eberl HJ, Kreft JU, Picioreanu C. 2002. Mathematical mod-
eling of biofilm structures. Antonie van Leeuwenhoek 81:245–256.
69. Vaughan BL, Smith BG, Chopp DL. 2005. A comparison of the extended finite element method
and the immersed interface method for elliptic equations with discontinuous coefficients and
singular sources. Preprint available at http://www.esam.northwestern.edu/chopp.
70. Wagner GJ, Moës, N, Liu WK, Belytschko T. 2001. The extended finite element method for rigid
particles in Stokes flow. Int J Num Meth Eng 51:293–313.
71. Wanner O, Gujer W. 1986. A multispecies biofilm model. Biotechnol Bioeng 28:314–328.
72. Williamson KJ, McCarty PL. 1976. Verification studies of the biofilm model for bacterial substrate
utilization. J Water Pol Control Fed 48:281–289.
73. Wimpenny JWT, Colasanti R. 1997. A unifying hypothesis for the structure of microbial biofilms
based on cellular automaton models. FEMS Microbiol Ecol 22(1):1–16.
74. Xavier JB, Picioreanu C, van Loosdrecht MCM. 2004. Assessment of three-dimensional biofilm
models through direct comparison with confocal microscopy imaging. Water Sci Technol 49(11–
12):177–185.
75. Zik O, Moses E. 1999. Fingering instability in combustion: an extended view. Phys. Rev. E
60(1):518–530.
2
George J. Grevera
Saint Joseph’s University
Philadelphia, Pennsylvania, USA
1. INTRODUCTION
George J. Grevera, BL 215, Mathematics and Computer Science, 5600 City Avenue, Saint
Joseph’s University, Philadelphia, PA 19131, USA. Phone: (610) 660-1535; Fax: (610) 660-3082.
[email protected].
33
34 GEORGE J. GREVERA
We note that some points within objects are noteworthy in that they are positioned
on the border of the object with at least one of the outside (background) points.
Adopting terminology from digital topology [1], we call these points elements of
the set of points that form the immediate interior (II) of some object. Similarly,
some background points are notable in that they are positioned on the border or
interface of some object as well. Once again adopting terminology from digital
topology, we call these background points elements of the set of points that form
the immediate exterior (IE). Together, the union of the sets II and IE form a set
of points called border points (or boundary elements), B. We may now define a
distance transform as an algorithm that given I produces a transformed image, I ,
by assigning to each point in I the minimum distance from that point to all border
points.
A number of issues arise when dealing with distance transform algorithms.
The first and probably the most important issue is one of accuracy. Does the
distance transform produce results with minimal errors (or is the distance transform
error free)? If that is the case, it is important to develop a methodology to verify
this claim (and we will do so in this chapter). Even for those algorithms that
are theoretically proven to be error free, from a software engineering standpoint
it is important to be able to validate the implementation of the algorithm. A
second important issue concerns the computation time required by the method.
It is relatively straightforward to develop an exhaustive method that requires a
great deal of processing time. It is important to evaluate processing time as well.
And yet another issue is with regard to the dimensionality of I. Since medical
imagery is of three or four dimensions, it is important in the medical arena for a
distance transform algorithm to generalize to dimensions higher than two. Another
issue that arises when dealing with medical images is that of anisotropic sampling.
Medical images are typically acquired as three-dimensional volumes of data (stacks
of slices) with the sampling within each plane or slice at a higher rate than the
sampling across slices. This yields data with finer spacing between neighboring
pixels (or more generally, voxels) within a slice (e.g., 0.5 mm) than between
neighboring pixels between slices (e.g., 1.0 mm).
Distance transform algorithms are, like most other computationally intensive
algorithms, of interest in and by themselves and have been the subject of at least
one PhD dissertation [2]. Many distance transform algorithms have been proposed,
with [3] and [4] most likely being the earliest. In general, distance transform algo-
rithms exhibit varying degrees of accuracy of the result, computational complexity,
hardware requirements (such as parallel processors), and conceptual complexity of
the algorithms themselves. In [5], the author proposed an algorithm that produces
extremely accurate results by propagating vectors that approximate the distance in
2D images by sweeping through the data a number of times by propagating a local
mask in a manner similar to convolution. In [6] the author presented the Chamfer
distance algorithm (CDA), which propagates scalar, integer values to efficiently
and accurately calculate the distance transform of 2D and 3D images (again in a
DISTANCE TRANSFORM ALGORITHMS 35
manner similar to convolution). Borgefors [6] also presented an error analysis for
the CDA for various neighborhood sizes and integer values. More recently in [7]
an analysis of 3D distance transforms employing 3x3x3 neighborhoods of local
distances was presented. In [8] an analysis of the 2D Chamfer distance algorithm
using 3x3, 5x5, and larger neighborhoods employing both integer and real values
was presented. Marchand-Maillet and Sharaiha [9] also present an analysis of
Chamfer distance using topological order as opposed to the approximation to the
Euclidean distance as the evaluation criteria. Because of the conceptual elegance
of the CDA and because of its widespread popularity, we feel that the CDA family
of algorithms is important and worthy of further study.
Of course, distance transforms outside of the Chamfer family also have been
presented. A technique from Artificial Intelligence, namely A∗ heuristic search
[10], has been used as the basis for a distance transform algorithm [11]. A multiple-
pass algorithm using windows of various configurations (along the lines of [5] and
other raster scanning algorithms such as the CDA) was presented in [12] and
[13]. A method of distance assignment called ordered propagation was presented
in [14]. The basis of that algorithm and others such as A∗ (used in [11]) is to
propagate distance between pixels, which can be represented as nodes in a graph.
These algorithms typically employ sorted lists to order the propagation among the
graph nodes. Guan and Ma [15] and Eggers [16] employ lists as well. In [17] the
authors present four algorithms to perform the exact, Euclidean, n-dimensional
distance transform via the serial composition of n-dimensional filters. Algorithms
for the efficient computation of distance transforms using parallel architectures are
presented in [18] and [19]. In [19] the authors present an algorithm that consists
of two phases, with each phase consisting of both a forward scan and a backward
scan. In the first phase columns are scanned; in the second phase rows are scanned.
They note that since the scanning of a particular column (or row) is independent of
the scanning of the other columns (or rows), each column (row) may be scanned
independently (i.e., in parallel). A distance transform employing a graph search
algorithm is also presented in [20].
Since the early formulation of distance transform algorithms [3, 4], applica-
tions employing distance transforms have also become widespread. For example,
distance transforms have been used for skeletonization of images [21, 22, 23, 24].
Distance transforms are also useful for the (shape-based) interpolation of both
binary images [25, 26] as well as gray image data [27]. In [28] the authors em-
ploy distance transform information in multidimensional image registration. An
efficient ray tracing algorithm also employs distance transform information [29].
Distance transforms have also been shown to be useful in calculating the medial
axis transform, with [30, 31] employing the Chamfer distance algorithm specifi-
cally. In addition to the usefulness of distance transforms for the interpolation of
3D gray medical image data [32, 33], they have also been used for the automatic
classification of plant cells [34] and for measuring cell walls [35]. The Chamfer
distance was also employed in a method to characterize spinal cord atrophy [36].
36 GEORGE J. GREVERA
where xSize is the number of columns in I and I , and ySize is the number of rows.
We note that some distance transform algorithms including [5] restrict the definition
of border points to elements of II only. Our framework easily accommodates this
DISTANCE TRANSFORM ALGORITHMS 37
via a simple change to the algorithm above as illustrated below. We point out,
however, that this definition will not preserve the property of symmetry under
complement [37]. Consider the complement C of the binary image I such that
C(p) = 1 if I(p) = 0 and C(p) = 0 otherwise. A distance transform that preserves
symmetry under complement produces the same result given either C or I (i.e.,
C (p) = I (p) for all p, although the sign may be opposite by convention. In that
case, |C (p)| = |I (p)|.
for (y=1; y<ySize-1; y++)
for (x=1; x<xSize-1; x++)
if (I(x,y)==1) //restrict border points to II only
if ( I(x-1,y) != I(x,y) or I(x+1,y)] !=I(x,y) or
I(x,y-1) != I(x,y) or I(x,y+1) !=I(x,y) )
then (x,y) is a 4-adjacent border element.
if ( I(x-1,y-1) != I(x,y) or I(x+1,y-1)] != I(x,y) or
I(x-1,y+1) != I(x,y) or I(x+1,y+1) != I(x,y) )
then (x,y) is a remaining 8-adjacent border element.
Note that, without loss of generality, we assume that no object extends to the edge
of the discrete matrix in which it is represented. Otherwise, the description of the
algorithms would be unnecessarily complicated by additional boundary condition
checks. If it is the case that an object extends to the edge of the matrix, one may
simply embed that matrix and the objects that are represented within it in a larger
matrix with an additional layer of surrounding background elements.
if (I’(x1,y1)==0) {
//s=(x’,y’)
//calculate the distance to this border element
d = sqrt( (x-x1)*(x-x1) + (y-y1)*(y-y1) );
//is it better than what’s already been assigned?
if (d < I’(x,y)) {
//yes, then update the distance from this
//point, t, to the border element, s
I’(x,y) = d;
}
} //end if
} //end for x1
} //end for y1
} //end if
} //end for x
} //end for y
y1 = list[i]->y;
//calculate the distance to this border element
d = sqrt( (x-x1)*(x-x1) + (y-y1)*(y-y1));
//is it better than what’s already been assigned?
if (d < I’(x,y)) {
//yes, then change to this border element
I’(x,y) = d;
}
}
} //end if
} //end for x
} //end for y
This method is also important for testing other methods as it will form the basis
of our testing procedure.
where check compares, and updates if necessary, the current distance I (u), u =
(x, y) to the specified neighbor and offset from neighbor to u, I (n) + a. Pseudo
code for the 8SED algorithm follows.
//perform the first pass ("first picture scan")
for (y=1; y<=ySize-1; y++) {
for (x=0; x<=xSize-1; x++) {
if (x>0) { //** boundary condition not checked in original
// but needed
check( x-1, y-1, dxy );
}
check( x, y-1, dy );
if (x<xSize-1) { //** not checked in original but needed
check( x+1, y-1, dxy );
}
}
for (x=1; x<=xSize-1; x++) check( x-1, y, dx );
for (x=xSize-2; x>=0; x--) check( x+1, y, dx );
}
Pseudo code for Grevera’s improved 8SED algorithm follows. Note: * indi-
cates a difference from the original 8SED algorithm.
DISTANCE TRANSFORM ALGORITHMS 41
[38] from digital signal processing using 3x3 windows for CDA 3x3, chessboard,
cityblock, and Euclidean, 5x5 windows for CDA 5x5, and 7x7 windows for CDA
7x7. The current distance assignment to each point under consideration, u, is
compared to the current assignments to its neighbors plus the distance, an , for that
specific neighbor n taken from Figure 1 from the specific neighbor, n, to u. If the
current distance assignment, I (u), is greater than I (n)+an , then I (u) is updated
to I (n) + an , which results in minimizing the distance to u. Resulting distance
transform errors diminish with increasing window size while computation cost
increases with increasing window size. Note that different window configurations
(but of the same size) are employed for the forward and backward passes.
Pseudo code for CDA 3x3, cityblock, chessboard, and Euclidean 3x3 appears
below. dx, dy, and dxy are assigned an values according to the table entries
in Figure 1 corresponding to the desired method. CDA 5x5 and CDA 7x7 are
analogous.
where check compares, and updates if necessary, the current distance I (u), u =
(x, y) to the specified neighbor and offset from neighbor to u, I (n) + an .
Borgefors cleverly demonstrated: (i) using a small window and propagating
distance in this manner introduces errors in the assigned distance values even if
double precision floating point is used to represent distance
√ values, (ii) these errors
may be minimized by using values other than 1 and 2 for the distances between
neighboring pixels, and, surprisingly, (iii) using integer window values √ such as 3
and 4 yields more accurate results than using window values of 1 and 2 and does
DISTANCE TRANSFORM ALGORITHMS 43
- 1- - - -
city block 1 u - - u 1
- - - - 1 -
1 1 1 - - -
chessboard 1 u - - u 1
- - - 1 1 1
- 11 - 11 - - - - - -
11 7 5 7 11 - - - - -
CDA 5x5 - 5 u - - - - u 5 -
- - - - - 11 7 5 7 11
- - - - - - 11 - 11 -
- 43 38 - 38 43 - - - - - - -
43 - 27 - 27 - 43 - - - - - -
38 27 17 12 17 27 38 - - - - - -
CDA 7×7 - - 12 u - - - - - - u 12 -
- - - - - - - 38 27 17 12 17 27 38
- - - - - - - 43 - 27 - 27 - 43
- - - - - - - - 43 38 - 38 43 -
√ √
2 1 2 - - -
Euclidean 3x3 1 u - √- u √1
- - - 2 1 2
Figure 1. Various windows used by the Chamfer distance algorithm. ‘u’ indicates the
center of the window. ‘-’ indicates that the point is not used (considered) during that pass
of the algorithm.
44 GEORGE J. GREVERA
and −1 <= dy <= 1 and |dx + dy| <= 2}. E consists of all edges from points
p
√to each of its neighbors and define the cost (distance) associated as either 1 or
2 depending on the Euclidean distance from p to the particular neighbor. As in
previous algorithms, this method begins as many of the previous ones with initially
assigning a distance value of 0 for all p in B and a value of infinity for all p not
in B. Those points p for which I (p) = 0 are also initially placed on an ordered
list L that is sorted from smallest to largest according to the distance assignment,
I (p). The algorithm then proceeds as follows:
//horizontal
else if |px(p)| = |py(p)| then
check( p, sgn(px(p)), sgn(py(p)) );
//diagonal
check( p, sgn(px(p)), 0 );
//horizontal
check( p, 0, sgn(py(p)) );
//vertical
else if |Px(p)| > |Py(p)| then
check( p, sgn(px(p)), sgn(py(p)) );
//diagonal
check( p, sgn(px(p)), 0 );
//horizontal
else
check( p, sgn(px(p)), sgn(py(p)) );
//diagonal
check( p, 0, sgn(py(p)) );
//vertical
L1 = L2;
check( p, dx, dy )
let n = (px+dx, py+dy);
let d = sqrt( dx*dx + dy*dy );
if (I’(n) > I’(p)+d) {
I’(n) = I’(p)+d;
put n in L2;
}
Ragnemalm also presents an error free version of the CSED algorithm. Unfor-
tunately, our implementation of that algorithm, which we believe is faithful to the
description in their paper, allows a few points to remain initialized at infinity in our
tests. This severely skews the results. Therefore, the software that accompanies
this article includes our implementation of the error free version but the results of
executing that implementation will not be included in this paper.
(a) (b)
(c) (d)
Figure 2. Sample test images consisting of (a) a single, solitary point-object, (b) a con-
figuration of three single point-objects that is a known problematic configuration, (c) and
(d) randomly generated test images by sampling from a normal distribution (with different
standard deviations).
need to (i) choose a suite of test cases (input binary images), (ii) develop a “gold
standard” or “ground truth” for each of the test cases, (iii) choose a set of metrics
to compare the result of a distance transform method with ground truth, and then
(iv) compare the results of a method under test with the gold standard using the
metrics.
The simplest test case consists of an image that contains a solitary object con-
sisting of a single point at the center of the image as shown in Figure 2a. Another
test case has been described [2] as being extremely problematic for algorithms
that sweep through the data using local windows (such as 4SED, 8SED, CDA,
DISTANCE TRANSFORM ALGORITHMS 49
DRA, and others). It consists of the three single point-objects as shown in Figure
2b. Although input images consisting of a few solitary point-objects are useful for
understanding algorithms, they are not reflective of real-world objects. To simu-
late real-world objects, we also include images consisting of randomly generated
objects by sampling from a normal distribution, as shown in Figures 2c and 2d.
With regard to a gold standard, we chose the SimpleList algorithm because it is
straightforward, easy to verify, and is exhaustive in its determination of the correct
distance assignments. The Simple algorithm could be used instead of SimpleList
but in practice, Simple is too slow to be useful. (For example, for a rather small
image of 300x300 consisting of a single center point object, SimpleList required
0.03 s of CPU time on a 2-GHz Pentium 4 under Linux. Simple required 71.98 s.
The remaining methods required less then 1 s.)
The result of the distance transform, I , may be regarded as a grey image
where the grey value at each location is the distance value assigned by the par-
ticular algorithm. Given I , the result of some distance transform algorithm, and
ISimpleList (the result of applying SimpleList to I), we can compute the magnitude
of the differences between I and ISimpleList
and determine the RMS (root mean
squared) error as well as the location of the (magnitude of the) single largest dif-
ference between I and ISimpleList
. Additionally, we also calculate the number
of pixels that exhibit any difference whatsoever (regardless of the magnitude of
the difference) and express this as a percentage of the whole. More qualitative
insights can also be gained by viewing difference images (|I − ISimpleList
|) or by
simply thresholding I to create a binary image and viewing the result as shown
in Figure 7 as applied to the input binary image consisting of a single center point.
The expected thresholded result should appear as a circular isocontour with radius
equal to the distance from the center point. Which isocontour is observed depends
upon the selected threshold value. Note that the thresholded results of CDA 3x3,
CDA 5x5, Chessboard, Cityblock, Euclidean 3x3, and MD exhibit significant vis-
ible errors in the form of polygonal approximations to the circular isocontour. The
more accurate of these methods exhibit polygons with more sides (while the least
have less sides). Chessboard has only four sides, while the thresholded results of
CDA 3x3, Cityblock, Euclidean 3x3, and MD have eight sides. Careful examina-
tion of the thresholded results of CDA 5x5 and CDA 7x7 yields 16- and 20-sided
polygons for the selected threshold, respectively. The remaining, most accurate
methods do not have any noticeable artifacts. In addition to accuracy, it is also
important to report the CPU time required to perform the distance transform as
well.
4. RESULTS OF EVALUATION
The times reported are user mode CPU time plus kernel mode CPU time. We
feel that this is a better measure than simple elapsed time, especially on modern,
multiprogrammed operating systems. No other users were logged onto the system
during the tests. Four input test images were employed to evaluate the various
distance transform algorithms: (1) a solitary object consisting of a single solitary
point at the center of the image (Figure 2a), (2) the extremely problematic image
consisting of 3 point-objects (Figure 2b), (3) a randomly generated set of objects
created by sampling from a normal distribution with a mean of the center of the
image an a standard deviation of 0.20 (Figures 2c), and (4) another randomly
generated set of objects created by sampling from a normal distribution with a
different standard deviation of 0.05 (Figure 2d). Each of the input test images were
1000x1000 pixels in size. As previously mentioned, the SimpleList algorithm was
used as the gold standard. RMS error as well as the magnitude of the single largest
difference are reported as well. The results of the evaluation are shown in Table
1 for the central single-point object and three single-point objects, and 2 for two
sets of randomly generated objects.
5. CONCLUDING REMARKS
Although the results in Table 1 appear promising for the gold standard method,
SimpleList, with regards to CPU time, Table 2 demonstrates that SimpleList is not
practical for most applications because its time is two to three orders of magnitude
worse than other methods. The best-performing methods with regard to CPU time
took as little as 0.1 seconds. Of these fastest methods, DRA 3x3 exhibited minimal
error for the randomly generated images.
With regard to accuracy, DV and SimpleList were the only methods that ex-
hibited 0 errors. The performance of SimpleList precludes it from being used in
practice but the performance of DV is quite good for practical use. For applications
that can tolerate small errors, the modified 8SED algorithm had a very low error
rate and excellent performance.
6. ACKNOWLEDGMENTS
7. APPENDIX
Chamfer2D 3x3
Chamfer2D 5x5
Chamfer2D 7x7
Chessboard2D
Cityblock2D
CSED
DeadReckoning 3x3
DeadReckoning 7x7
DijkstraVectors
EightSED
EightSED modified
errorfreeCSED
Euclidean2D
FourSED
ModifiedDijkstra
Simple
SimpleList
CLUT — CLUT (Color LookUp Table) class for writing some color TIFF
image files
Timer — Timer class for reporting elapsed time and CPU time
Normal — Normal class which samples random numbers from a normal
distribution using the Box-Muller transform
DISTANCE TRANSFORM ALGORITHMS 59
TIFFWriter — This class contains methods that write 8-bit color rgb
images or float, double, 8-bit, or 16-bit grey images
DistanceTransform — an abstract base class from which all distance
transform classes inherit
8. REFERENCES
1. Udupa JK. 1994. Multidimensional digital boundaries. Comput Vision Graphics Image Process:
Graphical Models Image Process 56(4):311–323.
2. Cuisenaire O. 1999. Distance transformations: fast algorithms and applications to medical image
processing. PhD thesis. Université Catholique de Louvian.
3. Rosenfeld A, Pfaltz JL. 1968. Distance functions on digital pictures. Pattern Recognit 1(1):33–61.
4. Montanari U. 1968. A method for obtaining skeletons using a quasi-Euclidean distance. J Assoc
Comput Machin 15:600–624.
5. Danielsson P-E. 1980. Euclidean distance mapping. Comput Graphics Image Process 14:227–248.
6. Borgefors G. 1986. Distance transformations in digital images. Comput Vision Graphics Image
Process 34:344–371.
7. Borgefors G. 1996. On digital distance transforms in three dimensions. Comput Vision Image
Understand 64(3):368–376.
8. Butt MA, Maragos P. 1998. Optimum design of chamfer distance transforms. IEEE Trans Image
Process 7(10):1477–1484.
9. Marchand-Maillet S, Sharaiha YM. 1999. Euclidean ordering via Chamfer distance calculations.
Comput Vision Image Understand 73(3):404–413.
10. Nilsson NJ. Artificial intelligence: a new synthesis. San Francisco: Morgan Kaufmann, 1998.
11. Verwer BJH, Verbeek PW, Dekker ST. 1989. An efficient uniform cost algorithm applied to distance
transforms. IEEE Trans Pattern Anal Machine Intell 11(4):425–429.
12. Leymarie F, Levine MD. 1992. Fast raster scan distance propagation on the discrete rectangular
lattice. Comput Vision Graphics Image Process: Image Understand 55(1):84–94.
13. Satherley R, Jones MW. 2001. Vector-city vector distance transform. Comput Vision Image Un-
derstand 82:238–254.
14. Ragnemalm I. 1992. Neighborhoods for distance transformations using ordered propagation. Com-
put Vision Graphics Image Process: Image Understand 56(3):399–409.
15. Guan W, Ma S. 1998. A list-processing approach to compute Voronoi diagrams and the Euclidean
distance transform. IEEE Trans Pattern Anal Machine Intell 20(7):757–761.
16. Eggers H. 1998. Two fast Euclidean distance transformations in Z2 based on sufficient propagation.
Comput Vision Image Understand 69(1):106–116.
17. Saito T, Toriwaki J-I. 1994. New algorithms for euclidean distance transformation of an n-
dimensional digitized picture with application. Pattern Recognit 27(11):1551–1565.
18. Boxer L, Miller R. 2000. Efficient computation of the Euclidean distance transform. Comput Vision
Image Understand 80:379–383.
19. Meijster A, Roerdink JBTM, Hesselink WH. 2000. A general algorithm for computing distance
transforms in linear time. In Mathematical morphology and its applications to image and signal
processing, pp. 331–340. Ed. J Goutsias, L Vincent, DS Bloombers. New York: Kluwer.
60 GEORGE J. GREVERA
20. Lotufo RA, Falcao AA, Zampirolli FA. 2000. Fast Euclidean distance transform using a graph-
search algorithm. SIBGRAPI 2000:269–275.
21. da Fontoura Costa L. 2000. Robust skeletonization through exact Euclidean distance transform
and its application to neuromorphometry. Real-Time Imaging 6:415–431.
22. Pudney C. 1998. Distance-ordered homotopic thinning: a skeletonization algorithm for 3D digital
images. Comput Vision Image Understand 72(3):404–413.
23. Sanniti di Baja G. 1994. Well-shaped, stable, and reversible skeletons from the (3,4)-distance
transform. J Visual Commun Image Represent 5(1):107–115.
24. Svensson S, Borgefors G. 1999. On reversible skeletonization using anchor-points from distance
transforms. J Visual Commun Image Represent 10:379–397.
25. Herman GT, Zheng J, Bucholtz CA. 1992. Shape-based interpolation, IEEE Comput Graphics
Appl 12(3):69–79.
26. Raya SP, Udupa JK. 1990. Shape-based interpolation of multidimensional objects. IEEE Trans
Med Imaging 9(1):32–42.
27. Grevera GJ, Udupa JK. 1996. Shape-based interpolation of multidimensional grey-level images.
IEEE Trans Med Imaging 15(6):881–892.
28. Kozinska D. 1997. Multidimensional alignment using the Euclidean distance transform. Graphical
Models Image Process 59(6):373–387.
29. Paglieroni DW. 1997. Directional distance transforms and height field preprocessing for efficient
ray tracing. Graphical Models Image Process 59(4):253–264.
30. Remy E, Thiel E. 2000. Computing 3D medial axis for Chamfer distances. Discrete Geom Comput
Imagery pp. 418–430.
31. Remy E, Thiel E. 2002. Medial axis for chamfer distances: computing look-up tables and neigh-
bourhoods in 2D or 3D. Pattern Recognit Lett 23(6):649–662.
32. Grevera GJ, Udupa JK. 1998. An objective comparison of 3D image interpolation methods. IEEE
Trans Med Imaging 17(4):642–652.
33. Grevera GJ, Udupa JK, Miki Y. 1999. A task-specific evaluation of three-dimensional image inter-
polation techniques. IEEE Trans Med Imaging 18(2):137–143.
34. Travis AJ, Hirst DJ, Chesson A. 1996. Automatic classification of plant cells according to tissue
type using anatomical features obtained by the distance transform. Ann Botany 78:325–331.
35. Van Der Heijden GWAM, Van De Vooren JG, Van De Wiel CCM. 1995. Measuring cell wall
dimensions using the distance transform. Ann Botany 75:545–552.
36. Schnabel JA, Wang L, Arridge SR. 1996. Shape description of spinal cord atrophy in patients with
MS. Comput Assist Radiol ICS 1124:286–291.
37. Grevera GJ. 2004. The “dead reckoning” signed distance transform. Comput Vision Image Under-
stand 95:317-333.
38. Oppenheim AV, Schafer RW, Buck JR. 1999. Discrete-time signal processing, 2d ed. Englewood
Cliffs: Prentice Hall.
39. Cormen TH, Leiserson CE, Rivest RL, Stein C. 2001. Introduction to algorithms, 2d ed. Cambridge:
MIT Press.
40. Svensson S, Borgefors G. 2002. Digital distance transforms in 3D images using information from
neighbourhoods up to 5x5x5. Comput Vision Image Understand 88:24-53.
3
Oliver Dorn
Universidad Carlos III de Madrid, España
Dominique Lesselier
Laboratoire des Signaux et Systèmes
Gif sur Yvette, France
Most biological bodies are structured in the sense that they contain quite well-defined in-
terfaces between regions of different types of tissue or anatomical material. Extracting
structural information from medical or biological images has been an important research
topic for a long time. Recently, much attention has been devoted to quite novel techniques
for the direct recovery of structural information from physically measured data. These tech-
niques differ from more traditional image processing and image segmentation techniques
by the fact that they try to recover structured images not from already given pixel or voxel-
based reconstructions (obtained, e.g., using traditional medical inversion techniques), but
directly from the given raw data. This has the advantage that the final result is guaranteed
to satisfy the imposed criteria of data fitness as well as those of the given structural prior
information. The ‘level-set-technique’ [1–3] plays an important role in many of these novel
structural inversion approaches, due to its capability of modeling topological changes during
the typically iterative inversion process. In this text we will provide a brief introduction into
some techniques that have been developed recently for solving structural inverse problems
using a level set technique.
Address all correspondence to: Oliver Dorn, Departamento de Matemáticas, Universidad Carlos III de
Madrid, Avenida de la Universidad, 30, 28911 Leganés, Madrid, España. Phone: (+34)91-6248825.
Fax: (+34)91-6249129. [email protected]. http://www.athena.uc3m.es/˜dorn/.
61
62 OLIVER DORN and DOMINIQUE LESSELIER
1. INTRODUCTION
Level set techniques for solving inverse problems were first proposed by San-
tosa [4]. Further examples for early contributions are (without claim of complete-
ness of this list) [5–11]. By now, many more results in a variety of applications
can be found in the literature. We refer to the recent overview articles [12–14],
each of them illuminating the recent progress in this exciting research topic with
a different viewpoint. An overview of level set techniques in medical imaging
can be found in [15]. In the present text, we intend to give a general introduc-
tion into level set techniques in medical imaging in the above-described sense of
direct reconstruction of structured images from raw data. We will follow two typ-
ical examples for this purpose, namely X-ray computerized tomography (CT) and
diffuse optical tomography (DOT). Both are representative for broader classes of
inverse problems, the first one representing linear tomography problems, and the
second nonlinear ones. The theory for both can be developed fairly in parallel,
with some characteristic differences. We will point out these analogies as well as
the differences.
This chapter is organized as follows. In Section 2 we introduce the use of
level set techniques for linear inverse problems, in particular X-ray CT. First, we
describe the more traditional pixel-based filtered backprojection technique, which
admits an interesting geometric interpretation, to be compared then with an al-
ternative gradient-based scheme for pixel-based inversion. We then extend this
gradient-based scheme to the situation of structural inversion using a level set
technique. We first concentrate on the search of descent directions for finding
only unknown shapes from given data, which will then be generalized to joint in-
version for interfaces and structural information in each individual region. Then,
some popular geometric regularization techniques are described for shape inver-
sion using level sets. At the end of the section we give a few hints for work in
the literature addressing linear inverse problems using level sets. The second part
of the text is concerned with nonlinear inverse problems, which is discussed in
Section 3. It starts with generalizing the concepts already seen for the linear case
to this more general situation. The main example is diffuse optical tomography.
First, the mathematical and physical background of this novel imaging technique
is explained briefly. An important component of nonlinear inverse problems by
iterative gradient-based schemes is efficient calculation of linearized parameter-
to-data mappings in each step. One powerful way of doing so (the so-called adjoint
scheme) will be described for our specific application. Then, some numerical ex-
amples are presented that illustrate the general behavior of shape reconstruction
schemes using level sets. Finally, some more literature references will be given
for level set techniques in nonlinear medical imaging problems, which concludes
the second part of the text.
LEVEL SET TECHNIQUES FOR STRUCTURAL INVERSION 63
where n(x) ∈ S 1 (S 1 being the unit circle in IR2 ) denotes the outward unit normal
to ∂Ω in the point x ∈ ∂Ω. In this model, u(x, θ) denotes the density of X-ray
photons propagating at position x ∈ Ω in direction θ ∈ S 1 . Boundary condition
(2) indicates that there are no x-ray photons entering the domain. Instead, in
our model the photons are created by the source term, q(x, θ), in (1) which can
be located at the boundary of the domain. In fact, we have a duality between
boundary sources and incoming boundary conditions, as pointed out in [16], so
we can choose between putting either an inhomogeneity in (2) or using a source
term in (1) for creating x-ray photons at the boundary of the domain. Usually, there
is an additional energy variable involved when modeling x-ray photons, which is
neglected in (1), (2) for simplicity.
For x-ray CT, the quantity of interest is the attenuation, µ(x), in (1), which
models loss of particles propagating in direction θ due to various physical pro-
cesses. We will assume here that this quantity does not depend on the direction
θ in which the x-rays propagate at position x, but it will depend on position x.
Knowing the spatial distribution of µ(x) in the human body, the physician can
gain important information about the structure of the body. Usually, in clinical
applications of this imaging technique, a large amount of high-quality x-ray data is
available, such that reconstruction of function µ(x) in form of a pixel-based array
64 OLIVER DORN and DOMINIQUE LESSELIER
which are the line integrals of the attenuation over the lines Ljk . In the continuous
setting, when obtaining the line integrals over all possible lines, L(s, θ) = {x :
x · θ = s}, through Ω for s ∈ IR1 and θ ∈ S 1 , the data g(s, θ) are given by the
Radon transform of the attenuation µ(x):
(T µ)(s, θ) = g(s, θ) = δL(s,θ) (x)µ(x)dx, (7)
Ω
technique, which takes into account the sampling rate of the data, the measure-
ment geometry that is used, and other details of the experimental setup. The
backprojection operator T ∗ is just the adjoint of T and is defined by [18, 19]
∗
(T g)(x) = g(x · θ, θ)dθ. (8)
S1
and by P the parameter space (of attenuation functions) equipped with the inner
product
µ1 (x) , µ2 (x) P = µ1 (x)µ2 (x) dx. (10)
IR2
Then we have
(T µ)(θ, s) , g(θ, s) Z = µ(x) , (T ∗ g)(x) P . (11)
The backprojection operator can be interpreted physically as a process that is
taking all gathered data and ‘projects them back’ uniformly over the lines that have
contributed to them. Summing up all these contributions yields T ∗ g. The filtered
backprojection reconstruction scheme can then be described as an implementation
of the formula
(V ∗ µ)(x) = T ∗ (v ∗ g) (x). (12)
Here, v is a filtering operator that is chosen such that V = T ∗ v approximates the
two-dimensional Dirac delta distribution concentrated in x. This will have the
effect that V ∗ µ is an approximation to µ. In physical terms, the filtered data v ∗ g
are backprojected by application of T ∗ such that the result V ∗ µ approximates as
much as possible the sought attenuation µ.
Notice that the filtered backprojection scheme calculates for each pixel in
the domain an individual value of the attenuation µ from the given data g using
formula (12). This usually yields good results if sufficient data of high quality are
available. In Section 2.4 we will introduce into the shape-based reconstruction
idea, which takes into account additional prior information about the structure of
the tissue. As said already, this is useful in cases where the quality or quantity of the
given data are not sufficient for a successful application of formula (12). Adding
prior information to a reconstruction is often called ‘regularization.’ In this sense,
the shape-based approach can be considered as a specific form of regularization,
which is practically achieved by using a level set technique. Before introducing
the shape-based approach, we want to derive in the following section an iterative
gradient scheme for data inversion in CT, which will then lead us directly to the
level set based strategies.
66 OLIVER DORN and DOMINIQUE LESSELIER
1 2 1
J (µ) = Tµ − g = Tµ − g , Tµ − g , (13)
2 Z 2 Z
1
J (µ + δµ) = J (µ) + T µ − g , T δµ + T δµ2Z
Z 2
= J (µ) + T ∗ (T µ − g) , δµ + O(T δµ2Z ), (14)
P
where we have used the fact that the backprojection operator T ∗ , as defined in (8),
is the adjoint operator of T . We call
for a sufficiently small step-size λ > 0. Doing so, the cost J (µ) is reduced in
each step as can be seen by plugging δµ = −λT ∗ (T µ − g) into (14). Certainly,
such an iterative gradient approach would be much slower than applying the fil-
tered backprojection scheme when sufficient high-quality data can be used for the
reconstruction. However, as mentioned previously, there are situations where only
few, noisy, and irregularly sampled data are available. In these situations, itera-
tive algorithms become more interesting due to their capability of incorporating
a-priori information in a very simple and flexible way. Certainly, our interest in
the gradient scheme is not motivated so much by its applicability to classical pixel-
based attenuation reconstruction from x-ray data, but by the fact that it directly
leads us to the most basic level set reconstruction technique for finding attenuation
distributions of the form (18) from the data in an iterative fashion. How to do this,
will be demonstrated in the following.
LEVEL SET TECHNIQUES FOR STRUCTURAL INVERSION 67
The level set representation of a given shape, D, is not unique, since it can
be modified some distance away from the interfaces without changing the shape.
68 OLIVER DORN and DOMINIQUE LESSELIER
We assume that the velocity field v(x) is regular enough such that the basic struc-
ture of D remains preserved during this evolution. Then, the points located on the
boundary, Γ = ∂D, will move to the new locations x = x + y(x), and boundary
Γ will be deformed into a new boundary, Γ = ∂D . Assuming furthermore that
the parameter distribution in Ω has the special form (18), it will change as well.
In the following, we want to quantify this change in the parameter distribution
LEVEL SET TECHNIQUES FOR STRUCTURAL INVERSION 69
where ds(x) is the incremental arclength. We have used the fact that in the limit
δµ(x) = µi (x) − µe (x) at boundary point x ∈ ∂D due to (18).
We arrive at the result
which is a distribution defined in the entire domain Ω but concentrated along ∂D.
70 OLIVER DORN and DOMINIQUE LESSELIER
Plugging now (23) into (27) we get for t ∈ [0, τ ] the corresponding change in the
parameters:
δµ(x; t) = (µi − µe ) v(x) · n(x)t δ∂D (x). (28)
Plugging expression (28) into (14) and neglecting terms of higher than linear order,
we arrive at
at the boundary ∂D of the shape D is of relevance for the change in the cost (30).
This is because tangential components do not contribute to shape deformations.
In order to solve the inverse problem, we now have to choose a velocity field
v(x) such that the cost is reduced during the corresponding shape evolution. One
obvious choice for such a velocity is
which is often called the gradient (or steepest-descent) direction with respect to
the cost. Plugging this into (30) yields a descent direction for the cost J . What
remains to be done now is to numerically implement the resulting flow equation
in order to follow the steepest descent flow. The practical procedure for this is
standard [1–3]. We formulate the basic Hamilton-Jacobi-type equation
∂φ
+ FSD (x, t) · |∇φ| = 0 (33)
∂t
LEVEL SET TECHNIQUES FOR STRUCTURAL INVERSION 71
for the representing level set function φ. Notice that the velocity function (32)
needs to be recalculated at each step of the artificial shape evolution (33) for each
point of the current shape boundary x ∈ ∂D. This means that a Radon transform
needs to be applied to the current attenuation distribution µ(x; t) at time step t,
and then the backprojection operator T ∗ needs to be applied to the difference in
the data T µ(x; t) − g in order to calculate (32). Moreover, appropriate extension
velocities need to be constructed at each step for the numerical implementation of
the shape evolution equation (33).
with basis functions aj (x) and bk (x) and expansion coefficients αj and βk , re-
spectively. The cost (data misfit) will now not only depend on the shape D (i.e.,
the describing level set function φ), but also on the set of expansion coefficients
{αj } and {βk }. We write this relationship in the form
1 2
J (φ, {αj }, {βk }) = T µ(φ, {αj }, {βk }) − g . (35)
2 Z
We will now take a slightly different approach than before. We want to find a
general time-evolution law for the unknowns of the problem that reduces gradually
the cost (35). We make the ansatz
dφ dαj dβk
= f (x, t), = fˆj (t), = f˜k (t), (36)
dt dt dt
where f (x, t) does depend on the spatial position x but fˆj (t) and f˜k (t) do not.
With evolving time, the cost will evolve as well. Using the chain rule, we get
Ni Ne
dJ dJ ∂µ dφ dJ ∂µ dαj dJ ∂µ dβk
= + + . (37)
dt dµ ∂φ dt j=1
dµ ∂αj dt dµ ∂βk dt
k=1
72 OLIVER DORN and DOMINIQUE LESSELIER
Ne
Ni
dµ
= (µe − µi )δ(φ) = βk bk (x) − αj aj (x) δ(φ), (39)
dφ j=1
k=1
dµ dµ
= aj (x)(1 − H(φ)), = bk (x)H(φ). (40)
dαj dβk
dJ
δµ = T ∗ (T µ − g) , δµ . (41)
dµ P
We can now combine the above expressions and obtain from (37)
Ni Ne
dJ ∂µ dφ ∂µ dαj ∂µ dβk
= T ∗ (T µ − g), + + (42)
dt ∂φ dt j=1
∂αj dt ∂βk dt P
k=1
dJ δ∂D (x)
= T ∗ (T µ − g) , (µe − µi ) f (x, t) (44)
dt |∇φ(x)| P
LEVEL SET TECHNIQUES FOR STRUCTURAL INVERSION 73
Ni
+ fˆj (t) T ∗ (T µ − g) , aj (x)(1 − H(φ))
P
j=1
Ne
+ f˜k (t) T ∗ (T µ − g) , bk (x)H(φ) . (45)
P
k=1
where FSD (x, t) is defined as in (32). Equation (49) is again the Hamilton-Jacobi-
type equation already found in (33) for level set function φ. Equations (50) and
(51) are evolution laws for the individual expansion parameters αj ,j = 1, . . . , Ni ,
and βk , k = 1, . . . , Ne . In order to calculate the right-hand sides of the evolution
system (49)–(51) in a given time step t, one Radon transform needs to be calcu-
lated for the current attenuation distribution µ(x, t) in order to calculate T µ − g,
and then this result needs to be backprojected by applying T ∗ in order to calculate
T ∗ (T µ − g). Once this expression (which is a function in the space of attenua-
tions) has been calculated, all three right-hand sides of (49)–(51) can be computed
from it simultaneously by just calculating weighted integrals of this expression
over the individual regions D, ∂D, and Ω\D. We mention that a related approach
has been proposed earlier in [21], where numerical results can also be found.
shapes. Each of them can be written in the form of a cost functional which is
added to least-squares functional (2.3). Recalculating the gradient (or steepest de-
scent) directions for these extended cost functionals then yields additional terms
in the normal velocity component. These additional terms will be derived in the
following.
It is shown in [22, 23] that applying a flow by a smooth vector field v(x) yields
an infinitesimal response in this cost (52) that is given by
dJlenΓ (D, v) = κ v, n dΓ, (53)
Γ
and where n is the outward normal to the boundary Γ. The relationship (53) can
also be derived directly using a level set formulation. First, using (43), we write
(52) in the form
JlenΓ (D(φ)) = δ(φ)|∇φ(x)| dx. (55)
Ω
Perturbing now φ → φ + ψ, formal calculation (see, e.g., [14]) yields that the cost
functional is perturbed by
∂JlenΓ ∇φ
,ψ = δ(φ)ψ(x)∇ · dx. (56)
∂φ Ω |∇φ|
∂JlenΓ ∇φ
= δ(φ)∇ · = δ(φ)κ. (57)
∂φ |∇φ|
For both representations (53) and (57), minimizing the cost by a gradient method
leads to curvature-driven flow equations, which is v = −κn.
LEVEL SET TECHNIQUES FOR STRUCTURAL INVERSION 75
It is shown in [22, 23] that applying a flow by a smooth vector field v(x) yields
an infinitesimal response in this cost (58) that is given by
dJvolD (D, v) = divv dx = v , n dΓ. (59)
D Γ
where µi and µe are given constant values. Then, this latter term can be written as
JM S = |∇H(φ)| dx. (64)
Ω
Taking into account that ∇H(φ) = H (φ)∇φ = δ(φ)|∇φ|n, we see that JM S
= JlenΓ (D(φ)) as given in (55) such that we arrive again at the curvature-driven
flow equation (57). For more details we refer to [25, 26].
76 OLIVER DORN and DOMINIQUE LESSELIER
In this section we describe the more general approach of level set techniques
for nonlinear inverse problems in medical imaging. As a good example for a
nonlinear inverse problem in medical imaging we will first describe the mathe-
matical model for Diffuse Optical Tomography (DOT), which in some sense can be
considered a generalization of x-ray CT to situations where scattering of photons
becomes dominant. It leads to a nonlinear inverse problem since the linearization
operation (6) does not yield simplified expressions anymore due to the scattering.
Instead, so-called ‘linearized sensitivity functions’ need to be considered now,
which correspond to linearizations (often called Jacobians or, in the continuous
setting, Fréchet derivatives,) of the nonlinear forward problem. Physically, these
linearized sensitivity functions have a similar meaning as the line integrals in x-ray
tomography, indicating the information flow from events inside the domain toward
the detectors. In other words, they replace the line integrals in (6) by weighted
integrals over the whole domain with the weights indicating importance of a given
region to the measurements. We will describe them below. Furthermore, the
LEVEL SET TECHNIQUES FOR STRUCTURAL INVERSION 77
1 ∂u
+ θ · ∇u(x, θ, t) + (a(x) + b(x))u(x, θ, t)
c ∂t
− b(x) η(θ · θ )u(x, θ , t)dθ = q(x, θ, t) (66)
Sn −1
u(x, θ, t) = 0 on Γ− . (68)
Here,
Γ± := (x, θ, t) ∈ ∂Ω × S n−1 × [0, T ], ±n(x) · θ > 0 .
b ≈ 100 − 200 cm−1 , µ−1 ≈ 0.005 − 0.01 cm [31, 32]. The scattering function
η(θ · θ ) describes the probability for a particle entering a scattering process with
the direction of propagation θ to leave this process with the direction θ. It is
normalized to
η(θ · θ )dθ = 1, (69)
S n −1
which expresses particle conservation in pure scattering events. The dot-product
in the argument indicates that η depends only on the cosine of the scattering
angle cos ϑ = θ · θ , an assumption typically made in DOT. Another assumption
typically made in DOT is that η is independent of the position x, although the
theory developed in the following can easily be generalized to the more general
case of position-dependent scattering functions. In our numerical experiments,
we will use a 2D-adapted version of the following Henyey-Greenstein scattering
function:
1 1 − γ2
η(θ · θ ) = , (70)
4π (1 + γ − 2γ cos ϑ)3/2
2
with −1 < γ < 1. (See, for example, [33, 34] for possible choices.) The parameter
γ in (70) is the mean cosine of the scattering function. Values of γ close to one
indicate that the scattering is primarily forward directed, whereas values close to
zero indicate that scattering is almost isotropic. In our numerical experiments we
will choose γ to be 0.9, which is a typical value for DOT.
The initial condition (67) indicates that there are no photons moving inside
Ω at the starting time of our experiment. The boundary condition (68) indicates
that during the experiment no photons enter the domain Ω from the outside. All
photons inside Ω originate from the source q, which, however, can be situated at
the boundary ∂Ω.
We consider the problem (66)–(68) for p different sources qj , j = 1, . . . , p,
positioned at ∂Ω. Possible sources in applications are delta-like pulses transmitted
at time t = 0 at the position sj ∈ ∂Ω into the direction θj , which can be described
by the distributional expressions
where δsj (x) is a Dirac delta distribution concentrated on a small part of the
boundary ∂Ω indicating the source extension, and where n(sj ) · θj < 0 along this
part of the boundary. We assume that a given source qj gives rise to the physical
fields ũj (x, θ, t), which are solutions of (66)–(68). Our measurements consist of
the outgoing flux across the boundary ∂Ω, which has with (68) the form
gj (x, t) = n(x) · θũj (x, θ, t)dθ on ∂Ω × [0, T ] (72)
n(x)·θ>0
The theory for the more general case of more than one unknown function then
follows the same line of reasoning, see, e.g., [31, 33, 35].
Let us call the data corresponding to the parameter distribution a by A(a). In
contrast to the Radon transform T considered in the previous section, the operator
A is now a nonlinear operator, mapping parameter distributions a(x) (here the
absorption coefficient) to the corresponding data A(a). Let us furthermore call
the physically measured data by g. As before, we can consider the difference
between these two and define the (now as well nonlinear) residual operator
The goal will be to minimize this mismatch between calculated and measured data
in some appropriate sense, as is described in the following section.
where , Z denotes the canonical inner product in data space Z. We assume that
R(a) admits the expansion
The operator R (a)∗ is the formal adjoint operator of R (a) with respect to spaces
Z and P :
We call
gradJ (a) = R (a)∗ R(a) (78)
the gradient direction of J in a.
The remaining part of the theory is now very similar to the development
presented in the previous section for linear operators. We only mention here the
result analogous to (30):
∂J (a)
= R (a)∗ R(a) (ai − ae )v(x) · n(x)ds(x), (79)
∂t t=0 ∂D
80 OLIVER DORN and DOMINIQUE LESSELIER
which describes the change in the cost due to a small deformation of the shape
D by a vector field v(x), and the corresponding gradient (or steepest-descent)
normal velocity field FSD , which is
Plugging this into (79) gives us a descent direction for the cost J .
The remainder of the theory, concerning for example the simultaneous re-
construction of shape and texture components, or the additional incorporation of
geometric regularization schemes, is now completely analogous to the linear case,
such that we will not discuss it here in particular. The main difference to the linear
theory is the need for computing the adjoint linearized residual operators R (a)∗
in each step of the iterative inversion. We will address this important topic in the
following section.
R (a)(δa)(xr , tr ) = n(xr ) · θw(xr , θ, tr ) dθ, (81)
n −1
S+
∂w
+ θ · ∇w(x, θ, t) + (a(x) + b(x))w(x, θ, t) − b(x)
∂t
η(θ · θ )w(x, θ , t)dθ
S n −1
= −δa(x)u(x, θ, t) (82)
w(x, θ, 0) = 0 in Ω × S n−1 ,
w(x, θ, t) = 0 on Γ− . (83)
Notice that, for a given perturbation function δa(xsc ) (where the argument xsc ∈ Ω
denotes the scattering points), the value of R (a)δa is a function in the vari-
ables xr and tr , where xr is the receiver location and tr the receiver time.
This explains the somewhat complicated notation in (81). The physical inter-
pretation of this result is that the perturbation δa creates a scattering source
Qδa (x, θ, t) = −δa(x)u(x, θ, t) inside the domain Ω. This gives rise to a distribu-
tion w(x, θ, t) (which can be positive or negative) of virtual ‘secondary particles’
propagating in the unperturbed medium to the receivers, where they are detected
as the (linearized) residuals in the data.
The adjoint linearized residual operator R (a)∗ is formally defined by the
identity (77). It is a mapping from the data space Z into the parameter space P .
Notice the analogy to the adjoint Radon transform in linear x-ray tomography,
which is as well a backprojection operator from the data space into the parameter
space. We will see below that there is indeed a close analogy between the backpro-
jection operator in x-ray tomography and the adjoint linearized residual operator
in DOT.
An explicit expression for the action of the adjoint linearized residual operator
on an arbitrary vector ζ of the data space can be derived by using Green’s formula
for the linear transport equation in time domain. Next, we formulate this result,
and refer for a derivation to [36, 33].
Let z denote the solution of the following adjoint linear transport equation:
∂z
− − θ · ∇z(x, θ, t) + (a(x) + b(x))z(x, θ, t) − b(x)
∂t
η(θ · θ )z(x, θ , t)dθ
Sn −1
= 0 in Ω × S n−1 × [0, T ],
and let u be the solution of the forward problem (66)–(68). Then we have
R (a)∗ ζ(x) = − u(x, θ, t)z(x, θ, t) dθdt . (86)
[0,T ] Sn −1
20 20
40 40
60 60
80 80
100 100
120 120
0 20 40 60 80 100 120 0 20 40 60 80 100 120
20 20
40 40
60 60
80 80
100 100
120 120
0 20 40 60 80 100 120 0 20 40 60 80 100 120
5 5 5
10 10 10
15 15 15
20 20 20
25 25 25
30 30 30
35 35 35
40 40 40
45 45 45
50 50 50
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
5 5 5
10 10 10
15 15 15
20 20 20
25 25 25
30 30 30
35 35 35
40 40 40
45 45 45
50 50 50
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
5 5 5
10 10 10
15 15 15
20 20 20
25 25 25
30 30 30
35 35 35
40 40 40
45 45 45
50 50 50
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
Figure 4. Shape evolution for the first example. Top row: first guess, after 2 and 14
iterations; center row: after 20, 80, and 130 iterations; bottom row: after 250 and 500
iterations; bottom right: true reference model. See also animated movie DOTmovie1 on
the attached CD.
5 5 5
10 10 10
15 15 15
20 20 20
25 25 25
30 30 30
35 35 35
40 40 40
45 45 45
50 50 50
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
5 5 5
10 10 10
15 15 15
20 20 20
25 25 25
30 30 30
35 35 35
40 40 40
45 45 45
50 50 50
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
Figure 5. Shape evolution for the second example. Top row: first guess, after 10 and 40
iterations; bottom row: after 250 and 500 iterations; bottom right: true reference model.
See also animated movie DOTmovie2 on the attached CD.
86 OLIVER DORN and DOMINIQUE LESSELIER
5 5 5
10 10 10
15 15 15
20 20 20
25 25 25
30 30 30
35 35 35
40 40 40
45 45 45
50 50 50
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
5 5 5
10 10 10
15 15 15
20 20 20
25 25 25
30 30 30
35 35 35
40 40 40
45 45 45
50 50 50
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
Figure 6. Shape evolution for the third example. Top row: first guess, after 30 and 35
iterations; bottom row: after 250 and 500 iterations; bottom right: true reference model.
See also animated movie DOTmovie3 on the attached CD.
5 5 5
10 10 10
15 15 15
20 20 20
25 25 25
30 30 30
35 35 35
40 40 40
45 45 45
50 50 50
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
5 5 5
10 10 10
15 15 15
20 20 20
25 25 25
30 30 30
35 35 35
40 40 40
45 45 45
50 50 50
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
Figure 7. Shape evolution for the fourth example. Top row: first guess, after 4 and 10
iterations; bottom row: after 40 and 500 iterations; bottom right: true reference model. See
also animated movie DOTmovie4 on the attached CD.
LEVEL SET TECHNIQUES FOR STRUCTURAL INVERSION 87
0.5 0.25
0.45
0.4 0.2
0.35
0.3 0.15
0.25
0.2 0.1
0.15
0.1 0.05
0.05
0 0
0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400
0.35 0.7
0.3 0.6
0.25 0.5
0.2 0.4
0.15 0.3
0.1 0.2
0.05 0.1
0 0
0 50 100 150 200 250 300 350 400 0 50 100 150 200 250 300 350 400
Figure 8. Evolution of least-squares data misfit. Top left: first example (Fig. 4); top right:
second example (Fig. 5); bottom left: third example (Fig. 6); bottom right: fourth example
(Fig. 7).
4. ACKNOWLEDGMENTS
5. REFERENCES
1. Osher S, Sethian JA. 1988. Fronts propagating with curvature-dependent speed: algorithms based
on Hamilton-Jacobi formulations. J Comput Phys 79:12–49.
2. Osher S, Fedkiw R. 2003. Level set methods and dynamic implicit surfaces. New York: Springer.
3. Sethian JA. 1999. Level set methods and fast marching methods, 2nd ed. Cambridge: Cambridge
UP.
4. Santosa F. 1996. A level set approach for inverse problems involving obstacles. ESAIM Control
Optim Calculus Variations 1:17–33.
5. Litman A, Lesselier D, Santosa D. 1998. Reconstruction of a two-dimensional binary obstacle by
controlled evolution of a level set. Inverse Probl 14:685–706.
6. Feng H, Karl WC, Castanon D. 2000. Tomographic reconstruction using curve evolution. Proc
IEEE Int Conf Computer Vision and Pattern Recognition 1:361–366.
7. Dorn O, Miller E, Rappaport C. 2000. A shape reconstruction method for electromagnetic to-
mography using adjoint fields and level sets. Inverse Probl 16:1119–1156.
8. Burger M. 2001. A level set method for inverse problems. Inverse Probl 17:1327–1355.
9. Ito K, Kunisch K, Li Z. 2001. Level set approach to an inverse interface problem. Inverse Probl
17:1225–1242.
10. Ramananjaona C, Lambert M, Lesselier D, Zolésio J-P. 2001. Shape reconstruction of buried
obstacles by controlled evolution of a level set: from a min–max formulation to numerical
experimentation. Inverse Probl 17: 1087–1111.
11. Ramananjaona C, Lambert M, Lesselier D. 2001. Shape inversion from TM and TE real data by
controlled evolution of level sets. Inverse Probl 17:1585–1595.
12. Burger M, Osher S. 2005. A survey on level set methods for inverse problems and optimal design.
Eur J Appl Math 16:263-301.
13. Dorn O, Lesselier D. 2006. Level set methods for inverse scattering, Inverse Probl 22:R67–R131.
14. Tai X-C, Chan TF. 2004. A survey on multiple level set methods with applications for identifying
piecewise constant functions Int J Num Anal Model 1:25–47.
15. Suri JS, Liu K, Singh S, Laxminarayan SN, Zeng X, Reden L. 2002. Shape recovery algorithms
using level sets in 2D/3D medical imagery: a state-of-the-art review IEEE Trans Inf Technol
Biomed 6:8–28.
16. Case K, Zweifel P. 1967. Linear transport theory. New York: Addison Wesley.
17. Kak AC, Slaney M. 2001. Principles of computerized tomographic imaging. SIAM Classics in
Applied Mathematics, Vol. 33. Philadelphia: SIAM.
18. Natterer F. 1986. The mathematics of computerized tomography. Stuttgart: Teubner.
19. Natterer F, Wübbeling F. 2001. Mathematical methods in image reconstruction. Monographs on
Mathematical Modeling and Computation, Vol. 5. Philadelphia: SIAM.
20. Radon J. 1917. Über die bestimmung von funktionen durch ihre integralwerte längs gewisser
mannigfaltigkeiten. Ber Säch Akad der Wiss Leipzig, Math-Phys Kl 69:262–267.
21. Feng H, Karl WC, Castanon DA. 2003. A curve evolution approach to object-based tomographic
reconstruction. IEEE Trans Image Process 12:44–57.
22. Delfour MC, Zolésio J-P. 2001. Shapes and geometries: analysis, differential calculus and opti-
mization. SIAM Advances in Design and Control. Philadelphia: SIAM.
23. Sokolowski J, Zolésio J-P. 1992. Introduction to shape optimization: shape sensitivity analysis.
Springer series in Computational Mathematics, Vol. 16. Berlin: Springer.
24. Mumford D, Shah J. 1989. Optimal approximation by piecewise smooth functions and associated
variational problems. Commun Pure Appl Math 42:577–685.
25. Chan TF, Vese LA. 2001. Active contours without edges. IEEE Trans Image Process 10:266–277.
26. Vese LA, Chan TF. 2002. A multiphase level set framework for image segmentation using the
Mumford-Shah model. Int J Comput Vision 50 271–293.
LEVEL SET TECHNIQUES FOR STRUCTURAL INVERSION 89
27. Chan TF, Tai X-C. 2003. Level set and total variation regularization for elliptic inverse problems
with discontinuous coefficients. J Comput Phys 193:40–66.
28. Lysaker M, Chan T, Tai X-C. 2004. Level set method for positron emission tomography. UCLA-
CAM Preprint 04-30.
29. Whitaker RT, Elangovan V. 2002. A direct approach to estimating surfaces in tomographic data.
Med Imaging Anal 6, 235–249.
30. Ye JC, Bresler Y, Moulin P. 2002. A self-referencing level set method for image reconstruction
from sparse Fourier samples. Int J Comput Vision 50:253–270.
31. Arridge SR. 1999. Optical tomography in medical imaging. Inverse Probl 15:R41–R93.
32. Okada E, Firbank M, Schweiger M, Arridge SR, Cope M, Delpy DT. 1997. Theoretical and
experimental investigation of near-infrared light propagation in a model of the adult head. Appl
Opt 36(1):21–31.
33. Dorn O. 1998. A transport-backtransport method for optical tomography. Inverse Probl 14:1107–
1130.
34. Heino J, Arridge S, Sikora J, Somersalo E. 2003. Anisotropic effects in highly scattering media.
Phys Rev E, 68(3):031908-1–031908-8.
35. Schweiger M, Arridge SR, Dorn O, Zacharopoulos A, Kolehmainen V. 2006. Reconstructing
absorption and diffusion shape profiles in optical tomography using a level set technique. Opt
Lett 31(4):471–473.
36. Dierkes T, Dorn O, Natterer F, Palamodov V, Sielschott H. 2002. Frechet derivatives for some
bilinear inverse problems SIAM J Appl Math 62:2092–2113.
37. Dorn O. 2000. Scattering and absorption transport sensitivity functions for optical tomography.
Opt Express 7:492–506.
38. Dorn O, Miller E, Rappaport C. 2001. Shape reconstruction in 2D from limited-view multifre-
quency electromagnetic data. Radon transform and tomography. AMS series on Contemporary
Mathematics, Vol. 278, pp. 97–122.
39. Ferrayé R, Dauvignac J-Y, Pichot C. 2003. An inverse scattering method based on contour defor-
mations by means of a level set method using frequency hopping technique. IEEE Trans Antennas
Propag 51:1100–1113.
40. Ferrayé R, Dauvignac J-Y, Pichot C. 2003. Reconstruction of complex and multiple shape object
contours using a level set method J Electromagn Waves Appl 17:153-181.
41. Irishina N, Moscoso M, Dorn O. 2006. Detection of small tumors in microwave medical imaging
using level sets and MUSIC. Proceedings of the progress in electromagnetics research symposium,
Cambridge, MA, March 26–29, 2006. To appear. http://ceta.mit.edu/PIER/.
42. LitmanA. 2005. Reconstruction by level sets of n-ary scattering obstacles. Inverse Probl 21:S131–
S152.
43. Ramananjaona C, Lambert M, Lesselier D, Zolesio J-P. 2003. On novel developments of controlled
evolution of level sets in the field of inverse shape problems. Radio Sci 38:1–9.
44. Ascher UM, Van den Doel K. 2005. On level set regularization for highly ill-posed dis-
tributed parameter estimation problems. J Comput Phys. To appear. http://www.cs.ubc.ca/ kv-
doel/publications/keesUri05.pdf
45. Chung ET, Chan TF, Tai XC. 2005. Electrical impedance tomography using level set representation
and total variational regularization. J Comput Phys 205:357–372.
46. Leitao A, Scherzer O. 2003. On the relation between constraint regularization, level sets and
shape optimization. Inverse Probl 19:L1–L11.
47. Soleimani M, Lionheart WRB, Dorn O. 2005. Level set reconstruction of conductivity and per-
mittivity from boundary electrical measurements using experimental data. Inverse Probl Sci Eng
14:193–210.
48. Soleimani M, Dorn O, Lionheart WRB. 2006. A narrowband level set method applied to EIT in
brain for cryosurgery monitoring. IEEE Trans Biomed Eng. To appear.
90 OLIVER DORN and DOMINIQUE LESSELIER
49. Calderero F, Ghodrati A, Brooks DH, Tadmor G, MacLeod R. 2005. A method to reconstruct
activation wavefronts without isotropy assumptions using a level set approach. In Lecture Notes in
Computer Science, Vol. 3504: Functional imaging and modeling of the heart: third international
workshop, FIMH 2005, Barcelona, Spain, June 2–4, 2005, pp. 195–204. Ed.AF Frangi, PI Radeva,
A Santos, M Hernandez. Berlin: Springer.
50. Lysaker OM, Nielsen BF. 2006. Toward a level set framework for infarction modeling: an inverse
problem. Int J Num Anal Model 3:377–394.
51. Bal G, Ren K. 2005. Reconstruction of singular surfaces by shape sensitivity analysis and level
set method. Preprint, Columbia University. http://www.columbia.edu/ gb2030/PAPERS/Sing
LevelSet.pdf.
52. Dorn O. 2004. Shape reconstruction in scattering media with voids using a transport model and
level sets. Can Appl Math Q 10:239–275.
53. Dorn O. 2006. Shape reconstruction for an inverse radiative transfer problem arising in medical
imaging. In Numerical methods for multidimensional radiative transfer problems. Springer series
Computational Science and Engineering. Ed. G Kanschat, E Meinköhn, R Rannacher, R Wehrse.
Springer: Berlin. To appear.
54. Ishimaru A. 1978. Wave propagation and scattering in random media. New York: Academic
Press.
55. Natterer F, Wübbeling F. 1995. A propagation-backpropagation method for ultrasound tomogra-
phy. Inverse Probl 11:1225–1232.
56. Osher S, Santosa F. 2001. Level set methods for optimisation problems involving geometry and
constraints I. Frequencies of a two-density inhomogeneous drum. J Comput Phys 171:272–288.
57. Osher S, Paragios N. 2003. Geometric level set methods in imaging, vision and graphics. Berlin:
Springer.
58. Sikora J, Zacharopoulos A, Douiri A, Schweiger M, Horesh L, Arridge S, Ripoll J. 2006. Diffuse
photon propagation in multilayered geometries. Phys Med Biol 51:497–516.
59. Zacharopoulos A, Arridge S, Dorn O, Kolehmainen V, Sikora J. 2006. 3D shape reconstruc-
tion in optical tomography using spherical harmonics and BEM. Proceedings of the progress in
electromagnetics research symposium, Cambridge, MA, March 26–29, 2006. To appear.
4
In this chapter we introduce concepts and algorithms of shape- and texture-based deformable
models — more specifically Active Shape Models (ASMs), Active Appearance Models
(AAMs), and Morphable Models — for facial image analysis. Such models, learned from
training examples, allow admissible deformations under statistical constraints on the shape
and/or texture of the pattern of interests. As such, the deformation is in accordance with the
specific constraints on the pattern. Based on analysis of problems with the standard ASM
and AAM, we further describe enhanced models and algorithms, namely Direct Appearance
Models (DAMs) and a Texture-Constrained ASM (TC-ASM), for improved fitting of shapes
and textures. A method is also described for evaluation of goodness of fit using an ASM.
Experimental results are provided to compare different methods.
1. INTRODUCTION
Many image based systems require alignment between an object in the input
image and a target object. The alignment quality can have a great impact on system
Address all correspondence to: Stan Z. Li, Center for Biometrics and Security Research,
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of
Sciences, 95 Zhongguancun Donglu, Beijing 100080, China. Phone: +86-10-8262 6787;
Fax: +86- 10-6265 9350. [email protected], [email protected]. Web: http://www.cbsr.ia.ac.cn/,
http://www.cbsr.ia.ac.cn/homapage/szli.
91
92 STAN Z. LI et al.
performance. For face analysis, in particular, both shapes and textures provide
important clues useful for characterizing faces. The task of face alignment is to
accurately locate facial features such as the eyes, nose, mouth, and outline, and
to normalize facial shape and texture. Accurate extraction and alignment of these
features offers advantages for many applications.
A sort of the most successful face alignment method is the deformable model,
which can represent variations in either the shape or texture of the target objects.
As two typical deformable model types, the active shape models (ASMs) [1] and
active appearance models (AAMs) [2, 3] have been widely used as alignment
algorithms in medical image analysis and face analysis [4] for the past decade. The
standard ASM consists of two statistical models: (1) a global shape model, which is
derived from the landmarks in the object contour, and (2) a local appearance model,
which is derived from the profiles perpendicular to the object contour around each
landmark. The ASM uses local models to find the candidate shape and the global
model to constrain the searched shape. The AAM makes use of subspace analysis
techniques, PCA in particular, to model both shape variation and texture variation,
and the correlations between them. The integrated shape and texture is referred
to as appearance. In searching for a solution, it assumes linear relationships
between appearance variation and texture variation, and between position variation
and texture variation; and learns the two linear regression models from training
data. Minimization in high-dimensional space is reduced in the two models. This
strategy is also developed in the active blob model[5].
ASMs and AAMs can be expanded in several ways. The concept, originally
proposed for the standard frontal view, can be extended to multi-view faces, ei-
ther by using piecewise linear modeling [6] or nonlinear modeling [7]. Cootes
and Taylor show that imposing constraints such as fixing eye locations can im-
prove AAM search results [8]. Blanz and Vetter extended morphable models and
the AAM to model the relationship of 3D head geometry and facial appearance
[9]. Li et al. [10] present a method for learning 3D face shape modeling from
2D images based on a shape-and-pose-free texture model. In Duta et al. [11],
the shapes are automatically aligned using procrustean analysis, and clustered to
obtain cluster prototypes and statistical information about intra-cluster shape vari-
ation. In Ginneken et al. [12], a K-nearest-neighbors classifier is used and a set of
features selected for each landmark to build local models. Baker and colleagues
[13] propose an efficient method called an “inverse compositional algorithm” for
alignment. Ahlberg [14] extends the AAM to a parametric method called an Ac-
tive Appearance algorithm to extract positions parameterized by 3D rotation, 2D
translation, scale, and six Action Units (controlling the mouth and the eyebrows).
In the direct appearance model (DAM) [15, 16], shape is modeled as a linear
function of texture. Using such an assumption, Yan et al. [17] propose a texture-
constrained ASM (TC-ASM), which has the advantage of an ASM in having good
localization accuracy and that of an AAM in having insensitivity to initialization.
To construct an effective evaluation function, a statistical learning approach was
SHAPE AND TEXTURE-BASED DEFORMABLE MODELS 93
proposed for face alignment by Huang et al. [18] using a nonlinear classification
function learned from a training set of positive and negative training examples.
The following sections first describe the classicalASM andAAM. We will then
briefly review the 3D Morphable Model as an important 3D deformable model.
After that, two of the improved face alignment algorithms — DAM and TC-ASM
— will be introduced based on the analysis of the problems of classical the ASM
and AAM. Then an alignment quality evaluation mechanism is addressed before
the experimental results and conclusion to end this chapter are presented.
For all the algorithms presented here, a training set of shape–texture pairs is
assumed to be available and denoted as Ω = {(S0 , T0 )}, where a shape S0 =
((x1 , y1 ), . . . , (xK , yK )) ∈ R2K is a sequence of K points in the 2D image plane,
and a texture T0 is the patch of pixel intensities enclosed by S0 . Let S be the
mean shape of all the training shapes, as illustrated in Figure 1. All the shapes are
aligned or warping to the tangent space of the mean shape S. After that texture
T0 is warped correspondingly to T ∈ RL , where L is the number of pixels in the
mean shape S. The warping may be done by pixel value interpolation, e.g., using
a triangulation or thin plate spline method.
Figure 1. Two face instances labeled with 83 landmarks and the mesh of the mean shape.
Reprinted with permission from SC Yan, C Liu, SZ Li, HJ Zhang, H Shum, QS Cheng.
2003. Face alignment using texture-constrained active shape models. Image Vision Comput
21(1):69–75. Copyright 2003,
c Elsevier. See attached CD for color version.
There are two classical deformable models for 2D face analysis — the Active
Shape Model (ASM) and the Active Appearance Model (AAM). We first look
through them; the model for 3D face analysis will be addressed later.
The ASM seeks to match a set of model points to an image by searching along
profiles of each point under the constraint of a statistical shape model. The AAM
94 STAN Z. LI et al.
seeks to match both the position of the model points and a representation of the
texture to an image by updating the model parameters using the difference between
the current synthesized image and the target image.
There are three key differences between the two models [19]:
1. The ASM only uses models of the image texture in small regions about
each landmark point, whereas the AAM uses a model of the appearance of
the whole of the region (usually inside a convex hull around the points).
2. The ASM searches around the current position, typically along profiles nor-
mal to the boundary, whereas the AAM only samples the image enclosed
by the current position.
3. The ASM essentially seeks to minimize the distance between model points
and the corresponding points found in the image, whereas the AAM seeks
to minimize the difference between the synthesized model image and the
target image.
where gj (x, y) is the profile of the jth landmark at (x, y) and X2A = X TA−1X
is the Mahalanobis distance measure with respect to a real symmetric matrix A.
After relocating all the landmarks using the local appearance models, we
n
obtain a new candidate shape Slm . The solution in shape eigenspace is derived by
maximizing the likelihood:
SHAPE AND TEXTURE-BASED DEFORMABLE MODELS 95
where1
n n n 2
Eng(Slm ; s) = λSlm − Slm + snlm − s2Λ . (4)
In above equation, snlm = U T (Slmn
− S) is the projection of Slmn
to the shape
n n
eigenspace, Slm = S + U slm is the reconstructed shape, and Λ is the diagonal
matrix of the largest eigenvalues of the training data {Si }. The first term is the
n
squared Euclidean distance from Slm to the shape eigenspace, and the second is
the squared Mahalanobis distance between snlm and s; λ balances the two terms.
Using the local appearance models leads to fast convergence to the local image
evidence. However, since they are modeled based on the local features, and the
“best” candidate point is only evaluated in the local neighborhood, the solution of
the ASM is often suboptimal, dependent on the initialization.
T = T + Vt, (5)
t = VT (T − T ) = VT T. (6)
By this, the L pixel values in the mean shape is represented as a point in the texture
subspace St in R .
The appearance of each example is a concatenated vector:
Λs
A= , (7)
t
where Λ is a diagonal matrix of weights for the shape parameters allowing for the
difference in units between the shape and texture variation, typically defined as
rI. Again, by applying PCA on the set {A}, one gets
A = Wa, (8)
96 STAN Z. LI et al.
a = WT A. (9)
The search for an AAM solution is guided by the following difference between
the texture Tim in the image patch and the texture Ta reconstructed from the current
appearance parameters:
δT = Tim − Ta . (10)
More specifically, the search for a face in an image is guided by minimizing the
norm δT . The AAM assumes that the appearance displacement δa and the posi-
tion (including coordinates (x, y), scale s, and rotation parameter θ) displacement
δp are linearly correlated to δT :
δa = Aa δT (11)
δp = Ap δT (12)
The prediction matrices Aa , Ap are to be learned from the training data by using
linear regression. In order to estimate Aa , a is displaced systematically to induce
(δa, δT ) pairs for each training image. Due to the large consumption of memory
required for the learning of Aa and Ap , learning has to be done with a small,
limited set of {δa, δT }.
vector spaces for which any convex combination of exemplar shapes and textures
describes a realistic human face. Correspondence is the basic requirement for
constructing such a vector space. In [23], correspondences are established be-
tween all exemplar faces and a reference face by an optical flow algorithm. This
scheme brings a consistent labeling of vertices and corresponding albedos across
the whole set of exemplar faces. The shape of an exemplar face is then represented
by a shape vector S ex = ((x1 , y1 , z1 ) . . . , (xK , yK , zK )) ∈ R3K that contains the
x, y, z coordinates of K vertices. The texture of the face is represented by a tex-
ture vector T ex = ((R1 , G1 , B1 ) . . . , (RK , GK , BK )) ∈ R3K that contains the
R, G, B texture values sampled at the same K vertices.
A new face can then be generated by convex combination of the K exemplar
faces, with their shape and texture vectors, S and T , expressed as
K
K
K
K
S= ai Siex T = bi Tiex ai = bi = 1. (13)
i=1 i=1 i=1 i=1
Again, PCA is applied separately on the shape and texture space to reduce di-
mensionality. Now, instead of describing a new face as a convex combination of
exemplars, as in Eq. (13), we use the similar shape and texture PCA model of
Eq. (1), (5) as
S = S + Us T = T + Vt. (14)
Note that U and V are the matrices consisting of orthogonal modes of variations
in {S ex } and {T ex }. The 3DMM shape and texture coefficient vectors s and t
are low-dimensional coding of the identity of a face invariant to pose and illumi-
nation influence. Given an input 2D image under arbitrary pose and illumination
conditions or unrestrained 3D face data, the 3DMM can recover the vectors of s
and t by an analysis by synthesis, providing an alignment between input face and
exemplar faces in the database.
ASM uses the local appearance models to search along the profiles of candidate
points. It leads to fast convergence to the local image evidence. However, since
they are modeled based on local features, and the “best” candidate point is only
evaluated in the local neighborhood, the solution of the ASM is often suboptimal,
dependent on the initialization.
By analyzing the relationships between the shape, texture, and appearance
subspaces in the AAM, we will show the defects of the model. Thereby, we
suggest a property that an ideal appearance model should have, which motivates
us to propose improvements to the classical model.
First, let us look into the relationship between shape and texture from an
intuitive viewpoint. A texture (i.e., the patch of intensities) is enclosed by a shape
98 STAN Z. LI et al.
(before aligning to the mean shape); the same shape can enclose different textures
(i.e., configurations of pixel values). However, the reverse is not true: different
shapes cannot enclose the same texture. So the mapping from the texture space
to the shape space is many-to-one. The shape parameters should be determined
completely by texture parameters but not vice versa.
Let us now look further into the correlations or constraints between the linear
subspaces Ss , St and Sa in terms of their dimensionalities or ranks. Let us denote
the rank of space S by dim(S). We have the following analysis:
1. When dim(Sa )=dim(St )+dim(Ss ), the shape and texture parameters are
independent of each other, and there exist no mutual constraints between
the s and t parameters.
2. When dim(St )<dim(Sa )<dim(St )+ dim(Ss ), not all the shape parameters
are independent of the texture parameters. That is, one shape can corre-
spond to more than one texture configuration in it, which conforms our
intuition.
3. One can also derive the relationship dim(St )<dim(Sa ) from Eqs. (7) and
(8) and write
Λs
Wa = (15)
t
when that s contains some components that are independent of t.
4. However, in the AAM it is often the case where dim(Sa )<dim(St ) if the
dimensionalities of Sa and St are chosen to retain, say, 98% of the total
variations, which is reported by Cootes [2] and also observed by us. The
consequence is that some admissible texture configurations cannot be seen
in the appearance subspace because dim(Sa )<dim(St ), and therefore can-
not be reached by the AAM search. We consider this a flaw in the AAM’s
modeling of its appearance subspace.
From the above analysis we conclude that the ideal model should be
dim(Sa ) = dim(St ), and hence that s is completely linearly determinable by t.
In other words, the shape should be linearly dependent on the texture, so that
dim(St ∪ Ss ) = dim(St ). The direct appearance model (DAM) is proposed mainly
for this purpose.
Another motivation of the DAM is memory consumption: the regression of Aa
with the AAM is very memory consuming. The AAM prediction needs to model
linear the relationship between appearance and the texture difference according to
Eq. (11). However, both δa and δT are high-dimensional vectors, and therefore the
storage size of training data generated for learning Eq. (11) increases very rapidly
as the dimensions increase. It is very difficult to train the AAM for Aa even with
a moderate number of images. Learning in a low-dimensional space will relieve
the burden.
SHAPE AND TEXTURE-BASED DEFORMABLE MODELS 99
δC(R) (18)
= trace{E([s − (R + δR)t][s − (R + δR)t]T )}
−trace[E{[s − Rt][s − Rt]T }]
= trace{E[RttT δRT + δRttT R
−stT δRT − δRtsT ]}
= trace{RE(ttT )δRT + ∆RE(ttT )R
−E(stT )∆RT − δRE(tsT )}.
for any δR → 0. Substituting δR by 1i,j for any (i, j), where → 0 and 1i,j
is the matrix in which entry (i, j) is 1 and 0 elsewhere, we arrive at RE(ttT ) =
E(stT ), and hence obtain an optimal solution:
Instead of using δT directly as in the AAM search (cf. Eq. (12)), we use
principal components of it, δT , to predict the position displacement:
δp = Rp δT , (22)
δT = HT δT, (23)
2. Get texture Tim from the current position, project it into the texture sub-
space St as t; reconstruct the texture Trec , and compute texture difference
δT0 = Tim − Trec and the energy E0 = δT0 2 ;
6. Compute difference texture δT using the new shape at the new position,
and its energy E0 = δT0 2 ;
9. Change κ to the next smaller number in {1.5, 0.5, 0.25, 0.125, . . . , }, goto 5;
View #1 #2 #3 #4 #5
Figure 2. Frontal, half-side, and full-side view faces and the labeled landmark points.
Reprinted with permission from SZ Li, SC Yan, HJ Zhang, QS Cheng. 2002. Multi-view
face alignment using direct appearance models. Proc 5th Int Conf Automatic Face Gesture
Recogn, pp. 309– 314. Copyright 2002,
c IEEE.
Figure 3. Texture and shape variations due to variations in the first three principal compo-
nents of the texture (the shapes change in accordance with s = Rt) for full-side (±1σ),
half-side (±2σ), and frontal (±3σ) views. Reprinted with permission from SZ Li, SC Yan,
HJ Zhang, QS Cheng. 2002. Multi-view face alignment using direct appearance models.
Proc 5th Int Conf Automatic Face Gesture Recogn, pp. 309–314. Copyright 2002, c IEEE.
A TC-ASM [17] imposes the linear relationship of the direct appearance model
(DAM) to improve the ASM search. The motivation is as follows. The ASM has
better accuracy in shape localization than the AAM when the initial shape is placed
close enough to the true shape, whereas the latter model incorporates information
about texture enclosed in the shape and hence yields lower texture reconstruction
error. However, the ASM makes use of constraints near the shape only, without
a global optimality criterion, and therefore the solution is sensitive to the initial
shape position. In the AAM, the solution-finding process is based on the linear
relationship between the variation of the position and the texture reconstruct error.
The reconstruct error, δT , is influenced very much by the illumination. Since δT
SHAPE AND TEXTURE-BASED DEFORMABLE MODELS 105
Figure 4. Initial alignment provided by a multi-view face detector. Reprinted with permis-
sion from SZ Li, SC Yan, HJ Zhang, QS Cheng. 2002. Multi-view face alignment using
direct appearance models. Proc 5th Int Conf Automatic Face Gesture Recogn, pp. 309–314.
Copyright 2002,
c IEEE.
Figure 5. DAM aligned faces (from left to right) at the 0th, 5th, 10th, and 15th iterations, and
the original images for (top–bottom) frontal, half-side and full-side view faces. Reprinted
with permission from SZ Li, SC Yan, HJ Zhang, QS Cheng. 2002. Multi-view face
alignment using direct appearance models. Proc 5th Int Conf Automatic Face Gesture
Recogn, pp. 309–314. Copyright 2002,
c IEEE.
106 STAN Z. LI et al.
Figure 6. Results of non-isometric (top of each of the three blocks) and isometric (bottom)
search for frontal (top block), half-side (middle block), and full-side (bottom block) view
faces. From left to right of each row are normal, and stretched faces. The number below
each result is the corresponding residual error. Reprinted with permission from SZ Li,
SC Yan, HJ Zhang, QS Cheng. 2002. Multi-view face alignment using direct appearance
models. Proc 5th Int Conf Automatic Face Gesture Recogn, pp. 309–314. Copyright
2002,
c IEEE.
SHAPE AND TEXTURE-BASED DEFORMABLE MODELS 107
where Σt stands for the covariance matrix of the distribution, and st is linearly
determined by texture t. The linear mapping from t to st is:
st = Rt, (25)
Figure 7. Comparison of the manually labeled shape (middle row) and the shape (bottom
row) derived from the enclosed texture using the learned projection matrix: st = Rt. In
the top row are the original images. All the images are test data. Reprinted with permission
from SC Yan, C Liu, SZ Li, HJ Zhang, H Shum, QS Cheng. 2003. Face alignment using
texture-constrained active shape models. Image Vision Comput 21(1):69–75. Copyright
2003,
c Elsevier.
Then
p(s|Slm , st ) ∝ p(Slm |s)p(s|st ). (30)
The corresponding energy function is
From Eqs. (4) and (27), the best shape obtained in each step is
After restoring the superscript of iteration number, the best shape obtained in
step n is
sn=(Λ−1 + Σ−1 t )
−1
(Λ−1 snlm + Σ−1 n
t st ). (32)
This indicates that the best shape derived in each step is an interpolation between
the shape from the local appearance model and the texture-constrained shape. In
this sense, the TC-ASM could be regarded as a tradeoff between the ASM and
AAM methods.
The stopping condition of the optimization is: if the shape from the local
appearance model and the texture-constrained shape are the same, i.e., the solution
generated by ASM is verified by the AAM, the optimal solution must have been
touched. In practice, however, these two shapes would hardly turn out to be the
same. A threshold is introduced to evaluate the similarity, and sometimes the
convergence criterion in the ASM is used (if the above criterion has not been
satisfied for a long time). For higher efficiency and accuracy, a multi-resolution
pyramid method is adopted in the optimization process.
110 STAN Z. LI et al.
Figure 8. Four face instances of qualified (top) and unqualified (bottom) examples with
their warped images. Reprinted with permission from XS Huang, SZ Li, YS Wang. 2004.
Statistical learning of evaluation function for ASM/AAM image alignment. In Proceedings:
Biometric Authentication, ECCV 2004 International Workshop, BioAW 2004, Prague, Czech
Republic, May 15, 2004 (ECCV Workshop BioAW), pp. 45– 56. Ed D Maltoni, AK Jain.
New York: Springer. Copyright 2004,c Springer.
In the real version of AdaBoost [25, 26], the weak classifiers can take a real
value, hm (x) ∈ R, and have absorbed the coefficients needed in the discrete
version (hm (x) ∈ −1, +1 in the latter case). The class label for x is obtained
as H(x) = sign[HM (x)], while magnitude |HM (x)| indicates the confidence.
Every training example is associated with a weight. During the learning process,
the weights are updated dynamically in such a way that more emphasis is placed
on hard examples that are erroneously classified previously. It has been noted in
recent studies [28, 29, 30] that the artificial operation of explicit re-weighting is
unnecessary and can be incorporated into a functional optimization procedure of
boosting.
An error occurs when H(x) = y, or yHM (x) < 0. The “margin” of an
example, (x, y), achieved by h(x) ∈ R on the training set examples is defined
as yh(x). This can be considered a measure of the confidence of h’s prediction.
The upper bound of classification error achieved by HM can be derived as the
following exponential loss function [31]:
M
J(HM ) = e−yi HM (xi ) = e−yi m=1 hm (x)
. (34)
i i
0. (Input)
(1) Training examples {(x1 , y1 ), . . . , (xN , yN )},
where N = a + b; of which a examples have yi = +1
and b examples have yi = −1;
(2) The maximum number Mmax of weak classifiers to be combined;
1. (Initialization)
(0) 1
wi = 2a for those examples with yi = +1 or
(0) 1
wi = 2b for those examples with yi = −1.
M = 0;
2. (Forward Inclusion)
while M < Mmax
(1) M ← M + 1;
(2) Choose hM according to Eq.36;
(M ) (M )
(3) Update wi ← exp[−yi HM (xi )], and normalize to i wi = 1;
3. (Output)
H(x) = sign[ M m=1 hm (x)].
where w(M −1) (x, y) = exp (−yFM −1 (x)) is the weight for the labeled example
(x, y) and
(M −1) E w(x, y) · 1[y=+1] |x
P (y = +1|x, w )= , (37)
E (w(x, y) | x)
where E(·) stands for the mathematical expectation and 1[C] is 1 if C is true or 0
otherwise. P (y = −1|x, w(M −1) ) is defined similarly.
The AdaBoost algorithm based on the descriptions from [25, 26] is shown in
Figure 9. There, the re-weight formula in step 2.3 is equivalent to the multiplicative
rule in the original form of AdaBoost [32, 25]. In Section 6.3, we will present a
statistical model for stagewise approximation of P (y = +1|x, w(M −1) ).
114 STAN Z. LI et al.
where
1 p(x|y = +1, w)
LM (x) = log , (39)
2 p(x|y = −1, w)
1 P (y = +1)
T = log . (40)
2 P (y = −1)
The log likelihood ratio (LLR), LM (x), is learned from the training examples of the
two classes. The threshold T is determined by the log ratio of prior probabilities.
In practice, T can be adjusted to balance between the detection and false alarm
rates (i.e., to choose a point on the ROC curve).
Learning optimal weak classifiers requires modeling the LLR of Eq. (39).
Estimating the likelihood for high-dimensional data x is a non-trivial task. In
this work, we make use of the stagewise characteristics of boosting, and derive
likelihood p(x|y, w(M −1) ) based on an over-complete scalar feature set Z =
{z1 , . . . , zK
}. More specifically, we approximate p(x|y, w(M −1) ) by p(z1 , . . .,
zM −1 , z |y, w(M −1) ), where zm (m = 1, . . . , M − 1) are the features that have
already been selected from Z by the previous stages, and z is the feature to be
selected. The following describes the candidate feature set Z, and presents a
method for constructing weak classifiers based on these features.
Because the shape is about boundaries between regions, it makes sense to use
edge information (magnitude or orientation or both) extracted from a grayscale
image. In this work, we use a simple Sobel filter for extracting the edge informa-
tion. Two filters are used: Kw for horizontal edges and Kh for vertical edges, as
follows:
1 0 −1 1 2 1
Kw (w, h) = 2 0 −2 and Kh (w, h) = 0 0 0 .
1 0 −1 −1 −2 −1
(41)
The convolution of the image with the two filter masks gives two edge strength
values:
Gw (w, h) = Kw ∗ I(w, h), (42)
Gh (w, h) = Kh ∗ I(w, h), (43)
The edge magnitude and direction are obtained as
Figure 10. The two types of simple Sobel-like filters defined on sub-windows. The rect-
angles are of size w × h and are at distances of (dw, dh) apart. Each feature takes a value
calculated by the weighted (±1, ±2) sum of the pixels in the rectangles. Reprinted with
permission from XS Huang, SZ Li, YS Wang. 2004. Statistical learning of evaluation func-
tion for ASM/AAM image alignment. In Proceedings: Biometric Authentication, ECCV
2004 International Workshop, BioAW 2004, Prague, Czech Republic, May 15, 2004 (ECCV
Workshop BioAW), pp. 45–56. Ed D Maltoni, AK Jain. New York: Springer. Copyright
2004,
c Springer.
Gh (w, h)
φ(w, h) = arctan( ). (45)
Gw (w, h)
The edge information based on the Sobel operator is sensitive to noise. To solve
this problem we use the sub-block of the image to convolve with the Sobel filter
(see Figure 10), which is similar to Haar-like feature calculation.
Let
M
−1
∆LM (x) = LM (x) − L̃m (x). (49)
m=1
(M )
The best feature is the one whose corresponding Lk (x) best fits ∆LM (x). It
can be found as the solution to the following minimization problem:
N
2
(M )
k ∗ = arg min ∆LM (xi ) − βLk (xi ) . (50)
k,β
i=1
This can be done in two steps as follows. First, find k ∗ for which
(M ) (M ) (M )
(Lk (x1 ), Lk (x2 ), . . . , Lk (xN )) (51)
is most parallel to
N
∗ ∆LM (xi )Lk (xi ) ∗
β = i=1 N . (53)
2
i=1 [Lk (xi )]
∗
7. EXPERIMENTAL RESULTS
7.1. DAM
7.1.1. Computation of Subspaces
A total of 80 images of size 128 × 128 are collected. Each image contains
a different face in an area of about 64 × 64 pixels. The images set are randomly
partitioned into a training set of 40 images and a test set of the other 40. Each
image is mirrored, and this doubles the total number of images in each set.
K = 72 face landmark points are labeled manually (see an example in
Figure 11. The shape subspace is k = 39 dimensional, which retains 98% of
the total shape variation. The mean shape contains a texture of L = 3186 pixels.
The texture subspace is = 72 dimensional, as the result of retaining 98% of total
texture variation. These are common to both the AAM and DAM.
For the AAM, an appearance subspace is constructed to combine both shape
and texture information. A concatenated shape and texture vector is 39 + 72
dimensional, where the weight parameter is calculated as r = 7.5 for Λ = rI in
Eq. (7). It is reduced to a 65-dimensional appearance subspace that retains 98%
of total variation of the concatenated features.
For the DAM, the linearity assumption made for the model, s = Rt + ε, of
Eq. (16) is well verified because all the elements in E(εεT ) calculated over the
training set are smaller than 10−5 .
Figure 11. A face image and the landmark points. Reprinted with permission from XW
Hou, SZ Li, HJ Zhang, QS Cheng. 2001. Direct appearance models. Proc IEEE Conf
Comput Vision Pattern Recogn 1:828–833. Copyright 2001,
c IEEE.
118 STAN Z. LI et al.
The original texture difference δT , which is used in the AAM for predicating
position displacement, is 3186 dimensional; it is reduced to 724-dimensional δT ,
which is used in the DAM for prediction, to retain 98% of variation over the 1920
training examples.
The DAM requires much less memory during learning of prediction matrices
Rp in Eq. (22) than AAM for learning Aa in Eq. (11). For the DAM, there are 80
training images, 4 parameters for the position (x, y, θ, scale), and 6 disturbances
for each parameter to generate training data for training Rp . So, the size of
training data for the DAM is 80 × 4 × 6 = 1920. For the AAM, there are 80
training images, 65 appearance parameters, and 4 disturbances for each parameter
to generate training data for training Aa . The size of the training data set for Aa
is 80 × 65 × 4 = 20800. Therefore, the size of the training data set for AAM’s
prediction matrices is 20800 + 1920 = 22720, which is 11.83 times that for the
DAM. On a PC, for example, the memory capacity for AAM training with 80
images would allow DAM training with 946 images.
Figure 12. Scenarios of DAM (top) and AAM (bottom) alignment. Reprinted with permis-
sion from XW Hou, SZ Li, HJ Zhang, QS Cheng. 2001. Direct appearance models. Proc
IEEE Conf Comput Vision Pattern Recogn 1:828–833. Copyright 2001,
c IEEE.
Figure 13. The evolution of total δT for the DAM (top) and AAM (bottom) as a function
of iteration number for the training (left) and test (right) images. Reprinted with permission
from XW Hou, SZ Li, HJ Zhang, QS Cheng. 2001. Direct appearance models. Proc IEEE
Conf Comput Vision Pattern Recogn 1:828–833. Copyright 2001, c IEEE.
Some results about DAM learning and search have been presented in Figure
2–6. Figure 14 compares the convergence rate and accuracy properties of the DAM
and AAM (for the frontal view) in terms of the error in δT (cf. Eq. (10)) as the
algorithms iterate. The statistics are calculated from 80 images randomly selected
from the training set and 80 images from the test set. We can see that the DAM has
faster a convergence rate and smaller error than the AAM. Figure 15 illustrates the
error of DAM for non-frontal faces. Figure 16 compares the alignment accuracy
of the DAM and AAM (for frontal faces) in terms of the percentage of images
whose texture reconstruction error δT is smaller than 0.2, where the statistics are
obtained using another test set including the 80 test images mentioned above and
an additional 20 other test images. It shows again that the DAM is more accurate
than the AAM.
The DAM search is fairly fast. It takes on average 39 ms per iteration for
frontal and half-side view faces, and 24 ms for full-side view faces in an image of
size 320 × 240 pixels. Every view model takes about 10 iterations to converge. If
3 view models are searched per face, as is done with image sequences from video,
the algorithm takes about 1 second to find the best face alignment.
SHAPE AND TEXTURE-BASED DEFORMABLE MODELS 121
0.3 0.3
0.25 0.25
0.2 0.2
0.15 0.15
0.1 0.1
0.05 0.05
0 5 10 15 20 0 5 10 15 20
0.3 0.3
0.25 0.25
0.2 0.2
0.15 0.15
0.1 0.1
0.05 0.00
0 0
0 5 10 15 20 0 5 10 15 20
Figure 14. Mean error (the curve) and standard deviation (the bars) in reconstructed texture
δT as a function of iteration number for DAM (left) and AAM (right) methods with the
training (top) and test (bottom) sets, for frontal face images. The horizontal dashed lines
in the lower part of the figures indicate average δT for the manually labeled alignment.
Reprinted with permission from SZ Li, SC Yan, HJ Zhang, QS Cheng. 2002. Multi-view
face alignment using direct appearance models. Proc 5th Int Conf Automatic Face Gesture
Recogn, pp. 309– 314. Copyright 2002,
c IEEE.
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 5 10 15 20 0 5 10 15 20
Figure 15. Mean error in δT and standard deviation of DAM alignment for half- (left)
and full- (right) side view face images from the test set. Note that the mean errors in the
calculated solutions are smaller than obtained using the manually labeled alignment after a
few iterations. Reprinted with permission from SZ Li, SCYan, HJ Zhang, QS Cheng. 2002.
Multi-view face alignment using direct appearance models. Proc 5th Int Conf Automatic
Face Gesture Recogn, pp. 309–314. Copyright 2002,
c IEEE.
122 STAN Z. LI et al.
100 100
AAM AAM
DAM DAM
%Converged Correctly
%Converged Correctly
80 80
60 60
40 40
20
20
0
-50 -40 -30 -20 -10 0 10 20 30 40 50 -50 -40 -30 -20 -10 0 10 20 30 40 50
X Displacement (Pixels) X Displacement (Pixels)
Figure 16. Alignment accuracy of the DAM (dashed) and AAM (solid) in terms of local-
ization errors in the x (left) and y (right) directions. Reprinted with permission from SZ
Li, SC Yan, HJ Zhang, QS Cheng. 2002. Multi-view face alignment using direct appear-
ance models. Proc 5th Int Conf Automatic Face Gesture Recogn, pp. 309–314. Copyright
2002,
c IEEE.
7.2. TC-ASM
A data set containing 700 face images with different illumination conditions
and expressions are selected from the AR database [33] in our experiments, each
of which is 512 × 512, 256 gray images containing the frontal view face about
200 × 200. 83 landmark points are manually labeled on the face. We randomly
select 600 for training and the other 100 for testing.
For comparison, the ASM and AAM are trained on the same data sets, in a
three-level image pyramid (resolution is reduced 1/2 level by level) as with the TC-
ASM. By means of PCA with 98% total variations retained, the dimension of the
shape parameter in the ASM shape space is reduced to 88, and the texture parameter
vector in the AAM texture space is reduced to 393. The concatenated vector of
the shape and texture parameter vectors with the weighting parameter, γ = 13.77,
is reduced to 277. Two types of experiments are presented: (1) comparison of the
point-position accuracy, and (2) comparison of the texture reconstruction error.
The experiments are all performed in the 3-level resolution image pyramid.
Figure 17. Accuracy of ASM, AAM, TC-ASM. From upper to lower, left to right, are
the results obtained with the initial displacements of 10, 20, 30, and 40 pixels. Note that
the value of the vertical coordinate is the percentage of examples that have the point-to-
point distance smaller than the corresponding value of horizonal coordinate. Reprinted
with permission from SC Yan, C Liu, SZ Li, HJ Zhang, H Shum, QS Cheng. 2003. Face
alignment using texture-constrained active shape models. Image Vision Comput 21(1):69–
75. Copyright 2003,
c Elsevier.
We compare the stability of the TC-ASM with theASM in Figure 18. The value
on the horizonal axis is the index number of the selected examples, whereas the
value on the vertical axis is the average standard deviation of the results obtained
from 10 different initializations that deviate from the ground-truth by approxi-
mately 20 pixels. The results are convincing that the TC-ASM is more stable to
initialization. An example is given in Figure 19.
Figure 18. Standard deviation in the results of each example for ASM (dotted) and TC-
ASM (solid) with the training set (left) and the test set (right). Reprinted with permission
from SC Yan, C Liu, SZ Li, HJ Zhang, H Shum, QS Cheng. 2003. Face alignment using
texture-constrained active shape models. Image Vision Comput 21(1):69–75. Copyright
2003,
c Elsevier.
Figure 19. Stability of the ASM (middle column) and the TC-ASM (right column) in
shape localization. The different initialization conditions are shown in the left column.
Reprinted with permission from SC Yan, C Liu, SZ Li, HJ Zhang, H Shum, QS Cheng.
2003. Face alignment using texture-constrained active shape models. Image Vision Comput
21(1):69–75. Copyright 2003,
c Elsevier.
SHAPE AND TEXTURE-BASED DEFORMABLE MODELS 125
Figure 20. Distribution of the texture reconstruction error with the ASM (dotted), the
AAM (square), and the TC-ASM (asterisk), with training data (left) and test data (right).
Reprinted with permission from SC Yan, C Liu, SZ Li, HJ Zhang, H Shum, QS Cheng.
2003. Face alignment using texture-constrained active shape models. Image Vision Comput
21(1):69–75. Copyright 2003,
c Elsevier.
Figure 21. Sensitivities of the AAM (upper) and TC-ASM (lower) to an illumination
condition not seen in the training data. From left to right are the results obtained at the 0th,
2th, and 10th iterations. Result in different levels of image pyramid is scaled back to the
original scale. Reprinted with permission from SC Yan, C Liu, SZ Li, HJ Zhang, H Shum,
QS Cheng. 2003. Face alignment using texture-constrained active shape models. Image
Vision Comput 21(1):69–75. Copyright 2003,
c Elsevier.
126 STAN Z. LI et al.
Figure 22. Scenarios of AAM (upper) and TC-ASM (lower) alignment with texture re-
construct errors 0.3405 and 0.1827, respectively. From left to right are the results obtained
at the 0th, 5th, 10th, and 15th iterations, and the original image. Result in different levels
of image pyramid is scaled back to the original scale. Reprinted with permission from SC
Yan, C Liu, SZ Li, HJ Zhang, H Shum, QS Cheng. 2003. Face alignment using texture-
constrained active shape models. Image Vision Comput 21(1):69–75. Copyright 2003, c
Elsevier.
Reprinted with permission from XS Huang, SZ Li, YS Wang. 2004. Statistical learning
of evaluation function for ASM/AAM image alignment. In Proceedings: Biometric
Authentication, ECCV 2004 International Workshop, BioAW 2004, Prague, Czech
Republic, May 15, 2004 (ECCV Workshop BioAW), pp. 45–56. Ed D Maltoni, AK
Jain. New York: Springer. Copyright 2004,
c Springer.
Figure 23. ROC curve for the reconstruction error-based alignment evaluation for the
training set. Reprinted with permission from XS Huang, SZ Li, YS Wang. 2004. Statistical
learning of evaluation function forASM/AAM image alignment. In Proceedings: Biometric
Authentication, ECCV 2004 International Workshop, BioAW 2004, Prague, Czech Republic,
May 15, 2004 (ECCV Workshop BioAW), pp. 45–56. Ed D Maltoni, AK Jain. New York:
Springer. Copyright 2004,
c Springer. See attached CD for color version.
Figure 25. Comparison between reconstruction error method and boost method. Reprinted
with permission from XS Huang, SZ Li, YS Wang. 2004. Statistical learning of evaluation
function for ASM/AAM image alignment. In Proceedings: Biometric Authentication,
ECCV 2004 International Workshop, BioAW 2004, Prague, Czech Republic, May 15, 2004
(ECCV Workshop BioAW), pp. 45–56. Ed D Maltoni, AK Jain. New York: Springer.
Copyright 2004,
c Springer. See attached CD for color version.
8. CONCLUSION
9. ACKNOWLEDGMENTS
This work was supported by the following funding: National Science Founda-
tion of China Project #60518002, Chinese National 863 Program Projects
#2004AA1Z2290 and #2004AA119050.
130 STAN Z. LI et al.
10. NOTES
n
1. It is a deviation of the mostly used energy function with a squared Euclidean distance between Slm
and shape S ∈ R2K derived from parameter s. It is more reasonable to take into account the prior
distribution in the shape space.
11. REFERENCES
1. Cootes TF, Taylor CJ, Cooper DH, Graham J. 1995. Active shape models: their training and
application. Comput Vision Image Understand 61:38–59.
2. Cootes TF, Edwards GJ, Taylor CJ. 1998. Active appearance models. In Proceedings of the
European conference on computer vision, Vol. 2, pp. 484–498. Ed. H Burkhardt, B Neumann.
New York: Springer.
3. Edwards GJ, Cootes TF, Taylor CJ. 1998. Face recognition using active appearance models.
In Proceedings of the European conference on computer vision, Vol. 2, pp. 581–695. Ed. H
Burkhardt, B Neumann. New York: Springer.
4. Cootes TF, Taylor CJ. 2001. Statistical models of appearance for computer vi-
sion. Technical Report, Wolfson Image Analysis Unit, Manchester University,
www.isbe.man.ac.uk/simbim/refs.html.
5. Sclaroff S, Isidoro J. 1998. Active blobs. In Proc IEEE Int Conf Comput Vision, Bombay, India,
pp. 1146–1153.
6. Cootes TF, Walker KN, Taylor CJ. 2000. View-based active appearance models. In 4th Interna-
tional conference on automatic face and gesture recognition, Grenoble, France, pp. 227–232.
http://citeseer.ist.psu.edu/cootes00viewbased.html.
7. Psarrou A, Romdhani S, Gong S. 1999. Learning a single active face shape model across views. In
Proceedings of the IEEE international workshop on recognition, analysis, and tracking of faces
and gestures in real-time systems, Corfu, Greece, 26–27 September, pp. 31–38. Washington, DC:
IEEE Computer Society.
8. Cootes TF, Taylor CJ. 2001. Constrained active appearance models. Proc IEEE Int Conf Comput
Vision 1:748–754.
9. Blanz V, Vetter T. 1999. A morphable model for the synthesis of 3D faces. In Proc. Siggraph’99,
pp. 187–194. New York: ACM Press.
10. LiY, Gong S, Liddell and H. 2001. Constructing facial identity surfaces in a nonlinear discriminat-
ing space. Proc IEEE Comput Soc Conf: Computer Vision and Pattern Recognition 2:258–263.
11. Duta N, Jain AK, Dubuisson-Jolly M. 2001. Automatic construction of 2D shape models. IEEE
Trans Pattern Analy Machine Intell 23(5):433–446.
12. van Ginneken B, Frangi AF, Staal JJ, ter Haar Romeny BM, Viergever MA. 2001. A nonlinear
gray-level appearance model improves active shape model segmentation. In IEEE workshop on
mathematical models in biomedical image analysis, pp. 205–212. Ed. L Staib, A Rangarajan.
Washington, DC: IEEE Society Press.
13. Baker S, Matthews I. 2001. Equivalence and efficiency of image alignment algorithms. Proc IEEE
Conf Comput Vision Pattern Recognition 1:1090–1097.
14. Ahlberg J. 2001. Using the active appearance algorithm for face and facial feature tracking. In
IEEE ICCV workshop on recognition, analysis and tracking of faces and gestures in real-time
systems, Vancouver, Canada, July 13, 2001, pp. 68–72. Washington, DC: IEEE.
15. Hou XW, Li SZ, Zhang HJ, Cheng QS. 2001. Direct appearance models. Proc IEEE Conf Comput
Vision Pattern Recognition 1:828– 833.
16. Li SZ, Yan SC, Zhang HJ, Cheng QS. 2002. Multi-view face alignment using direct appearance
models. Proc 5th Int Conf Automatic Face Gesture Recogn, Washington, DC, 20–21 May 2002,
pp. 309–314. Washington, DC: IEEE.
SHAPE AND TEXTURE-BASED DEFORMABLE MODELS 131
17. Yan SC, Liu C, Li SZ, Zhang HJ, Shum H, Cheng QS. 2003. Face alignment using texture-
constrained active shape models. Image Vision Comput 21(1):69–75.
18. XS Huang, SZ Li, YS Wang. 2004. Statistical learning of evaluation function for ASM/AAM im-
age alignment. In Proceedings: Biometric Authentication, ECCV 2004 International Workshop,
BioAW 2004, Prague, Czech Republic, May 15, 2004 (ECCV Workshop BioAW), pp. 45–56. Ed
D Maltoni, AK Jain. New York: Springer.
19. Cootes TF, Edwards GJ, Taylor CJ. 1999. Comparing active shape models with active appearance
models. In 10th British Machine Vison Conference, Vol. 1, pp. 173–182. Ed. T Pridmore, D Elli-
man. Nottingham, UK: BMVA Press. http://citeseer.ist.psu.edu/article/cootes99comparing.html.
20. Basso C, Romdhani S, Blanz V, Vetter T. 2005. Morphable models of faces. In Handbook of face
recognition, pp. 217–245. Ed S Li, A Jain. New York: Springer.
21. Blanz V, Vetter T. 2003. Face recognition based on fitting a 3D morphable model. IEEE Trans
Pattern Anal Machine Intell 25(9):1063- -1074.
22. Jones MJ, Poggio T. 1998. Multidimensional morphable models. Proc IEEE Int Conf Comput
Vision, pp. 683–688. http://citeseer.ist.psu.edu/jones98multidimensional.html.
23. Bergen JR, Hingorani R. 1990. Hierarchical motion-based frame rate conversion. Technical
Report, David Sarnoff Research Center, Princeton, NJ.
24. Li S, Zhang ZQ, Zhu L, Zhang HJ. 2002. Real-time multi-view face detection. In Proc 5th Int Conf
Automatic Face Gesture Recogn, Washington, DC, 20–21 May 2002, pp. 149–154. Washington,
DC: IEEE.
25. Schapire RE, Singer Y. 1999. Improved boosting algorithms using confidence-rated predictions.
Machine Learning 37(3):297–299.
26. Friedman J, Hastie T, Tibshirani R. 2000. Additive logistic regression: a statistical view of
boosting. Ann Stat 28(2):337–374.
27. Viola P, Jones M. 2001. Robust real-time object detection. Int J Comput Vision. To appear.
28. Friedman JH. 2001. Greedy function approximation: a gradient boosting machine. Ann Stat,
29(5):1189–1232.
29. Mason L, Baxter J, Bartlett PL, Frean M. 1999. Functional gradient techniques for combining
hypotheses. In Advances in large margin classifiers, pp. 221–247. Ed. AJ Smola, PL Bartlett, B
Schölkopf, D Schuurmans. Cambridge: MIT Press.
30. Zemel RS, Pitassi T. 2001. A gradient-based boosting algorithm for regression problems. In
Advances in neural information processing systems, Vol. 13. Ed. TK Leen, TG Dietterich, V
Tresp. Cambridge: MIT Press. http://citeseer.ist.psu.edu/zemel01gradientbased.html.
31. Schapire RE, Freund Y, Bartlett P, Lee WS. 1998. Boosting the margin: a new explanation for
the effectiveness of voting methods. Ann Stat 26(5):1651–1686.
32. Freund Y, Schapire RE. 1997. A decision-theoretic generalization of on-line learning and an
application to boosting. J Comput Syst Sci 55(1):119–139.
33. Martinez AM, Benavente R. 1998. The AR face database. Computer Vision Center Technical
Report No. 24. Barcelona, Spain.
5
Ricardo J. Ferrari
University of Calgary, Canada
University of Sáo Paulo, Brazil
Rangaraj M. Rangayyan
University of Calgary, Canada
Annie F. Frère
University of Mogi das Cruzes, Brazil
University of Sáo Paulo, Brazil
Rejane A. Borges
University of Mogi das Cruzes, Sáo Paulo, Brazil
Address all correspondence to: Ricardo J. Ferrari, 2 Forest Laneway, Suite 1901, Toronto, Ontario,
M2N 5X9, Canada. Phones: (416) 987-7528 (home). [email protected]. Reproduced (with
modifications) with permission from RJ Ferrari, RM Rangayyan, JEL Desautels, AF Frère. 2004.
Identification of the breast boundary in mammograms using active contour models. Med Biol Eng
Comput 42(2):201–208. Copyright 2004,
c @MBEC.
133
134 RICARDO J. FERRARI et al.
1. INTRODUCTION
In the initial stage of our investigation [10], we used the traditional active
deformable contour model (or snake, [11]) for detection of the breast boundary.
The method, summarized in Figure 1, is composed of six main stages [10]:
Stage 1: The image contrast is enhanced by using a simple logarithmic oper-
ation [12]. A contrast-correction step using a simple logarithmic operation as
is applied to the original image I(x, y); G(x, y) is the transformed image. This
dynamic range compression operation, although applied to the whole image, en-
hances significantly the contrast of the regions near the breast boundary in mam-
mograms, which are characterized by low density and poor definition of details
[2, 3]. The rationale behind the application of this procedure to the image is to
determine an approximate breast contour as close as possible to the true breast
boundary. The effect of this procedure can be seen by comparing the original and
enhanced images in Figures 2(a) and 2(b).
Stage 2: A binarization procedure using the Lloyd-Max algorithm is applied
to the image [14]. The Lloyd-Max least-squares algorithm is an iterative and fast
technique (convergence was reached in an average of three or four cycles in the
present work) for the design of a quantizer with low distortion. It uses the intensity
distribution (histogram) of the image to optimize, in the sense of a mean-squared-
error criterion, the quantization procedure applied to the image, checking each
possible N -level quantizer (N = 2 for binarization purposes) to determine the
quantizer that provides the lowest distortion. The distortion measure is given by
bj
N
ε= (x − y)2 f (x), (2)
j=1 x=aj
136 RICARDO J. FERRARI et al.
Figure 1. Flowchart of the procedures for identification of the skin–air boundary of the
breast.
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure 2. Results of each stage of Method 1 for identification of the breast boundary. (a)
Original image mdb042 from the Mini-MIAS database [13]. (b) Image after the logarith-
mic operation. (c)–(d) Binary image before and after applying the binary morphological
opening operator. (e) Control points N1 to N4 (automatically determined) used to limit the
breast boundary. (f) Normal lines computed from each pixel in the skin–air boundary. (g)
Boundary resulting after histogram-based analysis of the normal lines. (h) Final boundary.
DETECTION OF BREAST CONTOUR IN MAMMOGRAMS 139
Algorithm 1 : Algorithm for the Lloyd-Max method [14] used for binarization of
mammograms.
//
/ / Compute p r o b a b i l i t y d e n s i t y f u n c t i o n ( p d f ) a s t h e
/ / n o r m a l i z e d image−i n t e n s i t y h i s t o g r a m and d e t e r m i n e t h e
/ / Min and Max g r a y l e v e l v a l u e s o f t h e image
//
P d f = image−>c o m p u t e P d f ( ) ;
Min = image−>g e t M i n G r a y L e v e l ( ) ;
Max = image−>g e t M a x G r a y L e v e l ( ) ;
//
/ / I n i t i a l i z e q u a n t i z a t i o n l e v e l s w i t h Min and Max v a l u e s
//
L [ 0 ] = Min ;
L [ 1 ] = Max ;
//
/ / V a r i a b l e s u s e d f o r t h e d i s t o r t i o n m e a s u r e ( MSE)
//
NewDistortion = 0 ;
do {
OldDistortion = NewDistortion ;
//
/ / Compute t h e t h r e s h o l d v a l u e y
//
y = round ( ( L [ 0 ] + L [ 1 ] ) / 2 ) ;
NewDistortion = 0 ;
f o r ( x=L [ 0 ] ; x < L [ 1 ] ; x + + ) {
N e w d i s t o r t i o n + = s q r ( x − y ) ∗ P d f [ x − Min ] ;
}
//
/ / Two c y c l e s l o o p f o r b i n a r i z a t i o n
//
for ( i =0; i < 2 ; i ++) {
//
/ / Compute new q u a n t i z a t i o n l e v e l s
//
sum1 = sum2 = 0 ;
f o r ( x=L [ i ] ; x < y ; x + + ) {
sum1 + = x ∗ P d f [ x − Min ] ;
sum2 + = P d f [ x − Min ] ;
}
L [ i ] = r o u n d ( sum1 / sum2 ) ;
}
} w h i l e ( ( N e w D i s t o r t i o n − O l d D i s t o r t i o n > EPSILON ) ) ;
140 RICARDO J. FERRARI et al.
(a)
(b)
Figure 3. (a) Profile of a sample normal line used to determine an approximate skin–air
boundary. The symbol “×” indicates the skin–air intersection determined in Stage 5. (b)
Histogram computed from (a).
DETECTION OF BREAST CONTOUR IN MAMMOGRAMS 141
where α and β are weighting parameters that control, respectively, the tension and
rigidity of the snake; v (s) and v (s) values denote the first and second derivatives
of v(s) with respect to s, where v(s) indicates the continuous representation of
the contour. The external energy function Eext [v(s)] is derived from the image
I(x, y), and is defined in this work as
2
Eext (x, y) = − ∇I(x, y) , (5)
where ∇ is the gradient operator. In the present work, the values α = 0.001
and β = 0.09 were experimentally derived based upon the approximate boundary
obtained in the previous stage, the quality of the external force derived from the
original image, and the final contours obtained.
142 RICARDO J. FERRARI et al.
Figure 2(h) illustrates the result of application of the active contour model to
the contour estimate shown in Figure 2(g). The global nature of the active contour
model has removed the local irregularities present in the contour near the “R–ML”
label in Figure 2(g).
Sixty-six images from the Mammographic Image Analysis Society (Mini-
MIAS, [13]) database were used to assess the performance of the method. The
results were subjectively analyzed by an expert radiologist (JELD). According to
the opinion of the radiologist, the method detected accurately the breast boundary
in 50 images, and reasonably well in 11 images. In five images the method
failed completely (the result was considered not acceptable for CAD purposes)
because of distortions and artifacts present near the breast boundary (see Figure 4).
Limitations of the method exist mainly in Stages 5 and 6. Although Stage 5 helps in
obtaining a good breast contour, it is time consuming and may impose a limitation
in practical applications. The traditional snake model used in Stage 6 is not robust
(the method is not locally adaptive) in the presence of noise and artifacts, and has
a short range of edge capture.
In the improved method (Method 2) described in the following section, we
replace the traditional snake algorithm with an adaptive active deformable contour
model (AADCM) specially designed for the present application. The algorithm
includes a balloon force in an energy formulation that minimizes the influence of
the initial contour on the convergence of the algorithm. In this energy formulation,
the external energy is also designed to be locally adaptive. In formulating the
AADCM, we removed Stage 5 of Method 1; see Figure 1. A pseudo-code of the
snake algorithm used in the Method 2 is presented in Algorithm 2.
N
Etotal = [αEinternal (vi ) + βEexternal (vi )] , (6)
i=1
DETECTION OF BREAST CONTOUR IN MAMMOGRAMS 143
(a)
(b)
/ / Compute t h e a v e r a g e d i s t a n c e among t h e s n a k e n o d e s
/ / used f o r the n o r m a l i z a t i o n of the c o n t i n u i t y energy
a v e r a g e D i s t = g e tA v g D i s t a n c e ( ) ;
/ / Compute t h e c e n t e r o f g r a v i t y u s e d f o r t h e b a l l o o n e n e r g y
computeCG ( ) ;
/ / C r e a t e a Snake o b j e c t from t h e a p p r o x i m a t e d c o n t o u r
/ / r e p r e s e n t e d by a l i n k e d l i s t o f t y p e P o i n t { x , y}
MySnake = Snake ( A p p r o x i m a t e d C o n t o u r ) ;
/ / Loop f o r s t a g e s : a t e a c h s t a g e t h e b l u r
/ / a p p l i e d t o t h e o r i g i n a l image i s r e d u c e d
do {
/ / C o u n t e r v a r i a b l e f o r t h e number o f a d j u s t e d s n a k e n o d e s
movedNodes = 0 ;
/ / Count number o f c o m p l e t e c y c l e t h r o u g h t h e s n a k e n o d e s
countIter = 0;
/ / C r e a t e two a u x i l i a r s n a k e n o d e s f o r h a n d l e t h e
/ / t h e l a s t end node o f t h e s n a k e
SnakeNode ∗ n e x t N o d e = NULL , ∗ prevNode = NULL ;
} else {
i f ( MySnake . c u r r e n t = = end ) {
prevNode = ( SnakeNode ∗ ) MySnake . c u r r e n t−>p r e v ;
n e x t N o d e = ( SnakeNode ∗ ) s t a r t ;
} else {
prevNode = ( SnakeNode ∗ ) MySnake . c u r r e n t−>p r e v ;
n e x t N o d e = ( SnakeNode ∗ ) MySnake . c u r r e n t−>n e x t ;
}
}
DETECTION OF BREAST CONTOUR IN MAMMOGRAMS 145
/ / Get c o o r d i n a t e s and e n e r g y o f t h e c o e f f i c i e n t s f o r
/ / t h e c u r r e n t s n a k e node
curX = MySnake . c u r r e n t−>d a t a−>GetXCoord ( ) ;
curY = MySnake . c u r r e n t−>d a t a−>GetYCoord ( ) ;
a l p h a = MySnake . c u r r e n t−>d a t a−>G e tA l p h a ( ) ;
b e t a = MySnake . c u r r e n t−>d a t a−>G e t B e t a ();
gamma = MySnake . c u r r e n t−>d a t a−>GetGamma ( ) ;
/ / Compute t h e e n e r g y t e r m s f o r a l l p o s s i b l e p o s i t i o n s
/ / in the neighborhood
f o r ( row = 0 ; row < NEIGHBOR HEIGHT ; row ++)
f o r ( c o l = 0 ; c o l < NEIGHBOR WIDTH ; c o l + + ) {
//
/ / Compute i n d e x e s o f t h e n e i g h b o r h o o d
//
n I n d e x = row ∗ NEIGHBOR WIDTH + c o l ;
nX = curX + ( c o l − ( NEIGHBOR WIDTH − 1 ) / 2 ) ;
nY = curY + ( row − ( NEIGHBOR HEIGHT − 1 ) / 2 ) ;
/ / Avoid n e i g h b o r h o o d l i m i t s b e i n g o u t o f r a n g e
C h e c kA n d C o r r e c t L i m i t s ( nX , nY ) ;
/ / Get image g r a d i e n t v a l u e i n t h e p o s i t i o n ( nX , nY )
I g r a d [ n I n d e x ] = image−>G e t G r a d i e n t ( nX , nY ) ;
/ / Compute and s t o r e a l l e n e r g i e s i n t h e p o s i t i o n ( nX , nY )
/ / See e q u a t i o n s 1 . 8 , 1 . 9 , and 1 . 1 0
e c o n t [ n I n d e x ] = C o n t i n u i t y ( prevNode , nX , nY , n e x t N o d e ) ;
e grad [ nIndex ] = Gradient ( prevNode , nX , nY , n e x t N o d e ) ;
e b a l l [ nIndex ] = Balloon ( prevNode , nX , nY , n e x t N o d e ) ;
}
/ / Normalize the e n e r g y t o be i n t h e r a n g e [ 0 , 1 ]
/ / See e q u a t i o n s 1 . 1 1 , 1 . 1 2 , and 1 . 1 3
normContinuity ( e c o n t , NEIGHBOR WIDTH ∗ NEIGHBOR HEIGHT ) ;
normGradient (e g r a d , NEIGHBOR WIDTH ∗ NEIGHBOR HEIGHT ) ;
normBalloon (e b a l l , I g r a d, NEIGHBOR WIDTH ∗ NEIGHBOR HEIGHT ) ;
/ / S t a r t w i t h an i n s a n e l y h i g h u p p e r bound
minEnergy = MAX ENERGY ;
/ / Now f i n d t h e minimum e n e r g y l o c a t i o n i n t h e n e i g h b o r h o o d
f o r ( i n t row = 0 ; row < NEIGHBOR HEIGHT ; row ++)
f o r ( i n t c o l = 0 ; c o l < NEIGHBOR WIDTH ; c o l + + ) {
//
/ / Compute i n d e x and p o s i t i o n i n t h e n e i g h b o r h o o d
//
n I n d e x = row ∗ NEIGHBOR WIDTH + c o l ;
nX = curX + ( c o l − ( NEIGHBOR WIDTH − 1 ) / 2 ) ;
nY = curY + ( row − ( NEIGHBOR HEIGHT − 1 ) / 2 ) ;
/ / Check n e i g h b o r h o o d l i m i t s
C h e c kA n d C o r r e c t L i m i t s ( nX , nY ) ;
146 RICARDO J. FERRARI et al.
/ / Compute e n e r g y v a l u e o f a node i n t h e ( nX , nY ) p o s i t i o n
energy = alpha ∗ e c o n t [ nIndex ] + beta ∗ e grad [ nIndex ] ;
+ gamma ∗ e b a l l [ n I n d e x ] ;
/ / Save p o s i t i o n and e n e r g y v a l u e o f t h e
/ / p o i n t w i t h minimum e n e r g y i n t h e n e i g h b o r h o o d
i f ( e n e r g y < minEnergy ) {
minEnergy = e n e r g y ;
EminX = nX ;
EminY = nY ;
}
}
/ / Move t o t h e n e x t node o f t h e s n a k e
MySnake . moveToNext ( ) ;
}
/ / U p d a t e a v e r a g e d i s t a n c e among t h e s n a k e n o d e s
/ / and c e n t e r o f g r a v i t y o f t h e s n a k e
a v e r a g e D i s t = g e tA v g D i s t a n c e ( ) ;
computeCG ( ) ;
// S e l e c t i v e l y r e l a x e s the c u r v a t u r e term f o r p a r t i c u l a r
/ / s n a k e node i f t h e node s a t i s f i e s t h e f o l l o w i n g c o n d i t i o n s :
/ / −node must h a v e h i g h e r c u r v a t u r e t h a n i t s n e i g h b o r i n g n o d e s
/ / −node must h a v e a c u r v a t u r e a b o v e t h e t h r e s h o l d v a l u e 0 . 2 5
AllowCorners ( ) ;
} while ( s t a g e C o u n t <= n o s t a g e s ) ;
DETECTION OF BREAST CONTOUR IN MAMMOGRAMS 147
(a)
(b)
Figure 5. (a) Approximate breast contour obtained from Stage 4 of Method 1, as described
in Section 2, for image mdb042. (b) Sampled breast contour used as the input to the
AADCM.
148 RICARDO J. FERRARI et al.
where α and β are weighting parameters that control the internal and external
energies, Einternal and Eexternal , respectively, at each point vi .
The internal energy is composed of two terms as
This energy component ensures a stable shape for the contour and tries to keep
constant the distance between the points in the contour. The weighting parameters
a and b were initially set to unity (a = b = 1) in this work, since the initial contours
present smooth shapes and are close to the true boundary in most cases.
For each element (j, k) in a neighborhood of 7 × 7 pixels of vi , element
ecjk (vi ) of Econtinuity is computed as
1 2
ecjk (vi ) = |pjk (vi ) − ρ(vi−1 + vi+1 )| , (8)
l(vi )
N 2
where l(vi ) = N1 i=1 |vi+1 − vi | is the normalization factor that makes the
continuity energy independent of the size, location, and orientation of V . pjk (vi )
is the point in the image at position (j, k) in the 7 × 7 neighborhood of vi , and
ρ = 2 cos(1 2π ) is a constant factor to keep the location of the minimum energy
N
lying on the circle connecting vi−1 and vi+1 ; the centroid of the contour forms
the reference point or origin for this factor, as illustrated in Figure 6. It should
be noted that, in the present work, the methods described are applied to closed
contours.
The balloon force is used to force the expansion of the initial contour toward
the breast boundary. In this work, the balloon force was made adaptive to the mag-
nitude of the image gradient, causing the contour to expand faster in homogeneous
regions and slower near the breast boundary. The balloon energy term, ebjk (vi ),
is defined as
ebjk (vi ) = ni · {vi − pjk (vi )}, (9)
where ni is the outward unit normal vector of V at point vi , and the symbol ·
indicates the dot product. ni is computed by rotating vector ti = vvii −v
−vi −1 +
i −1
vi+1 −vi
which is the tangent vector at the point vi , by 90◦ .
vi+1 −vi ,
The external energy is based upon the magnitude and direction of the image
gradient and is intended to attract the contour to the breast boundary. It is defined
as
eejk (vi ) = −ni · ∇I{pjk (vi )}, (10)
where ∇I{pjk (vi )} is the image gradient vector at (j, k) in the 7×7 neighborhood
of vi . The direction of the image gradient is used to avoid attraction of the contour
by edges that may be located near the true breast boundary, such as identification
marks and small artifacts; see Figure 7(a) and (b). In this situation, the gradient
direction at position (j, k) on an edge near the breast boundary and the direction of
DETECTION OF BREAST CONTOUR IN MAMMOGRAMS 149
Figure 6. Characteristics of the continuity energy component in the adaptive active de-
formable contour model. The contour pixel, vi , is moved to position v'i by application of
the continuity requirement. Adapted with permission from B Mackiewich, JEL Desautels,
RA Borges, AF Frère. 2004. Intracranial boundary detection and radio frequency correction
in magnetic resonance images. Med Biol Eng Comput 42(2):201–208. Copyright 2004, c
The Institute of Engineering and Technology.
the unit normal of the contour will have opposite signs, which makes the functional
of energy present a large value at (j, k).
(a)
(b)
Figure 7. Application of the gradient direction information for avoiding attraction of the
boundary to objects near the true boundary. (a) Breast boundary detected automatically,
superimposed on the original image (mdb006) from the Mini-MIAS database. (b) Details
of the detected breast boundary close to the image identification marker (corresponding to
the boxed region in the original image).
DETECTION OF BREAST CONTOUR IN MAMMOGRAMS 151
Here, emin and emax indicate the minimum and maximum of the corresponding
energy component in the 7 × 7 neighborhood of vi . ∇Imax is the maximum
gradient magnitude in the entire image. The minimization procedure is performed
based on the energy components computed in a local region defined by the 7 × 7
neighborhoods of three adjacent contour pixels. At a given step, only one contour
pixel vi is modified.
In the present work, the Greedy algorithm proposed by Williams and Shah
[17] was used to perform minimization of the functional of energy in Eq. (6).
Although this algorithm has the drawback of not guaranteeing a global-minimum
solution, it is faster than the other methods proposed in the literature — such as
dynamic programming, variational calculus, and finite elements. It also allows
insertion of hard constraints, such as curvature evaluation, as discussed below.
Convergence of the AADCM is achieved in two stages by smoothing the
original image with two different Gaussian kernels defined with σx = σy = 3
pixels and σx = σy = 1.5 pixels. The first stage of smoothing introduces a
large amount of blur in the image, and helps the contour to expand faster since
low-contrast and small details will have been removed. In the second stage, the
amount of blurring introduced is lesser, and the active contour is more sensitive to
small variations; note that the second-stage filter is applied to the original image.
At each stage, the iterative process is stopped when the total energy of the contour
increases between consecutive iterations. This coarse-to-fine representation is
intended to give more stability to the contour.
In order to allow the deformable contour to adjust to corner regions, such as
the upper-right limit of the breast boundary, a constraint was inserted at the end of
each iteration to relax the continuity term defined in Eq. (8). The curvature value
C(vi ) at each point vi of the contour was computed as
2
ui ui−1
C(vi ) = 2 sin(θ/2) = − , (14)
ui ui−1
(a)
Figure 8. Illustration of the vectors and external angle used in Eq. (14) for calculation of
curvature.
to differentiate between corners and curved lines. Figures 9(b) and (c) illustrate
an example with and without the use of the curvature constraint to correct corner
effects.
The weighting parameters, α and β, in Eq. (6) were initialized to 0.2 and
1.0, respectively, for each contour element. This set of weights was selected
experimentally by using a group of 20 images randomly selected from the Mini-
MIAS database [13], not including any image in the test set used in the present
work to evaluate the results. A larger weight was given to the gradient energy
to favor contour deformation toward the breast boundary rather than smoothing
due to the internal force. Although these parameters were derived based upon
experiments, they have proven to be robust when applied to other images from the
Mini-MIAS database.
3.2. Database
Eighty-four images, randomly chosen from the Mini-MIAS database [13],
were used in this work. All images are medio-lateral oblique (MLO) views with
a 200-µm sampling interval and 8-bit gray-level quantization. For reduction of
processing time, all images were downsampled with a fixed sampling distance
so that the original images corresponding to a matrix size of 1024 × 1024 pixels
were transformed to 256 × 256 pixels. All results obtained with the downsampled
images were mapped to the original mammograms for subsequent analysis and
display.
DETECTION OF BREAST CONTOUR IN MAMMOGRAMS 153
(a)
(b) (c)
Figure 9. Example of the constraint used in the active contour model to correct smoothing
effects at corners. (a) Original image; the box indicates the region of concern; (b,c) show the
details of the breast contour, with and without the use of the constraint for corner correction,
respectively.
154 RICARDO J. FERRARI et al.
The results obtained from the segmentation procedure were analyzed accord-
ing to the protocol described in Section 5. A total of 84 images were analyzed;
three examples of the results are illustrated in Figures 10, 11, and 12. It is worth
noting from the example in Figure 11 that the method performs well even when
the contour is interrupted by the image boundary; this is because the control points
of the active contour model are kept fixed when they are on the boundary of the
image. A few more examples of the segmented results are shown in Figure 13.
The FP and FN average percentages and the corresponding standard deviation
values obtained for the 84 images are 0.41±0.25% and 0.58±0.67%, respectively.
Thirty-three images presented both FP and FN percentages less than 0.5%; 38
images presented FP and FN percentages between 0.5 and 1%; the FP and FN
percentages were greater than 1% for 13 images.
DETECTION OF BREAST CONTOUR IN MAMMOGRAMS 155
(a) (b)
(c)
Figure 10. Results obtained for image mdb003. (a) Original image. (b) Hand-drawn
boundary, superimposed on the histogram-equalized image. (c) Breast boundary detected
automatically, superimposed on the original image.
156 RICARDO J. FERRARI et al.
(a) (b)
(c)
Figure 11. Results obtained for image mdb008. (a) Original image. (b) Hand-drawn
boundary, superimposed on the histogram-equalized image. (c) Breast boundary detected
automatically, superimposed on the original image.
DETECTION OF BREAST CONTOUR IN MAMMOGRAMS 157
(a) (b)
(c)
Figure 12. Results obtained for image mdb114. (a) Original image. (b) Hand-drawn
boundary, superimposed on the histogram-equalized image. (c) Breast boundary detected
automatically, superimposed on the original image.
158 RICARDO J. FERRARI et al.
The most common cause of FN pixels was related to the non-detection of the
nipple region [see the example in Figure 10(c)]. By removing Stage 5, used to
approximate the initial contour to the true breast boundary (which is used only in
Method 1 described in Section 2), we decreased the average time for processing
an image from 2.0 to 0.18 min. However, the number of images where the nipple
was not identified increased [see, e.g., Figures 10(b) and (c)].
The main cause of FP pixels was associated with smoothing of the limits of the
breast boundary. Figure 14 presents a case where both the FP and FN percentages
are greater than 1%. In the case of image mdb068, the automatically obtained
initial contour was attracted to a high-density region inside the breast, instead of
growing outward in the direction of the true breast boundary.
Due to the use of the gradient orientation in our active contour model, the
method has shown good results in cases where small artifacts are present near the
breast boundary (compare Figures 4 and 7). The parameters of our active contour
model, although selected experimentally, have proven to be robust, as indicated
by the results obtained.
The processing time to perform identification of the breast boundary in 256 ×
256 images is about 0.1 min on average, using an 850-MHz computer with 512
Mb of memory. (Processing a 1024 × 1024 image took, on average, about 3 min.)
5. CONCLUSIONS
6. ACKNOWLEDGEMENTS
(a)
(b)
Figure 14. Image mdb068, presenting problems in segmentation of the breast boundary.
(a) The hand-drawn boundary, superimposed on the histogram-equalized image. (b) Auto-
matically detected boundary superimposed on the original image. The active contour has
been attracted to a high-density region in the breast. Such problems may be solved by
equalizing the image contrast before application of the method.
162 RICARDO J. FERRARI et al.
7. REFERENCES
1. Lou SL, Lin HD, Lin KP, Hoogstrate D. 2000. Automatic breast region extraction from digital
mammograms for PACS and telemammography applications. Comput Med Imaging Graphics
24:205–220.
2. Bick U, Giger ML, Schmidt RA, Nishikawa RM, Doi K. 1996. Density correction of peripheral
breast tissue on digital mammograms. RadioGraphics 16(6):1403–1411.
3. Byng JW, Critten JP, Yaffe MJ. 1997. Thickness-equalization processing for mammographic
images. Radiology 203(2):564–568.
4. Chandrasekhar R, Attikiouzel Y. 1997. A simple method for automatically locating the nipple on
mammograms. IEEE Trans Medical Imaging 16(5):483–494.
5. Lau TK, Bischof WF. 1991. Automated detection of breast tumors using the asymmetry approach.
Comput Biomed Res 24:273–295.
6. Miller P, Astley S. Automated detection of mammographic asymmetry using anatomical features.
1993. Int J Pattern Recognit Artif Intell 7(6):1461–1476.
7. Méndez AJ, Tahoces PG, Lado MJ, Souto M, Correa JL, Vidal JJ. 1996. Automatic detection of
breast border and nipple in digital mammograms. Comput Methods Programs Biomed 49:253–
262.
8. Bick U, Giger ML, Schmidt RA, Nishikawa RM, Wolverton DE, Doi K. 1995. Automated seg-
mentation of digitized mammograms. Acad Radiol 2(1):1–9.
9. Masek M, AttikiouzelY, deSilva CJS. 2000. Combining data from different algorithms to segment
the skin–air interface in mammograms. In Proceedings of the 22nd annual EMBS international
conference, Vol. 2, pp. 1195–1198. Washington, DC: IEEE.
10. Ferrari RJ, Rangayyan RM, Desautels JEL, Frère AF. 2000. Segmentation of mammograms:
identification of the skin–air boundary, pectoral muscle, and fibro-glandular disc. In Proceedings
of the 5th international workshop on digital Mammography, pp. 573–579. Ed MJ Yaffe. Madison,
WI: Medical Physics Publishing.
11. Kass M, Witkin A, Terzopoulos D. 1988. Snakes: active contour models. Int J Comput Vision
1(4):321–331.
12. Gonzalez RC, Woods RE. 1992. Digital image processing. Reading, MA:: Addison-Wesley.
13. Suckling J, Parker J, Dance DR, Astley S, Hutt I, Boggis CRM, Ricketts I, Stamatakis E, Cerneaz
N, Kok SL, Taylor P, Betal D, Savage J. 1994. The mammographic image analysis society digital
mammogram database. In Proceedings of the 2nd international workshop on digital mammogra-
phy, pp. 375–378. Ed AG Gale, SM Astley, DR Dance, AY Cairns. Excerpta Medica International
Congress Series, Vol. 1069. Amsterdam: Elsevier.
14. Lloyd S. 1982. Least squares quantization in PCM. IEEE Trans Inf Theory 28:129–137.
15. Mackiewich B. 1995. Intracranial boundary detection and radio frequency correction in mag-
netic resonance images. Master’s thesis, School of Computing Science, Simon Fraser University,
Burnaby, BC, Canada.
16. Lobregt S, Viergever MA. 1995. A discrete dynamic contour model. IEEE Trans Med Imaging
14(1):12–24.
17. Williams DJ, Shah M. 1992. A fast algorithm for active contours and curvature estimation. Comput
Vision Graphics Image Proces: Image Understand 55(1):14–26.
18. Mattis P, Kimball S. 2005. GIMP–GNU image manipulation program. http://www.gimp.org.
19. Ferrari RJ, Rangayyan RM, Desaultels JEL, Frère AF. 2001. Analysis of asymmetry in mammo-
grams via directional filtering with Gabor wavelets. IEEE Trans Med Imaging 20(9):953–964.
20. Mackiewich B, Desautels JEL, Borges RA, Frère AF. 2004. Intracranial boundary detection and
radio frequency correction in magnetic resonance images. Med Biol Eng Comput 42(2):201–208.
6
C. Tobon-Gomez, S. Ordas,
and A.F. Frangi
Computational Imaging Laboratory,
Universitat Pompeu Fabra, Barcelona, Spain
This chapter describes the use of statistical deformable models for cardiac segmentation
and functional analysis in Gated Single Positron Emission Computer Tomography (SPECT)
perfusion studies. By means of a statistical deformable model, automatic delineations of the
endo- and epicardial boundaries of the left ventricle (LV) are obtained, in all temporal phases
and image slices of the dynamic study. A priori spatio-temporal shape knowledge is captured
from a training set of high-resolution manual delineations made on cine Magnetic Resonance
(MR) studies. From the fitted shape, a truly 3D representation of the left ventricle, a series of
functional parameters can be assessed, including LV volume–time curves, ejection fraction,
and surface maps of myocardial perfusion, wall motion, thickness, and thickening. We
present encouraging results of its application on a patient database that includes rest/rest
studies with common cardiac pathologies, suggesting that statistical deformable models
may serve as a robust and accurate technique for routine use.
Address all correspondence to: Catalina Tobon-Gomez, Computational Imaging Laboratory, Univer-
sitat Pompeu Fabra, Pg Circumvallacio 8, Barcelona 08003, Spain. Phone: +34 93-542-1451, Fax:
+34 93-542-2517. [email protected].
163
164 CATALINA TOBON-GOMEZ et al.
1. INTRODUCTION
Over the course of the last two decades, myocardial perfusion with Single
Photon Emission Computed Tomography (SPECT) has emerged as an established
and well-validated method for assessing myocardial ischemia, viability, and func-
tion. Such a technique has a widespread use in clinical cardiology practice, with a
diagnostic accuracy and prognostic importance repeatedly confirmed in numerous
studies, allowing for evaluation of patients with known or suspected coronary artery
disease (CAD). These studies now represent an extensive database unequaled in the
field of cardiac imaging. Gated-SPECT imaging integrates traditional perfusion
information along with global left ventricular function. Therefore, identification of
the presence, extent, and severity of irreversible and reversible perfusion defects,
plus ventricular function, can be achieved within a single study. This enables, for
instance, to effectively stratify patients into subgroups at low and high risk for
acute cardiac events and subsequent death. Entering the new century, the field
has matured to the point at which SPECT data are accepted as indispensable in
the management of the decision-making process for many patients. Moreover,
SPECT is also being incorporated as a cornerstone in the design of multicenter
clinical trials [1–8]. This circumstance will undoubtedly have a major impact in
the treatment of patients with acute coronary syndromes and chronic CAD, as well
as in the evaluation of patients with heart failure and those with diabetes.
EDV − ESV
EF = × 100. (1)
EDV
According to recent studies, assessment of diastolic function (DFx) is also
achievable from gated-SPECT datasets [11]. Abnormalities in DFx have been re-
cently emphasized [12–17] as much earlier predictors of LV dysfunction. Nonethe-
less, there is still unclarity regarding which DFx parameters are more stable and
useful. Currently in gated-SPECT imaging, the two most exploited measurements
are Peak Filling Rate (PFR) and Time to Peak Filling Rate (TTPF). For the pur-
pose of calculating them, the filling rate vs. time curve is computed from the VTC
first derivative. Subsequently, PFR is estimated as the maximum value of this
166 CATALINA TOBON-GOMEZ et al.
(d) (e) (f )
Figure 2. Example of gated-SPECT study. (a) Long axis, (b) short axis and (c) 3D views at
End-diastole. (d) Long axis, (e) short axis and (f) 3D views at End-systole. See attached
CD for color version.
curve, divided by the EDV in order to normalize its value (EDV/s). TTPF (in ms)
corresponds to the time elapsed between ESV and PFR [11].
Figure 3. Three kinds of photons are present during image construction: (1) fully con-
tributing photons, (2) scattered Photons, and (3) completely absorbed photons.
6. Heart Motion: The constant “beating” of the heart has an additional blur-
ring effect on the already low-resolution data.
168 CATALINA TOBON-GOMEZ et al.
(a) (b)
(c) (d)
8. Partial Volume Effect: Partial volume effect establishes that the average
counts measured in a structure are not only proportional to the amount of
activity contained in that structure, but also to the size of the structure itself
[19]. This principle causes the myocardium to appear thinner at the end
of diastole (ED), when the activity count is lower, and thicker at end of
systole (ES), due to a higher activity count.
STATISTICAL DEFORMABLE MODELS FOR CARDIAC SEGMENTATION 169
(a) (b)
Figure 5. Some of the challenges inherent to gated-SPECT imaging: extra cardiac up-
take. (a) Bowel uptake that does not affect activity counts. (b) Liver uptake that affects
normalization of activity counts. See attached CD for color version.
10. Axial Slices: Most algorithms for automatic segmentation operate on short-
axis reformatted images. The LV long-axis usually has to be determined
manually, which may be prone to inaccuracies and inter-operator variabil-
ity [18].
Combining all these challenges, the data sets can be viewed as low-resolution,
noisy, temporally integrated, and sparse. Therein lies the need for a robust segmen-
tation methodology, since an error of only one voxel along the chamber surface
may generate a huge difference in volume calculation and derived parameters.
1.5.2. Complications
In order to cope with perfusion defects, the aforementioned algorithms use
a large amount of parameters, rules, and criteria that are empirically determined.
In addition, as mentioned before, extracardiac activity complicates segmentation,
especially when the external uptake focus is close to the myocardium. We believe
that the main drawback of these approaches is that their geometrical models are
not appropriate for those circumstances. A priori information on the organ and
confident information from other parts of the image (epicardium), should help
determine a valid LV shape.
2.1. Overview
An Active Shape Model (ASM) [24] is an example of a statistical deformation
model, and is composed of three parts: (1) a Point Distribution Model (PDM) that
models the shape variability of an object, (2) an appearance model that describes
gray-level patterns around the object’s boundaries, and (3) a matching algorithm
that drives the PDM deformation toward the target in an iterative manner. In the
following, we briefly describe each constitutive part of the algorithm, but thorough
descriptions are provided elsewhere [24, 28, 31].
x = x̂ + Φb, (2)
where
n
1
x̂ = xi (3)
n i=1
is the average landmark vector, b is the shape parameter vector of the model, and Φ
is a matrix whose columns are the principal components of the covariance matrix:
n
1
S= (xi − x̂)(xi − x̂)T . (4)
n − 1 i=1
b = ΦT (x − x̂). (5)
STATISTICAL DEFORMABLE MODELS FOR CARDIAC SEGMENTATION 173
Figure 6. Principal modes of variation for the MRI (ED) PDM. The shapes are generated
√ a single model parameter, (bi ), fixing all others at zero standard deviations
by varying
(SDi = λi ) from the mean shape.
The vector b defines the shape parameters of the ASM. By varying these
parameters we can generate different instances of the shape class under analysis
using Equation (2). Assuming that the cloud of landmark vectors follows a multi-
dimensional Gaussian distribution, the variance of the ith parameter, bi , across the
training set√is given by λi . By applying limits to the variation of bi , for instance
|bi | ≤ ±3 λi , it can be ensured that a generated shape is similar to the shapes
contained in the training class.
Obtaining the m 3D landmarks and their correspondence among training ex-
amples is a non-trivial task. Such a problem still represents an open topic for the
shape analysis community. Our methodology was inspired on the method pro-
posed by Frangi and coworkers [35]. The method can easily be set to build either
1- or 2-chamber models. In this work, a single-ventricle configuration with 2848
points (1071 endocardial and 1777 epicardial) was used. The constructed PDM
was trained from a population of 90 hearts with large intra- and inter-individual
variability, including both healthy and pathologic examples, in ED and ES tempo-
ral phases. Thus, 180 examples were in total considered [29]. Figure 6 presents
the four principal modes of variation for the MRI (ED) PDM.
174 CATALINA TOBON-GOMEZ et al.
with gik the kth component in the ith profile. Denoting these normalized deriva-
tive profiles as ĝ1 , . . . , ĝs , the mean profile g and the covariance matrix Σg are
computed for each landmark. The Mahalanobis distance,
structures vary considerably between patients as well as in one patient over time.
Therefore, a problem inherent to a linear gray-level model is that the underly-
ing assumption of a normal intensity distribution in practice does not hold. The
problem would persist if other local image features for boundary localization are
employed, but normalized gray-level profiles seem to be the most appropriate for
a linear approach [37].
Note that the actual weight for the source node (λ = w) is unity. To avoid
propagation of an update over the entire shape model surface, propagation
176 CATALINA TOBON-GOMEZ et al.
2.4.1. Initialization
The initialization of segmentation methods is frequently regarded as a given
precondition. In practice, however, initialization is usually performed manually
or by some heuristic preprocessing steps. Therefore, it is of great importance to
have a simple and effective initialization method at one’s disposal.
Although the 3D-ASM is relatively insensitive to the initial shape model in-
stance, a fully- or almost fully-automatic initialization is needed to produce con-
sistent results. We followed a very simple mechanism to scale and position the
mean shape of the model. The operator defines six epicardial points at the basal
level and a seventh one in the apex. The centroids of corresponding anatomical
regions in the mean shape are aligned to these points, with a similarity transforma-
tion. Figure 9 illustrates examples of the shape model initialization. Key benefits
STATISTICAL DEFORMABLE MODELS FOR CARDIAC SEGMENTATION 177
(a) (b)
Figure 9. Initialization points: (a) SA and (b) biplane views. See attached CD for color
version.
Figure 10. Convergence of the algorithm. Initialization (left), intermediate step (center),
and convergence (right). See attached CD for color version.
phases in the study, the fitted shape of the previous phase is used, as well as less
resolutions levels and iterations.
3.3. Experiments
For this application task the previously described PDM was applied without
further training (see Section 2.2). Aiming to match the profile of intensities par-
ticular to SPECT modality, the appearance model was retrained from manually
delineated endocardial and epicardial borders.
180 CATALINA TOBON-GOMEZ et al.
Description Number
Healthy 15
Ischemia Minor 4
Moderated 3
Intensive 1
General
s: number of training sets from which the
shape model was built (180)
a: number of training sets from which the
appearance model was built (1)
Shape Model
n: number of landmark points (2848)
fv: part of variance to be explained by the
shape model (0.50)
t: number of modes in the shape model (4)
m: bounds on shape model parameters (3.0)
Appearance Model
k: number of points in profile on either side of the
landmark point, giving profiles of length 2k + 1 (5)
Matching Procedure
ns: number of new positions to evaluate at each side of
current landmark position; the total number of positions
is 2ns + 1 (7)
L: number of levels in the multi-resolution
strategy (2)
N: number of iterations per resolution level (20)
In order to retrieve the optimal settings, and hence increase the efficiency of the
process, several parameter assemblies where evaluated. Final 3D-ASM settings
are summarized in Table 2.
STATISTICAL DEFORMABLE MODELS FOR CARDIAC SEGMENTATION 181
Figure 11. Central surface calculation. For every point in the endocardial surface, the
shortest distance to the epicardial surface is assessed. The middle points in all these segments
are collected and based on the same mesh connectivity of the endocardial surface; a new
mesh is built. (a) Original surface; (b–c) segment evaluation, and (d) generated central
surface. See attached CD for color version.
182 CATALINA TOBON-GOMEZ et al.
Figure 12. Left: wall motion, ED and ES, respectively. Right: wall thickening assessment,
ED and ES, respectively. See attached CD for color version.
Figure 13. Perfusion map corresponding to ES displayed in different views. See attached
CD for color version.
BASE
BASE ANT
ANT
SEPT LATERAL
INF INF
APEX APEX
Any combination of the previous maps can easily be rendered in the 3D an-
imation tool. The minimum and maximum values for the scalar ranges of the
previously described tools can be calculated from a selectable population of nor-
mal studies or from rest/stress states of the same patient.
4. RESULTS
Quantitative, and a few visual, results will be presented in this section and
further discussed in Section 5. Total ranges on all parameters calculations are
summarized in Table 3. Mean ± SD values are shown in Table 4. Figures 15
and 16 display the results of Bland-Altman analysis. According to them, 3DASM
achieved lower SD values for EF, ESV, and PFR, while QGS computed smaller
SD values for EDV and TTPF.
Regarding systolic function, two main types of discrepancies between QGS
and 3DASM calculations were detected during this work:
184 CATALINA TOBON-GOMEZ et al.
5 5
0 0
−5 -5
−8.23 %
−10 −9.6 % -10
0 20 40 60 80 100 0 20 40 60 80 100
(a) (b)
QGS Bland-Altman Plot of EDV (ml) 3DASM Bland-Altman Plot of EDV (ml)
20 20
EDV (Rest 1) (ml) - EDV (Rest 2) (ml)
EDV (Rest 1) (ml) - EDV (Rest 2) (ml)
15 15 13.6 ml
10.8 ml
10 10
5 5
0 0
−5 −5
−10 −10
(c) (d)
QGS Bland-Altman Plot of ESV (ml) 3DASM Bland-Altman Plot of ESV (ml)
ESV (Rest 1) (ml) - ESV (Rest 2) (ml)
10 10
8.1 ml 8.6ml
5 5
0 0
-5 -5
-10 -10
−10.0 ml
−11.6 ml
Mean ESV (Rest 1, Rest 2) (ml) Mean ESV (Rest 1, Rest 2) (ml)
(e) (f)
Figure 15. Bland-Altman Plots of EF (a–b), EDV (c–d), and ESV (e–f ) comparing first
and second rest studies estimated with QGS (left) and 3DASM (right).
STATISTICAL DEFORMABLE MODELS FOR CARDIAC SEGMENTATION 185
QGS Bland-Altman Plot of PFR (EDV/s) 3DASM Bland-Altman Plot of PFR (EDV/s)
1.5 1.5
1.0 EDV/s
1.0 1.0
0.5 0.5
0.0 0.0
−0.5 −0.5
− 0.6 EDV/s
−1.5 −1.5
0 1 2 3 4 1 2 3 4
Mean PFR (Rest 1, Rest 2) (EDV/s) Mean PFR (Rest 1, Rest 2) (EDV/s)
(a) (b)
QGS Bland-Altman Plot of TTPF (ms) 3DASM Bland-Altman Plot of TTPF (ms)
TTPF (Rest 1) (ms) - TTPF (Rest 2) (ms)
TTPF (Rest 1) (ms) - TTPF (Rest 2) (ms)
200 200
104 ms 106 ms
100
0 0
−100 − 95 ms
−1568 ms
−200 −200
100 200 300 400 500 100 200 300 400 500
Mean TTPF (Rest 1, Rest 2) (ms) Mean TTPF (Rest 1, Rest 2) (ms)
(c) (d)
Figure 16. Bland-Altman Plots of PFR (a–b) and TTPF (c–d) comparing first and second
rest studies estimated with QGS (left) and 3DASM (right).
Figure 17. (a–c) Some examples of end systole segmentation results obtained with 3DASM
of cases with the greatest difference on EF calculations. See attached CD for color version.
188 CATALINA TOBON-GOMEZ et al.
Figure 18. (a–c) Segmentation results obtained with 3DASM of cases with the greatest
difference on EDV calculations. ED segmentation results are displayed on the left and ES
results on the right. See attached CD for color version.
STATISTICAL DEFORMABLE MODELS FOR CARDIAC SEGMENTATION 189
Volume (ml) and Filling Rate (ml/s) 120 Volume (ml) and Filling Rate (ml/s)
120 350 250
Time of PFR
Time of PFR
100 250 100
150
150
Volume (ml)
50
Volume (ml)
50
60 60
-50
-50
40 40
Time of ES -150
Time of ES
-150
20 -250 20
TTPF TTPF
0 -350 01 -250
1 1 3 4 5 6 7 8 2 3 4 5 6 7 8
Time Frame Time Frame
(a) (b)
Volume (ml) and Filling Rate (ml/s) Volume (ml) and Filling Rate (ml/s)
120 350 120 250
50
60 60
-50
-50
40 40
-150
Time of ES Time of ES
-150
20 20
-250 TTPF
TTPF
0 -350 0 -250
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Time Frame Time Frame
(c) (d)
Figure 19. Volume and filling rate curves with different interpolation techniques. (a–b) Top
row shows restrictive interpolation results and (c–d) bottom row displays smooth curves.
Left: both plots (a–c) were obtained with data from the same patient. Right: both plots
(b–d) were obtained with data from the same patient. Notice the difference in time of ES
and PFR, length of TTPF (arrow), amplitude, and shape.
5. DISCUSSION
During the initial development of a new method, visual inspection of the epi-
and endocardial boundaries produced by the algorithm are used to assess its cor-
rectness. For further improvement on the methodology, quantitative results against
a gold standard or ground truth are needed to evaluate its accuracy. Commonly
used gold standards for gated SPECT are cine MRI and digital phantoms. Unfor-
tunately, the use of a gold standard was not possible during the development of
this work, but will be incorporated into future experiments. Apart from a merely
intuitive visual inspection, the first approach to numeric evaluation included inter-
study repeatability analysis. This allows for evaluating the same subject under the
same clinical conditions. One limitation of using two separate datasets to evaluate
algorithm accuracy is that added errors could be present in the dataset itself, due
to artifacts (see Section 1.4).
190 CATALINA TOBON-GOMEZ et al.
Comparisons with the most widespread clinical analysis tool (QGS) were
made in order to have a general idea of the expected results for each patient. These
comparisons were performed taking into account previously published studies that
describe QGS tendencies, along with our own experience during clinical practice.
It should be noted that this study did not aim to compare the accuracy of 3DASM
and QGS. QGS computations were included only to highlight problems with ex-
isting methods as an incentive to develop new methodologies.
As of datasets with extensive signal drops due to perfusion defects, our method
provided a correct anatomical interpolation in the apex with a smooth connection
between non-infarcted portions of the wall, based on reliable information from
other parts of the image and prior statistical knowledge. There is a clear difference
in the LV shape recovered with the two algorithms.
We have identified determination of the mitral valve plane as the main cause
of LV volume estimation discrepancies. Our shape model was built from a training
set of MRI studies, thereby providing high-resolution anatomical constraints to LV
shape recovery. Since expert segmentations spanned from the base to the apex, the
deformation process itself made the shape deform and grow toward an acceptable
position of the basal plane and apex. As demonstrated in Section 1.4, these kinds
of anatomical constraints are not present in SPECT datasets. Therefore, the basal
plane defined with initialization points is used to provide an upper limit for cavity
volume calculation.
6. CONCLUSIONS
7. ACKNOWLEDGMENTS
8. REFERENCES
1. Schaefer W, Lipke C, Standke D, Kulh HP, Nowak B, Kaiser HJ, Koch KC, Buell U. 2005.
Quantification of left ventricular volumes and ejection fraction from gated 99m Tc-MIBI SPECT:
MRI validation and comparison of the Emory Cardiac Tool Box with QGS and 4D-MSPECT.
J Nucl Med 46(8):1256–1263.
2. Lum DP, Coel MN. 2003. Comparison of automatic quantification software for the measurement
of ventricular volume and ejection fraction in gated myocardial perfusion SPECT. Nucl Med
Commun 24(4):259–266.
3. Nakajima K, Higuchi T, Taki J, Kawano M, Tonami N. 2001. Accuracy of ventricular volume and
ejection fraction measured by gated myocardial SPECT: comparison of 4 software programs.
J Nucl Med 42:1571–1578.
4. Faber TL, Vansant JP, Pettigrew RI, Galt JR, Blais M, Chatzimavroudis G, Cooke CD, Folks RD,
Waldrop SM, Gurtler-Krawczynska E, Wittry MD, Garcia E. 2001. Evaluation of left ventricular
endocardial volumes and ejection fractions computed from gated perfusion SPECT with magnetic
imaging: comparison of two methods. J Nucl Med 40:645–651.
5. Vaduganathan P, He Z, Vick III GW, Mahmarian JJ, Verani MS. 1998. Evaluation of left ventricular
wall motion, volumes, and ejection fraction by gated myocardial tomography with technetium
99m-labeled tetrofosmin: a comparison with cine magnetic resonance imaging. J Nucl Cardiol
6:3–10.
6. Lipke CSA, Kuhl HP, Nowak B, Kaiser HJ, Reinartz P, Koch KC, Buell U, Schaefer WM. 2004.
Validation of 4D-MSPECT and QGS for quantification of left ventricular volumes and ejection
fraction from gated 99m Tc-MIBI SPET: comparison with cardiac magnetic resonance imaging.
Eur J Nucl Med Mol Imaging 31(4):482–490.
7. Tadamura E, Kudoh T, Motooka M, Inubushi M, Okada T, Kubo S, Hattori N, Matsuda T, Koshiji T,
Nishimura K, Komeda M, Konishi J. 1999. Use of technetium-99m sestamibi ECG-gated single-
photon emission tomography for the evaluation of left ventricular function following coronary
artery bypass graft: comparison with threedimensional magnetic resonance imaging. Eur J Nucl
Med 26:705–712.
8. Thorley PJ, Plein S, Bloomer TN, Ridgway JP, Sivananthan UM. 2003. Comparison of 99mtc
tetrofosmin gated SPECT measurements of left ventricular volumes and ejection fraction with
MRI over a wide range of values. Nucl Med Commun 24:763–769.
9. Germano G, Kavanagh P, Su H, Mazzati M, Kiat H. 1995. Automatic reorientation of 3-
dimensional transaxial myocardial perfusion SPECT images. J Nucl Med 36:1107–1114.
10. Faber T, Cooke C, Folks R, Vansant J, Nichos K, DePuey E, Pettigrew R, Garcia E. 1999. Left
ventricular function and perfusion from gated SPECT perfusion images: an intergrated method.
J Nucl Med 40:650–659.
11. Akincioglu C, Berman DS, Nishina H, Kavanagh PB, Slomka PJ, Abidov A, Hayes S, Friedman
JD, Germano G. 2005. Assessment of diastolic function using 16-frame 99m Tc-Sestamibi gated
myocardial perfusion SPECT: normal values. J Nucl Med 46:1102–1108.
12. Boyer J, Thanigaraj S, Schechtman K, Perez J. 2004. Prevalence of ventricular diastolic dysfunc-
tion in asymtomatic, normotensive patients with diabetes mellitus. Am J Cardiol 93:870–875.
13. Yuda S, Fang Z. 2003. Association of severe coronary stenosis with subclinical left ventricular
dysfunction in the absence of infarction. J Am Soc Echocardiogr 16:1163–1170.
14. Yamada H, Goh PP, Sun JP, Odabashian J, Garcia MJ, Thomas JD, Klein AL. 2002. Prevalence
of left ventricular diastolic dysfunction by doppler echocardiography: clinical application of the
canadian consensus guidelines. J Am Soc Echocardiogr 15:1238–1244.
192 CATALINA TOBON-GOMEZ et al.
15. Bayata S, Susam I, Pinar A, Dinckal M, Postaci N, Yesil M. 2000. New doppler echocardio-
graphic applications for the evaluation of early alterations in left ventricular diastolic function
after coronary angioplasty. Eur J Echocardiogr 1:105–108.
16. Matsumura Y, Elliott P, Virdee M, Sorajja P, Doi Y, McKenna W. 2002. Left ventricular diastolic
function assessed using doppler tissue imaging in patients with hypertrophic cardiomyopathy:
relation to symptoms and exercise capacity. Heart 87(4):247–251.
17. Galderisi M, Cicala S, Caso P, De Simone L, D’Errico A, Petrocelli A, de Divitiis O. 2002.
Coronary flow reserve and myocardial diastolic dysfunction in arterial hypertension. Am J Cardiol
90:860–864.
18. Germano G. 2001. Technical aspects of myocardial SPECT imaging. J Nucl Med 42:1499–1507.
19. Germano G, Berman D. 1999. Quantitative gated perfusion SPECT. In Clinical gated cardiac
SPECT, pp. 115–146. Ed G Germano, D. Berman. Armonk, NY: Futura Publishing.
20. Cauvin J, Boire J, Bonny J, Zanca M, Veyre A. 1992. Automatic detection of the left ventricular
myocardium long axis and center in thallium-201 single photon emission computed tomography.
Eur J Nucl Med 19(21):1032–1037.
21. Germano G, Kiat H, Kavanagh B, Moriel M, Mazzanti M, Su H, Van Train KF, Berman D. 1995.
Automatic quantification of ejection fraction from gated myocardial perfusion SPECT. J Nucl
Med 36: 2138–2147.
22. Lomsky M, El-Ali H, Astrom K, Ljungberg M, Richter J, Johansson L, Edenbrandt L. 2005. A
new automated method for analysis of gated- SPECT images based on a three-dimensional heart
shaped model. Clin Physiol Funct Imaging 25(4):234–240.
23. Frangi AF, Niessen WJ, Viergever MA. 2000. Three-dimensional modeling for functional analysis
of cardiac images: a review. IEEE Trans Med Imaging 20(1):2–25.
24. Cootes TF, Taylor CJ, Cooper DH, Graham J. 1995. Active shape models — their training and
application. Comput Vision Image Understand 61(1):38–59.
25. Kelemen A, Szekely G, Guerig G. 1999. Elastic model-based segmentation of 3D neuroradiolog-
ical data sets. IEEE Trans Med Imaging 18:828–839.
26. McInerney T, Terzopoulos D. 1996. Deformable models in medical image analysis: a survey.
Med Image Anal 1(2):91–108.
27. Montagnat J, Delingette H. 2005. 4D deformable models with temporal constraints: application
to 4D cardiac image segmentation. Med Image Anal 9(1):87–100.
28. Cootes TF, Edwards GJ, Taylor CJ. 1998. Active appearance models. Proc Eur Conf Comput
Vision 2:484–498.
29. Ordas S, Boisrobert L, Bossa M, Laucelli M, Huguet M, Olmos S, Frangi AF. 2004. Grid-enabled
automatic construction of a two-chamber cardiac PDM from a large database of dynamic 3D
shapes. In Proceedings of the 2004 IEEE international symposium on biomedical imaging,
pp. 416–419. Washington, DC: IEEE.
30. Mitchell SC, Bosch JG, Lelieveldt BPF, van der Geest RJ, Reiber JHC, Sonka M. 2002. 3D
active appearance models: segmentation of cardiac MR and ultrasound images. IEEE Trans Med
Imaging 21(9):1167–1179.
31. Stegmann MB. 2004. Generative interpretation of medical images. Phd dissertation, Informatics
and Mathematical Modelling, Technical University of Denmark, Lyngby.
32. van Assen HC, Danilouchkine MG, Frangi AF, Ordas S, Westenberg JJM, Reiber JHC, Lelieveldt
BPF. 2005. SPASM: segmentation of sparse and arbitrarily oriented cardiac MRI data using a
3D-ASM. In Lecture notes in computer science, vol. 3504: 33–43. Ed AF Frangi, P Radeva, A
Santos, M Hernandez. New York: Springer.
STATISTICAL DEFORMABLE MODELS FOR CARDIAC SEGMENTATION 193
33. Bardinet E, Cohen LD, Ayache N. 1995. Superquadrics and free-form deformations: A global
model to fit and track 3D medical data. In Lecture notes in computer science, Vol. 905, pp.
319–326. Ed N Ayache. New York: Springer.
34. Montagnat J, Delingette H. Space and time shape constrained deformable surfaces for 4D medical
image segmentation. 2000. In Lecture notes in computer science, Vol. 1935, pp. 196–205. Ed SL
Delp, AM Digioia, B Jarmaz. New York: Springer.
35. Frangi AF, Rueckert D, Schnabel JA, Niessen WJ. 2002. Automatic construction of multiple-
object three-dimensional statistical shape models: application to cardiac modeling. IEEE Trans
Med Imaging 21(9):1151–1166.
36. American Heart Association. 1998. American Heart Association 1999 heart and stroke statistical
update. http://www.americanheart.org, Dallas, Texas.
37. Behiels G, Maes F, Vandermeulen D, Suetens P. 2002. Evaluation of image features and search
strategies for segmentation of bone structures in radiographs using active shape models. Med
Image Anal 6(1):47–62.
38. van Assen HC, Danilouchkine MG, Dirksen MS, Rieber JHC, Lelieveldt BPF. 2006. A 3D-ASM
driven by fuzzy inference: application to cardiac CT and MR. Med Image Anal. In press.
39. Bland JM, Altman DG. 1986. Statistical methods for assessing agreement between two methods
of clinical measurement. Lancet 8476:307–310.
7
Jasjit S. Suri
Biomedical Research Institute, Pocatello, Idaho, USA
Dual snake models are powerful techniques for boundary extraction and segmentation of
2D images. In these methods one contour contracts from outside the target and another
one expands from inside as a balanced technique with the ability to reject local minima.
Such approaches have been proposed in the context of parametric snakes and extended
for topologically adaptable snake models through the Dual-T-Snakes. In this chapter we
present an implicit formulation for dual snakes based on the level set approach. The level
set method consists of embedding the snake as the zero level set of a higher-dimensional
function and to solve the corresponding equation of motion. The key idea of our work
is to view the inner/outer contours as a level set of a suitable embedding function. The
mathematical background of the method is explained and its utility for segmentation of
cell images discussed in the experimental results. Theoretical aspects are considered and
comparisons with parametric dual models presented.
1. INTRODUCTION
Deformable models, which includes the popular snake models [1] and de-
formable surfaces [2, 3], are well-known techniques for boundary extraction and
Address all correspondence to: Gilson A. Giraldi, Laboratório Nacional de Computação Científica, Av.
Getulio Vargas, 333, Quitandinha, Petropólis, Brazil, CEP: 25651-075. Phone: +55 24 2233-6088,
Fax: +55 24 2231-5595. [email protected].
195
196 GILSON A. GIRALDI et al.
2. BACKGROUND REVIEW
Dual active contour models have been applied for cell image segmentation
[17, 19] and feature extraction [18]. Although there are few examples of such
approaches [17, 19, 20], their capability to reject local minima shall be better
explored for segmentation and boundary extraction purposes.
The basic idea of the dual active contour models (Dual ACM) is to reject local
minima by using two contours: one that contracts from outside the target and one
that expands from inside. Such a feature makes it possible to reduce sensitivity to
initialization by enabling a comparison between the energy of the two contours,
which is used to reject local minima.
This methodology was first proposed in [20]. To obtain the conventional
continuity and smoothness constraints, but remove the unwanted contraction force,
a known problem in traditional snake models [21, 10], a scale-invariant internal
energy function (shape model) was developed. In [20] a snake is considered as
a particle system, vi = (xi , yi ), i = 0, ..., N − 1, whose particles are linked by
internal constraints. The shape model is accomplished by the following internal
energy:
N −1 2
1 ei
Eint = , (1)
2 i=0 h
where
1 1
ei = (vi−1 + vi+1 ) − vi + θi R (vi−1 − vi+1 ) , (2)
2 2
h is the average space step, R is a 900 rotation matrix, and θi is related to the
internal angle ϕi in the vertex vi by
ϕ
i
θi = cot . (3)
2
It is clear that Eint has a global minimum when ei = 0, i = 0, 1, ..., N − 1. From
(2)–(3) it can be shown that this happens when
which are the internal angles of a regular polygon with vertices given by the points
vi [20]. The energy (1) can also be shown to be rotation, translation, and scale
198 GILSON A. GIRALDI et al.
invariant [20]. As usual in snake models [2, 6], the external energy is defined by
2
Eext (vi ) = − ∇I (vi ) , (5)
λ
E= Eint (vi ) + (1 − λ) Eext (vi ) , (6)
i
N
ui − vit
Fdriving = g (t) , (7)
ui − vit
where vit is the contour being processed at time t, ui is the contour remaining at
rest, and g(t) is the strength of the force. The termination condition adopted in
[20] is the following one, based on the low-velocity criterion:
max vit+1 − vit < δ, (8)
i
In fact, the Viterbi algorithm was also used in [14] and sometimes it is called
a non-evolutionary dual model, in the sense that it is not based on curve evolution
[22]. Before describing Dual-T-Snakes, we shall discuss the T-Snakes method,
which is its background.
3. T-SNAKES MODEL
Figure 1. Two snakes colliding with the inside grid nodes and snake point (snaxels) marked.
Reprinted with permission from Giraldi GA, Strauss E, Oliveira AF. 2003. Dual-T-snakes
model for medical imaging segmentation. Pattern Recognit Lett 24(7):993–1003. Copy-
right 2003,
c Elsevier.
Thus, from the data obtained with step (2), we can choose a set of N points,
{vi = (xi , yi ) , i = 0, ..., N − 1} to be connected to form a closed contour (T-
Snake). These points are called snaxels.
In [16] we proposed to evolve a T-Snake based on a tensile (smoothing) force
(Bi ), a normal (balloon-like) force (Fi ), and an external (image) force (fi ) [9].
These forces are given respectively by the following expressions:
1
Bi = bi vi − (vi−1 + vi+1 ) . (9)
2
Fi = ki (signi ) ni ; fi = γi ∇P, (10)
200 GILSON A. GIRALDI et al.
where ni is the normal at the snaxel vi and bi , ki , γi are force scale factors, P =
2
− ∇I , signi = 1 if I (vi ) ≥ T and signi = 0 otherwise (T is a threshold for
the image I). Region-based statistics can be also used [9].
Hence, we update the T-Snake position according to the following evolution
equation:
(t+∆t)
vi = vit + hi Bit + Fit + fit , (11)
4. DUAL-T-SNAKES ALGORITHM
The key idea behind this method is to explore the T-Snakes framework to
propose a generalized dual active contour model: one T-Snake contracts and splits
from outside targets and another ones expand from inside targets.
To make the outer snake contract and the inner ones expand, we assign an
inward normal force to the first and an outward normal force to the others according
to expressions (10). Also, to make the T-Snakes evolution interdependent we use
the image energy and an affinity restriction.
We use two different definitions for image energy: one for the outer contour,
(Eouter ), and the other for the set of inner contours enclosed by it, (Einner ):
N
−1
2
Eouter = − ∇I (vi ) /N, (12)
i=0
m
k −1
N
1 2
Einner = − ∇I (vi ) /Nk , (13)
m i=0
k=0
where m is the number of inner curves, and N , Nk represent the number of snaxels
of the outer and inner (k) snakes, respectively. The energies are normalized in order
to be compared. Otherwise, the snake energy would be a decreasing function of
the number of snaxels and comparisons would not make sense.
Following the dual approach methodology [20], if Einner > Eouter , an inner
curve must be chosen. To accomplish this, we use an affinity operator, which
estimates the pixels of the image most likely to lie on the boundaries of the objects.
Based on this operator, we can assign to a snaxel the likelihood that it is close to a
boundary. That likelihood is thresholded to obtain an affinity function that assigns
LEVEL SET FORMULATION FOR DUAL SNAKE MODELS 201
to the snaxel a value of 0 or 1: 0 for snaxels most likely to lie away from the target
boundaries and 1 otherwise.
Then, the inner curve with the highest number of snaxels with an affinity
function value of null is chosen. If Eouter > Einner , the outer snake is evolved if
the corresponding affinity function has null entries.
In addition, the balance between the energy/affinity of the outer and inner
snakes allows avoiding local minima. For instance, if a T-Snake has been frozen,
we can increase the normal force at the snaxels where the affinity function is zero.
The self-intersections that may happen during evolution of a snake when increasing
the normal force are naturally resolved by the T-Snakes model. This is the way we
can use the added normal force to play the role of the driving force used by Gunn
and Nixon (avoiding the matching problem required in [20]).
To evaluate the similarity of two contours, we use the difference between the
characteristic function of the outer snake and the characteristic functions of the
inner ones (Characteristic Diff). For example, in the case of CF triangulation in
Figure 1, we can stop the motion of all snaxels of an inner snake inside a triangle
σ if any of its vertices v ∈ σ has the two following properties: (a). All the 6
triangles adjacent to v have a vertex where Characteristic Diff=0; and (b) one of
these triangles is crossed by the outer contour.
The freezing point is used to indicate that a T-Snake has found an equilibrium
position. In the following algorithm we call Dual Snake a list of T-Snakes where
the first one is an outer contour and the other ones are inner contours. The algorithm
is summarized as on the following page.
This method can be efficient for images with artifacts and noise. In [17] it was
improved by using a multigrid approach and a region-growing method based on
the grid. Other improvements can be found in [16]. Once Dual-T-Snakes stops,
a global minimization method [15] or a simple greedy technique can be used to
find the object boundaries in order to complete the segmentation (see Section 7
below). Next, we develop the basic elements of the level set in order to be able to
present the implicit formulation for dual approaches (Section 6).
5. LEVEL SET
In this section we review some of the details of the level set formulation [8].
The main idea of this method is to represent a deformable surface (or curve) as a
level set, x ∈ 3 |G (x) = 0 , of an embedding function,
G : 3 × + → , (14)
such that the deformable surface (also called the front in this formulation) at t = 0
is given by a surface S:
S (t = 0) = x ∈ 3 |G (x, t = 0) = 0 . (15)
202 GILSON A. GIRALDI et al.
Algorithm 1 Dual-T-Snakes
Put all the dual snakes into a queue.
repeat
Pop out a dual snake from the queue;
Use the energies (equations (12) and (13)) and the affinity function to decide
the snake to be processed;
if all snaxels of that snake are frozen then
repeat
increase the normal force at those with affinity zero
until the snake energy starts decreasing.
Remove that added normal force;
repeat
Evolve the snake
until the temperature of all snaxels falls below the freezing point;
Analyze the Characteristic Diff of the current snake;
if the snake being processed is close to a snake of the other type (in-
ner/outer) then
remove the dual snake from the queue.
else
mount the resulting dual snake(s) and go to the beginning.
end if
end if
until the queue is empty
The next step is to find an Eulerian formulation for the front evolution. Fol-
lowing Sethian [8], let us suppose that the front evolves in the normal direction
→
−
with velocity F , which may be a function of the curvature, normal direction, etc.
We need an equation for the evolution of G (x, t), considering that the surface
S is the level set given by:
Let us take a point, x (t), t ∈ + , of the propagating front S. From its implicit
definition given above we have
G (x (t) , t) = 0. (17)
Now, we can use the chain rule to compute the time derivative of this expression:
Gt + F |∇G| = 0, (18)
LEVEL SET FORMULATION FOR DUAL SNAKE MODELS 203
where F = dx/dt is called the speed function. An initial condition, G (x, t = 0),
is required. A straightforward (and expensive) technique to define this function is
to compute a signed-distance function as follows:
where d is the distance from x to the surface S (x, t = 0) and the sign indicates if
the point is interior (-) or exterior (+) to the initial front.
Finite-difference schemes, based on an uniform grid, can be used to solve Eq.
(18). The same entropy condition of T-Surfaces (once a grid node is burnt it stays
burnt) is incorporated in order to drive the model to the desired solution (in fact,
T-Surfaces was inspired by the level set model [24]).
In this higher-dimensional formulation, topological changes can be efficiently
implemented. Numerical schemes are stable, and the model is general in the sense
that the same formulation holds for 2D and 3D, as well as for merge and splits. In
addition, the surface geometry is easily computed. For example, the front normal
(−
→n ) and curvature (K) are given, respectively, by:
−
→ ∇G (x, t)
n = ∇G (x, t) , K =∇· , (20)
∇G (x, t)
where the gradient and divergent (∇·) are computed with respect to x.
The update of the embedding function through expression (18) can be made
cheaper if the narrow-band technique is applied. The key idea of this method
comes from the observation that the front can be moved by updating the level set
function at a small set of points in the neighborhood of the zero set instead of
updating it at all the points on the domain (see [8, 25] for details).
To implement this scheme we need to pre-set a distance ∆d to define the
narrow band. The front can move inside the narrow band until it collides with the
narrow band frontiers. Then, the function G should be reinitialized by treating the
current zero set configuration as the initial one.
This method can also be made cheaper by observing that the grid points that
do not belong to the narrow band can be treated as sign holders [8], following the
same idea of the characteristic function used in Section 3. This observation will
be used in Section 6.2. We are now able to present the level set formulation for
the dual snake model.
6. DUAL-LEVEL-SET APPROACH
To clarify ideas, let us consider Figure 2a, which shows two contours bounding
the search space, and Figure 2b, which pictures a bi-dimensional surface where
the zero level set is the union of the two contours just presented.
(a) (b)
Figure 2. (a) Dual snakes bounding the search space. (b) Initial function where the zero
level set is the two contours presented.
If the surface evolves such that the two contours get closer, we can obtain the
same behavior of Dual-T-Snakes. That is the key idea of our proposal. In order
to accomplish this goal, we must define a suitable speed function and an efficient
numerical approach. For simplicity, we consider the one-dimensional version of
the problem depicted in Figure 3. In this case, the evolution equation can be written
as
∂G
Gt + F = 0. (21)
∂x
The main point is to design the speed function F such that Gt > 0. Therefore, if
we set the sign of F opposite to that of Gx , we attain this goal once
∂G
Gt = − F. (22)
∂x
Hence, the desired behavior can be obtained by distribution of the sign of F shown
in Figure 3.
However, we should note that Gx = 0 for singular points. The values of
G therefore remain constant over these points because Gt becomes null. Thus,
we should be careful about surface evolution nearby the singular points because
anomalies may happen. One possible way to avoid this problem is to stop front
evolution before it gets close to this point. Another possibility could be to change
the evolution equation in order to allow Gt = 0 over singular points. Such proposal
implies that the isolines may be not preserved, that is, they become a function of
time also. Thus,
G (x (t) , t) = y (t) ; (23)
LEVEL SET FORMULATION FOR DUAL SNAKE MODELS 205
∂G dx dy
Gt + = . (24)
∂x dt dt
Therefore, we should provide a speed function in the y direction:
∂G dx dy
Gt + , −1 · , = 0. (25)
∂x dt dt
Gt + F |(∇G, −1)| = 0,
∂G dx ∂G dy dz
Gt + + = , (26)
∂x dt ∂y dt dt
and, therefore,
∂G ∂G dx dy dz
Gt + , , −1 · , , = 0. (27)
∂x ∂y dt dt dt
One way to deal with these models is through viscous conservation laws. For
example, expression (24) becomes
∂G dx ∂2G
Gt + =ε 2, (28)
∂x dt ∂x
206 GILSON A. GIRALDI et al.
1 + αk
Gt = 2 |∇G| + ε∇2 G + β∇P · ∇G, (34)
1 + |∇I|
2
where P = − |∇I| , as in Eq. (10). The evolution of the fronts is interdependent
due to the embedding function. After initialization (see Section 6.2), all the fronts
evolve following expression (34). However, once the evolution is halted, we must
evaluate the similarity between the two contours and apply a driving velocity
instead of the driving force of Section 4. We next develop these elements of the
Dual-Level-Set method.
−x +x −y +y
Di,j G + Di,j G Di,j G + Di,j G
Gn+1
i,j = Gni,j − ∆t H , − (36)
2 2
+x −x +y −y
∆t Di,j G − Di,j G Di,j G − Di,j G
ε + , (37)
2 2 2
−x +x
where Di,j and Di,j are first-order finite-difference operators defined by:
G (x, y) − G (x − ∆x, y)
D−x G = , (38)
∆x
G (x + ∆x, y) − G (x, y)
D+x G = , (39)
∆x
−y +y
following the same idea for Di,j and Di,j . This scheme can be obviously extended
to 3D. Besides, we constrain the change of the discrete field Gi,j as follows:
n+1
G n
i,j − Gi,j < δ, (40)
where δ is a parameter (set at 1.0 in the tests below) for controlling the rate of change
of G. With such a procedure we get a more stable front evolution and the choice
of parameter values for the numerical scheme becomes easier. In Appendix 1 we
offer an additional discussion about numerical methods for differential equations
like expression (35).
6.2. Initialization
In shape recovery problems, discontinuities in image intensity are significant
and important. In fact, such discontinuities of an image I can be modeled as step
functions like in [28]:
n
s (x, y) = ai Si (x, y) , (41)
i=1
Therefore, an image I is not a differentiable function in the usual sense. The usual
way to address this problem is to work with a coarser-scale version of the image
obtained by convolving this signal with a Gaussian kernel:
Figure 4. Support of a step function given by Eq. (41) with a1 = 1, a2 = −1, and
a3 = 1.
Sigg and Peikert [30] to calculate the Euclidean distance function using a scan
process. Our scan process is executed by the GPU of the video card, during the
rasterization pipeline phase. The result of the rendering process is returned to the
application as a texture, where the cells in the narrow band are associated with the
pixels generated by each polygon rasterization.
For our dual approach, the narrow band is attractive not only for computational
aspects but also because it allows an efficient way to evaluate similarity between
two contours. In fact, instead of using the criterion of Section 4, we take the
procedure depicted in Figure 6. First, the intersection point is computed (Figure
6a); we then take a neighborhood of this point (Figure 6b) and stop to update the
function G in all the grid points inside it, or we can set to zero the velocity function
for these points. We say that those grid points are frozen ones.
(a) (b)
Figure 6. (a) Narrow bands touching each other. (b) Neighborhood to define similarity
between fronts.
The number of non-frozen grid points inside a narrow band offers a criterion
that indicates that the Dual-Level-Set has found an equilibrium position because,
if this number does not change for some interactions, we can be sure that the zero
level set remains unchanged. So, we must stop the evolution.
employ a less sensitive model of the initial position of the fronts. To accomplish
this task we can add an extra term to Eq. (34), which has a general form:
the terms of which will be explained next. We must be careful when choosing the
grid points to apply this expression. As in the case of Dual-T-Snakes, the fronts
may be near the boundary somewhere, but far from the target at another point. We
should automatically realize this fact when the fronts stop moving. To do so, we
can use the affinity operator explained in Section 4. Based on this operator, we
can define an affinity function that assigns to a grid point inside the narrow band
a value of 0 or 1: 0 for the grid points most likely to lie away from the target
boundaries and 1 otherwise. Like in Dual-T-Snakes, such affinity operator can be
defined through fuzzy segmentation methods [31–33], image transforms [34, 35],
region statistics [27], etc. For example, we can turn on the Vp term wherever the
affinity function is 0 by using Suri’s proposal [27], given by
Fp = 1 − 2u (x, y) , (45)
where u is the fuzzy membership function that has values within the range [0, 1].
The Fadv vector field can be implemented through an extra force field. In
this work, we propose Gradient Vector Flow (GVF) to define this term. GVF is
a vector diffusion method that was introduced in [36] and can be defined through
the following diffusion reaction equation [37]:
∂v
= ∇ · (g∇v) + h (∇f − v) , (46)
∂t
v (x, 0) = ∇f,
2
where f is a function of the image gradient (e.g., P = − ∇I ), and g (x) , h (x)
are nonnegative functions defined on the image domain. The field obtained by
solving the above equation is a smooth version of the original one that tends to be
extended very far from the object boundaries. When used as an external force for
deformable models, it makes the methods less sensitive to initialization [36] and
improves their convergence to the object boundaries. However, it is limited in the
presence of noise and artifacts. Despite this problem, the result can be worthwhile
near the boundaries of interest. We can therefore use the usual external field until
the evolution stops. The application of GVF as an extra velocity can then push
the fronts toward the boundaries. In this work, we simply set Fadv = v, with v
given by the GVF solution (Eq. (46)). In [38] we found another possibility for
combining GVF and level set models.
We can also implement the driving velocity based on global shape information.
In this case, there are training data that offer the prior shape information [39].
First, we must find the shape and pose parameters, represented by vectors α and β,
212 GILSON A. GIRALDI et al.
1
R= MMT . (49)
n
We then apply singular-value decomposition (SVD) to represent the covariance
matrix R as
R=U UT ,
where U is a matrix whose column vectors represent the set of orthogonal modes
of shape
variation arranged according to the decreasing order of its eigenvalues,
and is a diagonal matrix of corresponding singular values. Let Uk be the matrix
composed by the k first columns of U (the k principal components). Thus, given
a field u, we can compute the shape parameters α by [39]
α =UkT (u − µ) . (50)
Using the Gaussian distribution, the prior shape model can be computed as
1 1 T −1
P (α) = exp α Σk α , (51)
(2π) | k | 2
where k contains the first k rows and columns of Σ.
This expression is used in [39] to compute the optimized G∗ , which is defined
as
G∗ (t) = arg max P (G∗ |G (t) , ∇I) , (52)
where G is the embedding function at time t and ∇I is the image gradient. With
G∗ we can define Vshape as
7. SEGMENTATION FRAMEWORK
As long as the two fronts reach an equilibrium state, we can assure that the
boundary of the object we are trying to identify is located inside the two half narrow
bands associated with the inside and outside fronts. Therefore, the Dual-Level-Set
method can be combined with a search-based technique (a dynamic program or
greedy technique), resulting in a boundary extraction procedure composed by four
steps [15, 16, 41]: (a) the user initializes the inner/outer fronts; (b) computation
of the affinity operator and affinity function; (c) application of the Dual-Level-Set
algorithm; and (d) finding the final boundaries using a search-based approach.
The same methodology was applied for the Dual-T-Snakes method [17]. As
the boundary is enclosed by fronts obtained at the end of the Dual-Level-Set
method (see Figure 7), the Viterbi algorithm [13, 18] may be suitable. In this
algorithm the search space is constructed by discretizing each curve into N points
and establishing a match between them. Each pair of points is then connected by a
segment that is subdivided into M points. This process provides a discrete search
space with N M points (Figure 7).
Figure 7. Search space obtained through a matching between inner and outer snakes.
If E (po ) < E (vi ), then vi ← p. The algorithm can then be summarized as shown
below.
The snake position is updated following this procedure for all snaxels. A
termination condition is achieved when snaxels no longer move (equilibrium po-
sition). Greedy algorithms have been used in the context of snake models in order
to improve computational efficiency [41, 42]. The proposed algorithm is simple to
implement and efficient in extracting the desired boundary when the dual model
stops.
LEVEL SET FORMULATION FOR DUAL SNAKE MODELS 215
8. DUAL-LEVEL-SET RESULTS
(a) (b)
while (10 iterations in this case). Therefore, fronts obtain velocity and Figure 9b
shows their new positions 10 iterations later. The same happens in Figures 9c–d.
This example emphasizes an ability to avoid local minima with the Dual-Level-Set
method. The total number of interaction steps was 1618 in this example, and the
image resolution is 256 × 256 pixels.
The result using the Dual-Level-Set method is shown in Figure 9d. We can
see that the method achieved the desired target, which means that it was able to
obtain two contours close to the object boundary that bound the search space. An
important aspect to be considered is that the width of the search space remains
almost the same over its extension. It is interesting because it makes computation
of the search space with the Viterbi algorithm much easier. However, the greedy
model can be also efficient because we are very close to the target. This will be our
choice, which is presented next. Figure 10 shows the final result depicted by the
red curve. We can observe that the boundary extraction framework achieves the
desired goal. Initialization of the greedy snake model was performed following
the presentation in Section 7.
(a) (b)
(c) (d)
Figure 9. (a) Fronts stop moving at interaction 453 due to the low-velocity criterion. (b)
Front position in interaction 463 when the stopping term is turned on. (c) Configuration at
interaction 1347. (d) Final front positions.
the ribbon, and, a search based model, like the Viterby or the greedy one, can
then be applied to yield the final result. In Figure 11b we depict the bandpass
version of the original image as well as the initial fronts for the dual method. In
this case, we apply the affinity operator based on the threshold that characterizes
the boundaries (T = 150): if p is a non-frozen grid point inside a narrow band that
satisfies I (p) > 150, then we set β = 0.0 for a while (10 iterations in this case).
In this case we observed also that the fronts stop moving, far from each other
in some places, but the dual approach was able to go outside the local minimum
and extract the ribbon. Figure 12 shows another example using just a low-pass
218 GILSON A. GIRALDI et al.
Figure 10. Final result obtained with the greedy snake model. See attached CD for color
version.
(a) (b)
(c) (d)
Figure 11. (a) Image and initial fronts. (b) Bandpass-filtered image and initial fronts. (c)
Fronts stop moving at interaction 534 due to the low-velocity criterion. (d) Dual-Level-Set
result obtained after 641 interaction steps. See attached CD for color version.
LEVEL SET FORMULATION FOR DUAL SNAKE MODELS 219
filter to smooth the same image. In this case, we observed an increase in the
number of interactions if compared with the previous case. This happens because
the bandpass filter can remove some textures inside the cell, but the Gaussian one
is not able to do so. The Dual-Level-Set parameters are α = 0.1, ε = 2.0, β = 0.1,
T = 150, ∆t = 0.05.
(a) (b)
(c) (d)
Figure 12. (a) Position of the fronts at interaction 200. (b) Interaction 400. (c) Fronts in
interaction 600. (d) Dual-Level-Set result obtained after 926 interaction steps. Reprinted
with permission from Giraldi GA, Strauss E, Oliveira AF. 2003. Dual-T-snakes model for
medical imaging segmentation. Pattern Recognit Lett 24(7):993–1003. Copyright 2003,c
Elsevier.
220 GILSON A. GIRALDI et al.
Figure 13a shows initialization for the greedy model, and Figure 13b depicts
the final result.
(a) (b)
Figure 13. (a) Initialization of the greedy snake model. (b) Final result. Adapted with
permission from Giraldi GA, Strauss E, OliveiraAF. 2003. Dual-T-snakes model for medical
imaging segmentation. Pattern Recognit Lett 24(7):993–1003. Copyright 2003,
c Elsevier.
(a) (b)
Figure 14. (a) Original image with a cell nucleolus. (b) Initial fronts. Adapted with
permission from Giraldi GA, Strauss E, OliveiraAF. 2003. Dual-T-snakes model for medical
imaging segmentation. Pattern Recognit Lett 24(7):993–1003. Copyright 2003,
c Elsevier.
(a) (b)
Figure 15. (a) Outer and inner fronts just before the time point at which they undergo topo-
logical changes. (b) Fronts right after the topological changes. Adapted with permission
from Giraldi GA, Strauss E, Oliveira AF. 2003. Dual-T-snakes model for medical imaging
segmentation. Pattern Recognit Lett 24(7):993–1003. Copyright 2003, c Elsevier.
Figure 16 shows three evolution steps of the model as well as the dual result,
shown in Figure 16d. In Figure 16b the fronts stop evolving. The driving velocity
is then applied. However, different from the previous examples, we adopt the
222 GILSON A. GIRALDI et al.
(a) (b)
(c) (d)
Figure 16. (a) Position of the fronts after 500 interactions. (b) Fronts stop moving due
to artifacts. (c) Result after turning on the driving velocity for 20 interactions. (d) Dual-
Level-Set result. Adapted with permission from Giraldi GA, Strauss E, Oliveira AF. 2003.
Dual-T-snakes model for medical imaging segmentation. Pattern Recognit Lett 24(7):993–
1003. Copyright 2003,
c Elsevier.
following methodology. We first take a radial vector field, depicted in Figure 17,
whose center is the one of the inner front of Figure 14b. We use this field to define
the driving velocity given by
Vdrive = |∇P · →
−
n|, (61)
LEVEL SET FORMULATION FOR DUAL SNAKE MODELS 223
The key idea behind this field is to explore user initialization to define a vector
field that can be used for perturbation of the fronts when they stop far from the
target. The main point that we want to highlight is that, even with an ad-hoc
procedure, the Dual-Level-Set method can go outside a local minimum and obtain
the target. Figure 16c shows the result obtained when the driving velocity is turned
on, given by expression 61 during 20 interactions. We observe that the fronts are
less smooth now, but they are closer to undergoing a topological change and then
passing over the artifact as we desire. Certainly, a more efficient driving force
could be used to avoid loss of smoothness.
Figure 18a shows initialization for the greedy model, and Figure 18b depicts
the extracted boundary. We can see that the method was able to get the details of
the cell boundary.
9. DISCUSSION
(a) (b)
Figure 18. (a) Original image with a cell nucleolus. (b) Initial fronts. Adapted with
permission from Giraldi GA, Strauss E, OliveiraAF. 2003. Dual-T-snakes model for medical
imaging segmentation. Pattern Recognit Lett 24(7):993–1003. Copyright 2003,
c Elsevier.
key point is that there is a coupling, a thickness constraint, between the WM-GM
and GM-CSF volumes. In fact, the cortical mantle has nearly constant thickness
[43, 44]. Zeng proposed a level set method that begins with two embedded sur-
faces in the form of concentric sphere sets. The inner and outer surfaces were then
evolved, driven by their own image-derived information, while maintaining the
coupling in between through a thickness constraint. The model evolution is based
on the following system:
∂Gin
+ Fin |∇Gin | = 0, (62)
∂t
∂Gout
+ Fout |∇Gout | = 0, (63)
∂t
where Gin and Gout are the embedding functions of the inner and outer surfaces,
respectively. In this model, coupling between the two surfaces is obtained through
the speed terms Fin , Fout , based on the distance between the two surfaces and the
propagation forces. We must observe that there are two embedding functions, while
in the Dual-Level-Set method there is only one. In addition, we can implement
coupled constrained approaches, based on the dual method, by incorporating the
constant thickness constraint in the termination condition.
The Dual-Level-Set method is also a topologically adaptable approach, just
like its counterpart, the Dual-T-Snakes model, discussed in Section 4. Therefore,
we must provide some comparison between these methods. The characteristic
function plays a similar role to that of the embedding function in the sense that
LEVEL SET FORMULATION FOR DUAL SNAKE MODELS 225
11. ACKNOWLEDGMENTS
We would like to acknowledge the Brazilian agency for scientific development
(CNPq) and PCI-LNCC for financial support of this work.
226 GILSON A. GIRALDI et al.
12. APPENDIX
THEORETICAL PERSPECTIVES
In this section we present some theoretical elements not covered in the tra-
ditional level set literature. Our goal is to point out problems inherent in the
numerical analysis and a possible direction to be explored for level set theory.
Therefore, we start with numerical considerations, in the context of weak solu-
tions, for a simplified version of our problem. Then, two mathematical tools —
the generalized solutions and relaxation models for front propagation problems —
are discussed within the context of interest.
Expression (35) is a second-order partial differential equation (in space) that
depends on the parameter ε. If we add boundary conditions, we obtain a parabolic
problem that must be analyzed with some care because complex behaviors, from
the numerical viewpoint, may happen. Such an analysis is derived from the stan-
dard level set literature [25, 27] because such difficulties seem to be reduced when
using the narrow band approach.
A simple example of the static problem, F (x, G, Gx , Gxx ) = 0, is useful
for understanding the kind of complex behaviors that may occur. We shall thus
consider the following singular diffusion problem in one dimension, defined in the
unit interval
σ 2 G − ε2 Gxx = 0, (64)
G (0) = 0, (65)
G (1) = 1. (66)
The exact solution to this problem can be written as
exp − σε x − exp σε x
G (x) = σ , (67)
exp − ε − exp σε
which is well behaved for ε > 0. However, numerically, things are not so sim-
ple. Finite-difference schemes are particular cases of weak solutions, which can
be simply stated as follows. Let us consider the space H 1 (0, 1), composed by
functions that have a square-integrable first derivative, and the following spaces
(see [46] and references therein):
It can be shown that solution of Eqs. (70) and (64), with conditions (65)–(66),
are the same. However, formulation (70) allows development of approximation
schemes, well known in the field of finite-element methods [47]. Thus, let us
consider a partition of the unit interval 0 = x0 < x1 < ... < xnel , where nel is the
total number of elements Ωe = (xe−1 , xe ), e = 1, ..., nel . We have also a mesh
parameter, h = max (Ωe ), e = 1, ..., nel . Then consider the set of all polynomials
of degree not greater than k and denote its restriction to Ωe by Pk (Ωe ). Let
σ 2 h2
α= . (74)
6ε2
To see this, consider a uniform mesh, and let GI denote the nodal value of the
approximate solution at an arbitrary interior node I. Employing piecewise linear
shape functions, i.e., k = 1, then the Ith equation derived from expression (73) is
GI = crI . (76)
2 2
1 + 2α + (1 + 2α) − (1 − α)
r1 = , (78)
(1 − α)
1
r2 = . (79)
r1
We can note that both roots are positive if α < 1 and both are negative for α > 1.
Therefore, by (76) the difference solution will oscillate from one node to the next,
which demonstrates a spurious behavior of the approximate solution as compared
to the smooth exact solution in expression (67).
The main point is to check the effect of parameter α for the level set in general,
and for Dual-Level-Set in particular, even with the narrow-band approach. In
addition, if we take embedding function G as the image field itself, as in [5], we
obtain an approach that combines color anisotropic diffusion and shock filtering,
leading to a possible image flow for segmentation/simplification (see also [48]).
In this case, the narrow-band technique is not useful, and so it is not safe to apply
the method without the above discussion.
Another possibility to be investigated would be to work with generalized
solutions of the level set problem. First, we must observe that level set Eq. (34)
has the following general form:
Gt + F x, G, Dx G, Dx2 G = 0, (80)
where Dx G, Dx2 G represent the gradient and Hessian of G, respectively. For the
dual model we must have
F x, G, Dx G, Dx2 G ≤ 0. (81)
Dx2 G (
x) ≤ X. Thus, if condition
(82) holds, we have F (
x, G (
x) , p, X) ≤ F (
x,
G ( x), Dx2 G (
x), Dx G ( x) ≤ 0; and, therefore,
F (
x, G (
x) , p, X) ≤ 0, (85)
whenever (84) holds. We must observe that this expression does not depend on
the derivatives of G. Thus, it points toward a definition of generalized solutions
for Eq. (81). In fact, roughly, we can say that G is a viscosity solution of Eq. (80)
if, for all smooth test functions ϕ
1) If G − ϕ has a local maximum at a point (x, t), then:
ϕt + F x, G, Dx ϕ, Dx2 ϕ ≤ 0,
Although this theory is outside the scope of this paper, we must emphasize that
viscosity solutions theory is intimately connected with numerical analysis. It
provides mathematical tools to perform convergence analysis as well as indicates
how to build discretization methods or schemes for classical and more general
boundary conditions (see [49] and the references therein).
Another interesting point is the application of relaxation models for front
propagation problems. For instance, let us consider a Cauchy problem for the
following simplified version of the Hamilton-Jacobi equation in n :
Gt + H (∇G) = 0, (86)
pt + ∇H (p) = 0, (88)
pt + ∇w = 0, (91)
1
wt + a∇ · p = − (w − H (p)) , (92)
ε
Gt + w = 0, (93)
230 GILSON A. GIRALDI et al.
where w is an auxiliary function and ε > 0 is the relaxation time. Also, the
constant a must satisfy the stability condition:
2
a > |∇p H (p)| . (94)
By Chapman-Enskog expansion it can be shown that, if the O ε2 terms are
ignored, expressions (91)–(93) yield
T
Gt + H (∇G) =ε a∆G − ∇p H (p) ∇p [∇p H (p)] . (95)
∇G ⊗ ∇G
Gt + |∇G| =εT r I − 2 ∆G , (96)
|∇G|
where we choose a = 1. This equation is the level set formulation of the propa-
gation of a front S (t) with normal velocity
F = −1 − εk, (97)
1 ∇G ⊗ ∇G
k =∇·n=− Tr I − 2 ∆G . (98)
|∇G| |∇G|
In [50] analytical comparisons are presented between models (96) and (31) as
well as numerical schemes to solve system (91)–(93). Compared to the second-
order, singular and degenerate level set Eq. (96), relaxation system (91)–(93) is a
first-order semilinear hyperbolic system without singularity. Moreover, it retains
the advantages of the level set equation, such as the ability to handle complicated
geometries and changes in topology, and it allows one to capture the curvature
dependency automatically without directly solving the singular curvature term
numerically. Once it is semilinear we can numerically solve it in a strikingly
simple way by avoiding the Riemann solvers.
LEVEL SET FORMULATION FOR DUAL SNAKE MODELS 231
13. REFERENCES
1. Kass M, Witkin A, Terzopoulos D. 1988. Snakes: active contour models. Int J Comput Vision
1(4):321–331.
2. McInerney T, Terzopoulos D. 1996. Deformable models in medical image analysis: a survey.
Med Image Anal 1(2): 91–108.
3. Cohen LD, Cohen I. 1993. Finite-element methods for active contour models and balloons for
2D and 3D images. IEEE Trans Pattern Anal Machine Intell 15(11): 1131–1147.
4. Niessen WJ, ter Haar Romery BM, Viergever MA. 1998. Geodesic deformable models for medical
image analysis. IEEE Trans Med Imaging 17(4):634–641.
5. Sapiro G. 1997. Color snakes. Comput Vision Image Understand 68(2):247–253.
6. Black A, Yuille A, eds. 1993. Active vision. Cambridge: MIT Press.
7. Caselles V, Kimmel R, Sapiro G. 1997. Geodesic active contours. Int J Comput Vision 22(1):61–
79.
8. Malladi R, Sethian JA, Vemuri BC. 1995. Shape modeling with front propagation: a level set
approach. IEEE Trans Pattern Anal Machine Intell 17(2):158– 175.
9. McInerney T, Terzopoulos D. 1999. Topology adaptive deformable surfaces for medical image
volume segmentation. IEEE Trans Med Imaging 18(10):840–850.
10. Giraldi GA, Oliveira AF. 1999. Convexity analysis of snake models based on hamiltonian formu-
lation. Technical report, Universidade Federal do Rio de Janeiro, Departamental Engenharia de
Sistemas e Computação, http://www.arxiv.org/abs/cs.CV/0504031.
11. Leymarie F, Levine MD. 1993. Tracking deformable objects in the plane using and active contour
model. IEEE Trans Pattern Anal Machine Intell 15(6):617–634.
12. Amini AA, Weymouth TE, Jain RC. 1990. Using dynamic programming for solving variational
problems in vision. IEEE Trans Pattern Anal Machine Intell 12(9):855–867.
13. Giraldi GA, Vasconcelos N, Strauss E, Oliveira AF. 2001. Dual and topologically adaptable snakes
and initialization of deformable models. Technical report, National Laboratory for Scientific
Computation, Petropolis, Brazil, http://www.lncc.br/proj-pesq/relpesq-01.htm.
14. Bamford P, Lovell B. 1997. A two-stage scene segmentation scheme for the automatic collection
of cervical cell images. In Proceedings of TENCON ’97: IEEE region 10 annual conference on
speech and image technologies for computing and telecommunications, pp. 683–686. Washington,
DC: IEEE.
15. Giraldi GA, Strauss E, Oliveira AF. 2000. A boundary extraction method based on Dual-T-snakes
and dynamic programming. In Proceedings of the IEEE conference on computer vision and
pattern recognition (CVPR ’2000), Vol. 1, pp. 44–49. Washington, DC: IEEE.
16. Giraldi GA, Strauss E, Oliveira AF. 2001. Improving the dual-t-snakes model. In Proceed-
ings of the international symposium on computer graphics, image processing and vision (SIB-
GRAPI’2001), Florianópolis, Brazil, October 15–18, pp. 346–353.
17. Giraldi GA, Strauss E, OliveiraAF. 2003. Dual-T-snakes model for medical imaging segmentation.
Pattern Recognit Lett 24(7):993–1003.
18. Gunn SR. 1996. Dual active contour models for image feature extraction. Phd dissertation, Uni-
versity of Southampton.
19. Bamford P, Lovell B. 1995. Robust cell nucleus segmentation using a viterbi search based active
contour. In Proceedings of the fifth international conference on computer vision (ICCV’95),
Cambridge, MA, USA pp. 840–845.
20. Gunn SR, Nixon MS. 1997. A robust snake implementation; a dual active contour. IEEE Trans
Pattern Anal Machine Intell 19(1):63–68.
232 GILSON A. GIRALDI et al.
21. Xu G, Segawa E, Tsuji S. 1994. Robust active contours with insensitive parameters. Pattern
Recognit 27(7):879–884.
22. Gunn SR, Nixon MS. 1996. Snake head boundary extraction using global and local energy
minimisation. Proc IEEE Int Conf Pattern Recognit 2:581–585.
23. McInerney T, Terzopoulos D. 2000. T-snakes: topology adaptive snakes. Med Image Anal 4(2):73–
91.
24. McInerney TJ. 1997. Topologically adaptable deformable models for medical image analysis.
PhD dissertation. University of Toronto.
25. Sethian JA. 1996. Level set methods: evolving interfaces in geometry, fluid mechanics, computer
vision and materials sciences. Cambridge: Cambridge UP.
26. Osher S, Sethian JA. 1988. Fronts propagation with curvature-dependent speed: algorthms based
on Hamilton-Jacobi formulations. J Comput Phys 79:12–49.
27. Suri JS, Liu K, Singh S, Laxminarayan S, Zeng X, Reden L. 2002. Shape recovery algorithms
using level sets in 2d/3d medical imagery: a state-of-the-art review. IEEE Trans Inf Technol
Biomed 6(1):8–28.
28. You Y-L, Xu W, Tannenbaum A, Kaveh M. 1996. Behavioral analysis of anisotropic diffusion in
image processing. IEEE Trans Image Process 5(11):1539–1553.
29. Mauch S. 2000. A fast algorithm for computing the closest point and distance
function. Unpublished technical report. California Institute of Technology, September.
http://www.acm.caltech.edu/ seanm/projects/cpt/cpt.pdf.
30. Sigg C, Peikert R. 2005. Optimized bounding polyhedra for GPU-based distance transform. In
Scientific visualization: the visual extraction of knowledge from data, pp. 65–78. Ed GP Bonneau,
T Ertl, GM Nielson. New York: Springer.
31. Udupa J, Samarasekera S. 1996. Fuzzy connectedness and object definition: theory, algorithms
and applications in image segmentation. Graphical Models Image Process 58(3), 246–261.
32. Bezdek JC, Hall LO. 1993. Review of MR image segmentation techniques using pattern recog-
nition. Med Phys 20(4):1033–1048.
33. Xu C, Pham D, Rettmann M, Yu D, Prince J. 1999. Reconstruction of the human cerebral cortex
from magnetic resonance images. IEEE Trans Med Imaging 18(6):467–480.
34. Pohle R, Behlau T, Toennies KD. 2003. Segmentation of 3D medical image data sets with a
combination of region based initial segmentation and active surfaces. In Progress in biomedical
optics and imaging. Proc SPIE Med Imaging 5203:1225–1231.
35. Falcão AX, da Cunha BS, Lotufo RA. 2001. Design of connected operators using the image
foresting transform. Proc SPIE Med Imaging 4322:468–479.
36. Xu C, Prince J. 1998. Snakes, shapes, and gradient vector flow. IEEE Trans Image Process
7(3):359–369.
37. Xu C, Prince JL. 2000. Global optimality of gradient vector flow. In Proceedings of the 34th annual
conference on information sciences and systems (CISS’00). iacl.ece.jhu.edu/pubs/p125c.ps.gz.
38. Hang X, Greenberg NL, Thomas JD. 2002. A geometric deformable model for echocardiographic
image segmentation. Comput Cardiol 29:77–80.
39. Leventon ME, Eric W, Grimson L, Faugeras OD. 2000. Statistical shape influence in geodesic
active contours. Proc IEEE Int Conf Pattern Recognit 1:316–323.
40. Jain AK. 1989. Fundamentals of digital image processing. Englewood Cliffs, NJ: Prentice-Hall.
41. Poon CS, Braun M. 1997. Image segmentation by a deformable contour model incorporating
region analysis. Phys Med Biol 42:1833–1841.
42. Williams DJ, Shah M. 1992. A fast algorithm for active contours and curvature estimation. Comput
Vision Image Understand 55(1):14–26.
LEVEL SET FORMULATION FOR DUAL SNAKE MODELS 233
43. Zeng X, Staib LH, Schultz RT, Duncan JS. 1999. Segmentation and measurement of the cortex
from 3d MR images. IEEE Trans Med Imaging 18:100–111.
44. Zeng X, Staib LH, Schultz RT, Duncan JS. 1999. Segmentation and measurement of the cortex
from 3d MR images using coupled-surfaces propagation. IEEE Trans Med Imaging 18(10):927–
937.
45. Cates JE, Lefohn AE, Whitaker RT. 2004. GIST: an interactive, GPU-based level set segmenta-
tion tool for 3D medical images. http://www.cs.utah.edu/research/techreports/2004/pdf/UUCS-
04-007.pdf.
46. Franca LP, Madureira AL, Valentin F. 2003. Towards multiscale functions: Enriching finite el-
ement spaces with local but not bubble-like functions. Technical Report no. 26/2003, National
Laboratory for Scientific Computing, Petropolis, Brazil.
47. Hughes TJR. 1987. The finite element method: linear static and dynamic finite element analysis.
Englewood Cliffs: Prentice-Hall.
48. Sapiro G. 1995. Color snakes. Technical report, Hewlett-Packard Laboratories
([email protected]).
49. Crandall MG, Ishii H, Lions P-L. 1992. User’s guide to viscosity solutions of second-order partial
differential equations. Bull Am Math Soc 27:1–67.
50. Jin S, Katsoulakis MA, Xin Z. 1999. Relaxation schemes for curvature-dependent front propa-
gation. Commun Pure Appl Math 52(12):1557–1615.
8
A wide range of computer vision applications — such as distance field computation, shape
from shading, shape representation, skeletonization, and optimal path planning — require
an accurate solution of a particular Hamilton-Jacobi (HJ) equation, known as the eikonal
equation. Although the fast marching method (FMM) is the most stable and consistent
method among existing techniques for solving such an equation, it suffers from a large
numerical error along diagonal directions, and its computational complexity is not optimal.
In this chapter, we propose an improved version of the FMM that is both highly accurate
and computationally efficient for Cartesian domains. The new method is called the multi-
stencils fast marching (MSFM) method, which computes a solution at each grid point by
solving the eikonal equation along several stencils and then picks the solution that satisfies
the fast marching causality relationship. The stencils are centered at each grid point x and
cover its entire nearest neighbors. In a 2D space, two stencils cover the eight neighbors of
x, while in a 3D space six stencils cover its 26 neighbors. For those stencils that do not
coincide with the natural coordinate system, the eikonal equation is derived using directional
derivatives and then solved using a higher-order finite-difference scheme.
1. INTRODUCTION
Address all correspondence to: Dr. Aly A. Farag, Professor of Electrical and Computer Engineering,
University of Louisville, CVIP Lab, Room 412, Lutz Hall, 2301 South 3rd Street, Louisville, KY
40208, USA. Phone: (502) 852-7510, Fax: (502) 852-1580. [email protected].
235
236 M. SABRY HASSOUNA and ALY A. FARAG
applications, such as computing distance fields from one or more source points,
shape from shading, shape offsetting, optimal path planning, and skeletonization.
Several methods have been proposed to solve the eikonal equation [1–6]. The
most stable and consistent method among those techniques is the fast marching
method (FMM), which is applicable to both Cartesian [5, 7] and triangulated sur-
faces [8, 9]. The FMM combines entropy satisfying upwind schemes and a fast
sorting technique to find the solution in a one-pass algorithm. Unfortunately, the
FMM still has two limitations: (1) At each grid point x, the method employs a
4-point stencil to exploit only the information of the adjacent four neighbors to
x, thus ignoring the information provided by diagonal points. As a consequence,
the FMM suffers from a large numerical error along diagonal directions. (2) The
computational complexity of the method is high because it stores the solutions in
a narrow band that is implemented using a sorted heap data structure. The com-
plexity of maintaining the heap is O(log n), where n is the total number of grid
points. Therefore, the total complexity of the method is O(n log n).
Few methods have been proposed to improve the FMM so as to be either
computationally efficient [10, 11] or more accurate [7, 12]. In [7], The Higher
Accuracy Fast Marching Method was introduced to improve the accuracy of the
FMM by approximating the gradient by a second-order finite-difference scheme
whenever the arrival times of the neighbor points are available and to revert to
first-order approximation in other cases.
Danielsson and Lin [12] improved the accuracy of the FMM by introducing the
shifted Grid Fast Marching method, a modified version of the FMM. The main idea
of this algorithm is to sample the cost function at half-grid positions; therefore, the
cost is dependent on the marching direction. The update strategy for computing the
arrival time at a neighbor point of a known point under the new scheme is derived
from optimal control theory in a similar fashion to Tsitsiklis [13]. Therefore,
the solution cannot make use of any higher-order finite-difference scheme. They
proposed two solution models. The first uses 4-connected neighbors and gives the
same result as the FMM, while the second employs 8-connected neighbors and
gives better results than the FMM. In both models, the idea is the same, where
the neighborhood around x is divided into either 4 quadrants or 8 octants. The
known points of each quadrant/octant computes the arrival time at x, which is then
assigned the minimum value over all quadrant/octants.
The Group Marching Method (GMM) [10] is a modified version of the FMM
that advances a group of grid points simultaneously rather than sorting the solution
in a narrow band. The GMM reduces the computational complexity of the FMM
to O(n), while maintaining the same accuracy. The method works as follows: a
group of points G are extracted from the narrow band such that their travel times
do not alter each other during the update procedure. The neighbor points of G join
the narrow band after computing their travel times, while the G points are tagged
as known.
ACCURATE TRACKING OF MONOTONICALLY ADVANCING FRONTS 237
Consider a closed interface Γ (e.g., boundary) that separates one region from
another. Γ can be a curve in 2D or a surface in 3D. Assume that Γ moves in its
normal direction with a known speed F (x) that is either increasing or decreasing.
The equation of motion of that front is given by
∇T (x)F (x) = 1 (1)
T (Γ0 ) = 0,
where T (x) is the arrival time of Γ as it crosses each point x. If the speed depends
only on the position x, then the equation reduces to a nonlinear first-order partial
differential equation, which is known in geometrical optics as the eikonal equation.
The FMM [5] solves that equation in one pass algorithm as follows. For clarity,
let’s consider the 2D version of the FMM. The numerical approximation of ∇T
that selects the physically correct vanishing viscosity weak solution is given by
Godunov [14] as
−x +x −y +y 1
max(Dij T, −Dij T, 0)2 + max(Dij T, −Dij T, 0)2 = . (2)
Fij2
If the gradient ∇T is approximated by the first-order finite-difference scheme, Eq.
(2) can be rewritten as
238 M. SABRY HASSOUNA and ALY A. FARAG
2
Ti,j − min(Ti−1,j , Ti+1,j )
max ,0 + (3)
∆x1
2
Ti,j − min(Ti,j−1 , Ti,j+1 ) 1
max ,0 = ,
∆x2 Fij2
where, ∆x1 and ∆x2 are the grid spacing in the x and y directions, respectively.
Let T1 = min(Ti−1,j , Ti+1,j ), and T2 = min(Ti,j−1 , Ti,j+1 ); then
2 2
Ti,j − T1 Ti,j − T2 1
max ,0 + max ,0 = . (4)
∆x1 ∆x2 Fij2
If Ti,j > max(T1 , T2 ), then Eq. (4) reduces to a second-order equation of the
form
2 2
Ti,j − Tk 1
= 2; (5)
∆xk Fij
k=1
otherwise,
∆xk
Ti,j = min(Tk + ) k = 1, 2. (6)
Fij
The idea behind the FMM is to introduce an order in the selection of grid points
during computing their arrival times, in a way similar to the Dijkstra shortest path
algorithm [15]. This order is based on a causality criterion, where the arrival time
T at any point depends only on the adjacent neighbors that have smaller values.
During the evolution of the front, each grid point x is assigned one of three possible
tags: (1) known, if T (x) will not be changed latter; (2) narrowband, if T (x) may
be changed later, and finally, (3) far, if T (x) is not yet computed. The FMM
algorithm can be summarized as follows. Initially, all boundary points are tagged
as known. Then, their nearest neighbors are tagged as narrowband after computing
their arrival time by solving Eq. (4). Among all narrow-band points,
Figure 1. (a) The coordinate system is rotated by an angle θ to intersect the grid at lattice
points. (b) The stencil Sj is centered at x and intersects the grid at lattice points. See
attached CD for color version.
3. METHODS
All related methods except that of [12] ignore the information provided by
diagonal points and hence suffer from a large error along diagonal directions. One
can make use of diagonal information by either: (1) using only one stencil that
always coincides with the natural coordinate system and then rotate the coordinate
system several times such that the stencil intersects the grid at diagonal points, as
shown in Figure 1a, or (2) use several stencils at x that cover the entire diagonal
information and then approximate the gradient using directional derivatives. In
both cases, the eikonal equation is solved along each stencil; then the solution at
x that satisfies the causality relationship is selected. Although both methods give
the same results, the first method is limited to isotropic grid-spacing, and so the
second method is more general.
Dr1 = ∇T (xi ) · −
→
r 1 = r11 Tx + r12 Ty , (7)
Dr2 = ∇T (xi ) · −
→
r 2 = r21 Tx + r22 Ty , (8)
Dr = R ∇T (xi ), (10)
∇T (xi ) = R−1 Dr , (11)
T T
(∇T (xi )) = R−1 Dr = Dr T R−T . (12)
Since
T
∇T (xi )2 = (∇T (xi )) ∇T (xi ), (13)
then,
sin2 (φ)
∇T 2 = Dr21 − 2Dr1 Dr2 cos(φ) + Dr22 = . (17)
F 2 (xi )
The first-order approximation of the directional derivative, Drj , that obeys the
viscosity solution given the arrival time of a known neighbor point xj is
T (xi ) − T (xj )
Drj = max , 0 , j = 1, 2. (18)
xi − xj
Also, the second-order approximation of the directional derivative, Drj , that obeys
the viscosity solution given the arrival times of the known neighbor points xj and
xj−1 , where xj−1 ≤ xj [7] is
3 T (xi ) − 4 T (xj ) + T (xj−1 )
Drj = max ,0
2 xi − xj
3
= max [T (xi ) − T (vj )] , 0 , (19)
2 xi − xj
ACCURATE TRACKING OF MONOTONICALLY ADVANCING FRONTS 241
Figure 2. The proposed stencils for a 2D Cartesian domain. See attached CD for color
version.
where
4 T (xj ) − T (xj−1 )
T (vj ) = . (20)
3
In 2D, we use two stencils S1 and S2 . The nearest neighbors are covered by S1 ,
while S2 covers the diagonal ones. To simplify discussion, let’s assume isotropic
grid spacing with ∆x = ∆y = h, then φ = 90◦ . Let
The S1 stencil is aligned with the natural coordinate system as shown in Figure
2a, while the S2 stencil is aligned with the diagonal neighbor points as shown in
Figure 2b. Since φ = 90◦ , then (R RT )−1 = I, and hence,
1
Dr T I Dr = Dr21 + Dr22 = . (22)
F 2 (xi )
or
T (xi ) > max (T (v1 ), T (v2 )) , (24)
242 M. SABRY HASSOUNA and ALY A. FARAG
S1 f (∆x, ∆y) = min (∆x, ∆y) f (∆x, ∆y) = 2 min (∆x, ∆y)
S2 f (∆x, ∆y) = ∆2 x + ∆2 y f (∆x, ∆y) = 2 ∆2 x + ∆2 y
Substituting Eq. (29) into (27), we get the following upwind condition:
The values of f (∆x, ∆y) and τj for different stencils are given in Table 2.
Figure 3. The proposed stencils for a 3D Cartesian domain. See attached CD for color
version.
where aj = 1, bj = −2τj , and cj = τj2 . The values of g(h) and τj are given in
Table 3. Again, they depend on the stencil shape and the order of approximations
of the directional derivatives.
4. NUMERICAL EXPERIMENTS
the analytical solution is hard to find at least for complex speed models, we instead
start from a continuous and differentiable function Ti (x) as given by
T1 (x) = (x − x0 )2 + (y − y0 )2 − 1, (38)
(x − x0 )2 (y − y0 )2
T2 (x) = + ,
100 20
x2 y y2
T3 (x) = exp( ) + ,
10 20 50
x − x0 y − y0
T4 (x) = 1 − cos( ) cos( ),
20 20
T5 (x) = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 − 1,
(x − x0 )2 (y − y0 )2 (z − z0 )2
T6 (x) = + + .
100 20 20
The corresponding speed functions are derived as the reciprocal of ∇T (x). The
approximated arrival time, T́ (x), by each method is computed and then compared
with T (x) using different error metrics. The iso-contours of the analytical solution
of the arrival times for 2D domains are shown in Figure 4. The first speed function
is given by F1 (x), which corresponds to a moving front from a source point (x0 , y0 )
with a unit speed. In this case, the computed arrival time corresponds to a distance
field, which is of interest in many computer vision and graphics applications such as
shape representation, shape-based segmentation, skeletonization, and registration.
The test is performed three times from one or more source points; one source point
at the middle of the grid (high curvature solution), one source point at the corner
of the grid (smooth solution), and two source points within the grid (solution with
shock points). To measure the error between the computed arrival time, T́ (x), and
the analytical solution, Tanalytic (x), we employ the following error metrics:
100 100
90 90
80 80
70 70
60 60
50 50
40 40
30 30
20 20
10 10
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
(a) (b)
100 100
90 90
80 80
70 70
60 60
50 50
40 40
30 30
20 20
10 10
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
(c) (d)
100 55
90 50
80 45
70 40
60 35
30
50
25
40
20
30
15
20 10
10 5
10 20 30 40 50 60 70 80 90 100 5 10 15 20 25 30 35 40 45 50 55
(e) (f)
Figure 4. Iso-contours of the analytical solution of T (x) for T1 (a–c) at different source
points. (d) T2 , (e) T3 , and (f) T4 .
248 M. SABRY HASSOUNA and ALY A. FARAG
Method/Error L1 L2 L∞ L1 L2 L∞
Method/Error L1 L2 L∞
Method/Error L1 L2 L∞ L1 L2 L∞
(a) (b)
(c) (d)
(e) (f)
Figure 5. Iso-contours of the error curves when applying (a,c,e) FMM2 and (b,d,f) the
proposed method (MSFM2 ). See attached CD for color version.
250 M. SABRY HASSOUNA and ALY A. FARAG
(a)
(b)
Figure 6. Iso-contours of the exact solution (solid), FMM2 (dashed dot), and MSFM2
(dashed) for a wave propagating from (a) one source point and (b) two source points. See
attached CD for color version.
ACCURATE TRACKING OF MONOTONICALLY ADVANCING FRONTS 251
second one propagates from two different source points (51,35) and (51,67). It
is obvious that in both cases MSFM2 is more accurate than FMM2 . Also, the
computed iso-contours by the proposed method nearly coincide with the exact
analytical solution.
In Table 6, we list the numerical errors by two complex speed models (F2 and
F3 ) that are functions of the spatial coordinates.
In [12], the authors have tested their method using different test functions (F1
and F4 ) on a grid of size 56 × 56 points with ∆x = ∆y = 1. In this experiment,
we compare the proposed methods (MSFM1 and MSFM2 ) against their approach
under the same experimental conditions. The numerical errors are listed in Table 7.
5. DISCUSSION
The numerical error results of Tables 4–9 show that the proposed methods
(MSFM1 and MSFM2 ) give the most accurate results among all related tech-
niques, especially when using the second-order approximation of the directional
derivatives. Since MSFM1 is more accurate than the original FMM1 , the former
method is used to initialize the points near the boundary when using second-order
derivatives.
The worst case complexity of the proposed method is still O(n log n); how-
ever, the computational time is higher. The computational complexity can be re-
duced to O(n) by implementing the narrow band as an untidy priority queue [11].
In 2D space, the computational time is nearly the same as the FMM. However,
in 3D space, the computational time is approximately 2–3 times higher than the
FMM. In 3D, each voxel x has 6 voxels that share a face (F-connected), 12 voxels
that share an edge (E-connected), and 8 voxels that share a vertex (V-connected).
According to the proposed stencils, S1 covers the F-connected, Sj , where j ∈ [1, 4]
cover both F-connected and E-connected, and Sj , where j ∈ [1, 6] cover the entire
26-connected neighbors. To strike a balance between high accuracy and minimal
computational time, our numerical experiments have shown that we can restrict
the proposed method to 18 neighbors without apparent loss in accuracy.
252 M. SABRY HASSOUNA and ALY A. FARAG
Time T (x) T1 = (x − x0 )2 + (y − y0 )2 − 1 T4 = 1 − cos( x−x
20
0
) cos( y−y
20
0
)
Method/Error L1 L2 L∞ L1 L2 L∞
Time T (x) T5 = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 − 1
Method/Error L1 L2 L∞ L1 L2 L∞
Method/Error L1 L2 L∞
6. CONCLUSION
7. APPENDIX
the pseudocode for two-dimensional image space. For 3-dimensional space, the
algorithm will be modified at several parts to take into account the extra stencils.
In general, the FMM involves two NarrowBand operations: searching the
NarrowBand for the grid voxel v of minimum arrival time T , and finding the grid
neighbors of v. It has been suggested in [5] to implement the NarrowBand using
a minimum-heap data structure, which is a binary tree with the property that the
value at any given node is less than or equal to the values at its children. Therefore,
the root of the tree always contains the smallest element of the NarrowBand.
Also, whenever a new point is inserted into the heap, the data structure takes care
of resorting the tree to maintain the root as the smallest element, which can be
retrieved in time O(1). We have implemented the minimum-heap data structure
using the Multimap data structure that is part of the standard template library
(STL) [16].
254 M. SABRY HASSOUNA and ALY A. FARAG
(a)
Figure 7. The used data structure to achieve efficient implementing the MSFM method.
Algorithm 2 GetNeighborDistances(Voxel v)
Neighbors[8][2] = {{-1,0}, {1,0}, {0,-1}, {0,1}, {-1,-1}, {1,1}, {-1,1}, {1,-1}}
v.state = KNOWN
t=∞
for i = 1 : 8 do
x = v.x + Neighbors[i][1]
y = v.y + Neighbors[i][2]
Voxel vn = GetVoxel(x,y)
if ((vn .state == NARROWBAND) ∨ (vn .state == FAR)) then
t = min(t, SolveEikonal(vn , S1 ))
t = min(t, SolveEikonal(vn , S2 ))
if (t == ∞) then
continue
end if
if (vn .state == FAR) then
vn .state = NARROWBAND
NarrowBand.Insert(t, vn .id)
else
NarrowBand.Update(t, vn .id)
end if
end if
end for
256 M. SABRY HASSOUNA and ALY A. FARAG
Algorithm 4 GetCoefficients(Stencil S, T1 , T2 , a, b, c)
if (T2 = ∞) then
Tv = (4T1 − T2 ) / 3
if (S == S1 ) then
ratio = 9 / (4×h2 )
else
ratio = 9 / (8×h2 )
end if
a = a + ratio
b = b - 2× ratio × Tv
c = c + ratio×Tv2
else
if (S == S1 ) then
ratio = 1 / h2
else
ratio = 1 /(2×h2 )
end if
a = a + ratio
b = b - 2× ratio × T1
c = c + ratio×T12
end if
Algorithm 5 SolveQuadratic(v, T1 , T2 )
F = ComputeSpeed(v)
c = c - 1/F 2
if ((a == 0) ∧ (b==0)) then
return ∞
end if
disc = b2 -4×a×c
if (disc < 0) then
return ∞
else
t1 = (-b + sqrt(disc))/(2×a)
t2 = (-b - sqrt(disc))/(2×a)
return max(t1 , t2 )
end if
258 M. SABRY HASSOUNA and ALY A. FARAG
8. REFERENCES
1. Cerveny V. 1985. Ray synthetic seismograms for complex two- and three-dimensional structures.
J Geophys 58:2–26.
2. Vidale J. 1990. Finite-difference calculation of travel times in three dimensions. J Geophys
55:521–526.
3. van Trier J, Symes W. 1991. Upwind finite-difference calculation of travel times. J Geophys
56:812–821.
4. Podvin P, Lecomte I. 1991. Finite-difference computation of travel times in very contrasted
velocity models: a massively parallel approach and its associated tools. Geophys J Int 105:271–
284.
5. Adalsteinsson D, Sethian J. 1995. A fast level set method for propagating interfaces. J Comput
Phys 118:269–277.
6. Kim S. 1999. Eno-dno-ps: a stable, second-order accuracy eikonal solver. Soc Explor Geophys
69:1747–1750.
7. Sethian J. 1999. Level Sets methods and fast marching methods, 2nd ed. Cambridge: Cambridge
UP.
8. Kimmel R, Sethian J. 1998. Fast marching methods on triangulated domains. PNAS 95(11):8341–
8435.
9. Sethian JA, Vladimirsky A. 2000. Fast methods for the eikonal and related Hamilton-Jacobi
equations on unstructured meshes. PNAS 97(11):5699–5703.
10. Kim S. 2001. An o(n) level set method for eikonal equations. SIAM J Sci Comput 22(6):2178–
2193.
11. Yatziv L, Bartesaghi A, Sapiro G. 2006. A fast o(n) implementation of the fast marching algorithm.
J Comput Phys 212:393–399.
12. Danielsson P-EE, Lin Q. 2003. A modified fast marching method. In Proceedings of the 13th
Scandinavian conference, SCIA 2003. Lecture notes in computer science, Vol. 2749, pp. 1154–
1161. Berlin: Springer.
13. Tsitsiklis J. 1995. Efficient algorithms for globally optimal trajectories. IEEE Trans Auto Control
40(9):1528–1538.
14. Godunov S. 1959. Finite-difference method for numerical computation of discontinuous solutions
of the equations of fluid dynamics. Mat Sbornik 47:271–306. [Trans. from the Russian by I
Bohachevsky.]
15. Dijkstra EW. 1959. A note on two problems in connexion with graphs. Numer Mat 1:269–271.
16. http://www.sgi.com/tech/stl/.
9
1. INTRODUCTION
Deformable models are a powerful approach to image analysis and are used
particularly widely to segment pathological and normal anatomical structures from
medical images. They include parametric deformable models (PDMs) [1, 2],
Address all correspondence to: Rongxin Li, ICT Centre, Building E6B, Macquarie University Campus,
North Ryde, NSW 2113, Australia. Phone: (612) 9325 3127, Fax: (612) 9325 3200. [email protected].
259
260 RONGXIN LI and SÉBASTIEN OURSELIN
geometric deformable models (GDMs) [3, 4], and statistical point distribution
models (e.g., [5, 6]). PDMs are characterized by their parameterized representa-
tion of contours or surfaces in a Lagrangian framework. In comparison, GDMs are
based on the theory of front evolution and represented implicitly as level sets em-
bedded in higher-dimensional functions. Numerical computation of the evolution
of the GDM is carried out with the Eulerian approach, that is, no parameterization
of the model on a fixed Cartesian grid is performed.
Deformable models derived from the framework of functional minimization
have become widely used. The functional to be minimized is often referred to as the
“energy,” which varies with the model’s geometry and either regional consistency,
image gradients, or both. Although some models such as the generic GDM were
developed independently of energy minimization, those derived from an energy
minimization framework have been the most fruitful in image segmentation. Since
it is generally not possible to perform global optimization for energy functionals
(except in restricted circumstances such as when the functional is inherently one
dimensional [7]), a local minimum is usually the only achievable outcome with
these energy-based models. That is, in order to make such optimization com-
putationally tractable, algorithms applicable to multivariate energy minimization
find the nearest local minimum in a functional space composed of all permissible
hypersurfaces. This local minimum depends on the initial position of the model
in the space of all permissible models. This is called initialization dependency.
Furthermore, the energy minimum and the model geometry that corresponds
to that minimum are dependent on the parameters. This is referred to as parameter
dependency. For example, in the case of GDMs, the strength of the unidirectional
propagation force and the number of iterations can play a vital role in the final
outcome. Too few iterations or a too-weak propagation force can give rise to a
partially segmented object, while too many iterations combined with a compar-
atively strong propagation term may result in so-called “leakage” (i.e., invasion
of the zero level set into the background) through the relatively weak boundary
gradients of the object, particularly if these weak parts are also close to the initial
zero level set. This is the underlying reason that it is often necessary to monitor
the progress of segmentation using a GDM. A similar problem exists with a PDM
that incorporates an inflation or deflation force (also called a balloon force).
The initialization and parameter dependencies are an obstacle to robust au-
tomatic or near-automatic segmentation as image-specific, individualized initial-
ization and parameter tuning are necessary to achieve the desired segmentation.
This sensitivity to initialization, although varying in degree for different types of
models, has in general been a significant limitation under many circumstances.
We take segmentation-based anatomical-model construction as an example.
Anatomical models derived from medical images or image databases are needed
for some medical research (e.g., [8]). For instance, our own current research is on
accurate estimation of the amount of radiation energy deposited in various tissues
within the body of a pediatric patient during a radiological procedure. Accurate
TOWARD CONSISTENTLY BEHAVING DEFORMABLE MODELS 261
age-specific models of children for radiation dosimetry purposes are not presently
available. It is hoped that this work will lead to a scientific basis for determining
an optimal dose for children. Due to the generally large size of the databases, a
high degree of automation is often needed in this process. The requirement for
image-specific initialization and parameter tuning would prevent such automation
from being achieved. In fact, for a gradient rich image it is usually non-trivial
to find a set of parameters that sufficiently constrain the model yet are able to
avoid over-segmentation. Further, segmentation failure can also result from poor
initialization. This can occur, for example, when the initial implicit model lies on
both sides of the object boundary.
Despite the limitations, however, the deformable models approach remains
a powerful one for image segmentation. In this chapter, we present a generic
marker-based approach aimed at reducing limitations in order to achieve a higher
degree of automation for deformable model-based segmentation, especially in
scenarios where the markers can be automatically placed, using a morphological
or a knowledge-based method. This approach is presented after a brief review of
the literature in Section 2, and can be summarized as follows.
The approach described here consists of the following steps given two auto-
matically or manually placed markers. First, a gradient magnitude image is ob-
tained after the original image is smoothed. This image is then modified through
a Swamping Transform before geodesic topographic distances (GTDs) from the
markers are computed. The background theory and computational considerations
in this step is presented in Section 3. The second step is partitioning of the im-
age based on the GTD transforms via thresholding and computation of distance
gradient vectors. This is given in Section 5. Finally, the distance gradient vectors
along with the partitioning information is used within parametric and geometric
deformable models, as described in Section 6.
After the approach is described, the results of its validation experiments are
discussed in Sections 7 and 8. An application to the determination of tissue char-
acteristics in early childhood is presented in Section 9. Finally, some discussions
and conclusions are presented.
2. BACKGROUND
the model, albeit to a lesser extent [7]. What is attainable, in general, is a local
minimum of the energy functional via a variational framework. As the energy
functional is usually non-convex, the local minimum achieved is likely to be dif-
ferent from the global minimum. This is the cause of parameter and initialization
dependency.
First-generation deformable models [1] were particularly sensitive to initial-
ization due to a nearby local energy minimum that corresponds to a totally collapsed
version of the model. Early attempts to improve this include a balloon model [9],
which applies either a constant or a gradient-adaptive force in the direction of the
contour or surface normal. Based on similar principles, a curvature-independent
propagation term,
g(|∇Z(x)|)|∇u|,
has been incorporated in the geodesic active contour (GAC) level set models1 ,
where Z(x) is the image intensity, g is a monotonically decreasing function such
that g : R+ → (0, 1], as r → ∞, and u is the level set function. The GAC
essentially requires that the deformable model be completely inside or outside the
structure of interest in order to ensure success [10], as is the case with the balloon
model. Furthermore, as observed by numerous researchers (e.g., [11]), leakage
through weak object boundaries may happen due to factors such as initialization
and inappropriate weighting and stopping parameters.
Another widely employed approach is to modify the external force in de-
formable models. The gradient vector flow (GVF) method [13, 12, 10] is perhaps
the most prominent example, which uses spatial diffusion of the gradient of an
edge-strength map derived from the original image to supply an external force.
A GVF is a vector field v(x, y) = [u(x, y), v(x, y)] that minimizes the energy
functional
E= µ(u2x + u2y + vx2 + vy2 ) + |∇f |2 |v − ∇f |2 dxdy,
where f is an edge map of the original image and µ is a blending parameter. This
technique enables gradient forces to be extended from the boundary of the object
and has an improved ability in dealing with concavities over using forces based
on distances from edges [9, 12]. A major drawback of this approach, however, is
that it does not discriminate between target and irrelevant edges. Such irrelevant
gradients are abundant in a typical medical image. Consequently, the “effective
range” of the object gradients for attracting the deformable model is only extended
to the nearest non-object gradients. If the model is initialized beyond this range,
an undesirable outcome attributable to the non-object gradients is likely to result.
A third approach is hybrid segmentation, where independent analysis such as
thresholding, seeded region growing, a watershed transform, or a Markov Ran-
dom Field or Gibbs Prior model, is employed to provide an initial segmentation,
followed by a refinement stage using a deformable model (e.g., [14–16]). In this
TOWARD CONSISTENTLY BEHAVING DEFORMABLE MODELS 263
approach, the deformable model may only be a secondary procedure used merely
to refine the primary segmentation in the hope to gain more robustness to noise
or other degradation. The individual components in this hybrid approach may be
repeated to form an iterative process [15].
Through simultaneous statistical modeling of background regions, a region
competition approach originated from Zhu and Yuille [17] offers a mechanism
for PDMs to overcome the leakage problem, which was briefly described in Sec-
tion 1 above. This mechanism, however, is not always suitable. The difficulties
in the statistical modeling and in controlling the number of resultant regions are
two issues in many situations. On the other hand, some subsequent efforts have
successfully extended the mechanism for use with other frameworks such as the
GDM. In [11], the authors employ posterior probability maps of the tumor and the
background derived from fitting Gaussian and Gamma distributions to the differ-
ence image obtained from a pair of pre- and post-contrast T1-weighted images.
The difference between those probabilities is used to modulate the propagation of
the GDM, so that the zero level set is more likely to expand within the area of the
tumor and contract in the background. The main disadvantages with this method
are twofold. One is that the probability maps need to be derived using ad-hoc
methods, making the application to other problems non-trivial. The other is that
the usefulness of the resultant speed term is critically dependent on the accuracy
of the probability estimation.
Various ad-hoc techniques have also been used in different deformable mod-
els. For example, problem-dependent search methods have been used in a pre-
processing stage [18]. A robust generic approach to relaxing initialization and
parameter dependencies, however, has not yet been available.
Despite its limitations, the deformable models approach provides a power-
ful means to use a priori knowledge for image segmentation. Therefore, many
other approaches have attempted to incorporate some mechanisms of deformable
models. For example, it has been found that the smoothness-imposing property of
PDMs can be used to improve segmentation results using a watershed transform
[19], although the watershed framework can be restrictive as to the kind of con-
straints or knowledge that can be used and the manner that it is used. As another
example, in the classic algorithm proposed by Tek and Kimia [20], regions start to
grow under image forces from a number of seeds, and boundaries emerge as the
growth and merging of regions are inhibited by the presence of edges.
The approach to be presented in this chapter is within the framework of de-
formable models. Its purpose is to reduce sensitivity to initialization and to para-
meters, in order to achieve more robust automatic segmentation.
We start by examining a geometric image partitioning algorithm, namely,
Skeleton by Influence Zones (SKIZ, alternatively known as a generalized Voronoi
diagram).
264 RONGXIN LI and SÉBASTIEN OURSELIN
{Si ⊂ D : i ∈ I}
D \ ∪ Zi ,
i
A relationship exists between the SKIZ partitioning and the watershed trans-
form if the GTD in a gradient image is used as the basis in defining δ(x, Si )
[22, 21, 19]. Suppose that a gradient magnitude image f , obtained after smoothing
the original image Z(x) via anisotropic diffusion or using an infinite impulsional
response (IIR) filter that approximates a convolution with the derivative of the
Gaussian kernel, is a C 2 real function.
The GTD d(x, Si ) between x ∈ D and Si is defined as
where
d (x, y) = inf |∇f (s)|ds, (2)
γ∈{Γ(x,y)}
γ
{Si : i ∈ I}
are equal to the set of all the local minima of f . A SKIZ partition with respect to
the distance definition ∀i ∈ I,
This has been used in the metric-based definition of the watershed transform
[22, 23], and is the basis of partial differential equation (PDE) models of the
watershed [19, 24], which have been exploited to incorporate smoothness into
watershed segmentation. As the watershed theory has established that a watershed
coincides with the greatest gradient magnitude on a GTD-based geodesic path
between two markers, so does the corresponding part of the SKIZ [19, 22].
It is critical to note that in 2D or 3D, spatially isolated disjoint gradients cannot
be on a GTD-based geodesic path; therefore, the gradients need to have spatially
consistently large magnitudes in order to be part of the SKIZ.
It follows from the above that in order to take advantage of the significant
edges corresponding to the classical definition of watersheds, the markers from
which the GTDs are calculated to derive the SKIZ must be at local minima in the
image and there must not be other minima. Homotopy modification via a swamping
transform [22, 25, 26] is usually necessary to achieve this. This transform produces
an "uphill" landscape such that when moving away from a marker the gray levels
either rise up or keep constant until the SKIZ is reached (i.e., the border of the
marker’s influence zone). Thus, local minima that are marked will be kept but
those that do not contain markers will be filled.
Defining
−1, x ∈ ∪ Si
f1 (x) = i ,
f (x), otherwise
and
−1, x ∈ ∪ Si
b(x) = i ,
Mb otherwise
where the constant Mb = sup{f (x) : x ∈ D}, the swamping transform, which
results in a new image f2 (x), is accomplished by a reconstruction-by-erosion of
f1 (x) from b(x) [27]:
f2 (x) = Rf1 (x) [b(x)].
Traditionally the swamping transform is accomplished by the iterative geodesic
erosion (until stability) of b with f1 as the reference [28, 29], or
ε∞
f1 (x) [b(x)].
The computation of the GTD tranform and the swamping transform are key
elements to the proposed approach. Unfortunately, they can also be the most
computationally costly components. It is therefore critical to investigate efficient,
yet accurate, algorithms for those transforms.
between the target marker and the background marker. For a higher-dimensional
space such as 3D, there exists a stronger possibility that this may happen if the
object of segmentation does not always contrast well with neighboring organs.
In such a case, not all of the object’s boundary coincides with SKIZ. In order to
rectify this, it may be desirable to selectively use a range of f2 (x), using a mapping
such as
Lo , f2 (x) ≤ Li
f¯2 (x) = Lo + f2H(xi −L
)−Li
(Ho − Lo ), Li < f2 (x) < Hi , (6)
i
Ho , f2 (x) ≥ Hi
where Li , Hi , Lo , and Ho are constant parameters, with the range between Li and
Hi designating the desired band for f2 (x).
Alternatively, the speed function for the FMM can be directly manipulated
using a combination of a band selection and a linear approximation to ensure a
significant speed reduction at the object’s boundary. For example,
Hq , |∇f2 (x)| ≤ Lf
|∇f2 (x)|−Lf
q̄(x) = Hq − Hf −Lf (Hq − Lq ), Lf < |∇f2 | < Hf (7)
Lq , |∇f2 (x)| ≥ Hf
may be used, where Lf , Hf , Lq , and Hq are also constant parameters. If this
formulation is used, Hf needs to be adjusted to suit different tasks and different
sources of data, while the other three can remain application independent. The
parameter Hq should be large enough to ensure a near-zero GTD on any topo-
graphically flat path of a plausible length between any two points in the image.
Conversely, Hq should be small. In our experiments, we have consistently used
0.001 for Hq and have chosen Lf = 0.
This equation is designed to ensure that the background marker whose GTD to the
object marker is the least overrides any other markers. Eq. (8) can be implemented
very efficiently via the FMM. As
it can be rewritten as
in which the second term on the right-hand side lends itself naturally to multiple-
front wave propagation via the FMM, where only one round of propagation is
necessary.
A binary image can be obtained by thresholding M :
1, M (x) ≤ 0,
B0 (x) = . (10)
0, otherwise
Due to degradation or deficiencies in the boundary gradients that are often present,
and a lack of model constraints (e.g., smoothness, shape constraints) to overcome
these deficiencies, B0 is unlikely to be an optimal segmentation of the target in
most situations. However, it can provide valuable information for deformable
models. Different possibilities exist in how this information can be exploited by
a deformable model. A transform of this information into forces or speeds would
allow full integration into a deformable model framework. Advantages of doing
so include that isolated boundary gradients near SKIZ, while not playing a role in
defining B0 , can be additionally taken into account.
In order to achieve this, we compute a distance map D0 on image B0 , i.e.,
D0 (x) = Æ(B0 ),
adjusted to suit each specific segmentation task. In addition, it should not be used
where the gradient composition interior to the target may be significant and cannot
be estimated a priori.
Figure 1. Left: The external force (solid arrows) may be deficient in the normal direc-
tion of the deformable model (vertical straight line) near significant concavities. Right:
sgn[M (x)]N(x) (empty arrows) is a pressure force that is automatically adaptive to the
need for inflation or deflation. See attached CD for color version.
into a pressure force will make the force automatically adaptive to the need for
either expansion or contraction, in accordance with whether the node of the model
is inside or outside the segmentation object. Thus, this force can compensate for
the insufficient normal component of ∇D0 and ∇D1 near significant concavities,
as illustrated in Figure 1.
It is worth noting, however, that it is often desirable to ignore narrow concav-
ities, as their presence is frequently due to noise or artifacts, rather than a genuine
structural feature inherent in the object. This especially tends to be true if the
concavity is shallow.
F has the potential to replace the gradient image with vectors that are globally
consistent, in contrast to the short ranging and inconsistent information in the
gradient image.2 A vector field such as Eq. (12) can be easily integrated into a
PDM, as demonstrated in existing works with the GVF model [10, 12].
In a similar fashion to the GVF model, F can help drive the deformation of a
PDM v with a surface parameterization s:
∂v ∂2v ∂4v
= α 2 − β 4 + γ(F(x) · N(x))N(x), (13)
∂t ∂s ∂s
where N is the surface normal, and α, β, and γ are the model parameters. In our
experiments, α was set to 0.
Further, as discussed above, a pressure force N(x) or sgn[M (x)]N(x) can be
used in addition when it is desirable to model narrow concavities.
272 RONGXIN LI and SÉBASTIEN OURSELIN
∂u ∇u
= αh(f )|∇u|div + βsgn[M (x)]|∇u| + γP · ∇u, (14)
∂t |∇u|
P = −F,
with F defined in Eq.(12). The advantage of this, as in the PDM case above, is the
possible incorporation of forces based on spatially disjoint gradients. As the GTD
is a geodesic distance, disjoint gradients generally play no role in the distance
calculation except for the spatial locations that those gradients occupy.
TOWARD CONSISTENTLY BEHAVING DEFORMABLE MODELS 273
A simpler alternative is
P = ∇D0 + ∇D1 .
This alternative form is similar in effect to curvature-independent directional prop-
agation, but is more appropriate if narrow concavities should be ignored, as dis-
cussed in Section 6.1.
Initialization of u can be performed by combining D0 and D1 .
7. QUALITATIVE EVALUATIONS
Figure 2. Experiments with a PDM on a CT image. Top: the image, the initial model
(large white circle), and the two markers (black dots) that are use to respectively identify
the target and non-target background. Bottom: The globally consistent external force field
(white arrows) and the resultant segmentation (black contour).
TOWARD CONSISTENTLY BEHAVING DEFORMABLE MODELS 275
Figure 3. Brain tumor segmentation using the GAC level set model on a T1-weighted MR
image. Top left: cropped original brain MR image; top right: GTD transform from the
object marker; bottom left: GTD transform from the background marker; bottom right:
a segmentation of the brain tumor displayed as the overlay on the original image. See
attached CD for color version.
276 RONGXIN LI and SÉBASTIEN OURSELIN
Figure 4. Example of brain tumor segmentation using the GAC level set model. Top left:
axial slice view of an original T1-weighted MR image of the brain; top right: coronal
(above) and sagittal (below) views of the original image; middle left: the segmentation
superimposed on the axial slice view; middle right: the segmentation superimposed on
coronal (above) and sagittal (below) views of the original image; bottom: a 3D view of the
segmented tumor. See attached CD for color version.
TOWARD CONSISTENTLY BEHAVING DEFORMABLE MODELS 277
Figure 5. Example of lung and liver segmentation employing the GAC level set model on
CT images, with the results displayed using ITK-SNAP software [40]. See attached CD
for color version.
8. QUANTITATIVE VALIDATION
8.1. Accuracy
For quantitative assessment of the accuracy performance of the approach pre-
sented above, we used MR and CT databases to study the segmentations of the
brain, the liver, and the femur. As in the qualitative tests, within each group of
experiments the same set of parameters were always applied to all the images used
in the group.
278 RONGXIN LI and SÉBASTIEN OURSELIN
Figure 6. Example of brain segmentation using the GAC level set model. Top: an original
T1-weighted MR image of the brain. Middle: a slice of the segmented brain (white overlay).
Bottom: a 3D view of the segmented brain. See attached CD for color version.
280 RONGXIN LI and SÉBASTIEN OURSELIN
The target of segmentation was the distal section of the femur, comprised
mainly of the femoral condyles (lateral and medial). Manual segmentation by
an independent expert, used as the ground truth, is available for all the slices.
The resultant sensitivities and specificities are tabulated in Table 5 along with the
associated statistical analysis.
8.2. Robustness
8.2.1. Experiments with the Parametric Model
The aim of this validation study was to test the robustness of the approach to
placement of the initial seeds. The same CT image as described in Section 7.1
was used, the target remained the liver. The ground truth to be compared against
was a manual segmentation, shown in Figure 7. Two seeds were randomly placed
within the areas identified by a target mask and a background mask, respectively.
These seeds were generated as follows. A foreground mask and a background
mask were obtained, respectively, from the manual segmentation.
Denoting the binary image resulting from the manual segmentation as Zb , and
a circular structuring element A of 5 pixels in radius, we use
F =S A
B=Y A
Figure 7. Top: manual segmentation used as the ground truth. Bottom: background mask
in which a background identifying marker is randomly placed.
TOWARD CONSISTENTLY BEHAVING DEFORMABLE MODELS 285
It has been estimated that among children under 15 years of age who undergo
a CT examination, almost 1 in a 1000 will ultimately die from a cancer attributable
to their x-ray exposure [42]. CT scanning is a high-radiation-dose procedure, and
its use in diagnosis is increasing. Accurate estimation of the amount of radiation
286 RONGXIN LI and SÉBASTIEN OURSELIN
Failure Rate
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
Background
0
1 2 3 4 5 6 7 8 9 10 Seeds
µ = ρν, (16)
where ρ is the physical density of the material being imaged, and ν is called the mass
attenuation coefficient. ν is dependent on the chemical composition of the medium,
and can be computed (as a weighted sum) from the mass attenuation coefficients
of the constituent elements. ν is essentially constant across the population and
age groups, as well established in the literature. It is possible to use published
elemental composition of tissues to calculate ν [44].
On the other hand, when it is possible to calibrate the CT scanner using known
attenuation materials, µ can be obtained empirically via
µ = µH2 O ∗ (1 + t/K), (17)
TOWARD CONSISTENTLY BEHAVING DEFORMABLE MODELS 287
10.3. Conclusion
We have presented a generic approach to reducing the behavioral dependences
on the initialization and parameters for deformable models. Based on topographic
distance transforms from two markers, this novel approach uses an object-versus-
background identification in deformable models. In this approach, GTDs are
computed on the gradient magnitude image in order to locate appropriate gra-
dients given the two identifying markers. This information is integrated into a
deformable model as forces or speed terms to influence its evolution. This work
takes advantage of existing theoretical research into the watershed transform, yet
it is outside the watershed framework and preserves fully the advantages of de-
formable models. The implementation is based on efficient numerical methods.
Our experiments reveal that, even when a relatively high level of accuracy is needed
to be achieved, the requirement for initialization was minimal, and that the depen-
dence on parameters was limited in that the same parameters were applicable to
an entire set of images.
The approach described in this chapter has been applied to determination of
tissue characteristics in early childhood using a CT database of neonates, and
will be used for further pediatric dosimetry studies. We believe that by relaxing
dependences on the initialization and parameters this approach will enable a degree
of automation needed for segmentation-based construction of organ models from
large image databases using deformable models, particularly when seeds can be
placed automatically based on, for example, a priori knowledge regarding anatomy
and the intensity differentiation between object and background.
11. ACKNOWLEDGMENTS
The authors are grateful to Dr. Donald McLean and Mr. Luke Barclay of
Westmead Hospital, Sydney, Australia, and the University of Sydney, Australia,
290 RONGXIN LI and SÉBASTIEN OURSELIN
for their contribution and support; and to Dr. Pierrick Bourgeat and Dr. Hans
Frimmel for proofreading the manuscript and providing many invaluable sugges-
tions.
The authors thank Drs. Simon Warfield, Michael Kaus, Ron Kikinis, Peter
Black, and Ferenc Jolesz of the Department of Neurosurgery and the Surgical
Planning Laboratory, Department of Radiology of the Harvard Medical School at
Brigham and Womens’s Hospital, Boston, Massachusetts, USA, for sharing the
SPL and NSG Brain Tumor Segmentation Database.
This work benefited from the use of the Insight Segmentation and Registration
Toolkit (ITK), open-source software developed as an initiative of the US National
Library of Medicine, and available at www.itk.org.
The authors are grateful to Professor Luc Soler of Digestive Cancer Research
Institute, IRCAD, Strasbourg, France, for sharing the database of livers used in
this publication.
The authors wish to thank Andrea J.U. Mewes, Simon K. Warfield, and
Johannes Pauser of the Computational Radiology Laboratory, Harvard Medical
School, Department of Radiology, Brigham and Women’s Hospital and Children’s
Hospital, Boston, Massachusetts, USA, for sharing the original MR scans pre-
sented with the femoral bone segmentation in this chapter, and interactively seg-
menting the scans. Permission from the Westmead Hospital, Sydney, Australia,
to use the data for this publication is gratefully acknowledged.
12. NOTES
1. In this chapter the term GAC refers to both the original 2D version and the subsequently extended
version into 3D (which is also called minimal surface models [4]).
2. For the inside part of the model inside the target, a consistently expanding flow from the viewpoint
of the model may promote more robustness than using Eq. (12). Therefore, we also implemented
an alternative that takes into account the direction of the normal vector if any portion of the model
is interior to the target boundary. However, it has not been our first choice of methods because of
its inseparability from the model.
3. An exception is case 9, for which a total of 15 markers were placed inside and around the tumor
in order to overcome the strong gradients present in and around the tumor. Essentially, it is a case
where the tumor and normal tissues are very difficult to be segmented together using the same
parameters applied to the other cases. Nonetheless, for the sake of completeness we present the
result for this case together with the other cases.
13. REFERENCES
1. Kass M, Witkin M, Terzopoulos D. 1987. Snakes: active contour models. Int J Comput Vision
1:321–331.
2. McInerney T, Terzopoulos D. 1995. Topologically adaptable snakes. In Proceedings of the fifth
international conference on computer vision, pp. 840–845. Washington, DC: IEEE Computer
Society.
3. Malladi R, Sethian JA, Vemuri BC. 1995. Shape modeling with front propagation: a level set
approach. IEEE Trans Pattern Anal Machine Intell 17(2):158–175.
TOWARD CONSISTENTLY BEHAVING DEFORMABLE MODELS 291
4. Caselles V, Kimmel R, Sapiro G. 1997. Geodesic active contours. Int J Comput Vision 22(1):61–
79.
5. Cootes T, Hill A, Taylor C, Haslam J. 1994. The use of active shape models for locating structures
in medical images. Image Vision Comput 12(6):355–366.
6. Cootes T, Taylor C. 2001. Statistical models of appearance for medical image analysis and com-
puter vision. In Proc SPIE Med Imaging 4322:236–248.
7. Amini AA, Weymouth TE, Jain RC. 1990. Using dynamic programming for solving variational
problems in vision. IEEE Trans Pattern Anal Machine Intell 12:855–867.
8. Nipper JC, Williams JL, Bolch WE. 2002. Creation of two tomographic voxel models of paediatric
patients in the first year of life. Phys Med Biol 47:3143–3164.
9. Cohen L, Cohen I. 1993. Finite-element methods for active contour models and balloons for 2D
and 3D images. IEEE Trans Pattern Anal Machine Intell 15(11):1131–1147.
10. Paragios N, Mellina-Gottardo O, Ramesh V. 2004. Gradient vector flow fast geometric active
contours. IEEE Trans Pattern Anal Machine Intell 26(3):402–417.
11. Ho S, Bullitt E, Gerig G. 2002. Level set evolution with region competition: automatic 3D
segmentation of brain tumors. In R. Kasturi, D. Laurendeau, and C. Suen, editors, Proceedings of
the 16th international conference on pattern recognition, pp. 532–535. Washington, DC: IEEE
Computer Society.
12. Xu C, Prince JL. 1998. Snakes, shapes, and gradient vector flow. IEEE Trans Image Process
7(3):359–369.
13. Xu C, Prince JL. 2000. Gradient vector flow deformable models. In Handbook of Medical Imaging,
pp. 159–169. Ed I Bankman. New York: Academic Press.
14. Chen T, Metaxas D. 2002. Integration of Gibbs prior models and deformable models for 3D
medical image segmentation. In Proceedings of the 16th international conference on pattern
recognition, Vol. 1, pp. 719–722. Washington, DC: IEEE Computer Society.
15. Metaxas D, Chen T. 2004. A hybrid 3D segmentation framework. IEEE Int Symp Biomed Imaging
1:13–16.
16. Parke J, Keller J. 2001. Snakes on the watershed. IEEE Trans Pattern Anal Machine Intell
23(10):1201–1205.
17. Zhu SC, Yuille AL. 1996. Region competition: Unifying snakes, region growing, and bayes/MDL
for multiband image segmentation. IEEE Trans Pattern Anal Machine Intell 18(9):884–900.
18. Thodberg HH, Rosholm A. 2003. Application of the active shape model in a commercial medical
device for bone densitometry. Image Vision Comput 21:1155–1161.
19. Nguyen HT, Worring MI, van den Boomgaard R. 2003. Watersnakes: energy-driven watershed
segmentation. IEEE Trans Pattern Anal Machine Intell 25(3):330–342.
20. Tek H, Kimia B. 1997. Volumetric segmentation of images by three-dimensional bubbles. Comput
Vision Image Understand 65(2):246–258.
21. Najman L, Schmitt M. 1994. Watershed of a continuous function. Signal Process 38:99–112.
22. Meyer F. 1994. Topographic distance and watershed lines. Signal Process 38:113–125.
23. Roerdink JBTM, Meijster A. 2001. The watershed transform: definitions, algorithms and paral-
lelization strategies. Fundam Inform 41(1–2):187–228.
24. Maragos P, Butt MA. 1998. Advances in differential morphology: image segmentation via eikonal
PDE and curve evolution and reconstruction via constrained dilation flow. In Mathematical mor-
phology and its applications to image and signal processing, pp. 167–174. Ed HJAM Heijmans,
JBTM Roerdink. Amsterdam: Kluwer Academic.
25. Meyer F. 2001. An overview of morphological segmentation. Int J Pattern Recognit Artif Intell
15(7):1089–1118.
26. Bueno G, Mussea O, Heitza F, Armspach JP. 2001. Three-dimensional segmentation of anatomical
structures in MR images on large data bases. Magn Reson Imaging 19(1):73–88.
27. Meyer F, Maragos P. 1999. Multiscale morphological segmentations based on watershed, flood-
ing, and eikonal PDE. Scale-space theories in computer vision, pp. 351–362. Ed M Nielsen,
P Johansen, OF Olsen, J Weickert. Berlin: Springer.
292 RONGXIN LI and SÉBASTIEN OURSELIN
28. Beucher S. 2001. Geodesic reconstruction, saddle zones and hierarchical segmentation. Image
Anal Stereol 20:137–141.
29. Soille P. 2003. Morphological image analysis: principles and applications, 2nd ed. Berlin:
Springer.
30. Vincent L. 1993. Morphological grayscale reconstruction in image analysis: applications and
efficient algorithms. IEEE Trans Image Process 2(2):176–201.
31. Maragos P, Butt MA, Pessoa LFC. 1998. Two frontiers in morphological image analysis: dif-
ferential evolution models and hybrid morphological/linear neural networks. In Proceedings
of the international symposium on computer graphics, image processing and vision, Vol. 11,
pp. 10–17. http://cvsp.cs.ntua.gr/publications/confr/MaragosButtPesoa DifMorfMRLNN SIB-
GRAPI1998.pdf.
32. Adalsteinsson D, Sethian JA. 1995. A fast level set method for propagating interfaces. J Comput
Phys 118:269–277.
33. Sethian JA. 1996. A fast marching level set method for monotonically advancing fronts. Proc
Natl Acad Sci USA 93(4):1591–1595.
34. Sethian JA. 1999. Level set methods and fast marching methods: evolving interfaces in geometry,
fluid mechanics, computer vision, and materials science. Cambridge: Cambridge UP.
35. Vincent L, Soille P. 1991. Watersheds in digital spaces: an efficient algorithm based on immersion
simulation. IEEE Trans Pattern Anal Machine Intell 13(6):583–598.
36. Robinson K, Whelan PF. 2004. Efficient morphological reconstruction: a downhill filter. Pattern
Recognit Lett 25(15):1759–1767.
37. Marr D, Hildreth E. 1980. Theory of edge detection. Proc Roy Soc London B207:187–217.
38. Lorigo LM, Faugeras OD, Grimson WEL, Keriven R, Kikinis R, and Westin C-F. 1999. Co-
dimension 2 geodesic active contours for MRA segmentation. In Proceedings of the international
conference on information processing in medical imaging, pp. 126–133. Washington, DC: IEEE
Computer Society.
39. Warfield SK, Kaus M, Jolesz FA, Kikinis R. 2000. Adaptive, template moderated, spatially varying
statistical classification. Med Image Anal 4(1):43–55.
40. Yushkevich PA, Piven J, Cody H, Ho S, Gee JC, Gerig G. 2005. User-guided level set segmentation
of anatomical structures with ITK-SNAP. Insight J. To appear.
41. Kaus M, Warfield SK, Nabavi A, Jolesz FA, Black PM, Kikinis R. 2001. Automated segmentation
of MRI of brain tumors. Radiology 218(2):586–591.
42. Brenner DJ, Elliston CD, Hall EJ. 2001. Estimated risks of radiation-induced fatal cancer from
pediatric CT. Am J Roentgenol 176:289–296.
43. Curry TS, Dowdey JE, Murry RC. 1990. Christensen’s physics of diagnostic radiology. Philadel-
phia: Lea & Febiger.
44. McLean D, Barclay L, Li R, Ourselin S. 2006. Estimation of paediatric tissue characteristics from
CT image analysis. In Proceeding of the 6th international topical meeting on industrial radiation
and radioisotope measurement applications. Lecture notes in computer science, Vol. 3708. Ed J
Blanc-Talon, W Philips, DC Popescu, P Scheunders. Berlin: Springer. To appear.
45. Zhao H, Chan T, Merriman B, Osher S. 1996. A variational level set approach to multiphase
motion. J Comput Phys 127(1):179–195.
46. Paragios N, Deriche R. 2000. Coupled geodesic active regions for image segmentation: a level set
approach. In Proceedings of the sixth European conference on computer vision, Part 2. Lecture
notes in computer science, Vol. 1843, pp. 224–240. Berlin: Springer.
47. Vese LA, Chan TF. 2002. A multiphase level set framework for image segmentation using the
mumford and shah model. Int J Comput Vision 50(3):271–293.
48. Yezzi AJ, Tsai A, Willsky A. 2002. A fully global approach to image segmentation via coupled
curve evolution equations. J Vis Commun Image Represent 50(13):195–216.
49. Li S, Fevens T, Krzyzak A, Jin C, Li S. 2005. Toward automatic computer aided dental x-ray
analysis using level set method. In Proceedings of the international confernece on medical image
computing and computer assisted intervention, pp. pp. 670–678. Berlin: Springer.
10
Acute rejection is the most common reason for graft failure after kidney transplantation,
and early detection is crucial to survival of function in the transplanted kidney. In this study
we introduce a new framework for automatic classification of normal and acute rejection
transplants from Dynamic Contrast Enhanced Magnetic Resonance Images (DCE-MRI).
The proposed framework consists of three main steps. The first isolates the kidney from the
surrounding anatomical structures by evolving a deformable model based on two density
functions; the first function describes the distribution of the gray level inside and outside
the kidney region and the second describes the prior shape of the kidney. In the second step,
nonrigid registration algorithms are employed to account for the motion of the kidney due
to the patient’s breathing. In the third step, the perfusion curves that show transportation
of the contrast agent into the tissue are obtained from the segmented cortex of the whole
image sequence of the patient. In the final step, we collect four features from these curves
and use Bayesian classifiers to distinguish between acute rejection and normal transplants.
Applications of the proposed approach yield promising results that would, in the near future,
replace the use of current technologies such as nuclear imaging and ultrasonography, which
are not specific enough to determine the type of kidney dysfunction.
Address all correspondence to: Dr. Aly A. Farag, Professor of Electrical and Computer Engineering,
University of Louisville, CVIP Lab, Room 412, Lutz Hall, 2301 South 3rd Street, Louisville, KY
40208, USA. Phone: (502) 852-7510, Fax: (502) 852-1580. [email protected].
293
294 AYMAN EL-BAZ et al.
1. INTRODUCTION
In the United States, more than 12000 renal transplantations are performed
each year [1], but the transplanted kidneys face a number of surgical and medical
complications that cause a decrease in their functionality. Although such a decrease
in functionality can be reversed by proper treatment strategies and drug therapies
[2], the currently used techniques are not specific enough to diagnose the possible
diseases. For example, measurement of creatinine levels can be affected by diet and
medications [3]. Clearances of inulin and DTPA require multiple blood and urine
tests, and they provide information on both kidneys together, but not unilateral
information [4]. On the other hand, imaging tests are favorable since they provide
information on each kidney separately. However, the most frequently used imaging
technique, scintigraphy (also called nuclear imaging), preferred for its good func-
tional information, does not provide good spatial resolution. Without good spatial
resolution, precise anatomical detail cannot be obtained, so the diseases that affect
the different parts of the kidney (such as the cortex and medulla) cannot be diag-
nosed accurately [5]. Moreover, scintigraphy exposes the patients to a small dose
of radioactivity [6]. Another traditional imaging modality, Computed Tomography
(CT), uses nephrotoxic contrast agents and exposes patients to radiation despite
its superior functional and anatomical information [7]. Ultrasonography has been
found to show normal findings despite severe renal dysfunction [8], and several
studies on color Doppler sonography and power Doppler sonography (e.g., [9–
11]) have not been able to yield significant information to evaluate renal function.
Therefore, despite all its high costs and morbidity rates, biopsy is remains the gold
standard for diagnosis after renal transplantation. Unfortunately, the downside of
biopsy is that patients are subjected to such risks as bleeding and infection; more-
over, the relatively small needle biopsies may lead to over- or underestimation
of the extent of inflammation in the entire graft [12]. Hence, a noninvasive and
repeatable technique would not only be useful but is needed to ensure survival of
transplanted kidneys. For this reason, a fairly new imaging technique, Dynamic
Contrast Enhanced Resonance Imaging (DCE-MRI), has gained considerable at-
tention due to its ability to yield superior anatomical and functional information.
With DCE-MRI it has been possible to distinguish the different structures of the
kidney (such as the cortex and medulla) in a noninvasive way, and, combined with
function information, image analysis with DCE-MRI can help in detecting diseases
that affect the different parts of renal transplants. Therefore, the CVIP Lab at the
University of Louisville and the Urology and Nephrology Center at the University
of Mansoura have begun an ongoing collaboration to detect acute rejection from
normal functioning of transplants. This study focuses on the DCE-MRI findings
of this collaboration.
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 295
rotated and translated to get the best correlation, giving the registration parameters
of the image. The rotation in this scheme was limited to ∓ 4 degrees in 1-degree
steps. For the cases of when the patient inadvertently moved or breathed, detection
of kidney contour was severely impeded; therefore, a 50% vote rule was defined,
that is, at least 50% of the kidney needed to be detected for registration to work.
A similar procedure with some extensions was used by Yim et al. in [22].
The handicaps of this image analysis scheme can be listed as follows: (i) the
need for manual selection of a contour for each study; (ii) the problems that can be
faced in case of a large movement by the patient, and (iii) the great computational
expense of the Hough transform (the algorithm was implemented in parallel in [25],
and one patient took about an hour to evaluate). Moreover, this registration method
is highly dependent on the edge detection filter. Although the strength of edge
detection was increased by using opposed-phase gradient echo imaging that puts
a dark line between the water-based kidney and the perirenal fatty tissue, still, the
algorithm worked better in smaller areas since increasing the field of view (FOV)
parameter caused the algorithm to match more partial contours. Following this
same procedure, healthy volunteers and hydronephrosis patients were compared
in [26], and DCE-MRI was shown to be a reliable method for kidney analysis.
Noting the lack of special protocols and the consequent problems with edge
detection in the registration process, the second image analysis study came from
Giele et al. [28] in 2001, where three movement correction methods were compared
based on image matching, phase difference movement detection (PDMD), and
cross-correlation. In all these methods, a mask is generated from the best image
manually, and its similarity to a given image is calculated. Consequently, the
(i, j) values that give the highest similarity become the translation parameters
for the given image. Among these methods, the PDMD method demonstrated
the best performance, but only with an 68% accuracy compared to the results of a
radiologist. More importantly, in all three registration algorithms only translational
motion was handled and rotational motion was not mentioned, the existence of
which has been discussed in a number of studies (see [26, 39, 40]).
For segmentation of the kidneys, Priester et al. [41] subtracted the average of
pre-contrast images from the average of early-enhancement images, and thresh-
olded the subtraction image to obtain a black-and-white mask kidney image. Fol-
lowing this step, objects smaller than a certain size were removed and the remaining
kidney object was closed with erosion operations and manual interactions to obtain
a kidney contour.
This approach was furthered by Giele et al. [5] by applying an erosion filter
to the mask image to obtain a contour via a second subtraction stage. The possible
gaps in this contour were closed by a hull function to get the boundary of the kid-
ney, then via repeated erosions applied to this contour, several rings were obtained,
which formed the basics of the segmentation of the cortex from the medulla struc-
tures. Of course, in such a segmentation the medulla structures were intermixed
298 AYMAN EL-BAZ et al.
with the cortex structures, so a correlation study had to be applied to better classify
the cortical and medullary pixels.
Also in 2001, Boykov et al. [42] presented the use of graph cuts using Markov
models. In this algorithm, each voxel is described as a vector of intensity values
over time, and initially, several seed points are put on the objects and on the
background to give user-defined constraints as well as an expert sample of intensity
curves. These expert samples of intensity curves are used to compute a two-
dimensional histogram that would be used further as a data penalty function in
minimizing the energy function in the Markov model. Although the results looked
promising, this algorithm was tested only on one kidney volume, and manual
interaction was still required.
Following these studies, computerized image analysis schemes for the regis-
tration and segmentation of kidneys were introduced by Sun et al. [12, 21, 23, 24,
40, 43] in a series of studies performed on rats and human subjects. The study
on humans ([21, 40]) made use of a multistep registration approach. Initially, the
edges were aligned using an image gradient-based similarity measure consider-
ing only translational motion. Once roughly aligned, a high-contrast image was
subtracted from a pre-contrast image to obtain a kidney contour, which was then
propagated over the other frames searching for the rigid registration parameters
(rotation and translation). For segmentation of the cortex and medulla, the level
sets approach of Chan et al. [44] was used.
In most of these previous efforts, healthy transplants were used in the image
analysis, so the edge detection algorithms were applicable. However, in the case
of acute rejection patients, the uptake of contrast agent was decreased, so edge
detection algorithms generally failed in giving connected contours. Therefore, in
our approach, we avoid edge detection schemes; instead, we combine the use of
gray-level and prior shape information.
Both active contours and level sets tend to fail in the case of noise, poor image
resolution, diffused boundaries, or occluded shapes if they do not take advantage of
the a priori models. Four of the popular segmentation algorithms that make use of
only gray-level information are shown in Figure 1 implemented with ITK Version
2.0; namely, thresholding level sets [45], fast marching level sets [46], and geodesic
active contours [47]. However, especially in the area of medical imaging, organs
have well-constrained forms within a family of shapes [48]. Thus, additional
constraints based on the shape of the objects have been greatly needed aside from
the gray-level information of these objects.
Therefore, to allow shape-driven segmentation, Leventon et al. [49] used a
shape prior whose variance was obtained thorough Principal Component Analy-
sis (PCA), and used this shape prior to evolving the level sets to the maximum a
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 299
(a)
(b) (c)
(d) (e)
Figure 1. (a) Kidney to be segmented from the surrounding organs. The results of some
popular segmentation algorithms that depend only on gray-level and gradient information,
(b) connected thresholding, (c) fast marching level sets, (d) geodesic active contours, and
(e) thresholding level sets segmentation.
300 AYMAN EL-BAZ et al.
the study is concluded in Section 9 with speculation about future work and further
discussion.
3. Renogram generation.
4. Feature extraction.
5. KIDNEY SEGMENTATION
Figure 2. Example of a DCE-MRI series. For each patient, 150 images are taken from one
cross-section with 4-second intervals. Eight of one subject (numbers 1, 4, 5, 6, 10, 15, 21,
27) are shown here to give an idea of the protocol.
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 303
Figure 3. Block diagram of the proposed image analysis to create a CAD system for renal
transplantation. See attached CD for color version.
where ξint (φ(τ )) and ξext (φ(τ )) denote the internal and external forces, respec-
tively, that control the pointwise model movements. The total energy is the sum of
two terms: the internal energy keeping the deformable model as a single unit and
the external one attracting the model to the region boundary. The internal force
is typically defined as ξint (φ(τ ) = ς|φ (τ )|2 + κ|φ (τ )|2 , where weights ς and κ
control the curve’s tension and rigidity, respectively, and φ (τ ) and φ (τ ) are the
first and second derivatives of φ(τ ) with respect to τ .
Typical external forces designed in [57] to lead an active contour toward step
edges in a grayscale image Y are:
ξext φ(τ ) = −|∇Y(φ(τ ))|2 or
2 , (2)
−∇ G(φ(τ )) ∗ Y(φ(τ ))
where G(. . .) is a 2D Gaussian kernel and ∇ denotes the gradient operator. But both
these and other traditional external forces (e.g., based on lines, edges, or the GVF)
fail to make the contour closely approach an intricate boundary with concavities.
304 AYMAN EL-BAZ et al.
Moreover, due to high computational complexity, the deformable models with most
of such external energies are slow compared to other segmentation techniques.
As a solution to these problems, we modify the external energy component
of this energy formulation, and we formulate an energy function using the density
estimations of two distributions: the signed distance map from shape models and
the gray-level distribution [58]. The external energy component of our deformable
models is formulated as:
−pg (q|k)ps (d|k) if k = k ∗
ξext φ(τ ) = .
pg (q|k)ps (d|k) if k = k ∗
where RV is the region lying inside the kidney shape and S((i, j), Vm ) is
the minimum Euclidean distance between image location (i, j) and curve
Vm . The results of this step are shown in Figure 4e,f.
6. Compute the empirical density of the aligned signed distance maps (Figure
4e) as shown in Figure 4g.
7. Calculate the average signed distance map of the kidney at location (i, j)
as:
M
1
d(i, j) = dm (i, j). (4)
M m=1
Then threshold the sign distance map at zero level to obtain an average
shape.
Figure 5a shows the average signed distance map for a kidney object. We get
the average shape of the kidney by thresholding the signed distance map shown
in Figure 5a at zero level, the result of which is shown in Figure 5b. In the same
way, we calculate the average empirical densities of the empirical densities shown
in Figure 4g, with the result shown in Figure 6. With this approach, all the shape
variability is gathered into one density function. Compared to the other approaches
such as that of [60], we do not need to conduct a principal component analysis,
which is difficult for a big database that contains big images (size 600 × 600).
Figure 7 evaluates the quality of MI-based affine alignment, with the region
maps images being pixelwise averages of all the training maps images, m =
1, . . . , M , before and after mutual alignment of training set M . Similar shapes
are significantly overlapped after the alignment, that is, we decrease the variability
between shapes.
(a)
(b)
(c)
(d)
(e)
200 200 200 200
0 0 0 0
−200 −200 −200 −200
−400 −400 −400 −400
600 600 600 600
400 600 400 600 400 600 400 600
400 400 400 400
(f) 200
11
200 200
11
200 200
11
200 200
11
200
Figure 4. Steps of shape reconstruction. Samples of the database (a), manual segmentation
results (b), affine mutual information registration (c), binarization (d), signed distance maps
(e), level sets functions (f), and the empirical densities of signed distance maps (g). See
attached CD for color version.
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 307
(a) (b)
Figure 5. Average signed distance map for the kidney object (a), and the average shape of
the kidney after thresholding the average signed distance map from the zero level (b). See
attached CD for color version.
Figure 6. Average empirical signed distance map density representing the shape. It is
calculated as the average empirical density of empirical densities shown in Figure 4g.
The positive distances indicate the kidney area, while the negative distances indicate the
background.
In the following we describe this model to estimate the marginal density for
the gray-level distribution pg (q) in each region. The same approach is used to
estimate the density of the signed distances ps (d) for the object and background.
308 AYMAN EL-BAZ et al.
(a) (b)
Figure 7. Comparison of the shape overlaps in the training data sets before (a) and after
(b) alignment.
Let Q = {0, . . . , Q} denote sets of gray levels q in the given image. Here,
Q + 1 is the number of gray levels in the given image. Let S = {(i, j) : 1 ≤ i ≤
I, 1 ≤ j ≤ J} be a finite arithmetic grid supporting gray level images Y : S →
Q. The discrete Gaussian (DG) is defined as the discrete probability distribution
Ψθ = (ψ(q|θ) : q ∈ Q) on Q such that ψ(q|θ) = Φθ (q + 0.5) − Φθ (q − 0.5)
for q = 1, . . . , Q − 2, ψ(0|θ) = Φθ (0.5), ψ(Q − 1|θ) = 1 − Φθ (Q − 1.5) where
Φθ (q) is the cumulative Gaussian (normal) probability function with a shorthand
notation θ = (µ, σ 2 ) for its mean, µ, and variance, σ 2 .
In contrast to a conventional mixture of Gaussians and/or other simple dis-
tributions, one per region, we closely approximate the empirical gray-level dis-
tribution for the given image with an LCDG having Cp positive and Cn negative
components:
Cp Cn
pg:w,Θ (q) = wp,r ψ(q|θp,r ) − wn,l ψ(q|θn,l ), (5)
r=1 l=1
To estimate the parameters for the model given in Eq. (6), we modify the con-
ventional EM algorithm to take into account both positive and negative discrete
Gaussian components. The details of the algorithm are described below.
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 309
Cp Cn
pw,Θ (q) = wp,r ψ(q|θp,r ) − wn,l ψ(q|θn,l ). (7)
r=1 l=1
In line with Eq. (7), the positive weights w are restricted to unity as shown in Eq.
(6):
We also assume here that the numbers Cp and Cn of the components of each
type are known after the initialization in Section 5.4 and do not change during the
EM process. The initialization also provides the starting parameter values w[0]
and Θ[0] .
The probability densities form a proper subset of the set of the LCDG due to
the additional restriction pw,Θ (q) ≥ 0, which holds automatically for probability
mixtures with no negative components only. As mentioned earlier, this special
feature is ignored because our goal is to closely approximate the empirical data
only within the limited range [0, Q]. The approximating function of Eq. (7) is
assumed strictly positive only in the points q = 0, 1, . . . , Q.
The log-likelihood of the empirical data under the assumed independent sig-
nals is as follows:
1
L(w, Θ) = log f (Yi,j )
|S|
(i,j)∈S
1 |S|f (q)
= log (pw,Θ (q))
|S|
q∈Q
The LCDG that provides a local maximum of the log-likelihood in Eq. (8) can
be found using the iterative block relaxation process extending the conventional
scheme in [61] that was proposed initially in [62].
Let
Cp Cn
[m] [m] [m] [m] [m]
pw,Θ (q) = wp,r ψ(q|θp,r )− wn,l ψ(q|θn,l )
r=1 l=1
310 AYMAN EL-BAZ et al.
Multiplying the right-hand side of Eq. (8) by the left-hand side of Eq. (10),
which is valid since this latter has unit value, the log-likelihood of Eq. (8) can be
rewritten in the equivalent form:
Q Cp
[m]
L(w [m]
,Θ [m]
) = f (q) πp[m] (r|q) log pw,Θ (q)
q=0 r=1
C
Q n
[m]
− f (q) πn[m] (l|q) log pw,Θ (q) . (11)
q=0 l=1
The next equivalent form more convenient for specifying the block relaxation
[m]
process is obtained after replacing log pw,Θ (q) in the first and second brackets
[m] [m] [m] [m]
with the equal terms: log wp,r + log ψ(q|θp,r ) − log πp (r|q) and log wn,l +
[m] [m]
log ψ(q|θn,l ) − log πn (l|q), respectively, which follow directly from Eq. (9).
The block relaxation converges to a local maximum of the likelihood func-
tion in Eq. (11) by repeating iteratively the following two steps of conditional
maximization (comprising the E-step and the M-step, respectively [61]):
1. Find the conditional weights of Eq. (9) by maximizing the log-likelihood
L(w, Θ) under the fixed parameters w[m] , Θ[m] from the previous iteration
m, and
2. Find the parameters w[m+1] , Θ[m+1] by maximizing L(w, Θ) under the
fixed conditional weights of Eq. (9)
until the changes of the log-likelihood and all the model parameters become small.
The first step performing the conditional Lagrange maximization of the log-
likelihood of Eq. (11) under the Q + 1 restrictions of Eq. (10) results just in the
[m+1] [m+1]
conditional weights πp (r|q) and πn (l|q) of Eq. (9) for all r = 1, . . . , Cp ;
l = 1, . . . , Cn and q = 0, . . . , Q.
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 311
[m+1] [m+1]
wp,r = f (q)πp (r|q)
q∈Q
[m+1] [m+1] .
wn,l = f (q)πn (l|q)
q∈Q
The expected parameters Θ[m+1] of each Gaussian have conventional forms that
follow from the unconditional maximization of the log-likelihood of Eq. (11):
[m+1] 1
[m+1]
µc,r =
[m+1] qf (q)πc (r|q)
wc,r q∈Q
[m+1] 1
[m+1]
2
[m+1]
(σc,r )2 = [m+1] q − µc,i f (q)πc (r|q).
wc,r q∈Q
∞
3. Compute the scaling factor for the deviations: Scale = δp (q)dq ≡
−∞
∞
δn (q)dq.
−∞
4. If the factor Scale is less than an accuracy threshold, terminate and return
the model PC = P2 .
1 1
5. Otherwise, consider the scaled-up absolute deviations Scale ∆p and Scale ∆n
as two new “empirical densities” and use iteratively the conventional EM
algorithm to find sizes Cp and Cn of the Gaussian mixtures, Pp and Pn ,
respectively, approximating the best scaled-up deviations.
(a) The size of each mixture corresponds to the minimum of the integral
absolute error between the scaled-up absolute deviation ∆p (or ∆n )
and its model Pp (or Pn ). The number of components is increasing
sequentially by a unit step while the error is decreasing.
(b) Due to multiple local maxima, such a search may be repeated several
times with different initial parameter values in order to select the best
approximation.
6. Scale down the subordinate models Pp and Pn (i.e., scale down the weights
of their components) and add the scaled model Pp to and subtract the scaled
model Pn from the dominant model P2 in order to form the desired model
PC of the size C = 2 + Cp + Cn .
We use the Levy distance [63], ρ(F, P), between the estimated model P and
the empirical distribution F to evaluate the approximation quality. The distance
is defined as the minimum positive value α such that the two-sided inequalities
p(q − α) − α ≤ f (q) ≤ p(q + α) + α hold for all q ∈ Q:
It has been proven [63] that model P weakly converges to F when ρ(F, P) → 0.
Our experiments show that the modified EM algorithm typically decreases an ini-
tially large Levy distance between the empirical distribution and its estimated
model to a relatively small value.
t ∞
e(t) = p(q|2)dq + p(q|1)dq. (14)
−∞ t
(a) (b)
(c) (d)
(e) (f)
Figure 8. Estimated two dominant modes that present the kidney area and its background
(a); the scaled-up absolute deviation of the approximation and its LCDG model (b); ap-
proximation error for the scaled absolute deviation as a function of number of subordinate
Gaussians (c); density estimation of the scaled-up absolute deviation (d); final LCDG model
(e); and log-likelihood changes at the EM iterations (f).
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 315
(a) (b)
Figure 9. Final two-class LCDG approximation of the mixed density (a); and final LCDG
models for each class (b).
Figure 9 presents the final estimated density using the LCDG model for both
mixed density and the marginal density for each class. In Figure 9a the Levy
distance between the empirical density and the estimated density is 0.008, which
indicates a good match between the empirical distribution and the estimated one.
It is important to note that this shape density is calculated only once; then it
is used as it is during kidney segmentation of all other given abdomen images.
1. Register the given image to one of the aligned images in the database using
2D affine registration based on using mutual information as a similarity
measure. This step makes the algorithm invariant to scaling, rotation, or
translation.
5. Initialize the control points φ(τ ) for the deformable model, and for each
control point φ(τ ) on the current deformable model, calculate sign dis-
316 AYMAN EL-BAZ et al.
Figure 10. Greedy propagation of the deformable model (a). The deformable model in (a)
is initialized in the given kidney to be segmented (b). See attached CD for color version.
tances indicating exterior (−) or interior (+) positions of each of the eight
nearest neighbors w.r.t. the contour as shown in Figure 10.
6. Check the label k for each control point:
7. If the iteration adds new control points, use the bicubic interpolation of the
whole contour and then smooth all its control points with a lowpass filter.
8. Repeat steps 6 and 7 until no positional changes in the control points occur.
The first step of our approach is to estimate the density from the given image
for both the kidney object and its background. Unlike the shape density estimation,
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 317
the gray-level density estimation step will be repeated for every image we need to
segment; so that our approach is adaptable for all data.
Figure 11 shows the initial approximation of the bimodal empirical distribution
of Q = 256 gray levels over a typical DCE-MRI (Dynamic Contrast-Enhanced
Magnetic Resonance Imaging) slice of human abdomen. The dominant modes
represent the brighter kidney area and its darker background, respectively. After
the additive and subtractive parts of the absolute deviation are approximated with
the DG mixtures, the initial mixed LCDG model consists of the 2 dominant, 4
additive, and 4 subtractive DGs, that is, Cp = 6 and Cn = 4. The LCDG models
of each class are obtained with t = 78, ensuring the best class separation.
Figure 12 presents the final LCDG model obtained by refining the above
initial one using the modified EM algorithm introduced in Section 5.3. The first
37 iterations of the algorithm increase the log-likelihood of Eq. (11) from −6.90
to −4.49. Also, the Levy distance between the empirical density and the estimated
density is 0.007, which indicates a good match between the empirical distribution
and the estimated one.
The second step of the proposed approach is the alignment step. Figure
13a demonstrates one of our aligned databases. Figure 13b shows one kidney
image that we need to segment. Figure 13c shows the result of the registration of
(a) and (b) using MI. Figure 14 shows the final steps of the proposed approach.
The resulting segmentation in Figure 14c has an error of 0.23% with respect to
radiologist segmentation (ground truth). Figure 15 shows another example of
kidney segmentation using the proposed approach. To highlight the accuracy
of the proposed approach, we compared the proposed approach with the more
conventional geometric deformable model presented in [60], where the level set-
based shape prior gives an error of 4.9% (Figure 15g). Similar results for four
other kidney images in Figure 16 suggest that the latter approach fails to detect
sizeable fractions of the goal objects.
In the above examples, the abdominal images were from different patients to
show the applicability of our approach, and also to demonstrate the difficulty of
the problem. From here on, we will be using the images of only one patient to
illustrate the steps of our approach. The segmentation results of one patient are
given in Figure 17, a part of whose sequence was given in Figure 2.
(a) (b)
(c) (d)
Figure 11. Initial LCDG model of the bimodal empirical gray-level distribution: the DMRI
slice (a), its empirical gray-level distribution approximated with the dominant mixture of
the DGs (b), the scaled-up absolute deviation of the approximation and its LCDG model
(c), and the LCDG model of each class (d) for the best separating threshold t = 78. See
attached CD for color version.
(a) (b)
(c) (d)
Figure 12. Final 2-class LCDG model (a), log-likelihood changes at the EM iterations (b),
ten components of the final LCDG (c), the final LCDG model of each class for the best
separating threshold t = 85 (d). See attached CD for color version.
Figure 13. Kidney from the aligned database (a); a kidney to segment (b); alignment of (a)
and (b) using affine mutual information registration (c).
320 AYMAN EL-BAZ et al.
(a) (b)
(c) (d)
Figure 14. Initialization of deformable model (a); final segmentation of the aligned image
(b); final segmentation (c) after multiplying the aligned image by inverse transformation
(error = 0.23%); and radiologist segmentation (d). See attached CD for color version.
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure 15. Chosen training kidney prototype (a); an image to be segmented (b); its align-
ment to the prototype (c); the contour of the training prototype superimposed on the aligned
test image (d); the segmentation result (e); the same result (f) after its inverse affine trans-
form to the initial image (b) (total error 0.63% in comparison to ground truth (h); the final
boundary and the ground truth are in red and green, respectively), and the segmentation (g)
with the algorithm in [60] (the total error 4.9% in comparison to ground truth (h)). See
attached CD for color version.
322 AYMAN EL-BAZ et al.
Figure 16. Segmentation of four other kidney DCE-MR images: the left column — our
final boundaries and the ground truth (in red and green, respectively); the right column
— the segmentation with the algorithm in [60] vs. the ground truth (in red and green,
respectively). See attached CD for color version.
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 323
Figure 17. Segmentation results of a DCE-MRI series, a part of which was shown in Figure 2.
324 AYMAN EL-BAZ et al.
(a) (b)
(c) (d)
Figure 18. Distance map of two kidneys (a, b) and the samples of isocontours (c, d). See
attached CD for color version.
(a) (b)
Figure 19. Model constraints (a), and model evolution (b). See attached CD for color
version.
7. CORTEX SEGMENTATION
Strake et al. [64] have shown that the most important signal characteristics
come from the cortex of the kidney during acute rejection. Therefore, the final
step of our approach is to segment the cortex from the segmented kidney. To
achieve this task, we use the same approach but based now only on intensity. At
this step, since all the kidneys are aligned together, we select seed points from
the medulla regions, and evolve our deformable model based only on intensity.
After we extract the medullary regions, the rest is cortex, which is used as a mask
and propagated over the whole sequence to plot the average cortex intensity. In
Figure 21 we show the cortex segmentation results. In Figure 21a we manually
initialize several deformable models inside the medullar regions, and we allow our
deformable model evolve in these regions with gray-level information as shown
in Figure 21b. The cortex mask is applied to the rest of the sequence as shown in
Figure 21c–h.
8. RESULTS
(g) (h)
Figure 21. Segmentation of the cortex from kidney images. Several medullary seeds
are initialized (a), and the deformable model grows from these seed points (b). After the
medulla is extracted from the kidney, the cortex is propagated over the whole sequence of
images, as shown in (c)-(h). See attached CD for color version.
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 329
Figure 22. Cortex intensity vs. scan number from 4 subjects. There are 4 seconds between
each scan. Subjects 1 and 2 have acute rejection, subject 3 is normal, and subject 4 is
chronic glomerulopathy proved by biopsy. In these cortical renograms the normal patient
shows the expected abrupt increase in intensity along with a fast decrease, followed by a
constant valley and a slow decrease. On the other hand, these abrupt patterns are not seen
in acute rejection patients; there is no definite peak, and the time to reach peak intensity is
delayed. Subject 4 shows that DCE-MRI is also powerful in distinguishing other diseases.
See attached CD for color version.
9. CONCLUSION
In this chapter we presented a framework for the detection of acute renal rejec-
tion from Dynamic Contrast Enhanced Magnetic Resonance Images that includes
segmentation of kidneys from abdomen images, nonrigid registration, and Bayes
classification. For segmentation of kidneys from the abdomen images, we intro-
duced a new deformable model that evolves with both the gray-level information
of a given abdomen image, and the shape information obtained from a database
of manually segmented kidneys. The energy function of this deformable model
is a combination of (i) the gray-level density and (ii) the prior shape information
as a 1D density function. For these density estimations we introduced a modified
EM algorithm that closely approximates the densities. Following segmentation,
we introduced a nonrigid registration algorithm that deforms the kidney object on
isocontours instead of a square lattice, which provides more degrees of freedom to
obtain accurate deformation. After nonrigid registration, the kidney is segmented
into cortex and medulla, and the average gray-level value of the cortex for the
whole sequence of a patient is plotted. The features extracted from these signal
330 AYMAN EL-BAZ et al.
10. REFERENCES
1. 2000 annual report of the U.S. scientific registry of transplant recipients and the organ procure-
ment and transplantation network: transplant data 1990–1999. 2001. Richmond, VA: United
Network for Organ Sharing, Richmond, VA.
2. Sharma RK, Gupta RK, Poptani H, Pandey CM, Gujral RB, Bhandari M. 1995. The magnetic
resonance renogram in renal transplant evaluation using dynamic contrast-enhanced MR imaging.
Radiology 59:1405–1409.
3. Kasiske BL, Keane WF. 1996. Laboratory assessment of renal disease: clearance, urinalysis, and
renal biopsy. In The kidney, 5th ed., pp. 1137–1173. Ed BM Brenner, FC Rector. Philadelphia:
Saunders.
4. Bennett HF, Li D. 1997. MR imaging of renal function. Magn Reson Imaging Clin North Am
5(1):107–126.
5. Giele ELW. 2002. Computer methods for semi-automatic MR renogram determination. PhD
dissertation. Department of Electrical Engineering, University of Technology, Eindhoven.
6. Taylor A, Nally JV. 1995. Clinical applications of renal scintigraphy. Am J Roentgenol 164:31–41.
7. Katzberg RW, Buonocore MH, Ivanovic M, Pellot-Barakat C, Ryan RM, Whang K, Brock JM,
Jones CD. 2001. Functional, dynamic and anatomic MR urography: feasibility and preliminary
findings. Acad Radiol 8:1083–1099.
8. Tublin ME, Bude RO, Platt JF. 2003. The resistive index in renal Doppler sonography: where do
we stand? Am J Roentgenol 180(4):885–892.
9. Huang J, Chow L, Sommer FG, Li KCP. 2001. Power Doppler imaging and resistance index
measurement in the evaluation of acute renal transplant rejection. J Clin Ultrasound 29:483–490.
10. Turetschek K, Nasel C, Wunderbaldinger P, Diem K, Hittmair K, Mostbeck GH. 1996. Power
Doppler versus color Doppler imaging in renal allograft evaluation. J Ultrasound Med 15(7):517–
522.
11. Trillaud H, Merville P, Tran Le Linh P, Palussiere J, Potaux L, Grenier N. 1998. Color Doppler
sonography in early renal transplantation follow-up: resistive index measurements versus power
Doppler sonography. Am J Roentgenol 171(6):1611–16115.
12. Yang D, Ye Q, Williams M, Sun Y, Hu TCC, Williams DS, Moura JMF, Ho C. 2001. USPIO
enhanced dynamic MRI: evaluation of normal and transplanted rat kidneys. Magn Reson Med
46:1152–1163.
13. Chan L. 1999. Transplant rejection and its treatment. In Atlas of diseases of the kidney, Vol. 5,
chap. 9. Series Ed RW Schrier. Philadelphia: Current Medicine Inc.
14. Szolar DH, Preidler K, Ebner F, Kammerhuber F, Horn S, Ratschek M, Ranner G, Petritsch P,
Horina JH. 1997. Functional magnetic resonance imaging of the human renal allografts during
the post-transplant period: preliminary observations. Magn Reson Imaging 15(7):727–735.
15. Lorraine KS, Racusen C. 1999. Acute tubular necrosis in an allograft. Atlas of diseases of the
kidney, Vol. 1, chap. 10. Series Ed RW Schrier. Philadelphia: Current Medicine Inc.
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 331
16. Krestin GP, Friedmann G, Steinbrich W. 1988. Gd-DTPA enhanced fast dynamic MRI of the
kidneys and adrenals. Diagn Imaging Int 4:40–44.
17. Krestin GP, Friedmann G, Steinbrich W. 1988. Quantitative evaluation of renal function with
rapid dynamic gadolinium-DTPA enhanced MRI. In Proceedings of the international society for
magnetic resonance in medicine, Book of Abstracts. Los Angeles: MRSTS.
18. Frank JA, Choyke PL, Girton M. 1989. Gadolinium-DTPA enhanced dynamic MR imaging in
the evaluation of cisplatinum nephrotoxicity. J Comput Assist Tomogr 13:448–459.
19. Knesplova L, Krestin GP. 1998. Magnetic resonance in the assessment of renal function. Eur
Radiol 8:201–211.
20. Choyke PL, Frank JA, Girton ME, Inscoe SW, Carvlin MJ, Black JL, Austin HA, Dwyer AJ.
1989. Dynamic Gd–DTPA-enhanced MR imaging of the kidney: experimental results. Radiology
170:713–720.
21. Sun Y, Jolly M, Moura JMF. 2004. Integrated registration of dynamic renal perfusion MR im-
ages. In Proceedings of the IEEE international conference on image processing, pp. 1923–1926.
Washington, DC: IEEE
22. Yim PJ, Marcos HB, McAuliffe M, McGarry D, Heaton I, Choyke PL. 2001. Registration of
time-series contrast enhanced magnetic resonance images for renography. In Proceedings of the
14th IEEE symposium on computer-based medical systems, pp. 516–520. Washington, DC: IEEE.
23. Sun Y, Moura JMF, Ho C. 2004. Subpixel registration in renal perfusion MR image sequence. In
Proceedings of an IEEE international symposium on biomedical imaging, pp. 700–703, Wash-
ington, DC: IEEE.
24. Sun Y, Moura JMF, Yang D, Ye Q, Ho C. 2002. Kidney segmentation in MRI sequences using
temporal dynamics. In Proceedings of an IEEE international symposium on biomedical imaging,
pp. 98–101. Washington, DC: IEEE.
25. Gerig G, Kikinis R, Kuoni W, van Schulthess GK, Kubler O. 1992. Semiautomated ROI analysis in
dynamic MRI studies, part I: image analysis tools for automatic correction of organ displacements.
IEEE Trans Image Process 11:(2):221–232.
26. von Schulthess GK, Kuoni W, Gerig G, Duewell S, Krestin G. 1991. Semiautomated ROI analysis
in dynamic MRI studies, part II: application to renal function examination, first experiences.
J Comput Assist Tomogr 2:733–741.
27. Liang Z, Lauterbur PC. 1994. An efficient method for dynamic magnetic resonance imaging.
IEEE Trans Med Imaging 13(4):677–686.
28. Giele ELW, de Priester JA, Blom JA, den Boer JA, van Engelshoven JMA, Hasman A, Geerlings
M. 2001. Movement correction of the kidney in dynamic MRI scans using FFT phase difference
movement detection. J Magn Reson Imaging 14(6):741–749.
29. Vosshenrich R, Kallerhoff M, Grone HJ, Fischer U, Funke M, Kopka L, Siebert G, Ringert RH,
Grabbe E. 1996. Detection of renal ischemic lesions using Gd-DTPA enhanced turbo flash MRI:
experimental and clinical results. J Comput Assist Tomogr 20(2):236–243.
30. Munechika H, Sullivan DC, Hedlund LW, Beam CA, Sostman HD, Herfkens RJ, Pelc NJ. 1991.
Evaluation of acute renal failure with magnetic resonance imaging using gradient-echo and Gd-
DTPA. Invest Radiol 26(1):22–27.
31. Carvlin MJ, Arger PH, Kundel HL, Axel L, Dougherty L, Kassab EA, Moore B. 1987. Acute
tubular necrosis: use of gadolinium-DTPA and fast MR imaging to evaluate renal function in the
rabbit. J Comput Assist Tomogr 11(3):488–95.
32. Dalla-Palma L, Panzetta G, Pozzi-Mucelli RS, Galli G, Cova M, Meduri S. 2000. Dynamic
magnetic resonance imaging in the assessment of chronic medical nephropathies with impaired
renal function. Eur Radiol 10(2):280–286.
33. Kikinis R, von Schulthess GK, Jager P, Durr R, Bino M, Kuoni W, Kubler O. 1987. Normal
and hydronephrotic kidney: evaluation of renal function with contrast-enhanced MR imaging.
Radiology 165(3):837–842.
34. Semelka RC, Hricak H, Tomei E, Floth A, Stoller M. 1990. Obstructive nephropathy: evaluation
with dynamic Gd-DTPA-enhanced MR imaging. Radiology 175:797–803.
332 AYMAN EL-BAZ et al.
35. Beckmann N, Joergensen J, Bruttel K, Rudin M, Schuurman HJ. 1996. Magnetic resonance
imaging for the evaluation of rejection of a kidney allograft in the rat. Transpl Int 9(3):175–83.
36. Preidler KW, Szolar D, Schreyer H, Ebner F, Kern R, Holzer H, Horina JH. 1996. Differentiation
of delayed kidney graft function with gadolinium-DTPA-enhanced magnetic resonance imaging
and Doppler ultrasound. Invest Radiol 31(6):364–371.
37. El-Diasty T, Mansour O, Farouk A. 2003. Diuretic contrast enhanced mru versus ivu for depiction
of non-dilated urinary tract. Abd Imaging 28:135–145.
38. Laurent D, Poirier K, Wasvary J, Rudin M. 2002. Effect of essential hypertension on kidney
function as measured in rat by dynamic MRI. Magn Reson Med 47(1):127–131.
39. Krestin GP. 1994. Magnetic resonance imaging of the kidneys: current status. Magn Reson Q
10:2–21.
40. Sun Y. 2004. Registration and segmentation in perfusion MRI: kidneys and hearts. PhD disserta-
tion, Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburg.
41. de Priester JA, Kessels AG, Giele EL, den Boer JA, Christiaans MHL, Hasman A, van Engelshoven
JMA. 2001. MR renography by semiautomated image analysis: performance in renal transplant
recipients. J Magn Reson Imaging 14(2):134–140.
42. BoykovY, Lee VS, Rusinek H, Bansal R. 2001. Segmentation of dynamic N–D data sets via graph
cuts using Markov models. In Proceedings of the 4th international conference on medical image
computing and computer-assisted intervention (MICCAI). Lecture Notes in Computer Science,
Vol. 2208, pp. 1058–1066. Utrecht: Springer.
43. Sun Y, Yang D, Ye Q, Williams M, Moura JMF, Boada F, Liang Z, Ho C. 2003. Improving
spatiotemporal resolution of USPIO-enhanced dynamic imaging of rat kidneys. Magn Reson
Imaging 21:593–598.
44. Chan TF, Vese LA. 2001. Active contours without edges. IEEE Trans Image Process 10(2):266–
277.
45. Ibanez L, Schroeder W, Ng L, Cates J, and the Insight Software Consortium. 2005. The ITK
software guide. Clifton Park, NY: Kitware Inc.
46. Sethian JA. 1996. Level set methods and fast marching methods. Cambridge: Cambridge UP.
47. Caselles V, Kimmel R, Sapiro G. 1997. Geodesic active contours. Int J Comput Vision 22(1):61–
79.
48. Rousson M, Paragios N. 2002. Shape priors for level set representations. In Proceedings of the
7th European conference on computer vision, part II (ECCV’02). Lecture Notes in Computer
Science, Vol. 2751, pp. 78–92. Berlin: Springer.
49. Leventon M, Grimson WL, Faugeras O. 2000. Statistical shape influence in geodesic active
contours. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.
1316–1324. Washington, DC: IEEE Computer Society.
50. Chen Y, Thiruvenkadam S, Tagare H, Huang F, Wilson D. 2001. On the incorporation of shape
priors into geometric active contours. In IEEE workshop on variational and level set methods,
pp. 145–152, Washington, DC: IEEE.
51. Tsai A, Yezzi AJ, Wells WM, Tempany C, Tucker D, Fan A, Eric W, Grimson L, Willsky AS.
2001. Model-based curve evolution technique for image segmentation. In Proceedings of the
IEEE conference on computer vision and pattern recognition, pp. 463–468, Washington, DC:
IEEE Computer Society.
52. Paragios N. 2003. A level set approach for shape-driven segmentation and tracking of the left
ventricle. IEEE Trans Med Imaging 22:773–776.
53. Litvin A, Karl WC. 2003. Level set-based segmentation using data driven shape prior on feature
histograms. In IEEE workshop on statistical signal processing, pp. 166–169. Washington, DC:
IEEE.
54. Tsai A, Wells W, Warfield SK, Willsky AS. 2004. Level set methods in an em framework for shape
classification and estimation. In Proceedings of the international conference on medical image
computing and computer-assisted intervention (MICCAI). Lecture Notes in Computer Science,
Vol. 2211, pp. 1–9. Utrecht: Springer.
DEFORMABLE MODELS FOR THE DETECTION OF ACUTE RENAL REJECTION 333
55. Yang J, Duncan J. 2004. 3d image segmentation of deformable objects with joint shape-intensity
prior models using level sets. Med Image Anal 8:285–294.
56. Yuksel SE, El-Baz A, Shi H, Farag AA, El-Ghar MEA, Eldiasty TA, Ghoneim MA. 2005. Auto-
matic detection of renal rejection after kidney transplantation. In Proceedings of the conference
on computer assisted radiology and surgery (CARS), pp. 773–778. Berlin: Springer.
57. Witkin A, Kass M, Terzopoulos D. 1987. Snakes: Active contour models. Int J Comput Vision
1:321–331.
58. El-Baz A, Yuksel SE, Shi H, Farag AA, El-Ghar MA, Eldiasty T, Ghoneim MA. 2005. 2d and 3d
shape-based segmentation using deformable models. In Proceedings of the international confer-
ence on medical image computing and computer-assisted intervention (MICCAI). Lecture Notes
in Computer Science, Vol. 2212,pp. 821–829. Utrecht: Springer.
59. Viola P, Wells WM. 1995. Alignment by maximization of mutual information. In Proceedings of
the 5th international conference on computer vision, pp. 16–23. Washington, DC: IEEE Computer
Society.
60. Tsai A, Yezzi A, Wells W, Tempany C, Tucker D, Fan A, Grimson E, Willsky A. 2003. A shape-
based approach to curve evolution for segmentation of medical imagery. IEEE Trans Med Imaging
22(2):137–154.
61. Webb A. 2002. Statistical pattern recognition. 2nd. ed. Chichester: J. Wiley & Sons.
62. Schlesinger MI. 1968. A connection between supervised and unsupervised learning in pattern
recognition. Kibernetika 2:81–88.
63. Lamperti JW. 1996. Probability. New York: J. Wiley & Sons.
64. te Strake L, Kool LJS, Paul LC, Tegzess AM, Weening JJ, Hermans J, Doornbos J, Bluemm RG,
Bloem JL. 1988. Magnetic resonance imaging of renal transplants: its value in the differentiation
of acute rejection and cyclosporin A nephrotoxicity. Clin Radiol 39(3):220–228.
65. Farag, AA, El-Baz A, Gimel’farb G. 2004. Density estimation using modified expectation-
maximization for a linear combination of gaussians. In Proceding of the IEEE international
conference on image processing, Vol. 3, pp. 1871–1874. Washington, DC: IEEE Computer
Society.
11
Medical imaging continues to permeate the practice of medicine, but automated yet accurate
segmentation and labeling of anatomical structures continues to be a major obstacle to
computerized medical image analysis. Deformable models, with their roots in estimation
theory, optimization, and physics-based dynamical systems, represent a powerful approach
to the general problem of medical image segmentation. This chapter presents an introduction
to deformable models, beginning with the classical Active Contour Models (ACMs), or
snakes, and focusing on explicit, physics-based methods. Snakes are useful for segmenting
amorphous shapes when little or no prior knowledge about shape and motion is available.
Many extensions of snakes incorporate such additional knowledge. An example presented
in this chapter is the use of optical flow forces to incorporate knowledge of shape dynamics
and guide the snake deformations to track the leading edge of an injected contrast agent in
an echocardiographic image sequence. Active Shape Models (ASMs), or smart snakes, is a
powerful method for incorporating statistical models of shape variability in the segmentation
process. ASMs and ACMs offer different advantages, and, as such, a method combining
both is presented. Statistical knowledge about shape dynamics is useful for segmenting and
tracking objects with distinctive motion patterns (such as a beating heart). An extension of
the ASM to model knowledge of spatiotemporal constraints is presented.
335
336 GHASSAN HAMARNEH and CHRIS MCINTOSH
noise and boundary gaps. Ideas related to snakes date back to the early 1970s
[3, 4]. In short, an ACM is an energy-minimizing contour with smoothness con-
straints, deformed according to image data. They integrate boundary elements
into a single, inherently connected, smooth, mathematically well-defined struc-
ture, which can be implemented on the continuum achieving sub-pixel accuracy.
ACMs were originally designed to be semiautomatic tools supporting intuitive
interaction mechanisms for guiding the segmentation process. In active contour
models, a contour is initialized on the image and left to deform in a way that, first,
moves it toward features of interest in the image and, second, maintains a certain
degree of smoothness in the contour. Consequently, an energy term is associated
with the contour and is designed to be inversely proportional to both the contour’s
smoothness and its fit to the image data, in order to segment the desired image
features. Deformation of the contour in the image will change its energy; thus, one
can imagine an energy surface on top of which the contour moves (in a way that
resembles the slithering of a snake, and hence the name) while seeking valleys of
low energy [5].
1 2
∂v (s) 2 ∂ v (s) 2
α (v (s)) =
w1 (s)
+ w2 (s) ds, (2)
∂s ∂s2
0
1
β (v (s)) = w3 (s) P (v (s)) ds. (3)
0
Weighting functions w1 and w2 control the tension and flexibility of the contour,
respectively, and w3 controls the influence of the image data. wi can depend on
s but are typically set to different constants. For the contour to be attracted to
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 337
image features, function P (v (s)) is designed such that it has minima where the
features have maxima. For example, for the contour to be attracted to high-intensity
changes (high gradients), we can choose
search [10], an initial step to optimize for rigid parameters, distance transform-
based potential functions, and Gradient Vector Flow Fields (see Section 1.6.5). In
the next chapter, deformable organisms are introduced as an attempt to provide a
simple model of the human expert’s cognitive abilities in order to guide the model’s
deformations.
Figure 1. Sample frames (progressing left to right, top to bottom) showing incorrect
progress of a deformable model (snake) for segmenting the corpus callosum (CC) in a
midsagittal brain magnetic resonance image (MRI), due to the wrong choice of parameters.
See attached CD for color version.
Figure 2. Sample frames (progressing left to right, top to bottom) showing incorrect
progress of a snake segmenting the CC in an MRI image. Leaking of the snake occurs
because of the weak edge strength (lower left) and incorrect parameter setting. See
attached CD for color version.
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 339
µi v̈i (t) + γi v̇i (t) − w1 vi (t) + w2 vi (t) + w3 ∇P (vi (t)) = 0, (7)
where {vi (t) = (xi (t) , yi (t))}i=1,2,...,N are the nodes of the snake polygon, t is
used as the discrete time variable, and i is the snake node index. v̇ and v̈ are the first
and second derivates of v with respect to t. v and v are the second and fourth
derivates of v with respect to i. Setting the mass density to zero1 (µi = µ = 0)
and the damping density to a constant (γi = γ), we rewrite Eq. (7) for simulating
the deformations of the discrete snake as
γ v̇i − w1 Ftensile
i (t) + w2 Fflexural
i (t) = w3 Fexternal
i (t) . (8)
Ftensile
i (t) is a tensile force (resisting stretching) acting on node i at time t and is
given by
Ftensile
i (t) = 2vi (t) − vi−1 (t) − vi+1 (t) . (9)
Fflexural
i (t) is a flexural force (resisting bending) and is given by
Fflexural
i (t) = 2Ftensile
i (t) − Ftensile
i−1 (t) − Ftensile
i+1 (t) . (10)
Fexternal
i (t) is an external (image-derived) force. It is derived in a way that causes
the snake node to move toward regions of higher intensity gradient in the image
and is given by
Fexternal
i (t) = ∇P (xi (t) , yi (t)) , (11)
where P (xi (t) , yi (t)) is given in (4).
The equation used for updating the position of any snake node i can be
obtained from (8) by using a finite-difference derivative approximation v̇i =
(vi (t + ∆t) − vi (t))/∆t, where ∆t is a finite time step, yielding
∆t
vi (t + ∆t) = vi (t) − (w1 Ftensile
i (t) − w2 Fflexural
i (t) + w3 Fexternal
i (t)).
γ
(12)
1.6. Extensions
1.6.1. Inflation Force
In addition to the above forces, an inflation force, Finflation
i (t), can be uti-
lized to allow for initializing the snake farther away from the target boundary.
1 In static shape recovery problems not involving time-varying data, the mass density is often set to
zero, resulting in simplified equations of motion and a snake that comes to rest as soon as the internal
forces balance the external forces [11].
340 GHASSAN HAMARNEH and CHRIS MCINTOSH
Finflation
i (t) is given by
Finflation
i (t) = F (Is (xi , yi )) ni (t) , (13)
where Is is a smoothed version of I, ni (t) is the unit vector in the direction normal
to the contour at node i and the binary function
+1 if I (x, y) ≥ T
F (I (x, y)) = (14)
−1 otherwise.
links the inflation force to the image data, where T is an image intensity threshold.
More elaborate rules can be used (e.g., using region-based image intensity statistics
[12]). Consequently, Eq. (8) becomes
γ v̇i − w1 Ftensile
i (t) + w2 Fflexural
i (t) = w3 Fexternal
i (t) + qFinflation
i (t) , (15)
Figure 3. Adaptive resampling of a polygonal snake. (Left) Adding a node if the distance
between consecutive nodes is large or curvature is high. (Right) Removing a node if the
distance between nodes is small or the curvature is small (appropriate distance and curvature
thresholds must be chosen). See attached CD for color version.
Figure 4. Sample frames (progressing left to right, top to bottom) showing progress of
correct snake segmentation of the corpus callosum. The segmentation utilizes adaptive
inflation and subdivision and an appropriate choice of weights. See attached CD for color
version.
the user point that attracts the snake to that location. Alternatively, a force resulting
from a virtual spring connecting the mouse-click position with the closest snake
contour node can be applied. See examples of user-assisted snake segmentation
using the first method in Figures 5 and 6.
This minimization diffuses the image gradient such that it smoothly declines in
lower-valued regions and directly approximates the gradient in high-magnitude
areas, reflected by the first and second terms of (18), respectively (Figure 7).
The relative weighting of these two terms is governed by the scalar µ. f (x, y) =
−Eext (x, y), where Eext (x, y) is an image edge map (for example, obtained using
a Sobel or Canny filter). ux , uy , vx , and vy are the derivates of the components of
Fgvf , u (x, y), and v (x, y), with respect to x and y. Fgvf is then used in place of
the external forces, Fexternal , in Eq. (12), to obtain
∆t
vi (t + ∆t) = vi (t) − (w1 Ftensile
i (t) − w2 Fflexural
i (t) + w3 Fgvf
i (vi (t))).
γ
(19)
Then, the gradient magnitude √ and the gradient direction are taken as the square
root of the largest eigenvalue, λmax , of the matrix DT D and its corresponding
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 343
(a) (b)
Figure 6. Segmentation of an oral lesion example using the green band of a digital color
image. (a) Initial snake nodes. (b) Final segmentation result (snake nodes shown as white
dots and forced points as white circles). Reprinted with permission from [61]. Copyright
2000,
c IEEE.
Figure 7. Gradient Vector Flow Field: (a) original image depicting a corpus callosum in a
midsagittal brain MRI, (b) gradient vector field, (c) GVF field. Note how in (c) the field is
less sparse than in (b) and smoothly extends outward into homogenous regions.
and λmax (xi (t) , yi (t))) is calculated at the location of snake node xi (t) , yi (t)
at iteration time t. For more details the reader is referred to [14].
Fi = FHooke
i + Fviscous
i + Fuser
i + Fexternal
i . (22)
Fviscous
i = −kv vi . (24)
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 345
Fexternal
i (t) ∝ ∇ ∇ [Gσ ∗ I (xi )] , (25)
where I (xi ) is the intensity of a pixel at the location of node ni (see Eq. (4)).
Image forces that attract the model to an image boundary are calculated only for
boundary mesh nodes (similarly, image forces that attract medial model nodes to
medial features can also be applied). Following the calculation of the node forces
we compute the new acceleration, velocity, and position of each node given the
old velocity and position values, as follows (explicit Euler solution with time step
∆t):
ai = Fi /mi ,
vi = viold + ai ∆t. (26)
xi = xold
i + vi ∆t.
1 T
E(u) = u Ku + P(u), (27)
2
where K is called the stiffness matrix and P(u) is the discrete version of external
potential P (v(s)). The contour parameters that minimize the energy function can
now be obtained by solving the set of algebraic equations
Ku = −∇P. (28)
p(I|u)p(u)
p(u|I) = , (30)
p(I)
where p(u) is the prior probability density of the model shape parameters: a
mechanism for probabilistic regularization. p(I|u) is the probability of producing
an image I given the parameters u: an imaging (sensor) model. p(u) and p(I|u)
can be written (in the form of Gibbs distribution) as
where k1 and k2 are normalizing constants and A(u) is the discrete version of
the internal energy α(v) and P(u) is the discrete version of external potential
P (v(s)).
applied to the problem of tracking the leading edge of an injected contrast agent
in an echocardiographic image sequence and is shown to improve the tracking
performance. A clinical motivation and previous work on echocardiography and
video densitometry are initially presented.
Figure 8. Sample frames from a digitized image sequence. In frame #1, #59, #124, and
#422 the contrast agent has not, reached, just reached, totally filled, and washed out from
the RV, respectively. Reprinted with permission from [18]. Copyright 2000,
c IEEE.
deforming force, in order to speed up tracking and influence the snake nodes to
match corresponding contrast front regions between frames.
Other authors have investigated similar approaches. In [20] a method for
segmenting and tracking cardiac structures in ultrasound image sequences was
presented. In integrating the contour’s equation of motion, the method sets the
initial velocities of the contour vertices to OF estimates, and sets their positions
relative to the final position from the preceding frame. Peterfreund [21] used
Kalman filter-based active contours that calculate OF along the contour as system
Figure 9. Successive frames of contrast agent entering the right ventricle of the heart.
Reprinted with permission from [17]. Copyright 2000,
c IEEE.
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 349
measurement to detect and reject measurement that may belong to other objects.
Akgul et al. presented an application of tracking 2D contours of tongue surfaces
from digital ultrasound image sequences [22]. The method makes use of OF to
reduce the computational complexity involved when searching for optimal snake
node locations in a dynamic programming setting. This is done by considering
only a subset of pixels in a search window. The subset is chosen on the basis of
the first OF constraint, namely that the intensity of an object’s point in a dynamic
image does not change with time.
Using Taylor series expansion and neglecting higher-order terms gives the first OF
constraint equation:
Ix u + Iy v + It = 0, (34)
where u = dx/dt, and v = dx/dt are the desired velocity field components, Ix
and Iy are the spatial image derivatives, and It is the temporal image derivative.
Equation (33) by itself is insufficient to calculate (u, v); hence a second constraint,
the velocity field smoothness constraint, is introduced. The velocity field can
now be calculated as that which best satisfies both constraints by minimizing the
following square error function:
Figure 10. Optical flow (velocity) field shown on two consecutive frames.
the contrast front has moved from one frame to the next, and the second is to detect
this front as a smooth and connected boundary. We use the optical flow to address
the first task and snakes to address the second. To combine the two techniques we
include an additional force term Fflow i (t) proportional to the calculated velocity
field at the current snake node position vi (t) = (xi (t), yi (t)), yielding
Fflow
i (t) ∝ (u(xi (t − 1), yi (t − 1)), v(xi (t − 1), yi (t − 1))), (38)
and u (xi , yi ), v (xi , yi ) are obtained using algorithm (36) in Section 1.7.4.
1.8. Results
We tracked the leading edge of a contrast agent filling the RV in real ultrasonic
image sequences. Images were first smoothed using nonlinear diffusion filtering
[24]. The front of the contrast agent was tracked in eight sequences during the RV
filling process, five from the ARVD group and three from the control group. In
each sequence, the front was tracked, on average, in about eight images. In the
example depicted in Figure 11, the snake without OF forces needed 19, 20, 23,
22, 22, and 16 iterations (Figure 11a), whereas the snake with OF forces needed
5, 8, 10, 10, 6 and 10 iterations (Figure 11b). We also show an example of both
snakes, with and without OF forces, deforming toward the leading edge of the
contrast agent in a single frame (Figure 12). The OF snake (with larger nodes in
the figure) progresses faster toward the edge and locates it in only 10 iterations
compared to 23 iterations needed for the snake without OF forces. Histograms
of the number of iterations needed for the contour to find the edge for all tested
frames are also calculated (Figure 13). The mean number of iterations needed with
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 351
and without the use of information about the OF was 6.3 and 12.3, respectively.
Figure 14 shows the snake contour tracking the leading edge of the contrast front
and providing clinically relevant RV hemodynamics measurements.
(a)
(b)
Figure 11. Results of tracking a real sequence. Upper frames: without using optical flow
forces obtained after 19, 20, 23, 22, 22, and 16 iterations. Lower frames: with optical flow
forces obtained after 5, 8, 10, 10, 6, and 10 iterations. Reprinted with permission from [17].
Copyright 2000,
c IEEE.
2.1. Introduction
Active shape models (ASMs) or smart snakes are deformable shape modeling
techniques for segmentation of objects in images. ASMs ensure that the deformed
shape is consistent with a statistical model of shape variability calculated before-
Figure 12. The snake with optical flow forces (large nodes) progresses faster toward the
contrast front compared to the snake without optical flow forces. The snake nodes are shown
after 1 (left-most), 2, 6, 10, 15, and 23 (right-most) iterations. Reprinted with permission
from [17]. Copyright 2000,
c IEEE.
352 GHASSAN HAMARNEH and CHRIS MCINTOSH
(a) (b)
Figure 13. Histogram of the total number of iterations needed for the contour to latch onto
the contrast front (a) with optical flow forces and (b) without optical flow forces. Reprinted
with permission from [18]. Copyright 2000,
c IEEE.
Figure 14. The result of tracking the leading edge in one of the sequences from the ARVD
group. Each contour represents the contrast agent front at different times, indicated beside
each contour. The contrast front enters the RV (t = 0) until the RV is totally filled (t = 2960
ms). Note how the contrast front in the initial phase of filling (t = 0 to 480 ms) moves faster
than the final phase (t =480 to 2960 ms). This is indicative of the inhomogeneous operation
of the RV, identified via contrast front tracking. Reprinted with permission from [18].
Copyright 2000,
c IEEE.
354 GHASSAN HAMARNEH and CHRIS MCINTOSH
PCA, we can approximate each shape by the sum of a mean shape and a linear
combination of, say t, principal components, as follows:
x = x̄ + Pb, (39)
where
x is the vector of landmark coordinates,
x̄ is the mean shape,
P is the matrix of principal components,
b is a vector of weighting parameters, also called shape parameters, and
x and x̄ are each of length 2L. P is 2L × t and b is a vector of length t.
Equation (39) can be used to generate new shapes by choosing different values
for b. Constraints on these weighting parameters are used to ensure that only
allowable shapes are produced. The allowable shapes are those shapes belonging
to a region called the Allowable Shape Domain (ASD), which is inferred from the
region that the aligned training set occupies. Aside from generating new similar
shapes, this model is used to examine the plausibility of other shapes by checking
whether or not they lie in the ASD. The above model of the distribution of points
(or variation of shapes) is referred to as a Point Distribution Model (PDM).
where yij is the normalized derivative profile for the jth landmark in the ith image,
and ȳj is the mean normalized derivative profile for the jth landmark.
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 355
where ȳj is the mean normalized derivative profile for the jth landmark, and
h(d) is a subprofile along the search profile having a length equal to that of ȳj
and centered around the point on the search profile that is offset by d from the
landmark.
Finding d that minimizes the above function for all the landmarks, we arrive
at L new locations for the landmark positions. We may choose to control the
suggested movement by, for example, changing a small suggested movement to
no movement or large suggested movements to only half what is suggested.
The next step is to update the pose and shape parameters in order to move as
closely as possible to the new proposed positions with the restriction that the new
produced shape is an allowable shape. A practical, though not optimal, solution
is to first find only the pose parameters that move the shape estimate as close as
possible to the new proposed positions. Then there will remain residual adjust-
ments that can only be satisfied by deforming the shape and hence changing the
shape parameters. Finding the pose parameters is done by aligning the current
estimate to the new proposed shape. Alignment details are presented in the section
for the spatio-temporal case. The remaining landmark position modifications are
in general 2L-dimensional, whereas the shape variations obtained from the model
are only t-dimensional. A least-squares solution can be used to solve the follow-
ing equation for the changes in shape parameters db (with an orthonormal matrix
356 GHASSAN HAMARNEH and CHRIS MCINTOSH
column of P we have PT P = I)
shape becomes more similar to the mean shape, thus compacting the PDM model,
while another term increases when the landmarks move away from a given bound-
ary. More recently this line of work has been formalized through the concept of
Minimum Description Length.
Multi-Resolution Image Search An important issue that affects the image search
considerably in the ASM is choice of the length of the search profile. In choosing
the length of the search profile we are faced with two contradicting requirements:
on one hand, the search profile should be long enough to contain within it the
target point (the point that we need the landmark point to move to); on the other,
we require the search profile to be short for two reasons: first, to reduce the com-
putations required, and second, if the search profile is long and the target point
is close to the current position of the landmark, then it will be more probable to
move to a farther away noisy point and miss the target. In [27] a multi-resolution
approach is suggested where at first the search looks for faraway points and makes
large jumps, and as the search homes in on a target structure, it should restrict the
search only to close points.
In order to achieve such multi-resolution search, we generate a pyramid of
images. This pyramid of images contains the same image but with different reso-
lutions. At the base of the pyramid, Level 0, we have the original image. On the
next level we have a lower-resolution version of the image with half the number
of pixels along each dimension, and so on, until we reach the highest level of the
pyramid. Low pass filtering (smoothing), for example using Gaussian filtering,
should always precede image sub-sampling.
The search will begin at the top level of the pyramid (i.e., image with the
lowest resolution). The search is then initiated one level below using the search
output of the previous level, and so on, until the lowest level of the pyramid (the
original image) is reached. In order to be able to perform such a search in each
level, we should be equipped with information about the gray-level profiles in each
of these levels. This demands that during the training stage we obtain the mean
normalized derivative profile for each landmark at all levels of the pyramid.
The criterion utilized in order to change the level of search within the pyramid
is as follows. Move to a lower level when a certain percentage of landmarks do not
change considerably, for example, when 95% of the landmarks move only within
the central 50% of the search profile. A maximum number of iterations can also
be devised to prevent getting stuck at a higher level.
Active Appearance Models (AAMs) ASMs are statistical models of the shape
variations of an object that are used along with additional information about the
gray level values to segment new objects. AAMs are an extension of ASMs that
combine information about the shape and intensity of an object into one model
describing the appearance [28]. The model parameters are changed to locate new
358 GHASSAN HAMARNEH and CHRIS MCINTOSH
instances of the object in new images. The changes are found by measuring a
residual error between the appearance of the model and the data. The relationship
between the changes in the model parameters and the residual error is found in the
training stage. More details follow.
A shape model and a gray-level model are produced in similar fashion to
ASMs then combined to obtain an appearance model. The shape model is given
by
x = x̄ + Ps bs . (43)
The gray level model is obtained by first warping the examples to the mean shape;
then the gray-level values are sampled from the shape-normalized image. In order
to reduce the effects of global lighting variation, the sampled values are normalized.
The gray level model is written as
g = ḡ + Pg bg . (44)
The complete appearance parameters are the shape and the gray level parameters,
bs and bg , combined with the four pose which perform scaling by s, rotation by
θ, and translation by tx and ty .
In the AAM search process, the task is to minimize the difference between
a new image and one synthesized by the appearance model. The relationship
between the difference and the model parameters is approximated linearly. The
coefficients of this linear transformation are obtained in a training stage by varying
the model parameters and observing the change in the image intensities.
Active Blobs (ABs) [29] constitute an approach similar toAAMs. This method
uses the image difference to drive tracking by learning the relationship between im-
age error and parameter offset in an off-line processing stage. The main difference
is that ABs are derived from a single example with low-energy mesh deformations
(derived using a Finite-Element Method), whereas AAMs use a training set of
examples.
in the search space. Solutions are obtained by iteratively generating new gener-
ations of chromosomes by selective breeding based on the relative values of the
objective function for different members of the population. The set of shape and
pose parameters is encoded as a string of genes to form a chromosome. Each
gene can take one of several values called alleles. It has been shown that long
chromosomes with few alleles are preferable to shorter chromosomes with many
alleles. Consequently, chromosomes usually contain parameters encoded as bit
strings. Mutation and crossover are applied to chromosomes to produce children.
Crossover is when the two parent chromosomes are cut at a certain position and
the opposing sections combined to produce two children. Mutation is done by
selecting a gene and changing its value. In the case of binary representation, the
bit is complemented. In the iterative search process, the fitter an individual, i.e.,
the higher the value of the objective function obtained from the parameters of the
chromosome of the individual, the more this individual will be used to produce
children. Moreover, certain regions in the search space will be allocated more
trials due to the existence of some evidence that these regions are fitter.
and the shape parameters updated. Following the above assumptions, the standard
discrete Kalman filter is used to update the state estimates and covariance.
x = x̂ + Φu, (45)
where x̂ is a vector of the original points of the shape include this here, Φ is a matrix
of eigenvectors representing the vibrational modes, and u is a set of parameters.
Matrix Φ can be found by solving the following generalized eigensystem:
KΦ = MΦΩ2 , (46)
where Kxx , Kyx = KTxy , and Kyy are L × L with off-diagonal elements given
by
kxxij = d2xij d2ij
kyyij = d2yij d2ij , (48)
kxyij = dxij dyij d2ij
and diagonal elements kyyii = kxxii = 1.0 and kxyii = 0, where
dxij = xi − xj
dyij = yi − y j (x, y are from the vector x). (49)
d2ij = d2xij + d2xij
One suggestion for combining the FEM and PDM is by using an equation of the
form
x = x̂ + Φu + Pb, (50)
where x̂ is a mean shape, P is a matrix representing the statistical variations,
and b is a vector of parameters. Since the statistical modes of variation and the
vibrational modes are not independent, this approach is unsatisfactory due to the
redundancy in the model.
An alternative approach for combining the models is as follows. If only one
shape example exists, then we are left with only the FEM model. Given two shape
examples, we calculate the vibrational modes for each shape, and use them to
generate more shapes by randomly selecting the parameters u. We then train a
PDM on this new set of examples. If more than two shapes exist, then we need to
reduce the effect of the vibrational modes and increase the effect of the statistical
modes of variation because, as mentioned earlier, the vibrational modes may not
represent real shape variations of a certain class of objects.
This idea can be realized by first choosing the distribution of u to have a
zero mean and a diagonal covariance matrix Su = αΛ, with (Su )ii = αωi−2 (this
gives more variations in the low-frequency vibrational modes, and less variation in
the high-frequency vibrational modes), and then calculating the eigenvectors and
eigenvalues of the “combined-model” covariance matrix of the generated example
shapes. Now, the derivation of this covariance matrix is presented.
Starting with m original example shapes: xi for 1 ≤ i ≤ m, with a mean
m
1
x̄ = xi . (51)
m i=1
One example shape xi can be used to produce other shapes from its vibrational
modes as
xgi = xi + Φi u, (53)
where xgi is a shape generated from the vibrational modes of xi .
The covariance matrix of xgi is given by
This gives
m
1
Sc = S + α Φi ΛΦTi . (55)
m i=1
Now we find the eigenvectors and eigenvalues of Sc and proceed as with the normal
PDM.
If we have only one example, then from (51) we get x̄ = x1 , and from (52)
we get S = 0, and thus we have only the effects of the FEM model in (55). If
the number of samples is very large, then we wish to ignore the effects of the
vibrational modes. This can be done by setting α to zero, and then Sc = S, and
thus we obtain the PDM for the original example shapes. In the intermediate
cases, we wish to have a large value for α when there are few original examples
and smaller values as more examples become available. This may be achieved by
setting α ∝ 1/m.
In the training stage of the 2D PDM as well as in 3D, we need to provide the
coordinates of the landmarks for each shape or surface in the training set. The
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 363
problem is solved in 2D, by providing a human expert with a utility to “point and
click” landmarks on a 2D image. In 3D it is not trivial to build a utility that allows
the operator to navigate in a 3D space of intensity values and locate the landmarks.
In the currently discussed work, the segmentation is generated by hand in a slice-
by-slice fashion and the aligning procedure allowed for rotation only in the slice
plane, in addition to scaling and translation. In the search stage, the strongest edge
or a representative profile is sought along a search profile normal to the 3D surface.
Classification Using ASMs The idea of using an ASM for classification can be
explained by summarizing the work done in [34] on recognizing human faces. The
shape model and the local gray level profile model are augmented with a model
for the appearance of a shape-free face obtained by deforming each face to have
the same shape as the mean face (using thin plate splines as described in [35]). In
the training stage a number of training faces of the same and different individuals
were outlined. In the application stage, an ASM is used to segment a face, and
then the shape, shape-free, and local gray-level parameters are extracted and used
as features for classification.
3.1. Introduction
Ultrasound echocardiography is a valuable noninvasive and relatively inex-
pensive tool for clinical diagnosis and analysis of heart functions including ven-
tricular wall motion. An important step toward this analysis is segmentation of
endocardial boundaries of the left ventricle (LV) [20, 36–40]. Although segment-
ing anatomical objects in high-SNR images can be done with simple techniques,
problems do arise when the images are corrupted with noise and the object itself
is not clearly or completely visible in the image. This is clearly the case in heart
images obtained by ultrasonography, which are characterized by weak echoes,
echo dropouts, and high levels of speckle noise. These image artifacts often result
in detecting erroneous object boundaries or failing to detect true ones. Snakes [2]
and its variants [11, 41–43] overcome parts of these limitations by considering
the boundary as a single, inherently connected, and smooth structure, and also
by supporting intuitive, interactive mechanisms for guiding the segmentation. In
our application of locating the human LV boundary in echocardiography, human
guidance is often needed to guarantee acceptable results. A potential remedy is to
present the snake with a priori information about the typical shape of the LV. Sta-
tistical knowledge about shape variation can be obtained using PDMs, which are
central to the ASMs segmentation technique [33]. PDMs, which are obtained by
performing PCA on landmark coordinates labeled on many example images, have
been applied to the analysis of echocardiograms [37]. However, this procedure is
problematic since manual labeling of corresponding landmark points is required.
In our application it is tedious to obtain a training data set delineated by experts
with point correspondence, let alone the fact that defining a sufficient number of
landmarks on the LV boundary is a challenging task in and of itself.
The method described in this section is similar to both ASMs, but without the
landmark identification and correspondence requirement, and ACMs, but enforced
with a priori information about shape variation. We adopt an approach similar to
PDMs for capturing the main modes of ventricular shape variation. However,
in our method, rather than representing the object boundaries by spatially cor-
responding landmarks, we employ a frequency-based boundary representation,
namely Discrete Cosine Transform (DCT) coefficients. These new shape descrip-
tors eliminate the need for spatial point correspondence. PCA, which is central
to ASMs, is applied to this set of shape descriptors. An average object shape is
extracted along with a set of significant shape variational modes. Armed with this
model of shape variation, we find the boundaries in unknown images by placing
an initial ACM and allowing it to deform only according to the examined shape
variations. A similar approach using Fourier descriptors and applied to locating
the corpus callosum in 2D MRI was reported in [60]. Results of segmenting the
human LV in real echocardiographic data using the discussed methodology are
also presented.
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 365
3.2. Methods
3.2.1. Overview
This section presents a general overview of the method (see Figure 15). We
used snakes as the underlying segmentation technique. In order to arm the snake
model with a priori information about the typical shape variations of the LV that
may be encountered during the segmentation stage, a training set of images is pro-
vided. This set is manually delineated by medical experts without the requirement
of complete landmark correspondence between different images. The entire set
of manually traced contours is then studied to model the typical ventricular shape
variations. This is done by first applying a re-parametrization of the contours,
which gives a set of DCT coefficients replacing the spatial coordinates. We then
apply PCA to find the strongest modes of shape variation. The result is an average
ventricular shape, represented by a set of average DCT coefficients, in addition
to the principal components, along with the fraction of variation each component
explains. To segment a new image of the LV, we initialize a snake and, unlike
classical snakes, do not allow it to freely deform according to internal and external
energy terms, but instead we constrain its deformations in such a way that the
resulting contour is similar to the training set. To attain the constrained deforma-
tions we obtain the vector of DCT coefficients for the active contour coordinates,
project it onto an allowable snake space defined by the main principal compo-
nents, and then perform an Inverse DCT (IDCT), which converts the constrained
DCT coefficients back to spatial coordinates. This is repeated until convergence,
which is reached when the majority of snake nodes do not change their locations
significantly. The shape models generated are normalized with respect to the simi-
larity transformation parameters: rotation angle, scaling factor, and two translation
parameters.
Figure 15. Flowchart depicting the main steps involved in the use of a statistically con-
strained snake for image segmentation. Reprinted with permission from [62]. Copyright
2000,
c IEEE.
366 GHASSAN HAMARNEH and CHRIS MCINTOSH
N
π (2i − 1) (k − 1)
xi = w (k) X (k) cos , i = 1, . . . , N, (57)
2N
k=1
where
√1 , k = 1,
N
w (k) = (58)
2
, 2 ≤ k ≤ N,
N
and X (k) are the DCT coefficients. Similar equations are used for the yi coordi-
nates and the Y (k) DCT coefficients. The DCT was favored as the new frequency
domain shape parametrization because it produces real coefficients, has excellent
energy compaction properties, and the correspondence between the coefficients
(when transforming contours with no point correspondence) is readily available.
The latter property stems from the fact that corresponding DCT coefficients capture
specific spatial contour frequencies.
of varying length with no point correspondence). DCT coefficients (X) are then
obtained followed by PCA. Presented with a new image, a snake contour is first
initialized by specifying the starting and endpoints of the contour, and then allowed
it to deform by applying forces that minimize traditional energy terms. In order to
guarantee a snake contour resembling an acceptable shape (similar to those in the
training set), we constrain the resulting deformed contour, {vi (t) , i = 1, . . . , N },
by projecting vector X (consisting of M DCT coefficients) onto the subspace of
principal components (the allowable shape space) according to
Figure 18. Ultrasound image with manual tracing (continuous) and the contour after IDCT
of truncated DCT coefficients (dots). Reprinted with permission from [62]. Copyright
2000,
c IEEE.
Figure 19. Mean contour and the first and second variation modes (weighted by ±1 std).
Reprinted with permission from [62]. Copyright 2000,
c IEEE.
370 GHASSAN HAMARNEH and CHRIS MCINTOSH
space, and then performed the IDCT. It was visually obvious how the constrained
contour resembles a much more plausible boundary of the LV than the noisy one
(Figure 20).
Figure 20. (a) Manual tracing. (b) Noisy version of (a). (c) IDCT of truncated DCT
coefficients of (b). (d) Projection of (c) on the allowable shape space (note the similarity to
(a)). Reprinted with permission from [62]. Copyright 2000,
c IEEE.
Figure 21. Snake contours (dashed) and constrained contours (continuous) with increasing
number of iterations (left to right, top to bottom). Reprinted with permission from [62].
Copyright 2000,
c IEEE.
Figure 22. Progress ((a)–(d)) of a snake overlain on an ultrasound image of the left ventricle
(dashed), and the result of DCT–Truncation–Projection–IDCT (continuous). Reprinted with
permission from [62]. Copyright 2000,
c IEEE.
372 GHASSAN HAMARNEH and CHRIS MCINTOSH
3.4. Summary
We presented a method for constraining the deformations of an active contour
according to training examples and applied it to segmenting the human left ventricle
in echocardiographic (ultrasound) images. To capture the typical shape variations
of the training set, principal component analysis was performed on frequency-
domain shape descriptors in order to avoid the drawbacks associated with labeling
corresponding landmarks (only the start and endpoints of the contours correspond).
The method utilizes the strength of ACMs in producing smooth and connected
boundaries along with the strength of ASMs in producing shapes similar to those
in a training set. More plausible LV shapes resulted when employing the new
method compared to classical snakes.
4.1. Introduction
Much work has been done on tracking rigid objects in 2D sequences. In many
image analysis applications, however, there is a need for modeling and locating
non-rigid time-varying object shapes. One approach for dealing with such objects
is the use of deformable models. Deformable models [59] such as snakes [2] and
its variants [16, 42–45], have attracted considerable attention and are widely used
for segmenting non-rigid objects in 2D and 3D (volume) images. However there
are several well-known problems associated with snakes. They were designed as
interactive models and therefore rely on a user to overcome initialization sensi-
tivity. They were also designed as general models showing no preference among
a set of equally smooth shapes. This generality can cause unacceptable results
when snakes are used to segment objects with shape abnormalities arising from
occlusion, closely located but irrelevant structures, or noise. Thus, techniques
that incorporate a priori knowledge of object shape were introduced [46, 47]. In
ASMs [46] the statistical variation of shapes is modeled beforehand in accordance
with a training set of known examples. In order to attack the problem of tracking
non-rigid time-varying objects, deformable models were extended to dynamic
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 373
deformable models [16, 29, 48–51]. These describe the shape changes (over time)
in a single model that evolves through time to reach a state of equilibrium where
internal forces, representing constraints on shape smoothness, balance the exter-
nal image forces and the contour comes to rest. Deformable models have been
constructed by applying a probabilistic framework and lead to techniques such as
“Kalman snakes” [52]. Motion tracking using deformable models has been used
for tracking non-rigid structures such as blood cells [48], and much attention has
been given to the human heart and tracking of the left ventricle in both 2D and 3D
[16, 49, 50, 53]. In addition to tracking rigid objects, previous work has focused
on arbitrary non-rigid motion and gave little attention to tracking objects moving
in specific motion patterns, without incorporation of statistical prior knowledge in
both 2D and time [54].
We present a method for locating ST shapes in image sequences. We extend
ASMs [46] to include knowledge of temporal shape variations and present a new
ST shape modeling and segmentation technique. The method is well suited to
model and segment objects with specific motion patterns, as in echocardiography.
4.2. Method
In order to model a certain class of ST shapes, a representative training set
of known shapes is collected. The set should be large enough to include most of
the shape variations we need to model. Next, all the ST shapes in the training
set are parametrized. A data dimensionality reduction stage is then performed by
capturing only the main modes of ST shape variations. In addition to constructing
the ST shape model, the training stage also includes the modeling of gray-level
information. The task is then to locate an ST shape given a new unknown image
sequence. An average ST shape is first initialized, “optimal” deformations are
then proposed, and the deformations are constrained to agree with the training
data. The proposed changes minimize a cost function that takes into account
both the temporal shape smoothness constraints and the gray-level appearance
constraints. The search for the optimum proposed change is done using dynamic
programming. The following sections present the various steps involved in detail.
ST Shape Alignment Next, the ST shapes are aligned in order to allow compar-
ing equivalent points from different ST shapes. This is done by rotating, scaling,
and translating the shape in each frame of the ST shape by an amount that is fixed
within one ST shape. A weighted least-squares approach is used for aligning two
sequences and an iterative algorithm is used to align all the ST shapes. Given two
ST shapes,
S1 = [x111 , y111 , . . . , x11L , y11L , x121 , y121 , . . . , x12L , y12L , . . . . . . ,
x1F 1 , y1F 1 , . . . , x1F L , y1F L ]
and
S2 = [x211 , y211 , . . . , x21L , y21L , x221 , y221 , . . . , x22L , y22L , . . . . . . ,
x2F 1 , y2F 1 , . . . , x2F L , y2F L ] ,
we need to find rotation angle θ, scaling factor s, and the value of translation
(tx , ty ) that will align S2 to S1 . To align S2 to S1 , S2 is mapped to
The elements of W reflect our trust in each coordinate and are chosen to be
proportional to the “stability” of the different landmarks over the trainingset [46].
x̂2kl
To rotate, scale, and translate a single coordinate, (x2kl , y2kl ), we use
ŷ2kl
ax −ay x2kl tx
= . + , where ax = s cos(θ) and ay = s sin(θ).
ay ax y2kl ty
T
Ŝ2 can now be rewritten as Ŝ2 = Az, where z = [ax , ay , tx , ty ] and
x y . . .x y x y . . .x y . . .. . .x y . . .x y
211 211 21L 21L 221 221 22L 22L 2F 1 2F 1 2F L 2F L
T −y211 x211 . . .−y21L x21L −y221 x221 . . .−y22L x22L . . .. . .−y2F 1 x2F 1 . . .−y2F L x2F L
A = .
1 0 . . .1 0 1 0 . . .1 0 . . .. . .1 0 . . .1 0
0 1 . . .0 1 0 1 . . .0 1 . . .. . .0 1 . . .0 1
S = S̄ + Pt b, (61)
376 GHASSAN HAMARNEH and CHRIS MCINTOSH
T
where b = [b1 , b2 , ..., bt ] , Pt = [p1 , p2 , ..., pt ], and the constraints on b become
bl min ≤ bl ≤ bl max , where 1 ≤ l ≤ t.
where yijk is the representative profile for the kth landmark in the jth shape of
the ith ST shape. Using gray-level information, temporal and shape constraints,
the model is guided to a better estimate of the dynamic object hidden in the new
frame sequence.
T
where t = [tx , ty , tx , ty , ..., tx , ty ] is of length 2F L. M (s, θ) [S] + t scales,
rotates, and translates S by s, θ, and t, respectively. Both S̄ and Pt are obtained
from the training stage. A typical initialization would set b0 to zero and set s0 ,
θ0 , and t0 to values that place the initial sequence in the vicinity of the target.
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 377
Proposing a new sequence. For each landmark, say the kth landmark in
the jth frame, we define a search profile hjk = [hjk1 , hjk2 , ..., hjkH ] that is
differentiated and normalized as done with the training profiles. This gives H F
possibilities for the proposed positions of the kth landmarks in the F frames (see
Figure 23).
Since locating the new positions (1 out of H F possible) is computationally
demanding, we formulate the problem as a multistage decision process and use
dynamic programming [41] to find the optimum proposed landmark positions by
minimizing a cost function. The cost function comprises two terms: one due to
large temporal landmark position changes, and another reflecting the mismatch
between the gray-level values surrounding the current landmarks and those ex-
pected values found in the gray-level training stage. In the following paragraphs,
we detail our implementation of dynamic programming.
We calculate a gray-level mismatch value Mk (j, l) for each point along each
search profile in all the frames according to
(a)
(b)
the landmark in the last frame, frame F . Then we use mF to find the proposed land-
mark position in the second-to-last frame, frame F −1, as mF −1 = Pk (F, mF ). Its
coordinates will be (cF −1kx (mF −1 ), cF −1ky (mF −1 )). In general the proposed
coordinates of the kth landmark of the jth frame will be
mj = Pk (j + 1, mj+1 ). (69)
Tracking back to the first frame, we acquire the coordinates of the proposed posi-
tions of the kth landmark in all frames. Similarly, we obtain the proposed positions
0
for all the landmarks (1 ≤ k ≤ L), which define the ST shape changes dŜproposed .
Limiting the proposed sequence. Since the proposed ST shape (Ŝ 0 +
0
dŜproposed ) will generally not conform to our model of reduced dimensionality
and will not lie in the ASTSD, it cannot be accepted as an ST shape estimate.
Therefore, we need to find an acceptable ST shape that is closest to the proposed
one. This is done by first finding the pose parameters (s1 , θ1 , and t1 ) that will
0
align S̄ to Ŝ 0 + dŜproposed by mapping S̄ to M s1 , θ1 S̄ + t1 , and then
finding the extra ST shape modifications dS 1 that, when combined with the pose
0
parameters, will map exactly to Ŝ 0 + dŜproposed . The latter is done by solving
the following equation for dS 1 :
! "
0
M s1 , θ1 S̄ + dS 1 + t1 = Ŝ 0 + dŜproposed ⇒ (70)
−1 ! "
0
dS 1 = M s1 , θ1 Ŝ 0 + dŜproposed − t1 − S̄, (71)
−1
where M s1 , θ1 = M (s1 )−1 , −θ1 . In order to find the new shape
parameters, b , we need to solve dS 1 = Pt b1 , which, in general, has no
1
4.3. Results
We present results of locating the spatiotemporal shape of the left ventricle in
real echocardiographic image sequences. The training data set consisted of 6 frame
sequences, each sequence including 21 frames, and each frame of size 255 × 254
pixels (i.e., the size of ΦV = 6 × 21 × 255 × 254). The number of (x, y) landmark
coordinates in each frame was 25 (size of ΦS = 6 × 21 × 25 × 2). Three ST shape
parameters were used to explain 94.2% of the total ST shape variations. The gray-
level search was conducted on a profile of length 60 pixels, and the training profile
was of length 26 pixels. Figure 24 illustrates how statistical spatiotemporal prior
knowledge is used to constrain the proposed segmentation and produce the final
left-ventricular segmentation. Figure 25 shows additional segmentation results.
We applied this method to segmenting astrocyte cells in a 3D fluorescence
image, where the spatial z-axis replaces time. The training data set consisted of 8
volumes (out of 9, leave-one-out validation), and each included 11 image slices,
each image being of size 128×128 pixels (i.e., the size of ΦV = 8×11×128×128).
The number of (x, y) landmark coordinates in each slice was 40 (size of ΦS =
8 × 11 × 40 × 2). Seven shape parameters were used to explain 99.5% of the
total shape variations. The gray-level search was conducted on a profile of length
40 pixels, and the training profile was of length 12 pixels. Figure 26 illustrates
example segmentation results.
4.4. Summary
Motivated by the fact that many image analysis applications require robust
methods for representing, locating, and analyzing non-rigid time-varying shapes,
we presented an extension of a 2D ASM to 2D+time. This method models the
gray-level information and the spatiotemporal variations of a time-varying object
in a training set. The model was then used for locating similar moving objects in
a new image sequence. The segmentation technique was based on deforming a
spatiotemporal shape to better fit the image sequence data only in ways consistent
with the training set. The proposed deformations were calculated by minimizing
an energy function using dynamic programming. The energy function included
terms reflecting temporal smoothness and gray-level information constraints.
DEFORMABLE MODELS FOR MEDICAL IMAGE ANALYSIS 381
(a)
(b)
(c)
(d)
Figure 24. Left-ventricular segmentation result from two echocardiographic image se-
quences. Ultrasound frames are shown with the ST shape overlain (a,c) before and (b,d) af-
ter projection onto the ASTSD (frames progress from left to right, top to bottom). Reprinted
with permission from [63]. Copyright 2004,
c Elsevier.
382 GHASSAN HAMARNEH and CHRIS MCINTOSH
(a)
(b)
(a)
(b)
Figure 26. Segmenting a 3D astrocyte cell (spatial z-axis replaces time): (a) initial shape
model and (b) segmentation result overlain in white on a fluorescence 3D image. Reprinted
with permission from [63]. Copyright 2004,
c Elsevier.
5. REFERENCES
1. Terzopoulos D. 1987. On matching deformable models to images. Technical Report 60, Schlum-
berger Palo Alto Research, 1986. Reprinted in Topical Meeting on MachineVision, Technical
Digest Series, Vol. 12, pp. 160–167.
2. Kass M, Witkin A, Terzopoulos D. 1987. Snakes: active contour models. Int J Comput Vision
1(4):321–331.
3. Widrow B. 1973. The rubber mask technique, part I. Pattern Recognit 5(3):175–211.
4. Fischler M, Elschlager R. 1973. The representation and matching of pictorial structures. IEEE
Trans Comput 22(1):67–92.
384 GHASSAN HAMARNEH and CHRIS MCINTOSH
27. Cootes T, Taylor C, Lanitis A. 1994. Active shape models: evaluation of a multi-resolution
method for improving image search. In Proceedings of the 5th British machine vision conference
(BMVC’94), pp. 327–336. Surrey, UK: BMVA Press.
28. Cootes T, Edwards G, Taylor C. 1998. Active appearance models. In Proceedings of the fifth
European conference on computer vision (ECCV 1998), Volume 2. Lecture notes in computer
science, Vol. 1407, pp. 484–498. Ed H Burkhardt, B Neumann. Berlin: Springer.
29. Sclaroff S, Isidoro J. 1998. Active blobs. In Proceedings of the sixth international conference on
computer vision (ICCV), pp. 1146–1153. Washington, DC: IEEE Computer Society.
30. Hill A, Thornham A, Taylor C. 1993. Model-based interpretation of 3d medical images. In Pro-
ceedings of the fourth British machine vision conference (BMVC’93), pp. 339–348. Surrey, UK:
BMVA Press.
31. Lanitis A, Taylor A, Cootes T. 1994. Automatic tracking, coding and reconstruction of human
faces using flexible appearance models. Electron Lett 30(19):1578–1579.
32. Baumberg A, Hogg D. 1994. An efficient method for contour tracking using active shape models.
In Proceedings of the 1994 IEEE workshop on motion of non-rigid and articulated objects,
pp. 194–199. Washington, DC: IEEE Computer Society.
33. Cootes T, Taylor C. 1995. Combining point distribution models with shape models based on finite
element analysis. Image Vision Comput 13(5):403–409.
34. Lanitis A, Taylor C, Cootes T. 1994. Recognising human faces using shape and grey-level informa-
tion. In Proceedings of the third international conference on automation, robotics and computer
vision, pp. 1153–1157. Washington, DC: IEEE Computer Society.
35. Bookstein F. 1989. Principal warps: thin-plate splines and the decomposition of deformations.
IEEE Trans Pattern Anal Machine Intell 11(6):567–585.
36. Hunter IA, Soraghan JJ, Christie J, Durrani JS. 1993. Detection of echocardiographic left ventricle
boundaries using neural networks. IEEE Proc Comput Cardiol 20:201–204.
37. Parker A, Hill A, Taylor C, Cootes T, Jin X, Gibson D. 1994. Application of point distribution
models to the automated analysis of echocardiograms. IEEE Proc Comput Cardiol 21:25–28.
38. Taine M, Herment A, Diebold B, Peronneau P. 1994. Segmentation of cardiac and vascular
ultrasound images with extension to border kinetics. Proceedings of the 8th IEEE symposium on
ultrasonics, Vol. 3, pp. 1773–1776. Washington, DC: IEEE Computer Society.
39. Papadopoulos I, Strintzis MG. 1995. Bayesian contour estimation of the left ventricle in ultrasound
images of the heart. Proceedings of the IEEE conference on engineering in medicine and biology,
Vol. 1, pp. 591–592. Washington, DC: IEEE Computer Society.
40. Malassiotis S, Strintzis MG. 1999. Tracking the left ventricle in echocardiographic images by
learning heart dynamics. IEEE Trans Med Imaging 18(3):282–290.
41. Amini A, Weymouth T, Jain R. 1990. Using dynamic programming for solving variational prob-
lems in vision. IEEE Trans Pattern Anal Machine Intell 12(9):855–867.
42. Cohen L. 1991. On active contour models and balloons. Comput Vision Graphics Image Process:
Image Understand 53(2):211–218.
43. Grzeszczuk R, Levin D. 1997. Brownian strings: segmenting images with stochastically de-
formable contours. IEEE Trans Pattern Anal Machine Intell 19(10):1100–1114.
44. Herlin I, Nguyen C, Graffigne C. 1992. A deformable region model using stochastic processes
applied to echocardiographic images. In Proceedings of the IEEE computer society conference
on computer vision and pattern recognition (CVPR 1992), pp. 534–539. Washington, DC: IEEE
Computer Society.
45. Lobregt S, Viergever M. 1995. A discrete dynamic contour model. IEEE Trans Med Imaging
14(1):12–24.
46. Cootes T, Taylor C, Cooper D, Graham J. 1995. Active shape models: their training and applica-
tion. Comput Vision Image Understand 61(1):38–59.
47. Staib L, Duncan J. 1992. Boundary finding with parametrically deformable models. IEEE Trans
Pattern Anal Machine Intell 14(11):1061–1075.
386 GHASSAN HAMARNEH and CHRIS MCINTOSH
48. Leymarie F, Levine M. 1993. Tracking deformable objects in the plane using an active contour
model. IEEE Trans Pattern Anal Machine Intell 15(6):617–634.
49. Niessen W, Duncan J, Viergever M, Romeny B. 1995. Spatiotemporal analysis of left-ventricular
motion. Proc SPIE 2434:250–261.
50. Signh A, Von Kurowski L, Chiu M. 1993. Cardiac MR image segmentation using deformable
models. In SPIE proceedings on biomedical image processing and biomedical visualization, Vol.
1905, pp. 8–28. Bellingham, WA: SPIE.
51. Stark K, Fuchs S. 1996. A method for tracking the pose of known 3D objects based on an active
contour model. Proceedings of the international conference on pattern recognition (ICPR’96),
pp. 905–909. Washington, DC: IEEE Computer Society.
52. Terzopoulos D, Szeliski R. 1992. Tracking with Kalman snakes. In Active vision, pp. 3–20. Ed A
Blake, A Yuille. Cambridge: MIT Press.
53. Lelieveldt B, Mitchell S, Bosch J, van der Geest R, Sonka M, Reiber J. 2001. Time-continuous
segmentation of cardiac image sequences using active appearance motion models. Proceedings
of the 17th international conference on information processing in medical imaging (ICIPMI’01).
Lecture Notes in computer science, Vol. 2082, pp. 446–452. Berlin: Springer.
54. Black M, Yacoob Y. 1997. Recognizing facial expressions in image sequences using local
parametrized models of image motion. Int J Comput Vision 25(1):23–48.
55. Bonciu C, Léger C, Thiel J. 1998. A Fourier-Shannon approach to closed contour modeling.
Bioimaging 6:111–125.
56. Cootes T, Taylor C. 1997. A mixture model for representing shape variation. In Proceedings of the
eighth British machine vision conference (BMVC’97), pp. 110–119. Surrey, UK: BMVA Press.
57. Hill A, Taylor CJ. 1992. Model-based image interpretation using genetic algorithms. Image Vision
Comput 10(5):295–300.
58. Metaxas D, Terzopoulos D. 1991. Constrained deformable superquadrics and nonrigid motion
tracking. In Proceedings of the IEEE computer society conference on computer vision and pattern
recognition (CVPR 1991), pp. 337–343. Washington, DC: IEEE Computer Society.
59. Singh A, Goldgof D, Terzopoulos D. 1998. Deformable models in medical image analysis. Wash-
ington, DC: IEEE Computer Society.
60. Székely G, Kelemen A, Brechbühler C, Gerig G. 1996. Segmentation of 3D objects from MRI
volume data using constrained elastic deformations of flexible Fourier surface models. Med Image
Anal 1(1):19–34.
61. Hamarneh G, Chodorowski A, Gustavsson T. 2000. Active contour models: application to oral
lesion detection in color images. Proceedings of the IEEE international conference on systems,
man, and cybernetics, Vol. 4, pp. 2458–2463. Washington, DC: IEEE Computer Society.
62. Hamarneh G, Gustavsson T. 2000. Statistically constrained snake deformations. In Proceedings
of the IEEE international conference on systems, man, and cybernetics, Vol. 3, pp. 1610–1615.
Washington, DC: IEEE Computer Society.
63. Hamarneh G, Gustavsson T. 2004. Deformable spatiotemporal shape models: extending ASM to
2D + time. J Image Vision Comput 22(6):461–470.
12
DEFORMABLE ORGANISMS
FOR MEDICAL IMAGE ANALYSIS
In medical image analysis strategies based on deformable models, controlling the de-
formations of models is a desirable goal in order to produce proper segmentations. In
Chapter 11—“Physically and Statistically Based Deformable Models for Medical Image
Analysis”—a number of extension were demonstrated to achieve this, including user in-
teraction, global-to-local deformations, shape statistics, setting low-level parameters, and
incorporating new forces or energy terms. However, incorporating expert knowledge to
automatically guide deformations can not be easily and elegantly achieved using the clas-
sical deformable model low-level energy-based fitting mechanisms. In this chapter we
review Deformable Organisms, a decision-making framework for medical image analysis
that complements bottom–up, data-driven deformable models with top–down, knowledge-
driven mode-fitting strategies in a layered fashion inspired by artificial life modeling con-
cepts. Intuitive and controlled geometrically and physically based deformations are carried
out through behaviors. Sensory input from image data and contextual knowledge about
the analysis problem govern these different behaviors. Different deformable organisms for
segmentation and labeling of various anatomical structures from medical images are also
presented in this chapter.
Address all correspondence to: Dr. Ghassan Hamarneh, School of Computing Science, Simon Fraser
University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada. Phone: +1.604.291.3007, Fax:
+1.604.291.3045, [email protected].
387
388 GHASSAN HAMARNEH and CHRIS MCINTOSH
are continuously being acquired. This is creating an increasing demand for medical
image analysis (MIA) tools that are not only robust and highly automated, but also
intuitive for the user and flexible to adapt to different applications. Medical image
segmentation in particular remains one of the key tasks indispensable to a wide
array of subsequent quantification and visualization goals in medicine, including
computer-aided diagnosis and statistical shape analysis applications. However, the
automatic segmentation and labeling of anatomical structures in medical images
is a persistent problem that continues to defy solution. Several classifications of
segmentation techniques exist, including edge, pixel, and region-based techniques,
clustering, graph theoretic, and model correlation approaches [1–5]. However, no
one method can yet handle the most general case with sufficient accuracy.
It is important to note that a substantial amount of knowledge is often available
about anatomical structures of interest — shape, position, orientation, symmetry,
relationship to neighboring structures, landmarks, etc. — as well as about the
associated image intensity characteristics. However, medical image analysis re-
searchers have struggled to develop segmentation techniques that can take full
advantage of such knowledge.
The development of general-purpose automatic segmentation algorithms will
require not only powerful bottom–up, data-driven processes, but also equally
powerful top–down, knowledge-driven processes within a robust decision-making
framework that operates across multiple levels of abstraction. A flexible framework
is needed that can operate at the appropriate level of abstraction and is capable of
incorporating and applying all available knowledge effectively (i.e., at the correct
time and location during image analysis). A critical element in any such viable
highly automated solution is the decision-making framework itself. Top–down, hi-
erarchically organized models that shift their focus from structures associated with
stable image features to those associated with less stable features are promising ex-
amples [6, 7]. Although Knowledge-based [8–14] and agent-based segmentation
techniques [15–19] have been proposed in the past, their use of high-level con-
textual knowledge remains largely ineffective because it is intertwined much too
closely with the low-level optimization-based mechanisms. We revisit ideas for
incorporating knowledge that were explored in earlier systems and develop a new
frameworks that focuses on top–down reasoning strategies that may best leverage
the powerful bottom–up feature detection and integration abilities of deformable
models and other modern model-based medical image analysis techniques.
Deformable models, one of the most actively researched model-based seg-
mentation techniques [20], feature a potent bottom–up component founded in
estimation theory, optimization, and physics-based dynamical systems, but their
top–down processes have traditionally relied on interactive initialization and guid-
ance by knowledgeable users (see Section 1.4, Snakes Drawbacks, in Chapter
11). Since their introduction by Terzopoulos et al. [83, 84], deformable models
for medical image segmentation have gained increasing popularity. In addition
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 389
tors rather than low-level image features. Different subroutines are invoked based
on high-level model-fitting decisions by integrating image features, prior contex-
tual anatomical knowledge, and a pre-stored segmentation plan. Furthermore,
by combining a layered architecture with a set of standard subroutines, powerful
and flexible “custom-tailored” models can be rapidly constructed, thus providing
general-purpose tools for automated medical image segmentation and labeling.
To realize the ideas and achieve the goals mentioned above in Section 1,
we introduced a new paradigm for automatic medical image analysis that adopts
concepts from the emerging field of artificial life (AL)2 . In particular, we developed
deformable organisms, autonomous agents whose objective is the segmentation
and analysis of anatomical structures in medical images [38]. The AL modeling-
based approach provides us with the required flexibility to adhere to an active,
explicit search strategy that takes advantage of contextual and prior knowledge of
anatomy. The organisms are aware of the progress of the segmentation process
and of each other, allowing them to effectively and selectively apply knowledge
of the target objects throughout their development.
Viewed in the context of the AL modeling hierarchy (Figure 2a), current au-
tomatic deformable model-based approaches to medical image analysis utilize ge-
ometric and physical modeling layers only (Figure 2b). In interactive deformable
models, such as snakes, the human operator is relied upon to provide suitable be-
havioral and cognitive level support (Figure 2c). At the physical level, deformable
models interpret image data by simulating dynamics or minimizing energy terms,
but the models themselves do not monitor or control this optimization process
except in a most primitive way.
To overcome the aforementioned deficiencies while retaining the core strengths
of the deformable model approach, we add high-level controller layers (a “brain”)
on top of the geometric and physical layers to produce an autonomous deformable
organism (Figure 2d). The planned activation of these lower deformation layers
allows us to control the fitting or optimization procedure. The layered architec-
ture approach allows the deformable organism to make deformation decisions at
the correct level of abstraction utilizing prior knowledge, memorized information,
sensed image features, and inter-organism interaction.
Specifically, a deformable organism is structured as a “muscle”-actuated “body”
whose “behaviors” are controlled by a “brain” (Figure 3) that makes decisions
based on perceived image data and extraneous knowledge. The brain is the organ-
ism’s cognitive layer, which activates behavior routines (e.g., for a CC organism:
find-splenium, find-genu, find-upper-boundary, etc. (Figure 1)) according to a
plan or schedule (Figure 3).
392 GHASSAN HAMARNEH and CHRIS MCINTOSH
Figure 2. AL, Deformable Models, and Deformable Organisms. (a) AL modeling pyramid.
Adapted with permission from [39]. Copyright 1999,
c ACM. (b) Automatic deformable
models (incorporating geometry and physics layers only). (c) Deformable models guided
by an expert human operator. (d) Intelligent deformable models (deformable organisms)
provide a model of the cognitive abilities of human operators (by including higher cognitive
layers).
Figure 3. A Deformable Organism. The brain issues “muscle” actuation and perceptual at-
tention commands. The organism deforms and senses image features, whose characteristics
are conveyed to the brain. The brain makes decisions based on sensory input, memorized
information and prior knowledge, and a pre-stored plan, which may involve interaction with
other organisms. Reprinted with permission from [85]. Copyright 2002,
c Elsevier.
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 393
Carrying out a behavior routine requires image information in order for the
proper shape deformation to take place toward achieving the goal of the current
behavior. The deformable organism perception system is responsible for gathering
image information and comprises a set of sensors that are adaptively tuned to
specific image features (edge strength, texture, color, etc) in a task-specific way.
Hence, the organism can disregard sensory information superfluous to its current
behavioral needs.
The organism carries out a sequence of active, explicit searches for stable
anatomical features, beginning with the most stable anatomical feature and then
proceeding to the next best feature. This allows the organism to be “self-aware”
(i.e., knows where it and its parts are and what it is seeking at every stage) and is
therefore able to perform these searches intelligently and effectively by utilizing a
conflux of contextual knowledge, perceived sensory data, an internal mental state,
memorized knowledge, and a cognitive plan. For example, it need not be satisfied
with the nearest matching feature, but can look further within a region to find the
best match, thereby avoiding globally suboptimal solutions. The plan (or plans)
can be generated with the aid of a human expert, since the behavior routines are
defined using familiar anatomical terminology.
An organism may “interact” with other organisms to determine optimal initial
conditions. Once stable features are found and labeled, an organism can selectively
use prior knowledge or information from the neighbor organisms to determine the
object boundary in regions known to offer little or no feature information. Inter-
action among organisms may be as simple as collision detection and avoidance, or
one or several organisms supplying intelligent initial conditions to another, or the
use of inter-organism statistical shape/image appearance constraint information.
Furthermore, by carrying out explicit searches for features, correct correspon-
dences between the organism and the data are more readily assured. If a feature
cannot be found, an organism may “flag” this situation (Figure 4b). If multiple
plans exist, another plan can be selected and/or the search for the missing feature
postponed until further information is available (e.g., from a neighboring organ-
ism). Alternatively, the organism can retrace its steps and return to a known state
and then inform the user of the failure. A human expert can intervene and put
the organism back on course by manually identifying the feature. This strategy
is possible because of the sequential and spatially localized nature of the model
fitting process.
Customized behavioral routines and explicit feature search requires powerful,
flexible and intuitive model deformation control. The behavior routines activate
“motor” (i.e., deformation) controller routines or growth controller routines, en-
abling the organism to fulfill its goal of object segmentation. An organism may
begin in an “embryonic” state with a simple proto-shape, and then undergo con-
trolled growth as it develops into an “adult,” proceeding from one stable object
feature to the next. Alternatively, an organism may begin in a fully developed
394 GHASSAN HAMARNEH and CHRIS MCINTOSH
(a) (b)
state and undergo controlled deformations as it carries out its model-fitting plan.
The type of organism to use, or whether to use, some sort of hybrid organism, is
dependent on the image and shape characteristics of the target anatomical structure
(different examples are presented in Section 4).
Deformation controllers are parametrized procedures dedicated to carrying
out a complex deformation function, such as successively bending a portion of
the organism over some range of angle or stretching part of the organism forward
some distance. They translate natural control parameters such as <bend angle,
location, scale> or <stretch length, location, scale> into detailed deformations.
To summarize, the AL layered architecture provides the needed framework to
complement the geometrical and physical layers of classical deformable organisms
(the model shape, topology, and deformations) with behavioral and cognitive layers
(the high-level controllers that activate the routines) utilizing a perception system
(the source of the image data) and contextual-knowledge (knowledge about the
anatomy’s shape, appearance, neighborhood relationships, etc.).
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 395
(a)
(b)
(c)
(d)
Figure 5. Example deformable organisms for medical image analysis. (a) Geometrically
based and (b) physically based deformable CC organisms. (c) 2D and (d) 3D vessel crawler.
Progress of segmentation is shown from left to right. See attached CD for color version.
limited either by the type of objects they can model, or by the type and intuitiveness
of the deformations they can carry out. They are also typically not defined in terms
of the object but rather the object is unnaturally defined (or deformed) in terms
of the representation or deformation mechanism. Other 3D shape representations
with similar drawbacks include spherical harmonics, FEM, NURBS, and wavelet-
based representations [52–55].
Shape models founded upon the use of the medial-axis transform [56] are
emerging as a powerful alternative to the earlier boundary-based and volume-
based techniques [57–68]. Medial representations provide both a local and global
description of shape. Deformations defined in terms of a medial axis are natural
and intuitive and can be limited to a particular scale and location along the axis,
while inherently handling smoothness and continuity constraints.
Statistical models of shape variability have been used for medical image inter-
pretation [25, 27, 31, 32, 69]. These typically rely on principal component analysis
(PCA) and hence are only capable of capturing global shape variation modes. Sta-
tistical analysis of medial-based shape representation has been the focus of much
recent research [70–74].
In the following subsections we present the details of a variety of shape rep-
resentation and controlled deformation techniques that are crucial to the operation
of deformable organisms and modeling their lower geometrical and physical lay-
ers. These include 2D medial profiles (Section 3.3.1), 3D shape medial patches
(3.3.2), 2D spring-mass systems with physics-based deformations (3.3.3), and their
3D extension (3.3.4).
4. Compute locations xlm and xrm of boundary points l and r at either side of
the mth medial node (Figure 6) as
400 GHASSAN HAMARNEH and CHRIS MCINTOSH
Figure 6. Diagram of shape representation. Reprinted with permission from [85]. Copy-
right 2002,
c Elsevier.
Figure 7. Example medial shape profiles: (a) length profile L (m), (b) orientation profile
O (m), (c) left thickness profile T l (m), and (d) right thickness profile T r (m). Reprinted
with permission from [85]. Copyright 2002,
c Elsevier.
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 401
sx cos θ + O (m) + π2
xlm l
= xm + T (m) (1)
sy sin θ + O (m) + π2
and, similarly,
sx cos θ + O (m) − π2
xrm r
= xm + T (m) . (2)
sy sin θ + O (m) − π2
Figure 9. Examples of operator types: (left to right) triangular, Gaussian, flat, bell, and
cusp [75]. Reprinted with permission from [63]. Copyright 2004,
c World Scientific.
(a) (b)
(c) (d)
Figure 10. Introducing a bulge on the right boundary by applying a deformation operator
on the right thickness profile: (a) T r (m) before and (c) after applying the operator. (b)
Reconstructed shape before and (d) after the operator. Reprinted with permission from
[85]. Copyright 2002,
c Elsevier.
pd (m) = p̄d (m) + αdlst kdlst (m). (5)
l s t
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 403
where p, d, p̄, pd (m), l, and s are defined in (4); Mdls are variation modes (columns
of M ) for specific d, l, and s; and wdls are weights of the variation modes, where the
weights are typically set such that the variation is within three standard deviations.
For any shape profile type, multiple variation modes can be activated by setting
the corresponding weighting factors to nonzero values. Each variation mode acts
at a certain location and scale; hence we obtain
pd (m) = p̄d (m) + Mdls wdls . (7)
l s
404 GHASSAN HAMARNEH and CHRIS MCINTOSH
In summary, varying the weights of one or more of the variation modes alters the
length, orientation, or thickness profiles and generates statistically feasible stretch,
bend, or bulge deformations at specific locations and scales upon reconstruction.
Examples of statistics-based deformations are shown in Figure 11e–h.
Figure 12. Resulting medial shape profiles after applying the fitting schedule: (a) length
profile L (m), (b) orientation profile O (m), (c) left thickness profile T l (m), and (d) right
thickness profile T r (m). Reprinted with permission from [63]. Copyright 2004,
c World
Scientific.
Figure 13. Close-up of the initial and final stages of the handcrafted fitting schedule.
Reprinted with permission from [63]. Copyright 2004,
c World Scientific.
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 407
▲
1 Translation by 74,
▲ 24)
2 Rotation counterclockwise by 10o
3 Scaling by 1.2
4 Bend 1 8 2 w = 0.5
5 Bend 20 8 2 w = −0.8
6 Bend 22 6 2 w = −0.75
7 Bend 24 4 1 w = 2.2
8 Bend 1 4 2 w=1
9 Stretch 6 4 1 w = −1.5
10 Stretch 26 1 1 w=2
11 Left-bulge 15 7 1 w=3
12 Left-bulge 18 3 1 w=2
13 Left-bulge 6 12 1 w=3
14 Left-bulge 5 3 1 w=3
15 Right-squash 9 3 1 w = −1
16 Right-bulge 13 2 1 w = 0.5
17 Left-bulge 21 3 Gaussian α = 0.3
18 Left-bulge 21 7 Gaussian α = 0.1
19 Right-squash 24 2 Gaussian α = −0.5
20 Right-bulge 4 2 Bell α = 1.7
21 Right-bulge 6 3 Gaussian α = 0.4
22 Right-squash 1 3 Gaussian α = −2.2
23 Right-squash 25 1 Gaussian α = −0.8
1 2
3 4 5
6 7 8
9 10 11
12 13 14
15 16 17
18 19 20
21 22 23
Figure 14. Progress of the handcrafted fitting schedule (fitting steps listed in Table 1).
Reprinted with permission from [63]. Copyright 2004,
c World Scientific.
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 409
(a)
(b)
(c)
(d)
(e)
(f)
Figure 15. Deformations on a synthetic (slab) object. (a) Different operator types (left
to right): Gaussian, rectangular, pyramidal, spherical. (b) Different deformation types:
stretching, longitudinal bend, latitudinal bend. Different (c) locations, (d) amplitudes, and
(e) extent of a bulge operator. (f) Combining stretching, longitudinal and latitudinal bend,
and bulging deformations.
Because the effects of the deformation parameters are easily understood even to the
user not familiar with the details of the shape representation, intuitive and accurate
production of desired deformations is made possible. Results on real brain caudate
nucleus and ventricle structures are presented in Section 4.5.
Figure 16. Examples of different synthetic spring-mass structures. Reprinted with permis-
sion from [80]. See attached CD for color version.
(a) (b)
Figure 17. (a) Midsagittal MR brain image with the corpus callosum (CC) outlined in white.
(b) CC mesh model showing medial and boundary masses. Reprinted with permission from
[80]. Copyright 2005,
c SPIE.
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 411
Figure 18. Examples of deformations via user interaction (“mouse” forces). Reprinted
with permission from [62]. Copyright 2003,
c SPIE. See attached CD for color version.
Figure 19. Examples of physics-based deformations of the CC organism using (a) user
applied and (b) rotational external forces. Operator based (c) bending, (d) bulging, and (e)
stretching deformations. (f) Statistics-based spring actuation. Reprinted with permission
from [62]. Copyright 2003,
c SPIE.
In order for deformable organisms to explore the image space, fit to specified
regions, and take new shapes, they must be able to undertake a sequence of defor-
mations. Some of these deformations take place independent of the topological
design of the organism (general deformations), while others are designed specif-
ically for medial-axis based worm-type organisms (medial-based deformations).
Furthermore, in specific applications the general deformations can be modified to
apply to specific anatomical regions of the model. By automatically fixing a group
of nodes in place, the organism can perform deformations on specific regions of
its geometrical model without affecting others. We use the terms regional rota-
tion/translation and boundary expansion to refer to these types of deformations.
412 GHASSAN HAMARNEH and CHRIS MCINTOSH
Figure 20. Definition of variables for (a) radial bulge, (b) directional bulge, and (c) localized
scaling.
each spring as in Eq. (9), where θ ∈ 0, π2 is now defined as the angle between
(Figure 20b). The resulting effect in this case is that springs closer to C
sij and D
and with directions closer to the stretch deformation direction are affected more
(Figure 21).
A localized scaling deformation is independent of direction and requires only
specification of a deformation region and amplitude (Figure 20c). The rest length
update equation then becomes
old
rij = ((1 − d/R) (K − 1) + 1) rij . (10)
while the rest lengths on the other side are decreased according to
2 d2 2θ2 1 2,old
rij = 1− − 1 + 1 rij . (12)
R2 π K
Figure 22. Definition of variables for deformation operators: bending (left), and tapering
(middle). Reprinted with permission from [62]. Copyright 2003,
c SPIE.
(a) (b)
Figure 23. External forces for performing a (a) rotation (light gray circle marks center
of mass) and a (b) translation. Reprinted with permission from [62]. Copyright 2003,
c
SPIE. See attached CD for color version.
(a) (b)
Figure 24. Similarity transformation via external forces. (a) Rotating a model of the CC.
(b) Scaling and rotating a synthetic model. Reprinted with permission from [62]. Copyright
SPIE
c 2005.
Figure 25. Examples of localized deformations: (a) initial synthetic object, (b) bulge, (c)
bend, (d) bend at another location, (e) tapering, (f) tapering followed by a bulge, and (g)
tapering followed by a bulge and a bend deformations. CC model (h) before and (j) after
a localized bend. (i,k) Close-up versions of (h,j). Reprinted with permission from [62].
Copyright 2003,
c SPIE.
Figure 26. Spring types used for statistics-based deformations. Reprinted with permission
from [62]. Copyright 2003,
c SPIE. See attached CD for color version.
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 417
The set of rest lengths for the stretch springs (Figure 26) in a single example
model are collected in a vector rS , i.e.,
and similarly for the bending and left and right thickness springs: (Figure 26)
This gives
rS = r1S , r2S , . . . , rN
S
S
,
rB = r1B , r2B , . . . , rN
B
B
,
(17)
rT L = r1T L , r2T L , . . . , rN
TL ,
T
rT R = r1T R , r2T R , . . . , rN
TR ,
T
rS = r̄S + MS wS ,
rB = r̄B + MB wB ,
(18)
rT R = r̄T R + MT R wT R ,
rT L = r̄T L + MT L wT L ,
where def is the deformation type, being either S (for stretch), B (for bend), T L (for
left thickness), or T R (for right thickness). The location and scale, determined by
418 GHASSAN HAMARNEH and CHRIS MCINTOSH
the choice of loc and scl, respectively, determine which springs are to be included
in the analysis according to
loc+1 loc+scl−1
rdef,loc,scl = rloc
def , rdef , . . . , rdef . (20)
For example, for the bending deformation at location “five” with scale “three”
(def, loc, scl = B, 5, 3) we have
N
1
r̄def,loc,scl = r (j)def,loc,scl , (22)
N j=1
where r (j)def,loc,scl is rdef,loc,scl obtained from the jth training example, and N is
the number of training examples. The columns of Mdef,loc,scl are the eigenvectors,
mdef,loc,scl , of covariance matrix Cdef,loc,scl . That is,
where
1 N
T
C= (r(j) − r̄) (r(j) − r̄) , (24)
N − 1 j=1
def,loc,scl
Figure 27. Sample corpus callosum mesh model Figure 28. Sample CC mesh model deformations
deformations (1st PC for all deformation types (2nd PC for all deformation types over the entire
over the entire CC) derived from the hierarchical CC) derived from the hierarchical regional PCA.
regional PCA. Reprinted with permission from Reprinted with permission from [62]. Copyright
[62]. Copyright 2003,
c SPIE. 2003,
c SPIE.
Figure 31. Topology of the vessel crawler showing: Masses (left), radial springs (middle),
and stability springs (right) across sequential layers of the organism. See attached CD for
color version.
420 GHASSAN HAMARNEH and CHRIS MCINTOSH
stretching of the vessel crawler and links to boundary nodes to control thickness.
As deformable organisms are typically modeled after their target structures, we
employ this tubular topology to the task of vascular segmentation and analysis in
volumetric medical images [86]. We provide results in Section 4.6.
(a) (b)
0.8 0.45
0.7 0.4
0.35
0.6
0.3
0.5
0.25
0.4
0.2
0.3
0.15
0.2 0.1
0.1 0.05
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
(c) (d)
Figure 32. Off-board sensors (arc of white nodes in (a) and (b)) measure image intensity
(along the arc). This results in an intensity profile exhibiting three distinct peaks when an
overlapping vessel is ahead (c) and only two peaks in the case of a bifurcation (d).
Figure 33. A vessel crawler (left, in gray) utilizing an off-board hemispherical sensor
(shown as an arc in the left-hand image). The sensor (shown in 3D on the right) collects
vesselness measures, guiding it as it crawls along the vessel and detects branching points.
See attached CD for color version.
Model Initialization: To begin its search for the CC, the deformable
organism must be initialized to an appropriate location in the image using
robust anatomical features. The organism’s first step is to locate the top of
the head using a modified Hough transform [81] to search for the largest
ellipse (skull). Then, in response to the output of the skull sensor, a CC
template is built consisting of a rectangular window connecting two squares
to approximate the main body, genu, and splenium, respectively (see Figure
1). Maximizing average image intensity and minimizing intensity variance
over the shape model yields several candidate locations for the three CC
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 425
parts. Finally, a decision is made and the set of candidate parts that exhibits
the strongest edge connectivity and exhibits maximal distance between the
parts is selected.
Growing: Once initialized at a seed point, the vessel crawler must grow
outward along the vasculature by sequentially adding new layers centered
around the final sensor position using the 3D hemispherical and Hessian-
based sensors (described in Section 3.4). As it grows, each new layer
must be created and subsequently connected to the current end most layer
(Figure 31). The newest layer is aligned to the provided direction vector,
and then connected via a consistent clockwise ordering to prevent mesh
twisting. Once connected the model can be fit to the image data.
Fitting: The organism fits itself to the vessel boundary using 3D im-
age gradient driven deformations simulated by the physics layer (Section
3.3.4). Connections to the previous layer provide smoothness, while stiffer
circumferential springs provide local stability to noise, and flexible radial
springs allow deformation to the vessel boundary (Figure 31).
reached its goal. However, at this stage the medial axis is not in the middle of the CC
organism (Figure 34.27) so it is re-parametrized until the medial nodes are halfway
between the boundary nodes (Figure 34.28–30). Finally, the upper and lower
boundaries, which were reset in the previous step, are relocated (Figure 34.31–36)
to obtain the final segmentation result (Figure 34.36). Other CC segmentation
(Figure 35), validation results (Figure 36), and a demonstration of the organism’s
self-awareness (Figure 38) are presented.
Figure 35. Segmentation results. Reprinted with permission from [38]. Copyright 2002,
c
Springer.
Figure 36. Segmentation results (top), also shown (in black) over manually segmented
(gray) corpora callosa (bottom). Reprinted with permission from [85]. Copyright 2002,
c
Elsevier.
Figure 37. Segmentation result (a) before and (b) after detecting and repairing the fornix
dip. (c) Note the weak gradient magnitude where the fornix overlaps the CC. Reprinted
with permission from [85]. Copyright 2002,
c Elsevier.
430 GHASSAN HAMARNEH and CHRIS MCINTOSH
Figure 39. The lateral ventricle, caudate nucleus, and putamen shown in transversal brain
MRI slice.
(1) (2)
(3) (4)
(5) (6)
(7) (8)
Figure 40. Deformable lateral ventricles (1–16), caudate nuclei (CN) (8–16), and putamina
(11-16) organisms progressing through a sequence of behaviors to locate the corresponding
structures in an MR brain image. Reprinted with permission from [85]. Copyright 2002,
c
Elsevier. (results continued on next page)
432 GHASSAN HAMARNEH and CHRIS MCINTOSH
(9) (10)
(11) (12)
(13) (14)
(15) (16)
The CN organism segments the CN by stretching to locate its upper and lower limits
(Figure 40.9) and thickening to latch onto its inner and outer boundaries (Figure
40.10). The CN organism passes information about the location of its lowest
point (in the image) to the putamen organism, which is initialized accordingly
(Figure 40.11). The putamen organism moves toward the putamen in the brain
image (Figure 40.12) and then rotates and bends to latch onto the nearer putamen
boundary (Figure 40.13). It then stretches and grows along the boundary until
reaching the upper- and lower-most ends of the putamen (Figure 40.14), which
identifies the medial axis of the putamen (Figure 40.15). Since the edges of the
putamen boundary near the gray matter are usually weak, the organism activates
an explicit search for an arc (parametrized only by one parameter controlling its
curvature) that best fits the weak, sparse edge data in that region (Figure 40.16).
(a) (b)
(c) (d)
then show an example of automatically fitting the 3D medial patch shape model
to the binary image of a caudate nucleus from a 3D brain MRI (Figure 45). An
initial medial sheet estimate is positioned at the plane spanned by the two main
principal components of the locations of object points in the binary image. For
each point in the medial sheet, rays are cast in both directions perpendicular to the
medial sheet (along the third eigenvector) until they encounter an object boundary.
The two boundary locations above and below the medial sheet are recorded. The
nodes of the medial sheet are repositioned to be halfway between the top and
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 435
Figure 42. Progress of segmentation through its primary phases: (a) global model align-
ment, (b) model part alignment through (c) expansion and (d) contraction, (e) medial-axis
alignment, (f) fitting to boundary, (g) detecting and (h) repairing fornix dip. Reprinted with
permission from [80]. Copyright 2005,
c SPIE. See attached CD for color version.
Figure 43. (a) Automatic labeling of important anatomical regions of the CC. (b) Before and
(c) after intuitive manual intervention to improve the segmentation (red color corresponds
to areas of erroneous segmentation). Reprinted with permission from [80]. Copyright
2005,
c SPIE. See attached CD for color version.
Figure 44. Fitting a 3D shape model to a lateral ventricle: (a) initial 3D model, (b)
uniform shrinking along the x-axis, (c) bending deformations, (d) medial sheet bent into
approximate bisecting position of target structure, (e) final 3D shape reconstructed from the
medial sheets, and (f) overlay of 3D model on target structure.
436 GHASSAN HAMARNEH and CHRIS MCINTOSH
bottom boundaries. Rays are now cast along vectors normal to the medial sheet
until they encounter the boundaries. The procedure is iterated until the change in
the locations of the medial nodes is too small.
5. SUMMARY
Figure 45. Caudate nucleus (CN) represented using a medial patch. Top to bottom: initial
rectangular planar medial sheet, planar sheet cropped to match CN projection, curved medial
sheet placed equidistant from the upper and lower CN boundaries. Thickness values are
associated with each node in the medial sheet yielding a 3D surface of the CN.
Figure 46. Maximum intensity projection rendering of an MRA showing the vessel crawler
in orange. See attached CD for color version.
438 GHASSAN HAMARNEH and CHRIS MCINTOSH
Figure 47. Three example representations of a vascular system: directed acyclic graph
(left), a plot of the vessels with color corresponding to radial thickness (top), and a tree
representing to the structure of a segmented vessel (bottom). See attached CD for color
version.
The cognitive layer is the organism’s brain, which is mainly responsible for
decision-making based on sensory input, prior anatomical knowledge, a pre-stored
segmentation plan, and interaction with other organisms. The cognitive layer
controls the organism’s perceptual capabilities by dynamically assigning on-board
and off-board sensors. It also controls the sequence of behaviors that the organisms
should adopt. The different behaviors in turn activate bodily deformations carried
out through the lower geometrical and physical layers of the organism.
The need for top–down control of the shape deformations implies a require-
ment to develop shape representation techniques that respond to intuitive and con-
trolled deformation commands (e.g., “move forward,” “bend left”). To this end,
we presented examples of pure geometrical and physically based shape represen-
tations that provide the desired deformation control capabilities. Specifically, we
described the use of medial profiles/patches and 2D/3D deformable spring mass
mesh models as examples of the geometric and physical layers of the deformable
organism framework.
We presented examples of deformable organisms designed for different medi-
cal image analysis problems (e.g., segmentation of brain structures, analysis of
vasculature) and highlighted their main behavioral routines , decision-making
strategies, and sensory capabilities.
We believe the layered architecture of artificial life is a promising paradigm for
medical image analysis, capable of incorporating state-of-the-art low-level image
processing algorithms, high-level anatomical expert knowledge, and advanced
planning and optimization strategies in a single modular design.
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 439
6. NOTES
1. See [28] as an example of previous work on, and motivation for, segmenting the corpus callosum.
2. Earlier mention of the use of AL in conjunction with segmentation appeared in [35, 36] However,
as these methods rely on local, not global, decision-making, they strongly resemble traditional
region-growing methods. For a survey of other applications of AL, see [37].
7. REFERENCES
1. Yoo TS. 2004. Insight into images: principles and practice for segmentation. Wellesley, MA: AK
Peters Ltd.
2. Robb RA. 2000. Biomedical imaging. New York: Wiley-Liss.
3. Dhawan A. 2003. Medical image analysis. Wiley-IEEEE Press.
4. Bankman I. 2000. Handbook of medical imaging: processing and analysis. New York: Academic
Press.
5. Sonka M, Fitzpatrick J. 2000. Handbook of medical imaging. Bellingham, WA: SPIE.
6. McInerney T, Kikinis R. 1998. An object-based volumetric deformable atlas for the improved
localization of neuroanatomy in MR images. In Proceedings of the first international conference
on medical image computing and computer-assisted intervention (MICCAI’98). Lecture Notes
in Computer Science, Vol. 1496, pp. 861–869. Berlin: Springer.
7. Shen D, Davatzikos C. 2000. An adaptive-focus deformable model using statistical and geometric
information. IEEE Trans Pattern Anal Machine Intell 22(8):906–913.
8. Tsotsos J, Mylopoulos J, Covvey H, Zucker S. 1980. A framework for visual motion understand-
ing. IEEE Trans Pattern Anal Machine Intell 2(6):563–573.
9. Draper BA, Hanson AR, Riseman EM. 1993. Learning blackboard-based scheduling algorithms
for computer vision. Int J Pattern Recognit Artif Intell 7(2):309–328.
10. Draper BA, Hanson AR, Riseman EM. 1996. Knowledge-directed vision: control, learning, and
integration. Proc IEEE Signals Symbols 84(11):1625–1637.
11. Crevier D, Lepage R. 1997. Knowledge-based image understanding systems: a survey. Comput
Vision Image Understand 67(2):161–185.
12. Strat T, Fischler M. 1991. Context-based vision: recognizing objects using information from both
2D and 3D imagery. IEEE Trans Pattern Anal Machine Intell 13(10):1050–1065.
13. Poli R. 1996. Genetic programming for image analysis. In Proceedings of the first international
conference on genetic programming, pp. 363–368. Ed JR Koza, DE Goldberg, DB Fogel, RL.
Cambridge: MIT Press.
14. Martin MC. 2002. Genetic programming for real world robot vision. In Proceedings
of the international conference on intelligent robots and systems (IEEE/RSJ), pp. 67–72.
Washington, DC: IEEE. Available at http://martincmartin.com/papers/GeneticProgramming
ForRealWorldRobotVisionIROS2002Martin.pdf
15. Liu J, Tang Y. 1999. Adaptive image segmentation with distributed behavior-based agents. IEEE
Trans Pattern Anal Machine Intell 21(6):544–550.
16. Germond L, Dojat M, Taylor C, Garbay C. 2000. A cooperative framework for segmentation of
MRI brain scans. Artif Intell Med 20(1):77–93.
17. Boucher A, Doisy A, Ronot X, Garbay C. 1998. A society of goal-oriented agents for the analysis
of living cells. Artif Intell Med 14:183–199.
18. Rodin V, Harrouet F, Ballet P, Tisseau J. 1998. oRis: multiagents approach for image processing.
In Proceedings of the SPIE conference on parallel and distributed methods for image processing
II, Vol. 3452, 57–68. Ed S Hongchi, PC Coffield. Bellingham, WA: SPIE.
440 GHASSAN HAMARNEH and CHRIS MCINTOSH
19. Bazen AM, van Otterlo M. 2001. A reinforcement learning agent for minutiae extraction
from fingerprints. In Proceedings of the Belgium–Netherlands Artificial Intelligence Conference
(BNAIC’01), pp. 329–336. Ed B. Kröose, M de Rijke, G Schreiber, M van Someren. Washington,
DC: IEEE Computer Society.
20. McInerney T, Terzopoulos D. 1996. Deformable models in medical image analysis: a survey.
Med Image Anal 1(2):91–108.
21. Montagnat J, Delingette H, Ayache N. 2001. A review of deformable surfaces: topology. Image
Vision Comput 19(14):1023–1040.
22. Osher S, Paragios N. 2003. Geometric level set methods in imaging vision and graphics. Berlin:
Springer.
23. Caselles V, Kimmel R, Sapiro G. 1997. Geodesic active contours. Int J Comput Vision 22(1):61–
79.
24. Sethian JA. 1996. Level set methods: evolving interfaces in geometry, computer vision and
material sciences. Cambridge: Cambridge UP.
25. Cootes TF, Cooper D, Taylor CJ, Graham J. 1995. Active shape models: their training and
application. Comput Vision Image Understand 61(1):38–59.
26. Cootes T, Beeston C, Edwards G, Taylor C. 1999. A unified framework for atlas matching us-
ing active appearance models. In Proceedings of the 17th international conference on image
processing in medical imaging (IPMI’99), pp. 322–333. Berlin: Springer.
27. Szekely G, Kelemen A, Brechbuehler Ch, Gerig G. 1996. Segmentation of 3D objects from MRI
volume data using constrained elastic deformations of flexible Fourier surface models. Med Image
Anal 1(1):19–34.
28. Lundervold A, Duta N, Taxt T, Jain A. 1999. Model-guided segmentation of corpus callosum in
MR images. In Proceedings of the IEEE computer society conference on computer vision and
pattern recognition, Vol. 1, pp. 231–237. Washington, DC: IEEE Computer Society.
29. Cohen LD. 1991. On active contour models and balloons. Comput Vision Graphics Image Process:
Image understand 53(2):211–218.
30. Xu C, Prince J. 1998. Snakes, shapes, and gradient vector flow. IEEE Trans Image Process
7(3):359–369.
31. Cootes TF, Edwards GJ, Taylor CJ. 2001. Active appearance models. IEEE Trans Pattern Anal
Machine Intell 23(1):681–685.
32. Leventon M, Grimson W, Faugeras O. 2000. Statistical shape influence in geodesic active con-
tours. In Proceedings of the IEEE computer society conference on computer vision and pattern
recognition, Vol. 1. pp. 316–323. Washington, DC: IEEE Computer Society.
33. Warfield SK, Kaus M, Jolesz FA, Kikinis R. 2000.Adaptive, template moderated, spatially varying
statistical classification. Med Image Anal 4(1):43–55.
34. Hamarneh G, Gustavsson T. 2000. Statistically constrained snake deformations. In Proceedings
of the international conference on systems, man, and cybernetics, Vol. 3, pp. 1610–1615. Wash-
ington, DC: IEEE Computer Society.
35. Choi C, Wirth M, Jennings A. 1997. Segmentation: artificial life and watershed transform ap-
proach. In Sixth international conference on image processing and its applications, Vol. 1, pp.
371–375. Washington, DC: IEEE Computer Society.
36. Kagawa H, Kinouchi M, Hagiwara M. 1999. Image segmentation by artificial life approach using
autonomous agents. In Proceedings of the international joint conference on neural networks, Vol.
6, pp. 4413–4418. Washington, DC: IEEE Computer Society.
37. Kim K-J, Cho S-B. 2006. A comprehensive overview of the applications of artificial life. Artif
Life 12(1):153–182.
38. Hamarneh G, McInerney T, Terzopoulos D. 2001. Deformable organisms for automatic medical
image analysis. In Proceedings of the fourth international conference on medical image computing
and computer-assisted intervention (MICCAI’01). Lecture Notes in Computer Science, Vol. 2208,
pp. 66–75. Berlin: Springer.
39. Terzopoulos D. 1999. Artificial life for computer graphics. Commun ACM 42(8):32–42
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 441
40. Bookstein F. 1997. Morphometric tools for landmark data: geometry and biology. Cambridge:
Cambridge UP.
41. Costa L, Cesar Jr R. 2000. Shape analysis and classification: theory and practice. Boca Raton,
FL: CRC Press.
42. Dryden I, Mardia K. 1998. Statistical shape analysis. New York: John Wiley and Sons.
43. Lachaud J, Montanvert A. 1999. Deformable meshes with automated topology changes for coarse-
to-fine three-dimensional surface extraction. Med Image Anal 3(1):1–21.
44. Mandal C, Vemuri B, Qin H. 1998. A new dynamic FEM-based subdivision surface model for
shape recovery and tracking in medical images. In Proceedings of the first international conference
on medical image computing and computer-assisted intervention (MICCAI’98). Lecture Notes
in Computer Science, Vol. 1496, pp. 753–760. Berlin: Springer.
45. Miller J, Breen D, Lorensen W, O’Bara R, Wozny MJ. 1991. Geometrically deformed models:
a method for extracting closed geometric models from volume data. In Proceedings of the 18th
annual conference on computer graphics and interactive techniques (SIGGRAPH’91), Vol. 25,
pp. 217–226. New York: ACM Press.
46. Montagnat J, Delingette H. 1997.Volumetric medical image segmentation using shape constrained
deformable models. In Proceedings of the second international conference on computer vision,
virtual reality and robotics in medicine (CVRMed-MRCAS’97). Lecture Notes in Computer Sci-
ence, Vol. 1205, pp. 13–22. Berlin: Springer.
47. Barr A. 1984. Global and local deformations of solid primitives. In Proceedings of the 11th
annual conference on computer graphics and interactive techniques (SIGGRAPH’84), Vol. 18,
pp. 21–30. New York: ACM Press.
48. Coquillart S. 1990. Extended free form deformations: a sculpting tool for 3d geometric modeling.
In Proceedings of the 17th annual conference on computer graphics and interactive techniques
(SIGGRAPH’90), Vol. 24, pp. 187–196. New York: ACM Press.
49. Sederberg TW, Parry SR. 1986. Free-form deformation of solid geometric models. In Proceedings
of the 13th annual conference on computer graphics and interactive techniques (SIGGRAPH’86),
Vol. 4, pp. 151–160. New York: ACM Press.
50. Singh K, Fiume E. 1998. Wires: a geometric deformation technique. In Proceedings of the 25th
annual conference on computer graphics and interactive techniques (SIGGRAPH’98), Vol. 99,
No. 1, pp. 405–414. New York: ACM Press.
51. Terzopoulos D, Metaxas D. 1991. Dynamic 3D models with local and global deformations:
deformable superquadrics. IEEE Trans Pattern Anal Machine Intell 13(7):703–714.
52. Burel G, Henocq H. 1995. Three-dimensional invariants and their application to object recogni-
tion. Signal Process 45(1):1–22.
53. Davatzikos C, Tao X, Shen D. 2003. Hierarchical active shape models: using the wavelet trans-
form. IEEE Trans Med Imaging 22(3)414–423.
54. McInerney T, Terzopoulos D. 1995. A dynamic finite-element surface model for segmentation
and tracking in multidimensional medical images with application to cardiac 4D image analysis.
Comput Med Imaging Graphics 19(1):69–83.
55. Mortenson M. 1997. Geometric modeling. New York: Wiley.
56. Blum H. 1973. Biological shape and visual science. Theor Biol 38:205–287.
57. Attali D, Montanvert A. 1997. Computing and simplifying 2D and 3D continuous skeletons.
Comput Vision Image Understand 67(3):261–273.
58. Borgefors G, Nystrom I, Baja GSD. 1999. Computing skeletons in three dimensions. Pattern
Recognit 32:1225–1236.
59. Bouix S, Dimitrov P, Phillips C, Siddiqi K. 2000. Physics-based skeletons. In Proceedings of the
vision interface conference, pp. 23–30.
60. Dimitrov P, Damon J, Siddiqi K. 2003. Flux invariants for shape. In Proceedings of the IEEE
computer society conference on computer vision and pattern recognition (CVPR’03), pp. 835–
841. Washington, DC: IEEE Computer Society.
442 GHASSAN HAMARNEH and CHRIS MCINTOSH
61. Fritsch D, Pizer S, Yu L, Johnson V, Chaney E. 1997. Segmentation of medical image objects
using deformable shape loci. In Proceedings of the 15th international conference on information
processing in medical imaging (IPMI’97). Lecture notes i n computer science, Vol. 1230,
pp. 127–140. Berlin: Springer.
62. Hamarneh G, McInerney T. 2003. Physics-based shape deformations for medical image analysis.
In Proceedings of the 2nd SPIE conference on image processing: algorithms and systems (SPIE-
IST’03), Vol. 5014, pp. 354–362. Ed ER Dougherty, JT Astola, KO Egiazarian.
63. Hamarneh G, Abu-Gharbieh R, McInerney T. 2004. Medial profiles for modeling deformation
and statistical analysis of shape and their use in medical image segmentation. Int J Shape Model
10(2)187–209.
64. Leymarie F, Levine MD. 1992. Simulating the grassfire transform using an active contour model.
IEEE Trans Pattern Anal Machine Intell 14(1):56–75.
65. Pizer S, Fritsch D. 1999. Segmentation, registration, and measurement of shape variation via
image object shape. IEEE Trans Med Imaging 18(10):851–865.
66. Pizer S, Gerig G, Joshi S, Aylward SR. 2003. Multiscale medial shape-based analysis of image
objects. Proc IEEE 91(10):1670–1679.
67. Sebastian T, Klein P, Kimia B. 2001. Recognition of shapes by editing shock graphs. In Proceed-
ings of the international conference on computer vision (ICCV’01), pp. 755–762. Washington,
DC: IEEE Computer Society.
68. Siddiqi K, Bouix S, Tannenbaum A, Zucker S. 2002. Hamilton-Jacobi skeletons. Int J Comput
Vision 48(3):215–231.
69. Duta N, Sonka M, Jain A. 1999. Learning shape models from examples using automatic shape
clustering and Procrustean analysis. In Proceedings of the 17th international conference on in-
formation processing in medical imaging (IPMI’99). Lecture Notes in Computer Science, Vol.
1613, pp. 370–375.
70. Yushkevich P, Joshi S, Pizer SM, Csernansky J, Wang L. 2003. Feature selection for shape-
based classification of biological objects. In Proceedings of the 17th international conference on
information processing in medical imaging (IPMI’03). Lecture Notes in Computer Science, Vol.
2732, pp. 114–125.
71. Styner M, Gerig G, Lieberman J, Jones D, Weinberger D. 2003. Statistical shape analysis of
neuroanatomical structures based on medial models. Med Image Anal 7(3):207–220.
72. Lu C, Pizer S, Joshi S. 2003. A Markov random field approach to multiscale shape analysis.
In Proceedings of the conference on scale space methods in computer vision. Lecture Notes
in Computer Science, Vol. 2695, pp. 416–431. Available at http://midag.cs.unc.edu/pubs/papers/
ScaleSpace03 Lu shape.pdf.
73. Grenander U. 1963. Probabilities on algebraic structures. New York: Wiley.
74. Fletcher P, Conglin L, Pizer S, Joshi S. 2004. Principal geodesic analysis for the study of nonlinear
statistics of shape. IEEE Trans Med Imaging 23(8):995–1005.
75. Bill JR, Lodha SK. 1995. Sculpting polygonal models using virtual tools. In Proceedings of the
conference on the graphics interface, pp. 272–279. New York: ACM Press.
76. O’Donnell T, Boult T, Fang X, Gupta A. 1994. The extruded generalized cylinder: a deformable
model for object recovery. In Proceedings of the IEEE computer society conference on com-
puter vision and pattern recognition (CVPR’94), pp. 174–181. Washington, DC: IEEE Computer
Society.
77. Perona P, Malik J. 1990. Scale-space and edge detection using anisotropic diffusion. IEEE Trans
Pattern Anal Machine Intell 12(7)629–639.
78. Weickert J. 1998. Anisotropic diffusion in image processing. ECMI Series. Stuttgart: Teubner.
79. Frangi AF, Niessen WJ, Vincken KL, Viergever MA. 1998. Multiscale vessel enhancement fil-
tering. In Proceedings of the first international conference on medical image computing and
computer-assisted intervention (MICCAI’98). Lecture Notes in Computer Science, Vol. 1496,
pp. 130–137. Berlin: Springer.
DEFORMABLE ORGANISMS FOR MEDICAL IMAGE ANALYSIS 443
80. Hamarneh G, McIntosh C. 2005. Physics-based deformable organisms for medical image analysis.
In Proceedings of the SPIE conference on medical imaging: image processing, Vol. 5747, pp.
326–335. Bellingham, WA: SPIE.
81. Kimme C, Ballard DH, Sklansky J. 1975. Finding circles by an array of accumulators. Commun
Assoc Comput Machinery 18:120–122.
82. Bullitt E, Gerig G, Aylward SR, Joshi SC, Smith K, Ewend M, Lin W. 2003. Vascular attributes
and malignant brain tumors. In Proceedings of the sixth international conference on medical
image computing and computer-assisted intervention (MICCAI’03). Lecture Notes in Computer
Science, Vol. 2878, pp. 671–679. Berlin: Springer.
83. Terzopoulos D. 1986. On matching deformable models to images. Technical Report 60, Schlum-
berger Palo Alto Research. Reprinted in Topical Meeting MachineVision, Technical Digest Series
12:160–167 (1987).
84. Kass M, Witkin A, Terzopoulos D. 1987. Snakes: active contour models. Int J Comput Vision
1(4):321–331.
85. McInerney T, Hamarneh G, Shenton M, Terzopoulos D. Deformable organisms for automatic
medical image analysis. J Med Image Anal 6(3):251–266.
86. C. McIntosh and G. Hamarneh, Vessel Crawlers: 3D Physically-based Deformable Organisms
for Segmentation and Analysis of Tubular Structures in Medical Images. IEEE Conference on
Computer Vision and Pattern Recognition, 2006, pp. 1084–1091.
13
Robert Falk
Department of Medical Imaging, Jewish Hospital
Louisville, Kentucky, USA
Three-dimensional medial curves (MC) are an essential component of any virtual en-
doscopy (VE) system, because they serve as flight paths for a virtual camera to navigate
the human organ and to examine its internal views. In this chapter, we propose a novel
framework for inferring stable continuous flight paths for tubular structures using partial
differential equations (PDEs). The method works in two passes. In the first pass, the over-
all topology of the organ is analyzed and its important topological nodes identified. In
the second pass, the organ’s flight paths are computed by tracking them starting from each
identified topological node. The proposed framework is robust, fully automatic, computa-
tionally efficient, and computes medial curves that are centered, connected, thin, and less
sensitive to boundary noise. We have extensively validated the robustness of the proposed
method both quantitatively and qualitatively against several synthetic 3D phantoms and
clinical datasets.
Address all correspondence to: Dr. Aly A. Farag, Professor of Electrical and Computer Engineering,
University of Louisville, CVIP Lab, Room 412, Lutz Hall, 2301 South 3rd Street, Louisville, KY
40208, USA. Phone: (502) 852-7510, Fax: (502) 852-1580. [email protected].
445
446 M. SABRY HASSOUNA et al.
1. INTRODUCTION
2. PREVIOUS WORK
field and the Dijkstra algorithm [24] to locate an initial MC, which is then refined it-
eratively by discarding its redundant voxels. The penalty factor is specified heuris-
tically by the user for each dataset, preventing the algorithm from automation. In
addition, the method requires modification to handle complex tree structures.
Paik et al. [25] suggested a different approach for extracting MC, where a
geodesic path is found on the surface of a tubular structure and is later centered by
applying an iterative thinning procedure. The algorithm is sensitive to boundary
noise, difficult to apply to complex tubular structures, and requires user interaction.
Siddiqi et al. [26] extracted MC in two steps. Initially, medial surfaces were
computed by combining a thinning algorithm with the average outward flux mea-
sure to distinguish medial voxels from others. Then they applied an ordered thin-
ning method [27] based on the Euclidean distance field to extract MC. They
presented very good results for vascular trees. The method requires a postprocess-
ing step to clean unwanted branches caused by thinning as well as to convert the
computed skeleton into a graph for diagnoses and navigation.
Telea and Vilanova [28] resampled the volumetric object in three directions—
x, y, and z—which results in three separate stacks of cross-sections for the same
object. Then, they apply a level set-based centerline extraction method to com-
pute the medial points of each 2D cross-section in each volume independently.
Finally, the resulting sets of medial points are merged (intersected) to yield the
final skeleton. The algorithm involves a number of heuristics with no theoretical
justification. Also, the convergence of their method is not guaranteed because the
actual topology of a 3D object cannot be interpreted from its 2D cross-sections in
different directions.
The work by Deschamps et al. [29] is the most closely related work to the one
presented in this chapter. The user selects the endpoints of a branch of interest,
and then a fast marching front propagates from the starting point with a speed
determined by a scalar potential that depends upon location in the medium until
it reaches the endpoint. Minimal paths are extracted using gradient descent meth-
ods. They presented very nice results for single tube structures such as the colon.
However, for tree structures their method generates trajectories near branching
nodes.
in the object is assumed to start a new MC. The resulting skeleton is usually
disconnected and requires postprocessing for reconnection. Although generalizing
the method to handle non-polygonal data is possible, computation of the potential
function becomes very expensive because the potential value at an internal voxel
of the object is a function of all the voxels of the boundary. Also, for a noisy
boundary the method will generate a large number of unwanted branches since no
method is presented to identify the starting points of each MC.
Cornea et al. [31] extended [30] to handle non-polygonal data, they com-
puted the critical points and low divergence points of the field force as well as
the high-curvature boundary points to identify the starting points of MC. Again,
computation of the potential field is still very expensive. The method takes a half
an hour for computing the potential of 2003 objects. Also, identification of critical
field points may result in either an over- or underestimate of the true numbers, and
hence inaccurate identification of the exact topological nodes of the organ.
Ma et al. [32] used radial basis functions (RBFs) to build a new potential
field, and then applied a gradient descent algorithm to locate local extremes in
RBF (branching and end nodes). Constructing an RBF to preserve the shape
property of an arbitrary 3D model is still worth more consideration.
Wu et al. [33] regarded the 3D model faces as charged planes. Seed points with
negative charges at the model vertices are initiated, which are then pushed toward
local minimum positions by electric static force. The local minimum positions
are connected to complete the skeleton. This method is very computationally
expensive and requires user interaction to remove undesired connection.
Existing 3D flight path extraction techniques suffer from at least one of the
following limitations: MC are extracted from medial surfaces by different pruning
methods, and so accuracy of the former is a function of extraction of the latter;
manual interaction is required to select the starting point of each medial curve,
which is computationally expensive; heuristics are required to handle branching
nodes, dedicated to a specific application, and hence there is a lack of generality
and a sensitivity to boundary noise.
Consider the minimum-cost path problem that finds the path C(s) : [0, ∞) −→
Rn that minimizes cumulative travel cost from starting point A to some destination
B in Rn . If the cost U is only a function of the location x in the given domain,
then the cost function is called isotropic, and the minimum cumulative cost at x
450 M. SABRY HASSOUNA et al.
Figure 1. The dark line is a medial curve MC, while light gray lines are wavefronts. Lattice
voxels are represented by solid dots.
is defined as L
T (x) = min U (C(s))ds, (1)
CAx 0
where CAx is the set of all paths linking A to x. The path length is L, and the
starting and ending points are C(0) = A and C(L) = x, respectively. The path
that gives minimum integral is the minimum cost path. In geometrical optics, it
has been proved that the solution of Eq. (1) is eikonal equation Eq. (2):
where T (x) is the minimum arrival time of a wave as it crosses a point x, and the
speed is given by the reciprocal of the cost function:
In this work we derive a new speed function such that the minimum cost path
between two medial points is a medial curve.
λ(A) = h (7)
λ(Bi ) = h − δ, (8)
Let
∆2 x + ∆2 y
r= ; (14)
min(∆x, ∆y)
then
rg(h − δ) − g(h) < 0. (15)
Applying Taylor series expansion to g(h − δ) and substituting in Eq. (15),
dg(h)
r g(h) − δ + (δ) − g(h) < 0, (16)
dh
where
∞
δ k dk g
(δ) = (−1)k . (17)
k! dhk
k=2
Assume for now that (δ) → 0 is very small, such that it can be ignored with-
out affecting inequality (16). Later we will find the condition that satisfies the
assumption
dg(h)
r g(h) − δ − g(h) < 0. (18)
dh
By rearranging Eq. (18),
dg(h) r−1
> dh. (19)
g(h) rδ
Let
r−1 1 min(∆x, ∆y)
α1 = = 1− > 0. (20)
rδ δ ∆2 x + ∆2 y
By integrating both sides of Eq. (19) for a given x, we get
λ(x) λ(x)
dg(h)
> α1 dh (21)
λmin g(h) λmin
ln g(λ(x)) − ln g(λmin ) > α1 (λ(x) − λmin ) (22)
ln F (x) > α1 λ(x) − (α1 λmin − ln Fmin ), (23)
Then,
F (x) > exp(α1 λ(x) − ζ) = exp(−ζ) exp(α1 λ(x)). (25)
There are an infinite number of speed functions that satisfy Eq. (25), among
which we pick the one that allows us to neglect the effect of (δ) without altering
inequality Eq. (16).
PDE-BASED THREE DIMENSIONAL PATH PLANNING 453
which requires,
By restricting the minimum value of the speed to unity, the equation always holds.
Therefore, Eq. (26) reduces to
where α = α1 + α2 .
Now, let’s find the condition that satisfies (δ) → 0. By substituting the
proposed speed function, Eq. (29), into Eq. (16),
δ 2 α2
r exp(αh) − δα exp(αh) + exp(αh) − · · · − exp(αh) < 0 (30)
2!
δ 2 α2
r exp(αh) 1 − δα + − · · · − exp(αh) < 0. (31)
2!
Since
δ 2 α2
exp(−δα) = 1 − δα + − ··· ; (32)
2!
then,
which is the necessary condition to satisfy Eq. (18). Under the proposed speed
model, Eq. (29), all medial voxels are moving faster than non-medial ones. Since
the cost function U (x) is the reciprocal of the speed, then the voxels of a MC are
the voxels with minimal cost value. Therefore, MC is a minimal cost path.
For a 3D lattice, derivation of the speed is straightforward. The value of α is
given by
1 ∆2 x + ∆2 y + ∆2 z
α > ln . (36)
δ min(∆x, ∆y, ∆z)
454 M. SABRY HASSOUNA et al.
Figure 2. A medial curve MC intersects the propagating fronts at those voxels of maximum
positive curvatures. See attached CD for color version.
∇T
C(s + h) = C(s) − h . (39)
|∇T |
This formula is usually applied in the following way. We choose a step size h,
and construct the sequence s0 , s0 + h, s0 + 2h, etc. Let Cn be the numerical
estimate of the exact solution C(sn ), and then the solution can be computed by
the following recursive scheme:
∇T
Cn+1 = Cn − h given C(0) = B. (40)
|∇T |
The Euler method is fast but less accurate. Its total accumulated error is on the
order of O(h).
∇T
f (Cn ) = − (41)
|∇T |
C̄n+1 = Cn + hf (Cn )
h
Cn+1 = Cn + (f (Cn ) + f (C̄n+1 )).
2
The Heun method is slower but more accurate than the Euler method. The total
accumulated error is on the order of O(h2 ).
h
Cn+1 = Cn + (k1 + 2k2 + 2k3 + k4 ), (42)
6
456 M. SABRY HASSOUNA et al.
where
k1 = f (Cn ) (43)
h
k2 = f (Cn + k1 ),
2
h
k3 = f (Cn + k2 ),
2
k4 = f (Cn + hk3 ).
The total accumulated error is on the order of O(h3 ). For the Heun and Runge-
Kutta methods, the gradient is evaluated at non-lattice voxels, which require bilin-
ear/trilinear interpolation for 2D/3D spaces.
Although the fourth-order Runge-Kutta method is more accurate than oth-
ers, the gain in accuracy is sometimes lost by interpolation at non-lattice voxels.
Therefore, we will use the second-order Runge-Kutta method because it strikes a
balance between accuracy and efficiency.
This medialness assigns each object’s voxel x its minimum distance from the
boundary Γ as given by Eq. (46). D(x) can be discretely approximated using
chamfer metric (3,4,5) [19], or continuously approximated using the fast marching
methods [20], which is more accurate than discrete approximation:
If a wave is propagating from the object’s boundary with unit speed, the arrival
time solution of the eikonal equation is directly proportional to D(x). In this
chapter, D(x) is computed by solving Eq. (47) using the proposed multi-stenciled
fast marching method, (see Chapter 9)
|∇D(x)| = 1. (47)
Similarly, in 3D
δ = min(∆x, ∆y, ∆z). (49)
This medialness is suitable for extracting centerlines of arbitrary 2D shapes as well
as MC of 3D tubular objects, whose cross-section is nearly or perfectly circular.
Let D̂1 (x) be the discretized version of the floating distance field D1 (x):
D̂1 (x) = round D1 (x) . (52)
D̂1 converts the object with voxels as its basic elements into an object with clusters
as its new basic elements (Cluster Graph). Each cluster consists of connected
voxels with the same D̂1 value. Therefore, we can expect more than one cluster
with the same D̂1 value, if they are not adjacent. Two voxels are said to be
connected if they share a face, an edge, or a vertex 26-connected). Two clusters c1
and c2 are adjacent if a voxel in c1 shares a face with a voxel in c2 (6-connected).
In the cluster graph, each cluster is represented by a node and adjacent clusters
by links. The root of the graph is the cluster containing PS with zero cluster value,
followed by clusters with increasing D̂1 value. The cluster graph contains three
types of clusters: Extreme clusters (Xcluster), which exists at the tail of the
graph; Branching clusters (Bcluster), which has at least two adjacent clusters
with the same D̂1 value but greater than the value of Bcluster; and Merging
clusters (M cluster), which has at least two adjacent clusters (Successors) with
the same D̂1 value but lower than the value of M cluster. Merging clusters exist
only if the object contains holes (loops).
The medial voxel of a cluster is computed by searching the cluster for the
voxel with maximum D(x). Extreme and merging nodes are the medial voxels of
the associated clusters. Figure 3a shows the cluster graph of a tree structure with
one loop, where X, M , and S represent extreme, merging, and successor clusters,
respectively.
N
−1
T = ∆ti , (53)
i=1
where ∆ti and d(., .) are the travel time and Euclidean distance between two
medial neighbor voxels, respectively:
d(xi−1 , xi )
∆ti = . (54)
F (xi )
PDE-BASED THREE DIMENSIONAL PATH PLANNING 459
(a) (b)
Figure 3. (a) Cluster Graph. (b) Medial curves around a loop. See attached CD for color
version.
(a) (b)
Figure 4. (a) The MC of an arbitrary shape consists of N medial voxels. (b) Cluster graph
of the same shape. See attached CD for color version.
By restricting the value ∆ti to be greater than a certain value τ , where 0 < τ < 1,
then
τ ≤ ∆ti , (55)
d(xi−1 , xi )
τ ≤ , (56)
exp(βλ(xi ))
1 d(xi−1 , xi )
β ≤ ln . (57)
λ(x) τ
460 M. SABRY HASSOUNA et al.
The worst-case scenario for the right-hand side of Eq. (57) occurs when λ(x) =
λmax and
d(xi−1 , xi ) = min(∆x, ∆y). (58)
Let βc be the critical value of β; then
1 min(∆x, ∆y)
βc = ln . (59)
λmax τ
4.5.3. MC of Loops
Our framework extracts one MC for each tunnel or cavity the object may have.
Each loop in the object is associated with one merging cluster M of the cluster
graph. For illustration, let M have only two successors S1 and S2 , as shown in
Figure 3b. In order to extract the MC of this loop, three steps are required. In
the first step, we compute the medial voxel s1 of S1 and consider the entire set
of voxels of both M and S2 as part of the object’s background (construct holes)
such that there is a unique MC from s1 to PS . Finally, we propagate a fast wave
from PS until s1 is reached and then extract the MC between them. In the second
step, we extract the MC between s2 and PS in a similar fashion to the first step,
except that we consider the entire voxels of both M and S1 as part of the object’s
background and those of S2 as part of the object’s foreground. In the third step,
we propagate a fast wave from s1 until s2 is reached and then extract the MC
between them. The same concept can be generalized for a merging cluster with
any number of successors.
PS ; (3) propagate a moderate speed wave from PS and solve for the new distance
field D1 (x); (4) discretize D1 (x) to obtain D̂1 (x) and then construct the cluster
graph; (5) identify the extreme and merging nodes; (6) construct a new distance
field from PS by propagating a fast speed wave and solve for the new distance
field D2 (x); (7) if the object contains loops, extract their MC; and, finally, (8)
extract those MC that originate from extreme nodes and end with either a PS or
on a previously extracted path. The pseudocode of the proposed framework is
presented in Algorithm 1.
Figure 5. 3D synthetic shapes of different complexity: (a) spiral, (b) simple tree (in-plane),
and (c) complex tree.
voxel, which leads to disconnected MC. Since δ = min(∆x, ∆y, ∆z), then for
isotropic voxel size (∆x = ∆y = ∆z = 1.0), the critical value of αc is 0.549.
In this experiment, we study the accuracy of the analytical estimation of α by
manually changing the value of α from 0 to 1 in steps of 0.1. In each step, we
compute the L1 error between the computed MC and the ground truth for several
3D synthetic shapes. We then plot the L1 error versus α and determine the range
of α (Rα ) that corresponds to the steady-state range of the minimum L1 error, as
shown in Figure 6. Finally, we check if the estimated αc is within that range. In
Table 1 we show the computed L1 error under different values of α of various
synthetic shapes: spiral (Figure 5a), simple tree (Figure 5b), and complex tree
(Figure 5c). It is clear from Figure 6 that the estimated αc is always within Rα .
Also, there is a wide stable range for α that is αc ≤ α ≤ 1.0. In our experiments
we automated the proposed framework by fixing α = 0.7.
1 min(∆x, ∆y, ∆z)
βc = ln . (62)
λmax τ
464 M. SABRY HASSOUNA et al.
α 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
L1 (Spiral) 0.451 0.335 0.328 0.327 0.326 0.324 0.322 0.322 0.317 0.317
L1 (Simple Tree) 0.388 0.247 0.186 0.176 0.179 0.175 0.173 0.172 0.175 0.174
L1 (Complex Tree) 0.535 0.405 0.381 0.371 0.365 0.361 0.361 0.359 0.357 0.359
Figure 6. Computed L1 error of various synthetic shapes under different values of α. See
attached CD for color version.
Figure 7. Cross-sections in the cluster graph of a 3D synthetic shape (double donuts) under
different values of τ .
466 M. SABRY HASSOUNA et al.
Figure 8. Cross-sections in the cluster graph of a 3D synthetic shape (Tree) under different
values of τ .
PDE-BASED THREE DIMENSIONAL PATH PLANNING 467
Figure 9. Each extreme node is identified not from its corresponding extreme cluster Ai
but from its successor neighbor Ai−1 . See attached CD for color version.
(a) (b)
(c) (d)
Figure 10. Spiral synthetic shape: (a) smooth surface (0% noise level), (b) rough surface
(100% noise level), (c) MC of smooth surface, (d) MC of rough surface. See attached
CD for color version.
PDE-BASED THREE DIMENSIONAL PATH PLANNING 469
Figure 11. Three-connected donuts. Synthetic shape: (a) smooth surface (0% noise level),
(b) rough surface (100% noise level), (c) MC of smooth surface. See attached CD for
color version.
For noise-free synthetic shapes, the L1 and L∞ errors never exceeded 0.37 and
1.73 mm (e.g., one voxel), respectively. In the presence of severe noise levels, L1
and L∞ errors never exceeded 0.47 and 2.23 mm (e.g., two voxels), respectively,
which is quite acceptable for flight paths; therefore, the proposed method has low
sensitivity to boundary noise.
It is worth noticing that under some noise levels some error measures decrease
rather than increase because the profile of noise become symmetrically distributed
around the object, and hence the centeredness property of the computed MC is
not altered.
6. RESULTS
We have also validated the proposed method qualitatively against several clin-
ical datasets as shown in Figure 12. Notice the complexity of the clinical datasets
and the accuracy of the computed MC especially around loops and near branching
and merging nodes.
We have implemented the proposed method using C++ on a single 400-Mhz
SGI infinite reality supercomputer. The volume sizes and running times in seconds
of the tested datasets are listed in Table 4.
470 M. SABRY HASSOUNA et al.
Figure 12. Computed MC of clinical datasets. See attached CD for color version.
PDE-BASED THREE DIMENSIONAL PATH PLANNING 471
In this chapter we have presented a robust, fully automatic, and fast method
for computing flight paths of tubular structures for virtual endoscopy applications.
The computed flight paths enjoy several advantageous features, such as being
centered, connected, thin, and less sensitive to noise. Unlike previous methods,
our technique can handle complex anatomical structures with an arbitrary number
of loops, can extract only part of the skeleton given the starting and ending voxels
of the medial curve to be computed, and, finally, does not require voxels to be of
isotropic size because it takes voxel data spacing into account. The robustness of
the proposed method is demonstrated by correctly extracting all the MC of the
tested clinical datasets as well as successful validation against synthetic phantoms
of different complexity. In the future we intend to implement the proposed method
in the graphical processing unit (GPU) of a computer graphics card to reduce
computational time.
8. REFERENCES
1. Baert AL, Sartor K. 2001. Virtual endoscopy and related 3D techniques. Berlin: Springer.
2. Buthiau D, Khayat D. 2003. Virtual endoscopy. Berlin: Springer.
3. Y Zhou, Toga AW. 1999. Efficient skeletonization of volumetric objects. IEEE Trans Visualiz
Comput Graphics 5(3):196–209.
4. Bitter I, Kaufman AE, Sato M. 2001. Penalized-distance volumetric skeleton algorithm. IEEE
Trans Visualiz Comput Graphics 7(3):195–206.
5. Ma CM, Sonka M. 1996. A fully parallel 3d thinning algorithm and its applications. Comput
Vision Image Understand 64:420–433.
6. Svensson S, Nyström I, Sanniti di Baja G. 2002. Curve skeletonization of surface-like objects in
3d images guided by voxel classification. Pattern Recognit Lett, 23(12):1419–1426.
7. Deschamps T. 2001. Curve and shape extraction with minimal path and level-sets techniques:
applications to 3D medical imaging. PhD dissertation, Université Paris, IX Dauphine.
8. Bouix S, Siddiqi K, Tannenbaum A. 2003. Flux driven fly throughs. In Proceedings of the IEEE
computer society conference on computer vision and pattern recognition, pp. 449–454. Washing-
ton, DC: IEEE Computer Society.
9. Hassouna MS, Farag AA. 2005. PDE-based three-dimensional path planning for virtual en-
doscopy. In: Information processing in medical imaging: 19th international conference, IPMI
2005, pp. 529–540. Lecture Notes in Computer Science, Vol. 3565. Berlin: Springer.
10. Hassouna MS, Farag AA, Falk R. 2005. Differential fly-throughs (DFT): a general framework
for computing flight paths. In Medical image computing and computer-assisted intervention:
MICCAI 2005: 8th international conference, pp. 26–29. Berlin: Springer.
11. Ma CM. 1995. A fully parallel thinning algorithm for generating medial faces. Pattern Recognit
Lett 16:83–87.
12. Tsao YF, Fu KS. 1981. A parallel thinning algorithm for 3d pictures. Comput Graphics Image
Process 17:315–331.
474 M. SABRY HASSOUNA et al.
13. Saha PK, Majumder DD. 1997. Topology and shape preserving parallel thinning for 3d digital
images: a new approach. In Proceedings of the 9th international conference on image analysis
and processing, Vol. 1, pp. 575–581. Lecture Notes in Computer Science, Vol. 1310. Berlin:
Springer.
14. Palagyi K, Kuba A. 1997. A parallel 12-subiteration 3d thinning algorithm to extract medial lines.
In Proceedings of the 7th international conference on computer analysis of images and patterns,
pp. 400–407. Lecture Notes in Computer Science, Vol. 1296. Berlin: Springer.
15. Lohou C, Bertrand G. 2004. A 3d 12-subiteration thinning algorithm based on p-simple points.
Discr Appl Math 139(1–3):171–195.
16. Manzanera A, Bernard TM, Prêteux F, Longuet B. 1999. A unified mathematical framework for
a compact and fully parallel n-d skeletonization procedure. Proc SPIE, 3811:57–68.
17. Palagyi K, Kuba A. 1999. Directional 3d thinning using 8 subiterations. In Proceedings of the 8th
international conference on discrete geometry for computer imagery (DCGI ’99). Lecture Notes
in Computer Science, Vol. 1568, pp. 325–336. Berlin: Springer.
18. Gong W, Bertrand G. 1990. A simple parallel 3d thinning algorithm. In Proceedings of 10th
International Conference on Pattern Recognition, 1990, pp. 188–190. Washington, DC: IEEE.
19. Borgefors G. 1986. Distance transformations in digital images. Comput Vision Graphics Image
Process 34:344–371.
20. Adalsteinsson D, Sethian J. 1995. A fast level set method for propagating interfaces. J Comput
Phys 118:269–277.
21. Gagvani N, Silver D. 1999. Parameter-controlled volume thinning. Graphical Models Image
Process 61(3):149–164.
22. Bitter I, Sato M, Bender M, McDonnell KT, Kaufman A, Wan M. 2000. Ceasar: a smooth, accurate
and robust centerline extraction algorithm. In Proceedings of the Visualization ’00 conference,
pp. 45–52. Washington, DC: IEEE Computer Society.
23. Sato M, Bitter I, Bender MA, Kaufman AE, Nakajima M. 2000. Teasar: Tree-structure-extraction
algorithm for accurate and robust skeletons. In Proceedings of the 8th Pacific conference on
computer graphics and applications (PG ’00), p. 281. Washington, DC: IEEE Computer Society.
24. Dijkstra EW. 1959. A note on two problems in connexion with graphs. Num Math 1:269–271.
25. Paik DS, Beaulieu CF, Jeffrey RB, Rubin GD, Napel S. 1998. Automated path planning for virtual
endoscopy. Med Phys 25(5):629–637.
26. Dimitrov P, Damon JN, Siddiqi K. 2003. Flux invariants for shape. In Proceedings of 2003 IEEE
computer society conference on computer vision and pattern recognition (CVPR 2003), pp. 835–
841. Washington, DC: IEEE Computer Society.
27. Pudney C. 1998. Distance-ordered homotopic thinning: a skeletonization algorithm for 3d digital
images. Comput Vision Image Understand 72(3):404–413.
28. Telea A, Vilanova A. 2003. A robust level-set algorithm for centerline extraction. In Proceed-
ings of the symposium on visualization (VisSym 2003), pp. 185–194. Aire-la-Ville, Switzerland:
Eurographics Association.
29. Deschamps T, Cohen LD. 2001. Fast extraction of minimal paths in 3d images and applications
to virtual endoscopy. Med Image Anal 5(4): 281–299.
30. Chuang J-H, Tsai C-H, Ko M-C. 2000. Skeletonization of three-dimensional object using gener-
alized potential field. IEEE Trans Pattern Anal Machine Intell 22(11):1241–1251.
31. Yuan X, Balasubramanian R, Cornea ND, Silver D. 2005. Computing hierarchical curve-skeletons
of 3d objects. The Visual Computer. 21(11):945–955.
PDE-BASED THREE DIMENSIONAL PATH PLANNING 475
32. Ma W-C, Wu F-C, Ouhyoung M. 2003. Skeleton extraction of 3d objects with radial basis func-
tions. In Proceedings of the 2003 international conference on shape modeling and applications
(SMI 2003), pp. 207–215, 295. Washington, DC: IEEE Computer Society.
33. Wu F-C, Ma W-C, Liou P-C, Laing R-H, Ouhyoung M. 2003. Skeleton extraction of 3d objects
with visible repulsive force. In Proceedings of the Computer Graphics Workshop 2003
(Hua-Lien, Taiwan). Available online: http://www.lems.brown.edu/vision/people/leymarie/Refs/
CompGraphics/Shape/Skel.html.
34. Bellman R, Kalaba R. 1965. Dynamic programming and modern control theory. London: London
Mathematical Society Monographs.
35. Press WH, Teukolsky SA, Vetterling WT, Flannery BP. 1992. Numerical recipes in C: the art of
scientific computing. New York: Cambridge UP.
36. Yatziv L, Bartesaghi A, Sapiro G. 2006. A fast o(n) implementation of the fast marching algorithm.
J Comput Phys 212(2):393–399.
14
Jasjit S. Suri
Biomedical Research Institute, Idaho State
University, Pocatello, Idaho, USA
Ruey-Feng Chang
Department of Computer Science and Information
Engineering, National Chung Cheng University
Chiayi, Taiwan
The task of Object Recognition is one of the most important problems in computational
vision systems. Generally, to meet this problem, specific approaches are applied in two
phases: detecting the region of interest and tracking them around a frame sequence. For
applications such as medical images or general object tracking, current methodologies
generally are machine dependent or have high speckle noise sensitivity. In a sequence
of ultrasound images some data are often missing if the approach does not use lesion-
tracking methods. Also, the segmentation process is highly sensitive to noise and changes
in illumination. In this chapter we propose a four-step method to handle the first phase
of the problem. The idea is to track the region of interest using the information found in
Address all correspondence to: Paulo Sérgio Rodrigues, Laboratório Nacional de Computa cõ
Científica, Av. Getulio Vargas, 333, Quitandinha, Petropolis, Brazil, CEP: 25651-075. Phone: +55 24
2233-6088, Fax: +55 24 2231-5595. [email protected].
477
478 PAULO S. RODRIGUES et al.
the previous slice to search for the ROI in the current one. In each image (frame) we
accomplish segmentation with Tsallis entropy, which is a new kind of entropy for non-
extensive systems. Then, employing the Hausdorff distance we match candidate regions
against the ROI in the previous image. In the final step, we use the ROI curve to compute a
narrow band that is an input for an embedded function in a level set formulation, smoothing
the final shape. We have tested our method with three classes of images: a general indoor
office, a Columbia database, and low-SNR ultrasound images of breast lesions, including
benign and malignant tumors, and have compared our proposed method with the Optical
Flow Approach.
1. INTRODUCTION
3D scanners have been available on the market for some time, more than 90%
of ultrasound exams are accomplished using traditional 2D scanners. These de-
vices generally offer a low signal-to-noise ratio (SNR) as well as low contrast and
resolution, while Digital Image Processing promises to enhance the quality and
usefulness of images. On the other hand, the shear volume of exams and the time
spent on image collection and evaluation require intense attention and from an
experienced medical team. Many investigators have found that more than 60%
of masses referred for breast biopsy on the basis of mammographic findings are
actually benign [1–3]. Although biopsy is still necessary and the definitive exam,
60% is a high rate of false positives.
One way to help reduce the rate of false positives is through automatic breast
lesion detection based on features such as tumor shape, texture, and color. Most
benign tumors have an elliptical boundary and uniform texture, while most ma-
lignant tumors have an irregular or spread-out boundary and texture. The feature
detection aspects of Computational Vision with a sequence of ultrasound images
are highly advantageous, mainly with 2D ultrasound images of low resolution and
low SNR.
From the point of view of Computational Vision, automatic diagnosis of breast
lesions can be divided in two phases: (1) detection of the region of interest (ROI),
and (2) classification of the ROI based on medical classification schemes as a
benign or malignant breast lesion. Since phase 2 is highly dependent on phase 1,
the algorithms for shape, texture, and color detection should have high robustness.
The first phase is a step that demands much research time. The approaches found in
the literature are highly machine dependent and noise sensitive. This is generally
due to the fact that detection of an ROI is accomplished on an independent slice,
and the information from one frame to the next is not taken into account, so that
some features are missing. One way to reduce this dependence and take advantage
of the data from the previous slice is employment of a tracking algorithm.
One typical and important application that requires specific solutions is track-
ing a lesion area in a breast ultrasound image sequence (see, e.g., Figure 1) where
Figure 1. Example of two desired ultrasound images of a breast lesion for which our
proposed methodology was developed.
480 PAULO S. RODRIGUES et al.
images have a low signal-to-noise ratio (SNR) and a high time step between con-
secutive frames, and there is no need for real-time computation.
Such an application fits some of the objectives and features described above.
Our objectives include tracking objects with highly deformable shape that during
sequence evolution suffer transformations in translation, rotation, scale, split, or
merge. In addition, there is no need to manage occlusion or background shift
or more than one object in the scene. Other conditions include the following:
no need for real-time computation, a low SNR, and a high time step between
consecutive images, which produces difficulties when traditional techniques of
object correspondence are used. On the other hand, the low SNR presents problems
in terms of parameter setup that demand a robust segmentation algorithm.
One natural approach to handling these problems is to establish up a ground
truth region in the first frame and compute a correspondence problem between
consecutive images. This setup may be achieved through manual segmentation
or by using an automatic approach. In a simple way, the general problem can be
stated as a problem of correspondence between two regions. Generally, however,
a segmentation process is required before achieving correspondence. As the ob-
jectives and features become more and more complex, the segmentation algorithm
needs to be increasingly robust and the whole process becomes more and more
dependent on initial definition of parameters (e.g., spatial filters and morphological
operations).
In this context, segmentation methods that employ entropy applied to the ob-
ject tracking problem have been broadly investigated. Traditional Shannon entropy
is based on the achievement of a threshold between background and foreground
for the purpose of maximization of image information. However, these methods
are strongly dependent on image contrast and noise. In 1988, Constantino Tsal-
lis [4, 5] presented the concept of non-extensive entropy for systems with long
microscopic interactions and memory and fractal behavior. Such entropy expands
the well-known Boltzman-Gibbs and Shannon entropy for extensive systems. The
main advantage of this new theory is that it is based on a unique parameter, called q,
which may be handled according to system non-extensiveness. The authors of [6]
presented a first proposal for image segmentation using non-extensive entropy. In
our work, we follow this proposal for region tracking in a breast lesion ultrasound
image sequence. The task at hand is to establish a correspondence between regions
of interest (objects in scenes).
In our proposed methodology we used the general idea of making a match
between the region achieved in frame i (taken as a ground truth) and a candidate re-
gion achieved in frame i+1. We employ Tsallis entropy to find a candidate region.
We do so because it generates good results for images with a low SNR, because
of its simplicity of implementation, and because we have only one parameter to
manage (q), which permits writing efficient algorithms. Each candidate region is
smoothed, embedding the region boundary in a level set formulation with a narrow
band. This has dramatically lowered computational load, as processing is carried
OBJECT TRACKING IN IMAGE SEQUENCE 481
out only in the boundary area, called the “narrow band”. In addition, this yields
more accurate and smoother results. Matching is carried out through the use of
the Hausdorff distance, which is employed due to its simplicity of implementation
and the possibility of comparison between curves of different length.
The Hausdorff distance is a well-known method of matching two shapes not
having the same number of points [7, 8]. However, it has high sensitivity to
speckle noise and such other image artifacts as spurious and small regions. The
main contribution of our present chapter is a methodology for tracking a general
region of interest that combines the Hausdorff distance, a level set formulation,
and Tsallis non-extensive entropy in order to reduce Hausdorff noise sensitivity as
well as avoiding the increased parameter dependence that generally occurs with
other methodologies.
The chapter is organized as follows. In Section 2 we briefly discuss the
principal works related to our approach. We give some background on the main
approaches we have used in Section 3. The proposed Hausdorff-Tsallis level set
algorithm is described in Section 4, while our experimental results are reported
in the following section. Section 6 presents a discussion of our results, and we
present our conclusions in Section 7.
2. RELATED WORK
frames to extract moving objects. In [10], Luo and collaborators extended the
work of Kim and Hwang [9] by attaching a Dynamic Bayesian Network (DBN)
for interpretation of the object of interest. When the object of interest is extracted
from the background, it is split into four quadrants (labeled I, II, III, and IV), and
several attributes are extracted under the proposed DBN. This method was applied
to a sports video database and achieved up to 86% accuracy.
Gao and collaborators [11] investigated detection of human activity through
tracking of moving regions. The tracking process is achieved by segmenting domi-
nant moving regions inside the ROI from where spatial features are obtained. These
features are used to describe the related activities. An algorithm, called Weighted
Sequential Projection (WSR), is then applied to achieve consistent movements,
and to place them on a list of candidate regions. The idea underlying WSR is the
calculus of the correlation of two vectors for consecutive regions of consecutive
frames. The consistent regions are found based on the comparison between these
two vectors.
The process of identifying human activities is achieved in three steps: first,
an individual person is found by attaching segmented regions frame to frame;
second, the face is found using the algorithm derived by Sheidermann [12]; finally,
the hands are found through analysis of movements. Movements of the hands in
relation to the head are analyzed to detect, for example, a human dining.
The paper presented by Lim and Kriegman [13] offers a tracking process for
people walking around. The process uses two types of models for the tracking
algorithm: one off-line, called the shape model, and one on-line, called the ap-
pearance model. The off-line model is based on learning of several generic human
poses; the on-line model is based on the individual visual features that are being
tracked. The combination of these two models composes the final tracking algo-
rithm. In addition, during the off-line step, the background is learned trough the
Mahalanobis distance. Since this is a statistical distance, it can be updated on-line.
The general idea combines both the shape and the appearance models as
follows: at the on-line step the shape model is used to capture moving people. The
appearance model is then used to fine tune the candidate regions.
The recent paper by Yeasin and collaborators [14] presents a framework for
tracking several objects in a frame sequence whose principal contribution is a
strategy for reducing the problem of the conflict of trajectories, which happens
when two individual objects cross each other. In this case, a simple algorithm has
no way of distinguishing both objects. The strategy ofYeasin and collaborators [14]
is to use Multiple Hypotheses Tracking (MHT) algorithm combined with a Path
Coherent Function (PCF) to handle the conflict of trajectories. Since MHT suffers
from real-time implementation due to the exponential growth in the number of
hypotheses, the PCF is implemented to correct real-time object position, reducing
the number of hypotheses. This results is enhancement of the MHT algorithm. An
efficient implementation of the MTH algorithm can also be found in the work of
Cox and Hingorani [15].
OBJECT TRACKING IN IMAGE SEQUENCE 483
3. BACKGROUND
We now present a review of the three main theories underlying our proposed
approach for region tracking of breast lesions in an ultrasound image sequence:
the Hausdorff distance, the level set theory, and Tsallis non-extensive entropy.
H(A, B)
He (A, B) = 1 − . (3)
η
described as
S=− pi ln(pi ). (4)
i
Generally speaking, systems that have BGS-type statistics are called extensive
systems. Such systems have an additive property, defined as follows. Let X and Y
be two random variables, with probability density functions P = (p1 , . . . , pn ) and
Q = (q1 , . . . , qm ), respectively. If X and Y are independent, under the context
of Probability Theory, the entropy of the composed distribution will verify the
so-called additivity rule:
the experimental results (Section 5) show that it is better to consider our systems
non-extensive:
Sq ({pi qj }i,j ) = Sq ({pi }i ) + Sq ({qj }j )
(6)
+(1 − q) · Sq ({pi }i ) · Sq ({qj }j ).
maximizing Shannon entropy (Eq. (4)). In this chapter we use Tsallis entropy
to generate candidate regions in each image from the sequence. The manner in
which we use this entropy is the one proposed by Tsallis [4, 5] and applied by
Albuquerque et al. [6] on mammographic images.
If we consider two random variables, X and Y , as stated above, and letting
P be the probability density function of the background and Q the probability
t
density function of the foreground (ROI), with constraints i=1 ( pNi ) = 1 and
L pi
i=t+1 ( M ) = 1, and the assumption that t is a threshold, M is the total amount
of background pixels, N is the total amount of foreground pixels, and the discrete
levels are equally spaced between 1 and L in gray-scale level, the entropy of the
composed system, considering the systems as non-extensive, according to Eq. (6),
we have
Sq (X + Y ) = Sq (X) + Sq (Y )
(8)
+(1 − q) · Sq (X) · Sq (Y ),
where t pi q
1− i=1 ( N )
Sq (X) = (9)
q−1
and L pi q
1− i=t+1 ( M )
Sq (Y ) = . (10)
q−1
We maximize the information measured between the two classes (ROI and
background). When Sq (X + Y ) is maximized, luminance level t is considered to
be the optimum threshold value. This can be achieved with a comparatively cheap
computational effort:
topt = argmax[SqX (t) + SqY (t) + (1 − q) · SqX (t) · SqY (t)]. (11)
Therefore, it is interesting to investigate the images under the light of the non-
extensive entropy. Although it will be difficult to discuss the non-extensive features
(long-range and long-memory interactions and fractal behavior) for images in
general, we can quantitatively justify the use of q-entropy for image segmentation,
as the q parameter introduces a flexibility in the entropic formulation, and this
suggests an investigation of segmentation under q variation. Albuquerque et al. [6]
have studied q for mammographic images and achieved good results for 0 ≤ q ≤ 1,
which suggests a non-extensiveness for systems composed of such images.
In our work, we suggest the use of q-entropy to designate candidate regions
of interest. The general idea is to generate some perturbation around the q value
(used to generate the ROI in slice i) to achieve an ROI in slice i+1. Then a level set
formulation is used to smooth the boundary and matching is accomplished through
the Hausdorff distance. A more detailed algorithm will be set forth in Section 4.
OBJECT TRACKING IN IMAGE SEQUENCE 489
The next step is to find an Eulerian formulation for the front evolution. Fol-
lowing Sethian [42], let us suppose that the front evolves in the normal direction
→
−
with velocity F , which may be a function of the curvature, normal direction, etc.
We need an equation for the evolution of G (x, t), considering that surface S
is the level set given by
S (t) = x ∈ 2 |G (x, t) = 0 . (14)
490 PAULO S. RODRIGUES et al.
G (x (t) , t) = 0. (15)
We can now use the Chain Rule to compute the time derivative of this expres-
sion:
Gt + F |∇G| = 0, (16)
−→
where F = F is called the speed function. An initial condition G(x, t = 0) is
required. A straightforward (and expensive) technique to define this function is to
compute a signed-distance function as follows:
where d is the distance from x to the surface S (x, t = 0) and the signal indicates
if the point is interior (–) or exterior (+) to the initial front.
Finite-difference schemes based on a uniform grid can be used to solve Eq.
(16). The same entropy condition of T-Surfaces (once a grid node is burnt, it stays
burnt) is incorporated in order to drive the model to the desired solution (in fact,
T-Surfaces was inspired by the level sets model [43]).
In this higher-dimensional formulation, topological changes can be efficiently
implemented. Numerical schemes are stable and the model is general in the sense
that the same formulation holds for 2D and 3D, as well as for merge and splits.
Besides, the surface geometry is easily computed. For example, the front normal
and curvature are given by
→
− ∇G (x, t)
n = ∇G (x, t) , K = ∇ · , (18)
∇G (x, t)
respectively, where the gradient and divergent (∇·) are computed with respect to x.
Initialization of the model through Eq. (17) is computationally expensive and
not efficient if we have more than one front to initialize [44].
The narrow-band technique is much more appropriate for this case. The key
idea of this technique comes from the observation that the front can be moved by
updating the level set function at a small set of points in the neighborhood of the
zero set instead of updating it at all points on the domain (see [42, 45] for details).
To implement this scheme we need to preset a distance ∆d to define the narrow
band. The front can move inside the narrow band until it collides with the narrow-
band frontiers. Then function G should be reinitialized by treating the current zero
set configuration as the initial one.
This method can also be made cheaper by observing that the grid points that do
not belong to the narrow band can be treated as sign holders [42], in other words,
points belonging inside the narrow band are positive, and negative otherwise.
OBJECT TRACKING IN IMAGE SEQUENCE 491
To clarify ideas, let us consider Figure 2a, which shows the level set bounding
the search space, and Figure 2b, which pictures a bidimensional surface where the
zero level set is comprised of the contours just presented.
(a) (b)
Figure 2. (a) Level set bounding the search space. (b) Initial function with zero level set
as the contour presented.
The surface evolves such that the contour gets closer to the edge. In order
to accomplish this goal, we must define a suitable speed function and an efficient
numerical approach. For simplicity, we consider the one-dimensional version of
the problem depicted in Figure 2. In this case, the evolution equation can be written
as
∂G
Gt + F = 0. (19)
∂x
The main point is to design speed function F so as to obtain the desired result,
which can be accomplished if Gt > 0. For instance, if we set the sign of F
opposite to the one of Gx , we get Gt > 0:
∂G
Gt = − F. (20)
∂x
Hence, the desired behavior can be obtained by the distribution of the sign of F
shown in Figure 3.
Once our application focus is shape recovery in image I, we must choose a
suitable speed function F as well as a convenient stopping term S to be added to
the right-hand side of Eq. (19). Among the possibilities [46], the following ones
have been found to be suitable in our case:
1 + αk
F = 2, (21)
1 + |∇I|
S = β∇I · ∇G, (22)
where β is a scale parameter. Therefore, we are going to deal with the following
level sets model:
1 + αk
Gt = 2 |∇G| + β∇P · ∇G, (23)
1 + |∇I|
492 PAULO S. RODRIGUES et al.
2
where P = − |∇I| . After initialization, the front evolves following Eq. (23).
Next, we develop the numerical elements for the level set implemented herein.
Gt + H(Gx , Gy ) = 0, (24)
Gn+1 n
i,j = Gi,j + t · F ∇Gni,j , (25)
where
Gni+1,j − Gni−1,j Gni,j+1 − Gni,j−1
∇Gni,j = [( ), ( )], (26)
2· x 2· y
and
Gni+1,j − Gni−1,j Gni,j+1 − Gni,j−1
∇Gni,j = ( )2 + ( )2 . (27)
2· x 2· y
In Eqs. (25)–(27), n stands for the evolution in time n, and t is the time step
defined by the user.
3.3.2. Initialization
Discontinuities in image intensity are significant and important when it comes
to shape recovery problems. In fact, such discontinuities of image I can be modeled
as step functions, like
n
s (x, y) = ai Si (x, y) , (28)
i=1
OBJECT TRACKING IN IMAGE SEQUENCE 493
Therefore, an image I is not a differentiable function in the usual sense. The usual
way to address this problem is to work with a coarser-scale version of the image
obtained by convolving this signal with a Gaussian kernel:
In this section we present our proposed framework, which combines the Haus-
dorff distance, non-extensive entropy, and a level set formulation for tracking re-
gions representing a breast lesion in a frame sequence of ultrasound images.
The general idea for tracking adopted for this work, which is well known, is
to use a region achieved in image i of a sequence to search a corresponding region
in image i + 1. We then propose a framework combining the Hausdorff Distance,
Tsallis entropy, and a level set formulation to match region i and a candidate region.
As stated in Section 3.2, Tsallis entropy is a new trend in image segmentation
that is simple and can be well adapted to real-time applications. In turn, the
Hausdorff distance is a well-known similarity measure between two curves with
such advantages as simplicity of implementation and low computational time (see
Section 3).
In our notation, a frame has index i and specific segmentation index j. Assum-
ing that we have several segmentations on each image, Ii,j is the jth segmentation
of the ith frame. By using set notation,
are the m segmented outputs of frame Ii . Each Ii,j may contain several segmented
regions — among foreground and background ones —, called Target Regions
OBJECT TRACKING IN IMAGE SEQUENCE 495
(TRs). Then, ri,k ∈ Ii,j is the kth TR of segmented image Ii,j . For the sake of
explanation, we consider that segmented frame Ii,j has only one TR of interest
(the foreground), called ri ; the remaining ones belong to the background. Finally,
a model for tracking a region is denoted by MR.
By the way, there is a need for M O initialization in the first image; however,
there are several approaches to doing so, which is generally a step before the
tracking task. We assume that model region MR was already defined in the first
frame. However, some investigators [9, 11, 15, 17–18] offer other techniques for
such initialization.
Our proposed HTLS tracking algorithm is the following: let MRi−1 be the
model region achieved in image i − 1. For image Ii we compute Is given in Eq.
(31) with the Tsallis entropy according to Eq. (11), where each Ii,j is computed
with a different value of q. Each Ii,j has a candidate region rj . We extract rj from
Ii,j by applying mathematical morphology according to [49].
The region given by the morphological operators is the input for the level set
framework discussed in Section 3.3. This framework, with the added advantages
of the narrow band, has the net effect of smoothing the contour region, approaching
each contour pixel independently to the real edge.
Finally, each rj is matched against MRi−1 . MRi is that region that minimizes
Eq. (2).
To drop m (the total number of segmentations over each image Ii ), we choose
a set of values of q around that used in the previous image. Experimentally, we have
seen that m 3 is sufficient to achieve good results; the proposed method then
does not generate computational overhead due to applying segmentation m times
to each image. On the other hand, this approach allows the proposed method
to be robust to noise and changes in illumination conditions, as Tsallis entropy
for segmentation slightly adapts to new conditions from one image to another
throughout the sequence. This is the key idea of our work: the parameter setup
for the first slice is automatically updated along the entire image sequence. There
is then no need for user intervention as context, noise, or illumination change. A
schematic view of the proposed HTLS algorithm can be seen in Figure 5.
5. EXPERIMENTAL RESULTS
The proposed method was experimentally tested over three classes of image
sequences and compared to the Optical Flow Approach. The first class includes
real office scenes, where we can see a telephone around other objects on a table
with a white wall. The camera moves, generating a completely moving scenario.
This is the same as moving the object across a moving background. An example
of this sequence is shown in Figure 6.
In this class of sequence images we can note a complex background as the
camera moves. There are several objects in the scene: the telephone (ROI) has
496 PAULO S. RODRIGUES et al.
Figure 5. Schematic view of the proposed HTLS algorithm. The initial user-defined ROI
is not shown.
Figure 6. Sequence of images from an office table with slight camera movement, changes
in illumination, and a heterogeneous background. See attached CD for color version.
OBJECT TRACKING IN IMAGE SEQUENCE 497
basically the same color as the wall, and the entire background moves together,
which generates difficulties for tracking.
A second class of images is from the Columbia database [50], which contains
7200 color images of isolated objects (100 objects taken at 5 degrees incremented in
pose = 72 shots per object). This database is suitable for evaluating the performance
of our system for object tracking. As we have several object viewers with zoom, this
is equivalent to having a unique object moving over a homogeneous background,
which simplifies segmentation. An example of some object sequence from this
database can be seen in Figure 7.
Figure 7. Four sequences of images from the Columbia database. Each row is an object se-
quence. This database has several object viewers and zoom, which is suitable for evaluating
the performance of our proposed algorithm.
498 PAULO S. RODRIGUES et al.
The third and last class of images we used in our experiments includes ultra-
sound images. There were 140 ultrasound images with low SNR that also were
individually affected by speckle noise. We used 80 images of benign tumors and
60 of malignant tumors, divided into 7 sequences. Each sequence corresponded to
a different case and had 20 images. We carried out our experiment only on those
images where a tumor appeared. In the first slice of each sequence, we manually
traced the ROI. The main reason we chose this kind of image was to show the
performance of our algorithm under conditions of speckle noise. Although this
sequence of images was taken from real clinical exams of cancerous breast lesions
(which included several malignant and benign images), the purpose of this work is
not to classify the achieved region of interest as malignant or benign. The images
were thus not classified by any expert, such as a breast lesion tumor radiologist.
Rather, we assumed that the ROI in each image is the darker region around the
center of each image, without any care as to whether it of the correct shape for a
breast tumor. The advantages of testing our proposed algorithm with this class of
images is that we could trace the darker central region (ROI) throughout a low-
SNR context; in addition, the lesions have marked topological changes over the
evolution that are suitable for testing our proposed methodology. An example of
a sequence of images from this class is shown in Figure 8.
For each of the above classes we carried out experiments employing our
proposed algorithm with and without the level set formulation and compared its
performance against the Optical Flow Approach (OFA). The reason we chose the
optical approach is that it is a well-known and often-used method for tracking
moving objects since being proposed [51].Since then, several variations have been
proposed for a wide range of applications. We match the automatically generated
ROI against ground truth images manually segmented, and measure the curve
distances with the Polyline Distance Measure (PDM), which has become a standard
approach for matching two curves since it was proposed for use with medical
images [52, 53]. In the following, we will give in brief explanation about the OFA
and PDM approaches.
Figure 8. Four sequences of images from the Columbia database. Each row is an object se-
quence. This database has several object viewers and zoom, which is suitable for evaluating
the performance of our proposed algorithm.
500 PAULO S. RODRIGUES et al.
The brightness of every point of a moving or static object does not change
in time.
Let some object in the image, or some point of an object, move; after time dt
the object displacement is (dx , dy ). Using a Taylor series for brightness I(x, y, t)
gives the following:
and
∂Idy ∂Idy ∂Idy
+ + + ... = 0. (34)
∂x ∂y ∂t
Dividing Eq. (33) by dt and defining
∂dx ∂dx
= u, =v (35)
∂dt ∂dt
gives
∂I ∂I ∂I
= u+ v, (36)
∂t ∂x ∂y
usually called the optical flow constraint equation, where u and v are components
of the optical flow field in x and y coordinates, respectively. Since Eq. (36) has
more than one solution, more than one constraint is required.
free parameter for the equation of the line joining points B and C. Then, line
interval BC between B and C is given as
x x1 x2 − x1
= +λ (37)
y y1 y2 − y1
where (x, y) is the coordinate of a point on the line and λ ∈ [0, 1].
Now, let µ be the parameter of the distance orthogonal to the line interval
BC. Thus, the line segment between (x0 , y0 ) and (x, y) is perpendicular to the
line interval BC. Therefore, we can express (x0 , y0 ) similar to Eq. 37:
x0 x −(y2 − y1 )
= +µ
y0 y x2 − x1
x1 x2 − x1 −(y2 − y1 )
= +λ +µ (38)
y1 y2 − y1 x2 − x1
Solving the above equation using the related determinants, unknown parameters
λ and µ are given as
and
(y2 − y1 )(x1 − x2 ) + (x2 − x1 )(y0 − y1 )
µ= . (40)
(x2 − x1 )2 + (y2 − Y1 )2
Let the two distance measures, d1 and d2 , between A and B1 and B/C on B2 be
defined as Euclidian distances:
d1 = (x0 − x1 )2 + (y0 − y1 )2
. (41)
d2 = (x0 − x2 )2 + (y0 − y2 )2
where
A quantitative error measure between the ideal boundary and the computer-
estimated boundary could then be defined using the polyline distance described in
502 PAULO S. RODRIGUES et al.
Eq. (42). The measure is defined as the average polyline distance of all boundary
points of the estimated and ground-truth breast boundaries. We will denote the
measure as dError
poly , which is derived as follows:
5.3. Experiments
Initially, we will demonstrate a performance comparison between the pro-
posed algorithm with and without the level set framework regarding optical flow
for the image sequence of the first class. We are considering a manmade ground
truth, such as in Figure 9.
(GT for frame 22) (GT for frame 25) (GT for frame 27)
Figure 9. Three images (top row) from the first class and their corresponding manually
traced ground truth (bottom row). GT = ground truth. See attached CD for color version.
For each of the images of the given sequence, the GT was manually traced for
comparison with the automatic results. The similarity measure between the GT
and an automatic generated curve was taken using Eq. (45). The performance was
taken in evaluating a sequence of 30 frames (around 1 second each), is shown in
Figure 10, where the normalized error (Nerr) is taken as Nerr = 1 − η1 Err, and Err
√
is calculated by Eq. (45) and η = M 2 + N 2 /2 is the normalization factor taken
OBJECT TRACKING IN IMAGE SEQUENCE 503
Figure 10. Performance evaluation of the HTLS algorithm as a function of frames for the
first class sequence. “o” represents the HTLS performance; “+” represents HT algorithm
performance; “–” represents OFA algorithm performance. See attached CD for color
version.
as the higher possible error value, which depends on the corresponding image
dimensions M and N . In Figure 10, note the similar performance of our method
with and without the level set formulation. There is only a slight advantage for
the HTLS algorithm. In this class of images, with a moving object background,
the HTLS algorithm outperforms the optical flow approach. This is due to the
fact that OFA does not work well when the background moves. However, the
HTLS algorithm also has difficulties in tracking the ROI. This is because Tsallis
segmentation can only deal with two regions in a scene — the background and the
foreground — and in the class 1 images we have a wall with a similar gray-scale
level to that of the ROI.
Another important observation is that the HTLS algorithm performance al-
gorithm decreases with time. This is due to error propagation. A solution to this
disadvantage is to periodically restart the system at different points. On the other
hand, the optical flow approach is unstable. The same behavior we can note in Fig-
504 PAULO S. RODRIGUES et al.
ure 11, which is with class 2 images. However, we can see that all the algorithms
demonstrate good performance since the images are synthetic with a homogeneous
background.
The performance evaluation for the class 2 sequence is shown in Figure 11,
the same as in Figure 10.
Figure 11. Performance of the HTLS algorithm as a function of frames for the class 2 se-
quence. “o” represents HTLS performance; “+” represents HT performance; “–” represents
OFA performance. See attached CD for color version.
Figure 12 shows the performance curves for the class 3 images of benign tumor
ultrasound images, and Figure 13 shows the same for malignant tumor ultrasound
images.
The best performance of the HTLS algorithm over the optical flow approach
can be seen when there are low-SNR images in the sequence. The OFA does
not work well in this case. On the other hand, the HTLS algorithm (with and
without the level set framework) can show good performance with both benign
and malignant tumor images.
Before presenting examples of specific results with ultrasound images, we will
show the usefulness of Tsallis entropy in Figure 14, where we can see an image
OBJECT TRACKING IN IMAGE SEQUENCE 505
Figure 12. Performance of the HTLS algorithm for class 3 images (benign tumor). “o”
represents HTLS performance; “+” represents HT performance; “–” represents OFA per-
formance. See attached CD for color version.
from the sequence of a benign tumor and several segmentations for 0 < q < 1
(subextensive systems). Figure 14a is the original image, and the lesion is the
darker area traced manually as a white boundary around the image center. Below
each segmented image is the q value employed. The same happens with Figure
15. The q values were varied from a value near 0 to a value close to 1. In this
case, for values greater than 1 we did not achieve satisfactory results. In these two
figures, even so, the better results are those for q values near 1. It is important to
find an ideal q value, which may yield as crisp a boundary as possible. In the case
of Figure 14, when q approaches 1, the ROIs seem to become well defined. The
inverse seems to happen in Figure15, where better segmentation of the ROI seems
to occur when q = 0.00001. However, in both figures it is not possible to point
out the exact tumor boundary. An algorithm to define a boundary must take into
account not only the boundary traced in the first slice but also an adaptive q value.
To reinforce this idea, if the same q value is used for all images of a sequence,
the results may not be satisfactory, as is clearly shown in Figure 16.
506 PAULO S. RODRIGUES et al.
Figure 13. Performance of the HTLS algorithm with class 3 images (malignant tumor).
“o” represents HTLS performance; “+” represents HT performance; “–” represents OFA
performance. See attached CD for color version.
In this figure, the first column shows three original images (F-122, F-130,
and F-140); the second column shows the respective segmentation for the same
q = 0.9 (superextensive system) and the third column shows segmentation for
q = 8 (subextensive system). We can note that, for frame 122, q = 0.9 generates a
better segmentation than does q = 8 (where it is not possible to find any region). If
we set the same q = 0.9 value for the remaining frames, when frame 130 arrives,
we can see that q = 8 is more adequate. The inverse occurs again when frame 140
arrives, where q = 0.9 is better for segmentation than q = 8. These conditions
occur due to the noise distribution, and therefore depend on image acquisition as
well as on the selected target region.
Using non-extensive entropy on mammographic images may have advantages
due to the flexibility of finding a good segmentation as well as the simplicity in
managing only the q parameter, which is not possible with the other entropic
theories (such as BGS) or those that depend on setting up several parameters,
mainly when tracking a sequence of images. The disadvantage is that reaching
an ideal value of q may not be an easy task. q is a value that, given a value of t,
maximizes Eq. (8). One way to handle this problem is as follows. Given a t value
that maximizes Eq. (8), we can have:
∂S(X + Y )
= 0. (47)
∂q
Therefore, an ideal q value may be found with an iterative procedure, such as the
Newton-Raphsor method:
∂S(X+Y )
n+1 n ∂q
q =q − ∂ 2 S(X+Y )
, (48)
∂q 2
Figure 17. Eight images from a frame sequence of a benign tumor. The left column includes
the original images; the right column presents their corresponding achieved boundary with
their respective best q values (indicated below each image).
510 PAULO S. RODRIGUES et al.
Figure 18. Eight images from a frame sequence of a malignant tumor. The left column
includes the original images; on the right are their corresponding achieved boundary, with
their respective best q values shown below.
OBJECT TRACKING IN IMAGE SEQUENCE 511
shape and the q values achieved with the proposed method. In the following section
we discuss these results.
6. DISCUSSION
8. ACKNOWLEDGMENTS
The authors are grateful for the support of CNPq (Conselho Nacional de
Desenvolvimento Cientı́fico e Tecnológico), a Brazilian agency for scientific fi-
nancing. We also thank FAPERJ, Rio de Janeiro Stat’s Agency for Scientific and
Technological Development, by the financial support.
9. REFERENCES
1. Chen CM, Chou YH, Han KC, Hung GS, Tiu CM, Chiou HJ, Chiou SY. 2003. Breast lesion
on sonograms: computer-aided diagnosis with neural setting-independent features and artificial
neural networks. J Radiol 226:504–514.
2. Zheng Y, Greenleaf JF, Gisvold JJ. 1997. Reduction of breast biopsies with a modified self-
organizing map. IEEE Trans Neural Networks 8:1386–1396.
3. Brown ML, Houn F, Sickles EA, Kessler LG. 1995. Screening mammography in community prac-
tice: positive predictive value of abnormal findings and yield of follow-up diagnostic procedures.
AJR Am J Roentgenol 165:1373–1377.
OBJECT TRACKING IN IMAGE SEQUENCE 513
24. Drukker K, Giger ML, Horsch K, Kupinski MA,Vyborny CJ, Mendelson EB. 2002. Computerized
lesion detection on breast ultrasound. Med Phys 29(7):1438–1446.
25. Horsch K, Giger ML, Venta LA, Vyborny CJ. 2002. Computerized diagnosis of breast lesions on
ultrasound. Med Phys 29(2):157–164.
26. Chen DR, Kuo WJ, Chang RF, Moon WK, Lee CC. 2002. Use of the bootstrap technique with small
training sets for computer-aided diagnosis in breast ultrasound. Ultrasound Med Biol 28(7):897–
902.
27. Chen DR, Chang RF, Huang YL. 1999. Computer-aided diagnosis applied to solid breast nodules
by using neural networks. Radiology 213(2):407–412.
28. Kuo WJ, Chang RF, Moon WK, Lee CC, Chen DR. 2002. Computer-aided diagnosis of breast
tumors with different us systems. Acad Radiol 9(7):793–799.
29. Huttenlocher DP, Noh JJ, Rucklidge WJ. 1992. Tracking non-rigid objects in complex scenes.
Technical Report 1320, Department of Computer Science, Cornell University.
30. Huttenlocher DP, Klanderman GA, Rucklidge WJ. 1993. Comparing images using the Hausdorff
distance. IEEE Trans Med Imaging 15(9):850–863.
31. Chalana V, Kim Y. 1997. A methodology for evaluation of boundary detection algorithms on
medical images. IEEE Trans Med Imaging 16(5):642–652.
32. Bamford P. 2003. Automating cell segmentation evaluation with annotated examples. In Proceed-
ings of the APRS workshop on digital image computing, (WDCI 2003), pp. 21–25. Available at
http://www.aprs.org.au/wdic2003/CDROM/21.pdf.
33. Shannon C, Weaver W. 1948. The mathematical theory of communication. Urbana: U Illinois P.
34. Yamano T. 2001. Information theory based in nonadditive information content. Entropy 3:280–
292.
35. Kapur JN, Sahoo PK, Wong AKC. 1985. A new method for gray-level picture thresholding using
the entropy of the histogram. Comput Graphics Image Process 29:273–285.
36. Abutaleb AS. 1989. A new method for gray-level picture thresholding using the entropy of the
histogram. Comput Graphics Image Process 47:22–32.
37. Li CH, Lee CK. 1993. Minimum cross-entropy thresholding. Pattern Recognit 26:617–625.
38. Sahoo PK, Soltani S, Wong AKC. 1988. A survey of thresholding techiniques. Comput Vis Graph-
ics Image Process 41:233– 260.
39. Pun T. 1981. Entropic thresholding: a new approach. Comput Graphics Image Process 16:210–
239.
40. Osher SJ, Sethian JA. 1988. Fronts propagation with curvature dependent speed: Algorithm based
on Hamilton-Jacobi formulations. J Comput Phys 79:12–49.
41. Sethian JA. 1999. Level set methods and fast marching methods: evolving interfaces in compu-
tational geometry, fluid mechanics, computer vision, and materials science, 2nd ed. New York:
Cambridge UP.
42. Malladi R, Sethian JA, Vemuri BC. 1995. Shape modeling with front propagation: a level set
approach. IEEE Trans Pattern Anal Machine Intell 17(2):158–175.
43. McInerney TJ. 1997. Topologically adaptable deformable models for medical image analysis.
PhD dissertation, Department of Computer Science, University of Toronto.
44. ter Haar Romery BM, Niessen WJ, Viergever MA. 1998. Geodesic deformable models for medical
image analysis. IEEE Trans Med Imaging 17(4):634–641.
45. Sethian JA. 1996. Level set methods: evolving interfaces in geometry, fluid mechanics, computer
vision and materials sciences. New York: Cambridge UP.
46. Suri JS, Liu K, Singh S, Laxminarayan S, Zeng X, Reden L. 2002. Shape recovery algorithms
using level sets in 2-d/3-d medical imagery: a state-of-the-art review. IEEE Trans Inf Technol
Biomed 6(1):8–28.
47. Mauch S. 2000. A fast algorithm for computing the closest point and distance function. Technical
Report, CalTech.
OBJECT TRACKING IN IMAGE SEQUENCE 515
48. Sigg C, Peikert R. 2005. Optimized bounding polyhedra for gpu-based distance transform. Pro-
ceedings of Dagstuhl seminar 023231 on scientific visualization: extracting information and
knowledge from scientific data sets. Berlin: Springer. Available at http://graphics.ethz.ch/ peik-
ert/papers/dagstuhl03.pdf/.
49. Giraldi GA, Rodrigues PS, Marturelli LS, Silva RLS. 2005. Improving the initialization, conver-
gence and memory utilization for deformable models. In Segmentation models, Vol. 1. Part A of
Handbook of Image Analysis, Part A, 2nd ed., pp. 359–414. Berlin: Springer.
50. Columbia object image library (COIL-100). Department of Computer Science, Columbia Uni-
versity. http://www.cs.columbia.edu/CAVE/research/softlib/coil-100.html.
51. Lucas B, Kanade T. 1981. An iterative image registration technique with an application to stereo
vision. In Proceedings of the DARPA Image Understanding Workshop, pp. 121–130. Washington,
DC: US Government Printing Office.
52. Suri J. 1998. Error and shape measurement tools for cardiac projection images: a closer look.
Proceedings of an international conference in applications of patterns recognition, pp. 125–134.
53. Suri J, Haralick RM, Sheehan FH. 1996. Greedy algorithm for error reduction in automatically
produced boundaries from low contrast ventriculogram. Proceedings of the international confer-
ence on pattern recognition (ICPR’96), Vol. 4, pp. 361–365. Washington, DC: IEEE Computer
Society.
54. Suri J, Haralick RM, Sheehan FH. 2000. Greedy algorithm for error reduction in automatically
produced boundaries from low contrast ventriculogram. Int J Pattern Anal Appl 3(1):39–60.
55. Suri J, Setarhedran SK, Singh S. 2002. Advanced algorithm approaches to medical image seg-
menttion: state-of-the-art applications in cardiology, neurology, mammography and pathology.
Berlin: Springer.
56. Giraldi GA, Oliveira AAF. 1999. Dual-snake model in the framework of simplicial domain decom-
position. Technical poster at the international symposium on computer graphics, image processing
and vision (SIBGRAPI’99), pp. 103–106. Washington, DC: IEEE Computer Society.
57. Giraldi GA, Gonvalvez LMG, Oliveira AAF. 2000. Dual topologically adaptable snakes. In Pro-
ceedings of the fifth joint conference on information sciences (JCIS 2000), third international
conference on computer vision, pattern recognition, and image processing, Vol. 2, pp. 103–106.
Washington, DC: IEEE Computer Society.
58. Giraldi GA, Strauss E, Oliveira AAF. 2000. A boundary extraction method based on dual-t-snakes
and dynamic programming. In Proceedigns of the IEEE computer society conference on computer
vision and pattern recognition, (CVPR 2000), p. 1044. Washington, DC: IEEE Computer Society.
59. Giraldi GA, Gonzalves LMG, Oliveira AAF. 2000. Automating cell segmentation evaluation with
annotaded examples. In Proceedings of the APRS workshop on digital image computing, (WDCI
2000), Vol. 2, pp. 103–107. Washington, DC: IEEE Computer Society.
60. Giraldi GA, Schaefer L, Rodrigues PS. 2005. Gradient vector flow models for boundary extraction
in 2d images. In Proceedings of the 8th international conference on computer graphics and image
(IASTED CGIM 2005’, p. 2000. Washington, DC: IEEE Computer Society.
61. Giraldi GA, Strauss E, Oliveira AAF. 2000. Boundary extraction approach based on multi-
resolution methods and the t-snakes framework. In Proceedings of the 13th Brazilian symposium
on computer graphics and image processing (SIBGRAPI 2000), pp. 82–89. Washington, DC:
IEEE Computer Society.
15
Jundong Liu
Ohio University, Athens, Ohio, USA
In this chapter we introduce the concept of deformable image registration and point out that
interpolation effects, non-rigid body registration, and joint segmentation and registration
frameworks are among the challenges that remain for this area.
To address the interpolation effects challenge, we choose the Partial Volume Interpo-
lator (PV) used in multimodality registration as an example, and quantitatively analyze the
generation mechanism of the interpolation artifacts. We conclude that the combination of
linear interpolation kernel and translation-only motion leads to generation of the artifact
pattern. As a remedy we propose to use nonuniform interpolation functions in estimating
the joint histogram. The cubic B-spline and Gaussian interpolators are compared, and we
demonstrate improvements via experiments on misalignments between CT/MR brain scans.
A segmentation-guided non-rigid registration framework is proposed to address the
second and third challenges. Our approach integrates the available prior shape information
as an extra force to lead to a noise-tolerant registration procedure, and it differs from
other methods in that we use a unified segmentation + registration energy minimization
formulation, and the optimization is carried out under the level-set framework. We show
the improvement accomplished with our model by comparing the results with that of the
Demons algorithm. To explore other similarity metrics under the same framework to handle
more complicated inputs will be the focus of our future work.
1. INTRODUCTION
Address all correspondence to: Jundong Liu, Professor of Electrical Engineering and Computer Sci-
ence, Stocker 321A, Ohio University, Athens, OH 45701. Phone: 740-593-1603, Fax: 740-593-0007.
[email protected].
517
518 JUNDONG LIU
cal parameters as the main research areas has seen a tremendous amount of growth
over the past decade. The work described in this chapter is concerned with the
problem of automatically aligning 3D medical images.
Image registration is one of the most widely encountered problems in a variety
of fields, including but not limited to medical image analysis, remote sensing,
satellite imaging, optical imaging, etc. One possible definition of this problem is:
determine the coordinate transformation, or mapping, relating different views of
the same or similar objects. These views may arise from:
Figure 1. Images from different modalities. From left to right: CT, MR, PET, SPECT.
Available online at http://www.isi.uu.nl/Research/Registration/.
DEFORMABLE MODEL-BASED IMAGE REGISTRATION 519
is sensitive to bone density and MR to tissue density, whereas PET and SPECT
depict the physiology.
Figure 2 (from the homepage of the Image Science Institute, Department of
Medical Imaging; available online at http://www.isi.uu.nl/Research/Registration/
registration-frame.html) also shows how multimodal data can be used to provide
a better understanding of the physiology of the human brain aided by the presence
of precise anatomical information. It shows the right and left hemispheres of a
brain, segmented from an MR image. Information from a SPECT scan is overlaid
on the cortex. Color encoding is used to indicate the amount of cerebral blood
perfusion: from light gray for low perfusion, via yellow, to red for high perfusion.
As we can see, the picture shows an area with increased blood perfusion in the
right hemisphere, and this is indicative of pathology in the right hemisphere.
However, multimodality images are usually acquired with different devices,
and at different times, so there will inevitably be some motion between them. This
makes accurate geometrical registration a prerequisite for effective fusion of the
information from multimodality images.
The registration problem is to find the optimal spatial and intensity transfor-
mation so that the images are matched well. Finding the parameters of the optimal
geometric coordinate transformation is generally the key to any registration prob-
lem, while the intensity transformation is not always the task of interest.
The transformations can be classified into global and local transformations.
A global transformation is given by a single equation that maps the entire image.
Local transformations map the image differently depending on spatial location,
and are thus more difficult to express succinctly. The most common global trans-
formations are rigid, affine, and projective transformations.
A transformation is called rigid if the distance between points in the image
being transformed is preserved. A rigid transformation can be expressed as
u(x, y) = (cos(φ) x − sin(φ) y + dx ) − x,
(2)
v(x, y) = (sin(φ) x + cos(φ) y + dy ) − y,
where u(x, y) and v(x, y) denote the displacement at point (x, y) along the X and
Y directions; φ is the rotation angle and (dx , dy ) the translation vector.
A transformation is called affine when any straight line in the first image is
mapped onto a straight line in the second image with parallelism being preserved.
In 2D the affine transformation can be expressed as
u(x, y) = (a11 x + a12 y + dx ) − x,
(3)
v(x, y) = (a21 x + a22 y + dy ) − y,
where
a11 a12
(4)
a21 a22
denotes an arbitrary real-valued
matrix. Scaling transformation, which has a trans-
s1 0
formation matrix of 0 ŝ , and shearing transformation, which has a matrix of
1 s 2
0 1
3
, are two examples of affine transformation, where s1 , s2 , and s3 are positive
real numbers.
A more interesting case, in general, is that of a planar surface in motion
viewed through a pinhole camera. This motion can be described as a 2D projective
transformation of the plane
u(x, y) = mm06∗x+m 1 ∗y+m2
∗x+m7 ∗y+1 − x,
(5)
v(x, y) = m3 ∗x+m4 ∗y+m5 − y,
m6 ∗x+m7 ∗y+1
different devices, e.g., CT/MR, or the same devices but at different times, usually a
rigid transformation is adequate to explain the variation between them. When two
images depicting the same scene are taken from the same viewing angle but from
different positions, i.e., the camera is zoomed in/out or rotated around its optical,
an affine transformation is required to match these two images. In this thesis, we
will mainly be dealing with these types of global transformations.
When a global transformation does not adequately explain the relationship
of a pair of input images, a local transformation may be necessary. To register
an image pair taken at different times with some portion of the body that has
experienced growth, or to register two images from different patients, falls into
this local transformation registration category. A motion (vector) field is usually
used to describe the change/displacement in local transformation problem.
and then the resulting transformation is applied to the atlas to obtain seg-
mentation for the input image. The prior information embedded in the atlas
is totally neglected during the registration step. Registration-assisted seg-
mentation and segmentation-guided registration are directions worth pur-
suing.
pAB (a, b)
M I(A, B) = pAB (a, b) log , (6)
pA (a) · pB (b)
a,b
where pA (a), pB (b), and pAB (a, b) are the marginal probability distributions and
joint probability distribution, respectively. The relationship between MI and en-
tropy is
M I(A, B) = H(A) + H(B) − H(A, B), (7)
with H(A) and H(B) being the entropy of A and B, and H(A, B) their joint
entropy:
H(A) = − a pA (a) log pA (a),
(8)
H(A, B) = − a,b pAB (a, b) log pAB (a, b).
DEFORMABLE MODEL-BASED IMAGE REGISTRATION 523
Figure 3. PV interpolation.
524 JUNDONG LIU
Figure 4. The mutual information value of a pair of multimodal images. Row 1 contains a
pair of MR/CT images. (Available online at http://www.isi.uu.nl/Research/Registration/).
Rows 2 and 3 show the mutual information (MI),marginal entropies (H(A) and H(T (B)))
and joint entropy ((HA, T (B))) values as functions of translations t (up to ±7 pixels).
Overall, the combined effects of the moving-in and moving-out sets lead to a
change in his(a, b). So we have
A2 − A1 = M2 − M 1. (10)
At translation t,
his(a, b, t) = M1 + A2 t − A1 t
= M1 + (M 2 − M 1) t (11)
= t M2 + (1 − t) M1 .
f (0) = M 1, f (1) = M 2,
(12)
f (t) = t f (1) + (1 − t) f (0).
The above inequality indicates that each component of H(A, T (B)) is a con-
vex function within [0,1]. Since the summation of convex functions is still a convex
function, H(A, T (B)) = − g(his(a, b, t)), as a negative combination of cer-
tain number of convex functions, is a concave function in [0,1]. Correspondingly,
the MI responses are a convex function in the same interval. This property can be
easily extended to any intervals [n, n + 1] where n is an integer. That is the reason
why the responses of H(A, T (B)) as a function of translation t (Figure 4) bear a
concave-shaped artifact within each integer interval.
If we take a closer look at the above analysis, we can find that the heart of the
artifact generation mechanism lies in the following facts:
The artifact effect for pure rotations would be less severe than that of pure
translations. This is because the moving-in and moving-out grids, under
the pure rotation motion scenario, do not contribute to the change in the
histogram at uniform rate. Figure 5a shows the H(A, T (B)) values as
a function of rotations (up to ±15◦ ). As is evident, the responses for
rotations are much smoother than the translation counterpart.
In the past years, several remedies for reducing artifact effects have been pro-
posed [12, 13] that either rely on resampling one of the input images into different
grid sizes, or applying higher-order functions as the interpolation kernel. Although
a number of impressive results have been reported, we believe the analysis given in
DEFORMABLE MODEL-BASED IMAGE REGISTRATION 527
previous sections can provide a deeper insight into which kernel should be chosen
and how it will work.
When a bilinear function is used as the PV interpolation kernel, all the rel-
evant grids contribute to the change in a histogram bin value at a synchronized
pace. As we mentioned earlier, “to break the synchronization” is the key to reduc-
ing/removing interpolation artifacts. For a new interpolation kernel to be chosen
to avoid this translation-caused synchronization, two desired properties have to be
satisfied:
e.g., 16 points (filter width = 4) are used to update the histogram bin
values, and due to the consistency of image intensities, synchronization is
still likely to occur.
The Gaussian function used here also has a support of 4, and its standard
deviation σ is set at 1. The kernel function is given as
1 x2
hGaussian (x) = √ exp(− 2 ). (15)
2πσ 2σ
The superlinear function is an interpolator built only for the purpose of veri-
fying the second claim: a uniform function should not be used as the kernel. This
filter has a support width of 4, which is broader than the bilinear kernel; however,
it is still a uniform function:
1/2 − |x|/4, 0 ≤ |x| ≤ 2,
hsuperlinear (x) = (16)
0, elsewhere.
Figure 6. Three interpolation kernels. Left: cubic B-spline function. Middle: Gaussian
function. Right: superlinear function.
dislocated, which implies that the ultimate registration estimation can still be very
accurate even though both the cubic B-spline and Gaussian are “blurring” filters.
The right-bottom subfigure shows the registration response from the superlinear
interpolator. Compared with the narrower-support bilinear function, the superlin-
ear interpolator brings more points to smooth up the interpolation procedure; thus,
the artifact pattern becomes less severe. However, as the superlinear interpolator
is still a uniform kernel, the resulting response pattern is far from artifact-free.
Figure 7. Mutual information responses with respect to translations along the x-axis. Left-
top: values for bilinear interpolator. Right-top: for cubic B-spline function. Left-bottom:
for Gaussian function. Right-bottom: for superlinear function.
accurate. Note that Powell’s method was used as the optimization scheme in these
experiments.
Figure 8 depicts an example of the registration results. Figure 8a is the ref-
erence CT image. Figures 8b,c show the floating MR images, before registration
and after registration, respectively. An edge map of the reference CT image is
superimposed on the transformed floating image. As is evident, the registration is
visually quite accurate.
In this section we address the second and third challenges pointed out in Sec-
tion 1.2: how to integrate segmentation and registration into a unified procedure
so that the prior information embedded in both processes can be better utilized.
Figure 8. Example of the registration results. The left-most is the reference CT im-
age; the middle is the floating MR image prior to registration, and the right-most is the
MRI after registration. Images were obtained from the homepage of the Image Sci-
ence Institute at the University of Medical Center Utrecht, and are available online at
http://www.isi.uu.nl/Research/Registration/registration-frame.html).
532 JUNDONG LIU
Registration and segmentation are the two most fundamental problems in the
field of medical image analysis. Traditionally, they were treated as separate prob-
lems, each having numerous solutions proposed in the literature. In recent years,
the notion of integrating segmentation and registration into a unified procedure
has gained great popularity, partially due to the fact that the more practical prob-
lems, e.g., atlas-based segmentation, subsume both segmentation and registration
components.
Yezzi et al. [17] pointed out the interdependence existing in many segmenta-
tion and registration solutions, and a novel geometric, variational framework was
then proposed that minimizes an overall energy functional involving both pre- and
post-image regions and registration parameters. Geometrical parameters and con-
tour positions were simultaneously updated in each iteration, and segmentations
were obtained from the final contour and its transformed counterpart. While this
model and its variants [18, 19] are enlightening and pave a promising way toward
unifying registration and segmentation, their applicability range is either limited
to relatively simple deformation types (rigid/affine) [17], or to relatively simple
input images [18, 19].
Vemuri et al. [20] proposed a segmentation + registration model to solve
the atlas-based image segmentation problem where the target image is segmented
through registration of the atlas to the target. A novel variational formulation was
presented that place segmentation and registration processes under a unified varia-
tional framework. Optimization is achieved by solving a coupled set of nonlinear
PDEs.
Another segmentation + registration model proposed by Noble et al. [21] seeks
the best possible segmentation and registration from the maximum a posteriori
point of view. Improvements in accuracy and robustness for both registration and
segmentation have been shown, and potential applications identified. This model
is primarily designed for combining segmentation and rigid registration. While a
non-rigid algorithm was also implemented, the motion field estimation was based
on block-matching of size (7 × 7), which is not dense enough for most non-rigid
registration applications.
against input image noise. We present several 2D examples on synthetic and real
data in the implementation results section.
where Ω is the image domain and V (X) denotes the deformation field. λ1 , λ2 ,
and λ3 are three constant parameters that weight the importance of each term in
534 JUNDONG LIU
the optimization energy. The [I1 (X) − I2 (X + V (X))]2 dX term provides the
main force for matching two images, while terms [I2 (X +V (X)) − C1 ]2 dX and
2
[I2 (X + V (X)) − C2 ] dX allow the prior segmentation
to exert its influence,
aiming to enforce the homogeneity constraints. ||∇V (X)||2 dX is a diffusion
term to smooth the deformation field.
∂E
= 2(I1 (X) − I2 (X + V ))(−∇I2 (X + V )) (20)
∂V
+2λ1 (I2 (X + V ) − C1 )∇I2 (X + V ) · H(φ(X))
+λ3 ∇2 V,
where
I (X
Ω 2
+ V )H(φ(X + V )dX
C1 = , (21)
Ω
H(φ(X + V ))dxdy
I (X
Ω 2
+ V )(1 − H(φ(X + V ))dX
C2 = . (22)
Ω
(1 − H(φ(X + V )))dxdy
The level set function being used here is φ(X, 0) = D(X), where D(X) is the
signed distance from each grid point to zero level set C. This procedure is standard,
and we refer the reader to [23] for details.
10 10
20 20
30 30
40 40
50 50
60 60
70 70
80 80
90 90
100 100
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
(a) (b)
10 10
20 20
30 30
40 40
50 50
60 60
70 70
80 80
90 90
100 100
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
(c) (d)
Figure 9. Registration results for image set 1. First row: (a) the fixed image, (b) the
moving image. Second row: registration result of (c) using the Demons algorithm, and (d)
using our segmentation-guided registration model. The edge map from the fixed image is
superposed. See attached CD for color version.
of noise existing in the images. However, the registration result generated from
our model is quite accurate, which indicates that the integrated segmentation in-
formation is very helpful in pulling the moving image toward a correct matching.
We designed and carried out a similar experiment on a pair of MRI brain slices.
The images were obtained from the Davis-Mills Magnetic Resonance Imaging and
Spectroscopy Center (MRISC) at the University of Kentucky. The two slices have
substantial disparity in the shape of the ventricles, which is the region of interest.
DEFORMABLE MODEL-BASED IMAGE REGISTRATION 537
10 10
20 20
30 30
40 40
50 50
60 60
70 70
80 80
90 90
100 100
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
(a) (b)
10 10
20 20
30 30
40 40
50 50
60 60
70 70
80 80
90 90
100 100
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
(c) (d)
Figure 10. Registration results for image set 2. The images were obtained from the Davis-
Mills Magnetic Resonance Imaging and Spectroscopy Center (MRISC) at the University of
Kentucky. First row: (a) fixed image, (b) moving image. Second row: registration result of
(c) using the Demons algorithm, and (d) using our segmentation-guided registration model.
See attached CD for color version.
Figure 10 shows the images and results. Figures 10a,b show the fixed and moving
images, respectively. Figures 10c,d depict the results with the Demons algorithm
(10c) and our segmentation-guided registration model (10d). As can seen, the
former model fails to transform the ventricle area into the desired position, while
the latter accurately achieves the registration goal.
538 JUNDONG LIU
What would be the best (in terms of accuracy) interpolation scheme for
Mutual-Information-based image registration?
What are the artifact patterns for other popular similarity measures, such
as the Correlation Ratio (CR) and Local Correlation (LC).
5. REFERENCES
1. Maintz JBA, van den Elsen PA, Viergever MA. 1996. Comparison of edge-based and ridge-based
registration of CT and MR brain images. Med Image Anal 1(2):151–161.
2. Haacke E, Liang Z-P. 2000. Challenges of imaging structure and function with MRI. IEEE Eng
Med Biology Magn 19(5):55–62.
3. Barron JL, Fleet DJ, Beauchemin SS. 1994. Performance of optical flow techniques. Int J Comput
Vision 1(12):43–77.
4. Cover TM, Thomas JA 1991. Elements of information theory. New York: John Wiley and Sons.
5. Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P, Marchal G. 1995. Automated mul-
timodality image registration using information theory. In Proceedings of the 14th international
conference on information processing in medical imaging (IPMI), pp. 263–274. Ed YJC Bizais.
Washington, DC: IEEE Computer Society.
6. Viola PA, Wells WM. 1995. Alignment by maximization of mutual information. In Proceedings
of the fifth international conference on computer vision (ICCV’95), pp. 16–23. Washington, DC:
IEEE Computer Society.
540 JUNDONG LIU
7. Wells WM, Viola P, Atsumi H. 1997. Multi-modal volume registration by maximization of mutual
information. Med Image Anal 1(1):35–51.
8. Studholme C, Hill D, Hawkes D. 1999. An overlap invariant entropy measure of 3D medical
image alignment. Pattern Recognit 3271–86.
9. Ji X, Pan H, Liang ZP. 1999. A region-based mutual information method for image registration. In
Proceedings of the 7th international meeting of the international society for magnetic resonance
in medicine, Vol. 3, p. 2193. Washington, DC: IEEE Computer Society.
10. Maintz JB, Viergever MA. 1998. Survey of medical image registration. Med Image Anal 2:1-36.
11. Pluim J, Maintz J, Viergever M. 2000. Interpolation artefacts in mutual information-based image
registration. Comput Vision Image Understand 77:211–232.
12. Taso J. 2003. Interpolation artifacts in multimodality image registration based on maximization
of mutual information. IEEE Trans Med Imaging 22(7):845–864.
13. Chen H, Varshney PK. 2003. Mutual information-based CT-MR brain image registration using
generalized partial volume joint histogram estimation. IEEE Trans Med Imaging 22(9):1111–
1119.
14. Liu J, Wang Y, Liu J. 2006. A unified framework for segmentation-assisted image registration. In
Proceedings of the 7th Asian conference on computer vision. Lecture notes in computer science,
Vol. 3852, pp. 405–414. Berlin: Springer.
15. Liu J, Wei M, Liu J. 2004. Artifact reduction in mutual-information-based CT-MR image registra-
tion. In Proceedings of the SPIE, Vol. 5370, pp. 1176-1186. Medical imaging 2004: physiology,
function, and structure from medical images. Ed AA Amini, A Manduca. Bellingham, WA: In-
ternational Society for Optical Engineering.
16. Lehmann TM, Gonner C, Spitzer K. 1999. Survey: interpolation methods in medical image
processing. IEEE Trans Med Imaging 18(11):1049–1075.
17. Yezzi A, Zollei L, Kapur T. 2003. A variational framework for joint segmentation and registration.
Med Image Anal 7(2):171–185.
18. Unal G, Slabaugh G,Yezzi A, Tyan J. 2004. Joint segmentation and non-rigid registration without
shape priors. Siemens Technical Report SCR-04-TR-7495.
19. Unal G, Slabaugh G. 2005. Coupled PDEs for non-rigid registration and segmentation. In Pro-
ceedings of the IEEE computer society conference on computer vision and pattern recognition
(CVPR’05), Vol. 1, pp. 168–175. Washington, DC: IEEE Computer Society.
20. Vemuri B, Chen Y, Wang Z. 2002. Registration-assisted image smoothing and segmentation.
Proceedings of the 7th European Conference on Computer Vision, Part IV. Lecture notes in
computer science, Vol. 2353, pp. 546–559.
21. Wyatt PP, Noble JA. 2003. MAP/MRF joint segmentation and registration of medical images.
Med Image Anal 7:539–552.
22. Chan T, Vese L. 2001. An active contour model without edges. IEEE Trans Image Process
10(2):266–277.
23. Sussman M, Smereka P, Osher S. 1994. A level set approach for computing solutions to incom-
pressible two-phase flow. J Comput Phys 114:146–159.
24. Wang Y, Staib LH. 2000. Physical model-based non-rigid registration incorporating statistical
shape information. Med Image Anal 4(1):7–20.
25. Simulated brain database. Available at http://www.bic.mni.mcgill.ca/brainweb/.
26. Thirion JP. 1998. Image matching as a diffusion process: an analogy with Maxwell’s demons.
Med Image Anal2(3):243–260.
27. Fischer B, Modersitzki J. Image registration using variational methods. Paper presented
at Visual Analysis, Image-Based Computing and Applications Workshop. Available at
http://64.233.161.104/search?q=cache:EyP6NVf6DhsJ:www.icm.edu.pl/visual/moderzitski.ps+
28. Leventon M, Grimson WEL. 1999. Multi-modal volume registration using joint intensity dis-
tributions. In Lecture notes in computer science, Vol. 1496, pp. 1057–1066. Berlin: Springer.
Available at http://www.ai.mit.edu/people/leventon/Research/9810-MICCAI-Reg/.
DEFORMABLE MODEL-BASED IMAGE REGISTRATION 541
29. Likar B, Pernus F. 2001. A hierachical approach to elastic registration based on mutual informa-
tion. Image Vision Comput 19:33–44.
30. Vemuri BC, Huang S, Sahni S, Leonard CM, Mohr C, Gilmore R, Fitzsimmons J. 1998. An
efficient motion estimator with application to medical imaging. Med Image Anal 2(1):79–98.
31. Woods R, Maziotta J, Cherry S. 1993. MRI-PET registration with automated algorithm. J Comput
Assist Tomogr 17:536–546.
32. Zhu S, Yuille A. 1996. Region competition, unifying snakes, region growing, and Bayes/MDL
for multibank image segmentation. IEEE Trans Pattern Anal Machine Intell 18(9):884–900.
INDEX
543
544 INDEX
and level set representation, 67–68 Single photon emission computed tomography,
penalizing area of, 75 518–519
Shape based deformable models, 91–129 and statistical deformable models, 163–190
Shape-based inversion, 71–73 Singular points, 204
Shape-based segmentation, 298–300 Singular-value decomposition (SVD), 212
Shape deformation Skeleton by Influence Zones (SKIZ), 263–265
controlling, 397–420 Skeletonization of images, 35
and level set techniques, 68–70 Skin-air boundary information, 134, 136,
2D physics-based, 409–418 140–141
using medial-based operators, 401–403 Smart snakes, 351–363
Shape density distribution, 363 Smooth surface and boundary noise, 467–469
Shape model, 197, 358, 398, 482 Smoothing force, 199
construction of, 304–306 Smoothing in breast imaging, 151–153, 160
in Dual-T-Snakes, 211–212 Snake model, 195–196, 303
initialization of, 176–178 for breast contour determination, 135, 141
new instance of, 176 Snakes, 335–336
Shape positioning in three-dimensional active adaptive inflation, 340
shape model (ASM), 175 adaptive subdivision scheme, 340–341
Shape probability density function (PDF), 363 classical, 336–337
Shape profile by hierarchical regional PCA, color images, 342–344
403–405 discretization and numerical simulation of,
Shape reconstruction from medial profiles, 339
399–401 drawbacks of, 337–338
Shape recovery, 207 inflation force, 339–340
Shape representation and optical flow field, 346–350
medial profiles for, 399–400 statistically constrained, 363–372
using medial patches, 405–409 and user interaction, 341–342
Shape variations Snaxels, 199–201, 214
modeling, 352–354 Sobel filter, 114–115
in point distribution model, 173 Sorted heap, 236–237
in statistically constrained snakes, 364–365 Source image, 533
Shaped-based reconstruction in computerized Source points in fast marching method, 246, 248
tomography, 67 Spatial coordinate transformation, 519–520
Shearing transformation, 520 Spatial image derivatives, 349
Shifted Grid Fast Marching method (SGFM), Spatiotemporal shape segmentation algorithm,
236, 252 376–380
Signal concentration, 27 Spatiotemporal shapes, 372–383
threshold for quorum-sensing, 7, 24 alignment, 374
Signal drops in Gated Single Positron Emission estimating and reiterating, 379–380
Computer Tomography (Gated-SPECT), 167 and gray-level training, 376
Signal-to-noise ratio, 479–480 model representation, 375–376
Signaling molecule in P. aeruginosa, 6–7, 24 statistical variation, 373–376
Signals in bacterial biofilms, 2, 24 variation modes, 375
Signed distance, 304 Speckle noise, 498
Signed-distance function, 84, 490 SPECT. See Single photon emission computed
Signed distance map, 304–305, 307, 313 tomography
Similarity criteria, 522 Speed function, 18–21, 203–206, 246, 251–252,
Similarity transformations in deformable 489–491
organisms, 414–415 derivation of, 450–453
Simple distance transform algorithm, 37–38, in geodesic topographic distance transform,
53–57 267
SimpleList distance transform algorithm, 38–39, in kidney image analysis, 325
49–50, 53–57 Speed wave, 461
Simplex Meshes, 346 Spiral shape and medial curve, 472
INDEX 555