2014 - Methods For Nuclei Detection, Segmentation, and Classification in Digital Histopathology A Review-Current Status and Future Poten
2014 - Methods For Nuclei Detection, Segmentation, and Classification in Digital Histopathology A Review-Current Status and Future Poten
7, 2014
97
AbstractDigital pathology represents one of the major evolutions in modern medicine. Pathological examinations constitute the
gold standard in many medical protocols, and also play a critical
and legal role in the diagnosis process. In the conventional cancer diagnosis, pathologists analyze biopsies to make diagnostic and
prognostic assessments, mainly based on the cell morphology and
architecture distribution. Recently, computerized methods have
been rapidly evolving in the area of digital pathology, with growing
applications related to nuclei detection, segmentation, and classification. In cancer research, these approaches have played, and
will continue to play a key (often bottleneck) role in minimizing
human intervention, consolidating pertinent second opinions, and
providing traceable clinical information. Pathological studies have
been conducted for numerous cancer detection and grading applications, including brain, breast, cervix, lung, and prostate cancer
grading. Our study presents, discusses, and extracts the major
trends from an exhaustive overview of various nuclei detection,
segmentation, feature computation, and classification techniques
used in histopathology imagery, specifically in hematoxylineosin
and immunohistochemical staining protocols. This study also enables us to measure the challenges that remain, in order to reach
robust analysis of whole slide images, essential high content imaging with diagnostic biomarkers and prognosis support in digital
pathology.
Index TermsDigital pathology, histopathology, microscopic
analysis, nuclei classification, nuclei detection, nuclei segmentation, nuclei separation.
1937-3333 2014 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution
requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
98
TABLE I
DESCRIPTION OF NOTATION
Fig. 1.
IRSHAD et al.: METHODS FOR NUCLEI DETECTION, SEGMENTATION, AND CLASSIFICATION IN DIGITAL HISTOPATHOLOGY
99
Fig. 2. Different types of nuclei. (a) LN. (b) EN. (c) EN (Cancer). (d) EN
(Mitosis).
40 magnifications. The output of the digital scanners is multilayered images, stored in a format that enables fast zooming
and panning.
For illumination, uniform light spectrum is used to highlight the tissue slide. The microscope setup, sample thickness,
appearance, and staining may cause uneven illumination. In addition, most camera technologies have low response to short
wavelength (blue) illumination and have a high sensitivity at
long wavelength (red to infrared) regions. To reduce these differences in illumination, most slide scanners provide standard
packages to normalize and correct spectral and spatial illumination variations. To address the problem of color nonstandardness, Monaco et al. [32] presented a robust Bayesian color segmentation algorithm that dynamically estimates the probability
density functions describing the color and spatial properties of
salient objects.
III. CHALLENGES IN NUCLEI SEGMENTATION
AND CLASSIFICATION
Among the different types of nuclei, two types are usually the
object of particular interest: lymphocyte and epithelial nuclei.
Nuclei may look very different according to a number of factors such as nuclei type, malignancy of the disease, and nuclei
life cycle. Lymphocyte is a type of white blood cell that has a
major role in the immune system. Lymphocyte nuclei (LN) are
inflammatory nuclei having regular shape and smaller size than
epithelial nuclei (EN) [see Fig. 2(a)]. Nonpathological EN have
nearly uniform chromatin distribution with smooth boundary
[see Fig. 2(b)]. In high-grade cancer tissue, EN are larger in
size, may have heterogeneous chromatin distribution, irregular
boundaries, referred to as nuclear pleomorphism, and clearly
visible nucleoli as compared to normal EN [see Fig. 2(c)]. The
variation in nuclei shape, size, and texture during nuclei life
cycle, mitotic nuclei (MN), is another factor of complexity [see
Fig. 2(d)].
Automated nuclei segmentation is now a well-studied topic
for which a large number of methods have been described in
the literature and new methodologies continue to be investigated. Detection, segmentation, and classification of nuclei in
routinely stained histopathological images pose a difficult computer vision problem due to high variability in images caused by
a number of factors including differences in slide preparation
(dyes concentration, evenness of the cut, presence of foreign
artifacts or damage to the tissue sample, etc.) and image acquisition (artifacts introduced by the compression of the image,
presence of digital noise, specific features of the slide scanner,
etc.). Furthermore, nuclei are often organized in overlapping
(1)
100
(3)
The basic effect of erosion (dilation) operator on an image is to shrink (enlarge) the boundaries of foreground pixels.
Two other major operations in morphology are opening and
closing . Opening is an erosion of an image followed by dilation; it eliminates small objects and sharpens peaks in the object.
Opening is mathematically defined as
Uf S = [Uf S] S.
(4)
The P r that are often used are gray level (average intensity
and variance), color, texture, and shape related.
4) Watershed: Watershed is a segmentation method that usually starts from specific pixels called markers and gradually
floods the surrounding regions of markers, called catchment
basin, by treating pixel values as a local topography. Catchment
basins are separated topographically from adjacent catchment
basins by maximum altitude lines called watershed lines. It allows to classify every point of a topographic surface as either
belonging to the catchment basin associated with one of the
local minima or to the watershed line. Details about watershed
can be found in [42]. The basic mathematical definition contains
lower slope LS(i) that is the maximum slope connecting pixel i
in the image I to its neighbors of lower altitude as
I(i) I(j)
(9)
LS(i) = max
d(i, j)
j N (i)
where N (i) is neighbors of pixel i and d(i, j) is the Euclidean
distance between pixels i and j. In case of i = j, the lower
slope is forced to be zero. The cost of moving from pixel i to j
is defined as
d(i,
j),
if
I(i) = I(j).
2
(10)
The topographical distance between the two pixels i and j is
expressed as
min
(i 0 ,...,i t )
(5)
t1
(11)
k =0
(8)
where is the set of all paths from i to j. The watershed transformation is usually computed on the gradient image instead of
the intensity image.
5) Active Contour Models and Level sets: Active contour
models (ACMs) or deformable models, widely used in image
segmentation, are deformable splines that can be used to depict
the contour of objects in an image using gradient information by
seeking to minimize an energy function [43]. In case of nuclei
segmentation, the contour points that yield the minimum energy
level form the boundary of nuclei. The energy function is often
defined to penalize discontinuity in the curve shape and graylevel discontinuity along the contour [12]. The general ACM is
defined using the energy function E over the contour points c as
(12)
E = (EInt (c) + EIm g (c) + EExt (c)) dc
3) Region Growing: Region growing [41] is an image segmentation method consisting of two steps. The first step is the
selection of seed points and the second step is a classification
of neighboring pixels to determine whether those pixels should
be added to the region or not by minimizing a cost function.
Let P r(Ii ) is a logical predicate which measures the similarity
of a region Ii . The segmentation results in a partition of I into
regions (I1 , I2 , . . . , In ), so that the following conditions hold:
1) P r(Ii ) = TRUE for all i = 1, 2, . . . , n;
2) P r(Ii Ij ) = FALSE, Ii , Ij (i = j) adjacent regions.
where EInt controls the shape and length of the contour (often
called internal energy), EIm g influences adjustment of local
parts of the contour to the image values regardless of the contour
geometry (referring as image energy), and EExt is the userdefined force or prior knowledge of object to control the contour
(referring as external energy). , , and are empirically derived
constants.
There are two main forms of ACMs. An explicit parametric representation of the contour, called snakes, is robust to
image noise and boundary gaps as it constrains the extracted
(6)
The black top-hat transform is defined as the difference between image I and its closing as
Tb (I) = Uf [Uf S].
(7)
IRSHAD et al.: METHODS FOR NUCLEI DETECTION, SEGMENTATION, AND CLASSIFICATION IN DIGITAL HISTOPATHOLOGY
boundaries to be smooth. However, in case of splitting or merging of contours, snakes are restricted for topological adaptability
of the model. Alternatively, the implicit ACM, called level sets,
is specifically designed to handle topological changes, but they
are not robust to boundary gaps and have other deficiencies as
well [44]. The basic idea is to determine level curves from a
potential function.
6) K-means Clustering: The K-means clustering [45] is an
iterative method used to partition an image into K clusters. The
basic algorithm is as follows.
1) Pick K cluster centers, either randomly or based on some
heuristic.
2) Assign cluster label to each pixel in the image that minimizes the distance between the pixel and the cluster center.
3) Recompute the cluster centers by averaging all the pixels
in the cluster.
4) Repeat steps 2) and 3) until convergence is attained or no
pixel changes its cluster.
The difference is typically based on the pixel value, texture,
and location, or a weighted combination of these factors. Its
robustness depends mainly on the initialization of clusters.
7) Probabilistic Models: Probabilistic models can be
viewed as an extension of K-means clustering. Gaussian mixture
models (GMMs) are a popular parametric probabilistic model
represented as weighted sum of Gaussian cluster densities. The
image is modeled according to the probability distribution
P (I(i)) =
K
wk N (I(i)|k , k2 )
(13)
k =1
w
The wk are positive real values such that K
k =1 k = 1.
The parameters of GMM are estimated from training data
using the computation method like expectation maximization
(EM) [46] that iteratively finds maximum likelihood. The EM
is based on the following four steps.
(0)
(0)
(0)
1) Initialization: The parameters k , k2 , and wk are
randomly initialized for each cluster Ck .
2) Expectation: For each pixel I(i) and cluster Ck , conditional probability P (Ck |I(i)) is computed as
(t)
(t)
(t )
w N (I(i)|k k2 )
P (Ck |I(i))(t) =
K k (t)
.
(t) 2 ( t )
)
j =1 wj N (I(i)|j j
(t)
(t )
(14)
(t)
U
P (Ck |I(i))(t) I(i)
(t+1)
k
= i
U
(15)
(t)
i P (Ck |I(i))
U
(t+1) 2
P (Ck |I(i))(t) (I(i) k
)
(t+1)
k
= i
(16)
U
(t)
i P (Ck |I(i))
U
P (Ck |I(i))(t)
(t+1)
.
(17)
= i
wk
U
101
.
u V A ,tV w(u, t)
v V B ,tV w(v, t)
(19)
Ncut value would not be small for the cut that partitions
isolating points, because the cut value will be a large percentage
of the total connection from that set to the others. The basic
procedure used to find the minimum Ncut is explained here [48].
These image processing methods are extensively used in recently proposed frameworks for preprocessing, nuclei detection,
segmentation, separation, and classification. Based on these image processing methods, we compiled a list of existing frameworks for nuclei detection, segmentation, separation, and classification in histopathology as shown in Table II. In the following
sections, we discuss how different image processing methods
have been used.
Ncut(A, B) =
B. Preprocessing
Preprocessing can be performed to compensate for adverse
conditions such as the presence of batch effects. Batch effect
refers to unevenness in illumination, color, or other image parameters recurring across multiple images. Noise reduction and
artifacts elimination can also be performed prior to detection and
segmentation. Additionally, region of interest (ROI) detection
can also be performed in order to reduce the processing time.
1) Illumination Normalization: The illumination can be corrected either by using white shading correction or by estimating
102
TABLE II
SUMMARY OF STATE-OF-THE-ART NUCLEI DETECTION AND SEGMENTATION FRAMEWORKS IN HISTOPATHOLOGY
IRSHAD et al.: METHODS FOR NUCLEI DETECTION, SEGMENTATION, AND CLASSIFICATION IN DIGITAL HISTOPATHOLOGY
the illumination pattern from a series of images. In white shading correction, a blank (empty) image is captured and used to
correct images pixel by pixel [73]. A common equation is
Specimen value Background value
.
White Reference value Background value
(20)
A downside of this method is that a blank image must be
acquired for each lens magnification whenever the microscope
illumination settings are altered.
An alternative normalization method is based upon the intrinsic properties of the image which are revealed through Gaussian
smoothing [74]. Another possible way is to estimate background
by exploiting the images of the specimen directly, even in the
presence of the object [75], [76]. Can et al. [77] introduced a
method to correct nonuniform illumination variation by modeling the observed image I(i) as product of the excitation pattern,
E(i), and the emission pattern, M (i) as
Transmittance =
(21)
J
1
Ij (i)
J K +1
(22)
j =K
103
Different color models can be used. Most detection and segmentation methods [9], [10], [17], [24], [25], [50], [64] use the
RGB color model, although the RGB model is not a perceptually uniform color model. Other more perceptual color models
such as HSV, Lab, and Luv are sometimes used [11], [18], [19],
[27], [51], [70], [72], [86][89].
3) Noise Reduction and Image Smoothing: Thresholding is
used for noise reduction that usually follows filtering and background correction in order to minimize random noise and artifacts [22], [90]. The pixels that lie outside threshold values are
often determined using intensity histogram are considered to be
noisy. Alternatively, applying the threshold function on a group
of pixels instead of an individual pixel eliminates a noisy region.
While such techniques are successful to eliminate small spots
of noise, they fail at eliminating large artifacts [91].
Alternatively, morphological operations can also be used for
noise reduction. Noise and artifacts are eliminated using morphological operations like closings and openings [59]. Morphological gray-scale reconstruction methods are used to eliminate
noise while preserving the nuclei shape [24], [54], [55], [70].
While thresholding and filtering reduce noise according to pixel
intensities, morphology reduces noise based on the shape characteristics of the input image, as characterized by a structuring
element. Morphology cannot distinguish the nuclei areas and artifacts having a nuclear-like shape but different intensity values.
Thresholding (prior or subsequent to applying the morphological operations) removes such artifacts.
Adaptive filters [92], Gamma correction [17], and histogram
equalization [52] have been used to increase the contrast between foreground (nuclei) and background regions. Anisotropic
diffusion is used to smooth nuclei information without degrading nuclei edges [52], [86]. Gaussian filtering is also used to
smooth nuclei regions [18], [26], [61].
4) ROI Detection: In some frameworks, noise reduction and
ROI detection are performed simultaneously. For example, for
tissue level feature computation, the preprocessing step selects
the ROI by excluding regions with little content and noise [91].
For nuclei level feature computation, noise reduction is succeeded by ROI detection to determine the nuclei region [70],
[86].
Thresholding is popular for ROI detection. Sertel et al. [52]
introduced the nuclei and cytological components as ROI for
grading of follicular lymphoma (FL). Red blood cells (RBCs)
and background regions show uniform patterns as compared to
other nuclei in FL tissue; thus, thresholding is performed in RGB
color model for elimination of RBCs and background. Similarly,
Dalle et al. [17] selected neoplasm ROI for nuclei pleomorphism
in breast cancer images by using Otsu thresholding along with
morphological operations.
Clustering is another method that is commonly used for ROI
detection. Cataldo et al. [25] performed automated separation of
cancer from noncancerous regions (stroma, blood vessels) using unsupervised clustering. Then, cancerous and noncancerous
regions are refined using morphological operations. Dundar
et al. [19] proposed a framework for classification of intraductal breast lesions as benign or malignant using the cellular component. The intraductal breast lesions contain four
104
[ m i n , M A X ]
3) Identify the local maxima of RN (i) and impose a minimum region size to filter out irrelevant minima.
This methodology improves the accuracy of seed locations.
The main disadvantage of this methodology is its sensitivity
to even minor peaks in the distance map that results in over
segmentation and detection of tiny regions as nuclei.
The radial symmetry transform (RST) is also used for seed
detection. Loy and Zelinsky [93] proposed fast gradient-based
interest operator for detection of seed points having high radial
symmetry. Although this approach is inspired by the results of
the generalized symmetry transform, it determines the symmetrical contribution of each pixel around it, rather than considering
the contribution of a local neighborhood to a central pixel. Veta
et al. [59] also employed RST for seed detection.
Recently, several other approaches have been proposed to
detect the seed points. Qi et al. [64] proposed a novel and
fast algorithm for seed detection by utilizing single-path voting with the shifted Gaussian kernel. The shifted Gaussian
kernel is specifically designed by amplifying the voting at
the center of the targeted object and resulted in low occurrence of false seeds in overlapping regions. First, a cone shape
(rm in , rm ax , ) with its vertex at (x, y) is used to define the
voting area A(x, y; rm in , rm ax , ), where rm in is a minimum
radius, rm ax is a maximum radius, and is the aperture angle
of the cone. The voting direction (x, y) is computed using the
negative gradient direction (cos((x, y)), sin((x, y)), where
is the angle of the gradient direction with respect to x-axis.
The voting image V (x, y; rm in , rm ax , ) is generated using the
shifted Gaussian kernel with its means x , y and standard deviation located at the center (x, y) of the voting area A and
oriented in the voting direction using single path approach as
V (x, y; rm in , rm ax , ) =
I(x, y) N (x, y, x , y , )
(u ,v )A
(24)
where
I(x, y)
is the magnitude of gradient image and
N (x, y, x , y , ) is a 2-D shifted Gaussian kernel defined as
1
(x x )2 + (y y )2
exp
,
2 2
2 2
(25)
where x = x + cos2 (rm ax + rm in ) and y = y sin2 (rm ax +
rm in ). Later, the seed points are determined by executing mean
shift on the sum of voting images. They have compared their
results with iterative voting method in [94].
Counting nuclei by type is highly important for grading purpose. However, manual counting of nuclei is tedious and subject to considerable inter- and intrareader variations. Fuchs and
Buhmann [95] reported 42% disagreement between five pathologists on classification of nuclei as normal or atypical. They
also reported intrapathologist error of 21.2%. This shows the
high potential added value of automatic counting tools.
MN count provides clues to estimate the proliferation and the
aggressiveness of the tumor [62]. Anari et al. [88] proposed the
fuzzy c-means (FCM) clustering method along with the ultraerosion operation in the Lab color model for detection of MN in
IHC images of meningioma. They reported detection accuracy
N (x, y, x , y , ) =
IRSHAD et al.: METHODS FOR NUCLEI DETECTION, SEGMENTATION, AND CLASSIFICATION IN DIGITAL HISTOPATHOLOGY
c
U
vkmi I(i) Ck 2
(26)
() =
k =1 i=1
j =1
1
I (i)C k
I (i)C j
m 21
c () =
2
U
(32)
wik logk +
U
U
i=1
(33)
i=1
Recently, Roullier et al. [62] proposed a graph-based multiresolution framework for MN detection in breast cancer IHC
images. This approach consists in unsupervised clustering at
low resolution followed by refinements at a higher resolution.
At multiresolution level, mitotic regions are initially segmented
by using the following discrete label regularization function:
0
2
f f
min R(f ) +
(29)
2
f H(V )
where the first term R(f ) is the regularizer defined as the
discrete Dirichlet form of the function f H(V ) : Rw (f ) =
1
2 12
and H(V ) is the
u V [
v u w(u, v)(f (v) f (u)) ]
2
Hilbert space of real valued functions defined on the vertices V
of a graph. The second term is a fitting term. 0 is a fidelity
parameter called the Lagrange multiplier which specifies the
tradeoff between the two competing terms. The GaussJacobi
method is used to approximate the solution of minimization in
(29) by the following iterative algorithm:
(0)
0
f (u) = f (u)
, u V
+ v u w(u, v)
(30)
where f (t) is function at the iteration step t. More details on these
definitions can be found in [62]. This discrete regularization is
adapted for labeling the mitotic regions at higher resolution. The
authors reported more than 70% TPR and 80% TNR.
The use of EM for GMM was recently proposed by Khan
et al. [86] for the detection of MN in breast cancer histopathological images. In this framework, pixel intensity of mitotic and
nonmitotic region is modeled by a GammaGaussian mixture
model as
f (Ii ; ) = 1 (I(i); , ) + 2 N (I(i); , )
logf (I(i); )
(27)
(28)
U
i=1
i=1 k =1
m
i=1 vk i I(i)
.
U
m
i=1 vk i
105
(31)
= argmax ()
(34)
where wik , k = 1, 2 are indicator variables showing the component membership of each pixel I(i) in the mixture model
(31). This method reported 51% F-score during ICPR 2012
Contest [96].
Ciresan et al. [97] used deep max-pooling convolutional neural networks (CNNs) to detect MN and achieved highest F-score
(78%) during ICPR 2012 contest [96]. A training dataset consisting of patch images centered on ground truth mitosis is used
to train a CNN. The trained CNN is then used to compute a map
of probabilities of mitosis over the whole image. Their approach
proved to be very efficient and to have a much lower number of
false positives (FPs) as compared to the other contestants.
Grading of lymphocytic infiltration based on detection of
large number of LN in IHC HER2+ breast cancer histopathology
was reported by Basavanhally et al. [18]. In this framework, LN
are automatically detected by a region growing method which
uses contrast measures to find optimal boundary. High detection
sensitivity has been reported for this framework, resulting in a
large number of nuclei other than lymphocytes being detected.
In order to reduce the number of FP, size and luminance information based maximum a posteriori (MAP) estimation is applied
to temporarily labeled candidates as either LN or CN. Later,
Markov random field (MRF) theory with spatial proximity is
used in order to finalize the labels. This framework has been
evaluated on 41 HER2+ WSI and reported 90.41% detection
accuracy as compared to 94.59% manual detection accuracy.
D. Nuclei Segmentation
Nuclei features such as size, texture, shape, and other morphological appearance are important indicators for grading and
prognosis of cancer. Consequently, classification and grading
of cancer is highly dependent on the quality of segmentation
of nuclei. The choice of the nuclei segmentation method is
correlated with the feature computation method. For instance,
some feature computation method requires the exact boundary points of nuclei to compute the nuclei morphology. In this
case, high magnification images are required to utilize the exact
106
min
P G (u ,v )
m
1
i=1
(35)
where w(ui , ui+1 ) is a weight function between two pixels and
PG (u, v) is a set of paths connecting two vertices. Given a set
of K seeds S = (si V), where i = 1, 2, . . . , K, the energy
: V R induced by the metric for all the seeds of S can be
expressed as
S (u) = min (si , u),
s i S
u V.
(36)
(37)
j N (i)
I(i) I(j)
exp
2I2
(38)
IRSHAD et al.: METHODS FOR NUCLEI DETECTION, SEGMENTATION, AND CLASSIFICATION IN DIGITAL HISTOPATHOLOGY
is refined later using a second Gcuts-based method with combination of alpha expansion and graph coloring to reduce computational complexity. The authors reported 86% accuracy on
25 histopathological images containing 7400 nuclei. The framework often causes oversegmentation when chromatin is highly
textured and the shape of nuclei is extremely elongated. In case
of highly clustered nuclei with weak borders between nuclei,
undersegmentation may occur.
For nuclei segmentation in glioblastoma histopathology images, Chang et al. [66] proposed a multireference Gcuts framework for solving the problem of technical and biological variations by incorporating geodesic constraints. During labeling,
a unique label L(i) is assigned to each vertex v V and the
image cutout is performed by minimizing the energy
E=
v V
(40)
(v ,u )E
where Eg f and Elf are the global and local data fitness
terms applying the fitness cost for assigning L(v) to v, and
Esm o othness (L(v), L(u)) is the prior energy, denoting the cost
when the labels of adjacent vertices, v and u are L(v) and L(u),
respectively. The authors reported 85% TPR and 75% PPV on
TCGA dataset [101] of 440 WSI.
Vink et al. introduced a deterministic approach using machine
learning technique to segment EN, LN, and fibroblast nuclei in
IHC breast cancer images [69]. Initially, the authors report that
one detector cannot cover the whole range of nuclei as diversity
in appearance is too large to be covered by a single detector.
They formulate two detectors (pixel-based and line-based) using modified AdaBoost. The first detector focuses on the inner
structure of nuclei and second detector covers the line structure
at the border of nuclei. The outputs of these two detectors are
merged using an ACM to refine the border of the detected nuclei. The authors report 95% accuracy with computational cost
of one second per field of view image.
These nuclei segmentation frameworks have reported good
segmentation accuracy on LN, MC, and EN having regular
shape, homogeneous chromatin distribution, smooth boundaries, and individual existence. However, these frameworks have
poor segmentation accuracy for CN especially when CN are
clustered and overlapping. Furthermore, they are intolerant to
chromatin variations, which are very common in CN.
E. Nuclei Separation
A second generation of nuclei segmentation frameworks tackles the challenges of heterogeneity, overlapping, and clustered
nuclei by using machine learning algorithms together with classical segmentation methods. In addition, statistical and shape
models are used to separate overlapping and clustered nuclei.
As compared with nuclei segmentation methods, these methods are more tolerant to variations in shape of nuclei, partial
occlusion, and differences of the staining.
107
The watershed transform is employed to address the problem of overlapping nuclei by defining a group of basins in the
image domain, where ridges in-between basins are borders that
isolate nuclei from each other [9], [19], [25], [54], [60]. Wahlby
et al. [26] addressed the problem of clustered nuclei and proposed a methodology that combined the intensity and gradient
information along with shape parameters for improved segmentation. Morphological filtering is used for finding nuclei seeds.
Then, seeded watershed segmentation is applied on the gradient
magnitude image to create the region borders. Later, the result
of the initial segmentation is refined with gradient magnitude
along the boundary separating neighboring objects, resulting
in the removal of poorly contrasted objects. In final step, distance transform and shape-based cluster separation methodologies are applied keeping only the separation lines, which went
through deep valleys in the distance map. The authors reported
90% accuracy for overlapping nuclei. Cloppet and Boucher [99]
presented a scheme for segmentation of overlapping nuclei in
immunofluorescence images by providing a specific set of markers to the watershed algorithm. They defined markers as split
between overlapping structures and resulted in 77.59% accuracy in case of overlapping nuclei and 95.83% overall accuracy.
In [102], a similar approach is used for segmentation of clustered
and overlapping nuclei in tissue micro array (TMA) and WSI
colorectal cancers. First, combined global and local thresholding are used to select foreground regions. Then, morphological
filtering is applied to detect seed points. Region growing from
seed points produces initial segmented nuclei. At last, clustered
nuclei are separated using watershed and ellipse approximation.
The authors claimed 80.3% accuracy.
The main problem with most ACMs is their sensitivity to
initialization. To solve this initialization problem, Fatakdawala
et al. [57] proposed EM-driven Geodesic ACM with overlap
resolution for segmentation of LN in breast cancer histopathology and reported 86% TPR and 64% PPV. EM-based ACM
initialization allows the model to focus on relevant objects of
interest. The magnetostatic active contour [103] model is used as
a force F guiding contour toward boundary. Based on contours
enclosing multiple objects, high concavity points are detected
on the contours and used in the construction of an edge-path
graph. Then, a scheme based on high concavity points and size
heuristic is used to resolve overlapping nuclei. The degree of
concavity/convexity is proportional to the angle (cw ) between
contour points. It is computed as follows:
(cw ) = arccos
(cw cw 1 ) (cw +1 cw )
|cw cw 1 ||cw +1 cw |
(41)
108
E(, , IF , IB ) =
K
=2
k =1
(k (I) (I))2 |k |(k )dI
Shape + boundary energy
(F H 1 2 )dI +
(B H 1 2 )dI
r
+
Region energy
K
=2
H 1 2 dI +
(k k )2 dI
+
k =1
Mutual occlusion energy
(42)
K
k =1
K 1
k =1
|I k | di + B
2
K
k =1
|I b |2 di
K
K
k j
(43)
k =1 j =1,j = k
IRSHAD et al.: METHODS FOR NUCLEI DETECTION, SEGMENTATION, AND CLASSIFICATION IN DIGITAL HISTOPATHOLOGY
probability map image and hematoxylin-stained image, produced after color deconvolution [84].
In general, model-based approaches segment nuclei using a
prior shape information, which may introduce a bias favoring the
segmentation of nuclei with certain characteristics. To address
this problem, Wienert et al. [68] proposed a novel contourbased minimum model for nuclei segmentation using minimal
a prior information. This minimum model-based segmentation
framework consists of six internal processing steps. First, all
possible closed contours are computed regardless of shape and
size. Second, all initially generated contours are ranked using
gradient fit. Third, nonoverlapping segmentation is performed
with ranked labeling in a 2-D map. Fourth, segmentation is
improved using contour optimization. Fifth, cluster nuclei are
separated using concavity point detection (41). Last, segmented
regions are classified as nuclei or background using stained
related information. This framework avoids a segmentation bias
with respect to shape features. The authors managed to achieved
86% TPR and 91% PPV on a dataset of 7931 nuclei.
RST is an iterative algorithm attributing votes to pixels inside
a region [93]. After the final iteration, maxima are used as
marker of a nuclei segmentation algorithm such as watershed.
Each boundary point contributes to votes for a region defined
by oriented cone-shape kernels as
A(x, y; rm in , rm ax , ) = (x + r cos , y + r sin )
|rm in r rm ax ,
(x, y) +
(x, y)
2
2
(44)
109
110
TABLE III
SUMMARY OF NUCLEI FEATURES USED IN HISTOPATHOLOGY
IRSHAD et al.: METHODS FOR NUCLEI DETECTION, SEGMENTATION, AND CLASSIFICATION IN DIGITAL HISTOPATHOLOGY
111
investigated to separate overlapping and clustered/touching nuclei. These methods have good results for nuclei that are slightly
touching or overlapping each other, but they are not suitable for
specimens containing larger numbers of nuclei with extensive
overlapping and touching. These methods suffer from dependencies inducing instability. For instance, the computation of
curvature is highly dependent on concavity point detection algorithm, region growing tends to rely on shape and size of
nuclei, marker-controlled watershed needs true nuclei markers,
and ellipse-fitting techniques are unable to accommodate the
shape of most nuclei. Most of these methods also require prior
knowledge. In spite of the availability of few methods like clustering, GMM and EM, and new image modality [67] able to
deal with heterogeneity, accurate segmentation of touching or
overlapping nuclei is still an open research area.
To the best of our knowledge, only few supervised machinelearning techniques like Bayesian [18], [55], SVM [67], and
AdaBoost [69] are used for nuclei segmentation. The basic philosophy of the machine learning approach is that human provides examples of the desired segmentation, and leaves the optimization and parameter tuning tasks to the learning algorithm.
The two main avenues to be explored in terms of supervised
machine-learning algorithms are the use of more domain specific features and limitation of overfitting issues.
REFERENCES
[1] W. W. Ma and A. A. Adjei, Novel agents on the horizon for cancer
therapy, CA: A Cancer J. Clinic., vol. 59, no. 2, pp. 111137, 2009.
[2] M. May, A better lens on disease: Computerized pathology slides may
help doctors make faster and more accurate diagnoses, Sci. Amer.,
vol. 302, pp. 7477, 2010.
[3] R. Rubin and D. S. Strayer, in Rubins Pathology: Clinicopathologic
Foundations of Medicine, E. M. Rubin, F. Gorstein, R. Schwarting, and
D. S. Strayer, Eds., 4th ed. Philadelphia, PA, USA: Lippincott Williams
& Wilkins, Apr. 2004.
[4] X. Zhou and S. Wong, Informatics challenges of high-throughput microscopy, IEEE Signal Process. Mag., vol. 23, no. 3, pp. 6372, May
2006.
[5] M. N. Gurcan, L. E. Boucheron, A. Can, A. Madabhushi, N. M. Rajpoot,
and B. Yener, Histopathological image analysis: A review, IEEE Rev.
Biomed. Eng., vol. 2, pp. 147171, Dec. 2009.
[6] P. H. Bartels, D. Thompson, M. Bibbo, and J. E. Weber, Bayesian belief
networks in quantitative histopathology, Anal. Quant. Cytol. Histol.,
vol. 14, no. 6, pp. 459473, 1992.
[7] J. P. Thiran and B. Macq, Morphological feature extraction for the classification of digital images of cancerous tissues, IEEE Trans. Biomed.
Eng., vol. 43, no. 10, pp. 10111020, Oct. 1996.
[8] T. Mouroutis, S. J. Roberts, and A. A. Bharath, Robust cell nuclei segmentation using statistical modeling, Bioimaging, vol. 6, pp. 7991,
1998.
[9] M. N. Gurcan, T. Pan, H. Shimada, and J. Saltz, Image analysis for
neuroblastoma classification: Segmentation of cell nuclei, in Proc. IEEE
28th Annu. Int. Conf. Eng. Med. Biol. Soc., New York, NY, USA, 31
Aug.3 Sep. 2006, pp. 48444847.
[10] O. S. Al-Kadi, Texture measures combination for improved meningioma classification of histopathological images, Pattern Recog.,
vol. 43, no. 6, pp. 20432053, 2010.
[11] J. Kong, L. Cooper, T. Kurc, D. Brat, and J. Saltz, Towards building
computerized image analysis framework for nucleus discrimination in
microscopy images of diffuse glioma, in Proc. IEEE 33rd Annu. Int.
Conf. Eng. Med. Biol. Soc., Boston, MA, USA, Aug. 30Sep. 3, 2011,
pp. 66056608.
[12] W. N. Street, W. H. Wolberg, and O. L. Mangasarian, Nuclear feature
extraction for breast tumor diagnosis, in Proc. Int. Symp. Electron.
Imag.: Sci. Technol., San Jose, CA, USA, 14 Feb. 1993, vol. 1905,
pp. 861870.
112
[13] J. Gil, H. Wu, and B. Y. Wang, Image analysis and morphometry in the
diagnosis of breast cancer, Microsc. Res. Tech., vol. 59, no. 2, pp. 109
118, 2002.
[14] C. Gunduz, B. Yener, and S. H. Gultekin, The cell graphs of cancer,
Bioinformatics, vol. 20, pp. 145151, 2004.
[15] S. Petushi, F. U. Garcia, M. M. Haber, C. Katsinis, and A. Tozeren, Large-scale computations on histology images reveal gradedifferentiating parameters for breast cancer, BMC Med. Imag., vol. 6,
pp. 1424, 2006.
[16] A. E. Tutac, D. Racoceanu, T. Putti, W. Xiong, W.-K. Leow, and V. Cretu,
Knowledge-guided semantic indexing of breast cancer histopathology
images, in Proc. Int. Conf. Biomed. Eng. Informat., Sanya, Hainan,
China, 2008, vol. 2, pp. 107112.
[17] J.-R. Dalle, H. Li, C.-H. Huang, W. K. Leow, D. Racoceanu, and
T. C. Putti, Nuclear pleomorphism scoring by selective cell nuclei detection, in Proc. IEEE Workshop Appl. Comput. Vis., 2009, 6 pp.
[18] A. N. Basavanhally, S. Ganesan, S. Agner, J. P. Monaco, M. D. Feldman,
J. E. Tomaszewski, G. Bhanot, and A. Madabhushi, Computerized
image-based detection and grading of lymphocytic infiltration in HER2+
breast cancer histopathology, IEEE Trans. Biomed. Eng., vol. 57, no. 3,
pp. 642653, Mar. 2010.
[19] M. Dundar, S. Badve, G. Bilgin, V. C. Raykar, R. K. Jain, O. Sertel, and
M. N. Gurcan, Computerized classification of intraductal breast lesions
using histopathological images, IEEE Trans. Biomed. Eng., vol. 58,
no. 7, pp. 19771984, Jul. 2011.
[20] C.-H. Huang, A. Veillard, L. Roux, N. Lomenie, and D. Racoceanu,
Time-efficient sparse analysis of histopathological whole slide images,
Computer. Med. Imag. Graph., vol. 35, pp. 579591, Nov. 2011.
[21] N. Lomenie and D. Racoceanu, Point set morphological filtering and semantic spatial configuration modeling: Applications to microscopic image and bio-structure analysis, Pattern Recog., vol. 45, no. 8, pp. 2894
2911, 2012.
[22] S. J. Keenan, J. Diamond, W. Glenn McCluggage, H. Bharucha,
D. Thompson, P. H. Bartels, and P. W. Hamilton, An automated machine vision system for the histological grading of cervical intraepithelial neoplasia (CIN), J. Pathol., vol. 192, no. 3, pp. 351362,
2000.
[23] X. He and Q. Liao, A novel shape prior based segmentation of touching
or overlapping ellipse-like nuclei, in Proc. SPIE, San Diego, CA, USA,
vol. 6914, 2008, 8 pp.
[24] P. W. Huang and Y. H. Lai, Effective segmentation and classification
for HCC biopsy images, Pattern Recog., vol. 43, no. 4, pp. 15501563,
2010.
[25] S. D. Cataldo, E. Ficarra, A. Acquaviva, and E. Macii, Automated segmentation of tissue images for computerized IHC analysis, Comput.
Methods Progr. Biomed., vol. 100, no. 1, pp. 115, 2010.
[26] C. Wahlby, I. M. Sintorn, F. Erlandsson, G. Borgefors, and E. Bengtsson,
Combining intensity, edge and shape information for 2D and 3D segmentation of cell nuclei in tissue sections, J. Microsc., vol. 215, no. 1,
pp. 6776, 2004.
[27] K. Nguyen, A. K. Jain, and B. Sabata, Prostate cancer detection: Fusion
of cytological and textural features, J. Pathol. Informat., vol. 2, no. 2,
pp. 727, Dec. 2011.
[28] S. Ali, R. Veltri, J. Epstein, C. Christudass, and A. Madabhushi, Adaptive
energy selective active contour with shape priors for nuclear segmentation and gleason grading of prostate cancer, in Proc. 14th Int. Conf.
Med. Image Comput. Comput.-Assist. Interv., Toronto, ON, Canada, 18
22 Sep. 2011, pp. 661669.
[29] C. Demir and B. Yener, Automated cancer diagnosis based on
histopathological images: A systematic survey, Dept. Comput. Sci.,
Rensselaer Polytechnic Inst., Troy, NY, USA, Tech. Rep. TR-05-09,
2005.
[30] L. He, L. R. Long, S. Antani, and G. Thoma, Computer assisted diagnosis in histopathology, Sequ. Genome Anal.: Methods Appl., 2010, pp.
271287.
[31] H. Fox, Is H&E morphology coming to an end?, J. Clin. Pathol.,
vol. 53, pp. 3840, 2000.
[32] J. Monaco, J. Hipp, D. Lucas, S. Smith, U. Balis, and A. Madabhushi,
Image segmentation with implicit color standardization using spatially
constrained expectation maximization: Detection of nuclei, in Proc.
15th Int. Conf. Med. Image Comput. Comput.-Assist. Interv., 2012,
pp. 365372.
[33] G. Alexe, G. S. Dalgin, D. Scanfeld, P. Tamayo, J. P. Mesirov, C.
DeLisi, L. Harris, N. Barnard, M. Martel, A. J. Levine, S. Ganesan,
and G. Bhanot, High expression of lymphocyte-associated genes in
node-negative HER2+ breast cancers correlates with lower recurrence
rates, Cancer Res., vol. 67, no. 22, pp. 10 66910 676, 2007.
IRSHAD et al.: METHODS FOR NUCLEI DETECTION, SEGMENTATION, AND CLASSIFICATION IN DIGITAL HISTOPATHOLOGY
[58] Y. Al-Kofahi, W. Lassoued, W. Lee, and B. Roysam, Improved automatic detection and segmentation of cell nuclei in histopathology images, IEEE Trans. Biomed. Eng., vol. 57, no. 4, pp. 841852, Apr. 2010.
[59] M. Veta, A. Huisman, M. A. Viergever, P. J. van Diest, and J. P. W. Pluim,
Marker-controlled watershed segmentation of nuclei in H & E stained
breast cancer biopsy images, in Proc. 8th IEEE Int. Symp. Biomed.
Imag.: Nano Macro, Chicago, IL, USA, Apr. 2011, pp. 618621.
[60] H. Kong, M. Gurcan, and K. Belkacem-Boussaid, Partitioning
histopathological images: An integrated framework for supervised colortexture segmentation and cell splitting, IEEE Trans. Med. Imag., vol. 30,
no. 9, pp. 16611677, Sep. 2011.
[61] A. Mouelhi, M. Sayadi, and F. Fnaiech, Automatic segmentation of clustered breast cancer cells using watershed and concave vertex graph, in
Proc. Int. Conf. Commun., Comput. Control Appl., Hammamet, Tunisia,
Mar. 2011, pp. 16.
[62] V. Roullier, O. Lezoray, V. T. Ta, and A. Elmoataz, Multi-resolution
graph based analysis of histopathological whole slide images: Application to mitotic cell extraction and visualization, Comput. Med. Imag.
Graph., vol. 35, pp. 603615, 2011.
[63] S. Ali and A. Madabhushi, An integrated region-, boundary-, shapebased active contour for multiple object overlap resolution in histological
imagery, IEEE Trans. Med. Imag., vol. 31, no. 7, pp. 14481460, Jul.
2012.
[64] X. Qi, F. Xing, D. J. Foran, and L. Yang, Robust segmentation of
overlapping cells in histopathology specimens using parallel seed detection and repulsive level set, IEEE Trans. Biomed. Eng., vol. 59,
no. 3, pp. 754765, Mar. 2012.
[65] M. Kulikova, A. Veillard, L. Roux, and D. Racoceanu, Nuclei extraction from histopathological images using a marked point process approach, presented at the Proc. SPIE Med. Imag., San Diego, CA, USA,
2012.
[66] H. Chang, L. A. Loss, and B. Parvin, Nuclear segmentation in H&E
sections via multi-reference graph cut (MRGC), in Proc. 9th IEEE Int.
Symp. Biomed. Imag.: Nano Macro, Barcelona, Spain, 2012, pp. 614
617.
[67] A. Veillard, M. Kulikova, and D. Racoceanu, Cell nuclei extraction from
breast cancer histopathology images using color, texture, scale and shape
information, in Proc. 11th Eur. Congr. Telepathol. 5th Int. Congr. Virt.
Microsc., Venice, Italy, 69, Jun. 2012, 3 pp.
[68] S. Wienert, D. Heim, K. Saeger, A. Stenzinger, M. Beil, P. Hufnagl,
M. Dietel, C. Denkert, and F. Klauschen, Detection and segmentation of
cell nuclei in virtual microscopy images: A minimum-model approach,
Sci. Rep., vol. 2, 7 pp., 2012.
[69] J. Vink, M. V. Leeuwen, C. V. Deurzen, and G. Haan, Efficient nucleus
detector in histopathology images, J. Microsc., vol. 249, no. 2, pp. 124
135, 2013.
[70] A. M. Khan, H. El-Daly, E. Simmons, and N. M. Rajpoot, A hybrid
magnitude-phase approach to unsupervised segmentation of tumor areas
in breast cancer histology images, J. Pathol. Inform., vol. 4, 7 pp., Mar.
2013.
[71] (2012). MITOS, ICPR 2012 Contest, IPAL UMI CNRS Lab Std., [Online]. Available: http://ipal.cnrs.fr/ICPR2012/?q=node/5
[72] C. D. Malon and E. Cosatto, Classification of mitotic figures with convolutional neural networks and seeded blob features, J. Pathol. Inform.,
vol. 4, pp. 913, May 2013.
[73] G. D. Marty, Blank-field correction for achieving a uniform white background in bright field digital photomicrographs, BioTechniques, vol. 42,
no. 6, pp. 714720, 2007.
[74] F. W. Leong, M. Brady, and J. O. McGee, Correction of uneven illumination (vignetting) in digital microscopy images, J. Clin. Pathol.,
vol. 56, no. 8, pp. 619621, 2003.
[75] A. Gherardi, A. Bevilacqua, and F. Piccinini, Illumination field estimation through background detection in optical microscopy, in Proc.
IEEE Symp. Comput. Intell. Bioinform. Comput. Biol., Paris, France,
2011, pp. 16.
[76] F. Piccinini, E. Lucarelli, A. Gherardi, and A. Bevilacqua, Multi-image
based method to correct vignetting effect in light microscopy images,
J. Microsc., vol. 248, no. 1, pp. 622, 2012.
[77] A. Can, M. Bello, H. E. Cline, T. Xiaodong, F. Ginty, A. Sood, M. Gerdes,
and M. Montalto, Multi-modal imaging of histological tissue sections, in Proc. IEEE 5th Int. Symp. Biomed. Imag.: Nano Macro,
Sun Valley, Idaho, USA, 2008, pp. 288291.
[78] L. E. Grenier, B. V. Funt, P. H. Orth, and D. M. McIntosh, Transillumination method apparatus for the diagnosis of breast tumors and other
breast lesions by normalization of an electronic image of the breast,
U.S. Patent 5 079 698, Jan. 7, 1992.
113
114
Daniel Racoceanu (M08) received the M.Eng. degree from the Politehnica University, Timisoara, Romania, in 1992, and the M.Sc., Ph.D., and Dr. habil
degrees from the University of Besancon, Besancon,
France, in 1993, 1997, and 2006, respectively,
He is currently with the University Pierre and
Marie Curie, Paris, France. He is a Senior Research
Fellow at the French National Center for Scientific
Research, Singapore, and a Professor (adj.) at the National University of Singapore, Singapore. He is the
Director of the FrenchSingaporean-Joint Lab Image and Pervasive Access Lab UMI, French National Center for Scientific
Research. His research interests include symbolic and connectionist cognitive
exploration methods for high-content biomedical images. He is involved in
the ANR-A*STAR program and French Ministry of Industry founded grants
(http://www.comp.nus.edu.sg/danielr/see webpages).