410412
410412
Strotov
a multi-step approach based on a preliminary detection, cause false detections. Next processing steps are per-
regions of interest (ROI) selection, object contour seg- formed to archive scale invariance, reject false detections,
mentation, contour descriptor calculation, object match- and refine object shape.
ing, and recognition. ROI selection. To archive invariance to scale trans-
In the next section object, detection algorithm is de- form gaussian image pyramid is created, and blob detec-
scribed in detail. Then contour descriptor evaluation, ob- tion algorithm described above is performed. Binary im-
ject matching and recognition are discussed. We present ages are formed at each scale of the pyramid. Filter masks
some experimental results for proposed approach ob- sizes are fixed, but coefficient k slightly increases with
tained on natural video sequences. image detail degradation. For each pyramid level binary
image is formed, and list of segments is created. Segment
1. Object detection
analysis at different scales is a part of the algorithm
Preliminary detection. The algorithm, that is used to which allows selection of regions of interest.
detect objects at preliminary step, should satisfy two re- The analysis starts from coarse image resolution and
quirements. It should be computational efficient and work goes to more detailed levels. Simple morphological oper-
well in cloudy and noisy environment. It can be assumed, ations are involved to reduce segment fragmentation on
that objects are more contrast than the underlying back- low resolutions. Bounding boxes for each segment are
ground. Neighboring pixels usually have similar bright- expanded on some value depending on initial size. Then
ness values and background have low spatial frequencies intersections between bounding boxes are searching at
in Fourier domain. In that case, objects with some as- different scales. Intersected regions must be counted and
sumptions can be describes as blobs. Spatial filters are excluded from the list. As a rule, large objects in the im-
typically used for blob detection to increase SNR and to age are more fragmented on detailed scale levels. This
get better results. property is used to specify large object location. Example
At first, the background must be estimated. Wherefore of binary mask of the test object on different levels of the
an observed image l(i, j) passes through the spatial filter with pyramid is shown in Fig. 2.
big size mask h2. Simultaneously l(i, j) is smoothed with
mask h1 of smaller size to average object brightness. To im-
prove performance, box filters are used. They use masks h1
and h2, which have dimensions (2q1+1)×(2q1+1) and
(2q2+1)×(2q2+1), q1 < q2 respectively:
h1 (m, n) = 1 / (2q1 + 1) 2 , m, n = −q1 , q1 ;
0, m, n = −q1 , q1 ; (1)
h2 (m, n) =
( )
1 / (2q2 + 1) − (2q1 + 1) , otherwise.
2 2
Contour segmentation. At this step, the more compli- flat. This effect is caused by δ(ϕ) due to ϕ(x, y) smooth-
cated segmentation procedure is performed in each ROI ing. The iterative search stops when the number of points
to estimate object contour. We choose active contour where level set function is close to zero ceases to vary no-
model as a powerful and flexible approach that can be ticeably [9].
used to precisely segment object boundary. More im- This method can deal with the detection of objects
portant that this approach grants that contour will be whose boundaries are dimmed or not necessarily defined
closed and won’t contain gaps. by gradient. It does not require image filtering and can ef-
The model is based on the Mumford-Shah functional ficiently process noisy images. Therefore, the true bounda-
minimization problem [9]. Let’s assume for simplicity ries are preserved and could be accurately detected. Addi-
that the images are continuous. The general form for the tionally, it can automatically detect interior contours with
Mumford–Shah energy functional for sensed image l(x, y) the choice of Dirac function approximation [10].
can be written as However, Chan-Vese model also has some draw-
backs: the unsuccessful segmentation of images with sig-
E MS (r , C ) = ∫∫ ( l ( x, y ) − r ( x, y) ) dx dy +
2
nificant intensity inhomogeneity, the sensitivity to the ini-
( x , y )∈ROI
(4) tial contour placement, and time-consuming iterative
+µ ∫∫ ∇r ( x, y ) dx dy + ν ⋅ length(C ),
2
solving procedure. In this work images are segmented on-
( x , y )∈ROI \ C ly in areas determined by ROIs, and are centered on ob-
where m and v are positive constants, l(x, y) – segmented jects in most cases. The influence of an image inhomoge-
image, С – object boundary curve. It becomes a difficult neity on segmentation results is noticeable for large-scale
problem to find С since r(x, y) is also an unknown func- objects but can be significantly reduced by image
tion in 2D coordinate space. Expression can be simplified downsampling in the gaussian pyramid. Thus, the main
if r(x, y) is a piecewise constant function that takes the drawbacks of the approach can be overcome.
value r1 inside С and r0 outside С. In that case energy Next subsection provides a description of object
functional (4) is reformulated as follows: recognition step of the algorithm.
( x , y )∈inside ( C )
creasing of the amount of information describing the ob-
ject contour. Also, the contours descriptor allows increas-
Expression (5) describes a Chan-Vese active contour ing the speed of the contour matching [11].
model [9, 10], where first term is the energy that corre- The proposed descriptor can be calculated using the
sponds to expansion force; the second is the energy that object binary image b or a contour C. In the first case af-
tends to compress the contour. The problem is to find the ter the image binarization we can extract external image
boundary of the object at which equilibrium is reached contour. Points of the contour are translated into polar
between two forces. The unknown curve C is replaced by coordinate frame with the frame center in the object cen-
the level set function ϕ(x, y), considering that ϕ(x, y) > 0 if troid. Obtained vector of polar coordinates is discretized
the point (x, y) is inside C, ϕ(x, y) < 0 if (x, y) is outside C, and subjected to the median filter.
and ϕ(x, y) = 0 if (x, y) is on C. Finally the minimization The result descriptor units can be calculated using the
problem is solved by taking the Euler–Lagrange equa- following equation:
tions and updating the level set function ϕ(x, y) by the
gradient descent method: 2πi π
D(i ) = Fmed max d ( P center , P , , (8)
∂ϕ( x, y ) ND ND
= δ(ϕ)( ν K (ϕ) − (l ( x, y ) − r0 ) 2 +
∂t (6)
where i = 1, N D – the number of the current descriptor
+(l ( x, y ) − r1 ) 2 ),
unit; ND – the total number of the descriptor units;
where r1 and r0 are average brightness values of object d(P1, P2) – Euclidian distance between P1 and P2; Pcenter –
and background respectively, δ(ϕ) – approximation of the position of the object centroid; P(α, ∆α) – any object
Dirac delta function, K(ϕ) – curvature of the curve. In or object contour point situated in sector of the circle that
transition from continuous (x, y) to discrete (i, j) coordi- is limited by the α±∆αangles (the circle is centered in
nate values equation (6) is transformed to Pcenter); Fmed{…} – the symbolic definition of the median
filtering operation.
ϕn +1 (i, j ) = ϕn (i, j ) + As the object contour is a close curve, it generates the se-
( )
(7)
+δ(ϕ) ν K n (ϕ) − (l (i, j ) − r0 (n)) 2 + (l (i, j ) − r1 (n)) 2 . ries of the descriptors that are shifted relative one to another
depending of the starting angle. The descriptor with the
At each n-th iteration, ϕ(x, y) is reinitialized to be the maximal D(1) unit is used as an object descriptor.
signed distance function to its zero level set. This proce- Steps of calculating contour descriptors are illustrated
dure prevents the level set function from becoming too in the Fig. 3.
3. Object matching
Small targets are matched by minimizing relative dif-
ferences in average brightness, size position for object
candidates found in new frame. Contour coordinates are
very valuable for tracking and recognition purposes for
larger objects. However, information about contour coor-
dinates is excessive and values themselves are not invari-
ant to geometrical transformations. Therefore more rele-
vant contour descriptors are used.
Object matching is performed by minimizing the cri-
terion function:
ND
( ) ,
a)
Fcrit ( j ) = min ∑ Dob (i − m ) − D j (i )
m∈M
i =1 (9)
{ }
M = m : m = 1, N D , j = 1, N ,
ND 2
( )
s0 ( j ) = arg min ∑ D0 (i − s ) − D j (i ) ,
s∈S i =1 (11)
{
S = s : s = 1, N D , }
Hence the value of angle γ is calculated by the formula:
2πs0
γ= . (12) a) b)
ND
As a result of calculating criterion function (11) for
every training descriptor we get vector of values of crite-
rion function
( )
M = f crit ( j ) | j = 1, N g . (13)
of pipelined processing. Therefore, we suggest that the In the work [16] the authors propose a mixed approach.
algorithms are suitable for Xilinx Virtex 5 of higher They use three types of indicators and a neural network.
FPGA based vision systems [12]. The result true positive ratio is between 82 % and 94 %. In
the work [17] the Markov random field based classificato-
6. Experimental research
ry is used. The result true positive ratio is between 88 %
The first goal of research is to determine the ability of and 95 %. In the work [18] the authors propose recognition
algorithm to localize objects at the distance of several of the military airplanes using wavelet-based descriptors.
kilometers. Video database contained 12 grayscale video The result true positive ratio is about 96 %.
sequences with 7 different types of aircraft, three types of The experiments were carried out on the same natural
UAVs and two helicopters. Object observed on cloudy image sequences that were used for the object detection
environment and in clear sky conditions. These sequences algorithm examination. The minimal aerial object area
were obtained from single TV or IR camera with a wide was 500 pixels. The maximal aerial object area was less
field of view. The size of objects varied from about 3×3 than 15 % of the image area.
pixels to 200×200 and even higher. Confident detection The reference object base includes 17 objects. The ob-
of objects in observed images affects the quality of algo- jects were defined by the 3D models. The sets of the ref-
rithm. The true positive ratio Pt and false negative ratio erence images were rendered for every model. The factor
Pf are measured for fixed detection algorithm parameters. 3 geosphere point distribution was used (92 points). The
Reference object position and size are determined in each light source was situated in front of the object. The ex-
video frame by visual inspection. Additionally the stand- amples of the object recognition are presented on Fig. 5.
ard deviation of object coordinate σc and size σs meas-
urement error are estimated.
To get more relevant results σs is divided on reference
size and expressed in percent. The results are summarized
in Table 1. The algorithm is less reliable in detecting hel-
icopters because of rotary wings that are not always dis-
tinguishable. In some cases the shape of the object varies
very rapidly due to the changes of the angle of view, a) b)
which also causes object misses.
Table 1. Object detection results grouped by type
Type of
aerial vehicles
Pd Pf σs, pixel σs, %
Authors’ information
Vadim Sergeevich Muraviev (b. 1981) received engineer qualification in Automation and Control Systems, candi-
date of Technical Science degree from Ryazan State Radio Engineering University (RSREU) in 2003 and 2010, respec-
tively. Currently he works as the associated professor of Automation and Information Technologies in Control depart-
ment (AITC) of RSREU. His research interests are image processing, pattern recognition, methods of classification and
optimization. E-mail: [email protected] .
Sergey Aleksandrovich Smirnov (b. 1986) received engineer qualification in Automation and Control Systems,
candidate of Technical Science degree from RSREU in 2008 and 2015, respectively. Currently he works as a leading
researcher in the AITC department of RSREU. His research interests are currently focused on computer optics, image
processing, pattern recognition, and control systems. E-mail: [email protected] .
Valery Viktorovich Strotov (b. 1980) received B.S., engineer qualification in automation and control systems,
candidate of technical science degree from RSREU in 2001, 2002 and 2009, respectively. He is a researcher (from
2002) and an associated professor (from 2009) of Automation and Information Technologies in Control department of
RSREU. His technical interests include image processing (image registration and stabilization, object detection, track-
ing and recognition) in industrial and on-board vision systems. E-mail: [email protected] .
Received May 24, 2017. The final version – August 14, 2017.