0% found this document useful (0 votes)
7 views9 pages

PRLpaper Preprint2018

Uploaded by

alex.muravev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views9 pages

PRLpaper Preprint2018

Uploaded by

alex.muravev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/323388062

Efficient adaptive non-maximal suppression algorithms for homogeneous


spatial keypoint distribution

Article in Pattern Recognition Letters · February 2018


DOI: 10.1016/j.patrec.2018.02.020

CITATIONS READS

38 13,194

6 authors, including:

Oleksandr Bailo François Rameau


Qualcomm State University of New York, Korea
11 PUBLICATIONS 597 CITATIONS 67 PUBLICATIONS 1,380 CITATIONS

SEE PROFILE SEE PROFILE

Kyungdon Joo Jinsun Park


Korea Advanced Institute of Science and Technology Pusan National University
44 PUBLICATIONS 825 CITATIONS 32 PUBLICATIONS 1,221 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by François Rameau on 25 February 2018.

The user has requested enhancement of the downloaded file.


1

Pattern Recognition Letters


journal homepage: www.elsevier.com

Efficient adaptive non-maximal suppression algorithms for homogeneous spatial keypoint


distribution

Oleksandr Bailo, Francois Rameau∗∗, Kyungdon Joo, Jinsun Park, Oleksandr Bogdan, In So Kweon
a Department of Electrical Engineering, KAIST, Daejeon, 34141, Republic of Korea

ABSTRACT

Keypoint detection usually results in a large number of keypoints which are mostly clustered, re-
dundant, and noisy. These keypoints often require special processing like Adaptive Non-Maximal
Suppression (ANMS) to retain the most relevant ones. In this paper, we present three new efficient
ANMS approaches which ensure a fast and homogeneous repartition of the keypoints in the image.
For this purpose, a square approximation of the search range to suppress irrelevant points is proposed
to reduce the computational complexity of the ANMS. To further speed up the proposed approaches,
we also introduce a novel strategy to initialize the search range based on image dimension which leads
to a faster convergence. An exhaustive survey and comparisons with already existing methods are
provided to highlight the effectiveness and scalability of our methods and the initialization strategy.
c 2018 Elsevier Ltd. All rights reserved.

1. Introduction

Keypoint detection is often the first step for various tasks


such as SLAM [14], panorama stitching [4], camera calibra-
tion [3], and visual tracking [12, 5]. Therefore, this stage poten-
tially affects the robustness, stability, and accuracy of the afore-
mentioned applications. In the past decade, we have witnessed (a) (b) (c)
significant advances in keypoint detectors leading to major im-
provements in terms of accuracy, speed, and repeatability. But Fig. 1: Keypoint detection: (a) TopM NMS, (b) bucketing, (c) proposed ANMS.
while the detection of keypoints has been intensively studied, The bottom right subimage represents the coverage and clusteredness of key-
points computed using a Gaussian kernel. The red color in the subimage stands
ensuring their homogeneous spatial distribution has attracted a for a dense cluster of points, while the blue color represents an uncovered area.
rather low level of attention. It is well known that spatial point
distribution is crucial to avoiding problematic cases like de-
to their high computational complexity. To overcome this lim-
generated configurations (for structure from motion or SLAM)
itation we propose three novel approaches called Range Tree
or redundant information (i.e. cluster of points) as depicted
ANMS (RT ANMS), K-d Tree ANMS (K-dT ANMS), and Sup-
in Fig. 1. Moreover, a homogeneous and unclustered point dis-
pression via Square Covering (SSC). The developed algorithms
tribution might speed up most computer vision pipelines since a
aim to efficiently select the strongest and well-distributed key-
lower number of keypoints is needed to cover the whole image.
points across the image. We achieve such performance using
One of the most effective solutions to ensure well-distributed
a square search range approximation which is initialized in an
keypoint detection is to apply an Adaptive Non-Maximal Sup-
optimal and intuitive manner (see Fig. 2).
pression (ANMS) algorithm on the keypoints extracted by a de-
tector. However, despite all the advantages offered by such ap- An abundant number of experiments are used to demon-
proaches, these methods have been rarely used in practice due strate the relevance of our ANMS algorithms in terms of speed,
spatial distribution, and memory efficiency. Furthermore, we
experimentally highlight that ANMS is a beneficial step for
∗∗ Correspondingauthor: Tel.: +8210-3355-7120; SLAM, which drastically improves the accuracy of the motion
E-mail addresses: [email protected] (F. Rameau) estimation while using a restricted number of keypoints.
2
(a) Suppression range search
(b) (c)
(d)

Preprocessing and
initialization
Input image

Image with output keypoints

Iteration
# detected points: 9 # detected points: 34 # detected points: 126 # detected points: 58 # detected points: 105
Keypoint detection Range size: 250p Range size: 125p Range size: 63p Range size: 94p Range size: 70p

Fig. 2: Algorithm’s workflow: (a) keypoint detection in the original image (depicted in blue), (b) sorting keypoints by strength and initialization of the search
range, (c) conceptual representation of our ANMS algorithm where: every column represents the search range guess (orange boxes) through a binary search process
iterated until queried number of points is reached (100 in this example); while every row depicts the iterations through input points, (d) final result where the red
dots represent the selected keypoints.

To sum up, the contributions of this paper are the following: leads to a very limited spatial dissemination of the keypoints
• Three novel and efficient ANMS algorithms (see Fig. 1(b)).
• A new and optimal initialization of the search range It should be noted that certain works have recently attempted
to improve the NMS stage by introducing a novel adaptive
• An extensive series of experiments against state-of-the-art
cornerness score calculation taking into consideration the lo-
• Efficient and optimized ANMS codes are made available cal contrast around the keypoints [16]. Thus, these approaches
at https://github.com/BAILOOL/ANMS-Codes. tend to improve the spatial distribution as well as the robustness
This paper is organized as follows. In Section 2, we provide against illumination variations. However, they suffer from the
an extensive literature review of existing approaches. The nota- point clustering effect inherent to NMS approaches.
tions as well as proposed methods are introduced in Section 3.
Finally, a large number of experiments is provided in Section 4 2.3. Adaptive non-maximal suppression
followed by a brief conclusion (Section 5). ANMS methods have been developed to tackle the aforemen-
tioned drawbacks. These techniques enforce better keypoint
2. Related work spatial distribution by jointly taking into account the cornerness
strength and the spatial localization of the keypoints. The very
In this section, we report existing methods that have been de- first ANMS approach was proposed by Brown et al. [4]. The
veloped to improve the spatial distribution of keypoints. These authors initially introduced this concept to robustify the image
approaches can be divided into three categories: bucketing ap- matching for panorama stitching. In that work, the keypoints
proaches, Non-Maximal Suppression (NMS), and ANMS. are suppressed based on their corner strength and the location
of the closest strong keypoint. Unfortunately, the original im-
2.1. Bucketing approach plementation of this ANMS has a quadratic complexity which
is not suitable for real-time applications such as SLAM.
Currently, the bucketing-based point detection approach [10]
To overcome this problem, multiple attempts to reduce the
is the most common technique used to ensure good repartition
computational time of ANMS have been investigated. For
of the keypoints. This approach is relatively simple: the source
instance, Cheng et al. [7] proposed an algorithm using a 2-
image is partitioned into a grid and keypoints are detected in
dimensional k-d tree for space-partitioning of high-dimensional
each grid cell. The bucketing-based approach is efficient for
data. Using this data structure, the keypoints are separated into
detecting keypoints all over the image, however, it is unable to
rectangular image regions. Then, from each cell, the strongest
avoid the presence of redundant information ( i.e. clusters of
features are selected as the output sample set. This algorithm
keypoints).
was extended by Behrens et al. [1] using a general tree data
structure. While these methods perform faster than the tra-
2.2. Non-maximal suppression ditional ANMS [4], they do not necessarily output homoge-
NMS (also referred to as TopM) is often used to remove a neously distributed points.
large number of keypoints which are mostly redundant or noisy More recently, Gauglitz et al. [8] have proposed two comple-
responses of the keypoint detectors. The most common ap- mentary approaches that reportedly perform in a subquadratic
proach for NMS [15] consists of suppressing the weakest key- run time. In the first approach, the authors have chosen to use
points using an empirically determined threshold. Thereafter, an approximate nearest neighbor algorithm [6] which relies on
the clusteredness is often reduced by suppressing the keypoints a randomized search tree [17]. The second algorithm named
which do not belong to a local maximum in a particular ra- Suppression via Disk Covering (SDC) aims to further boost the
dius. NMS is a straightforward and fast way to reject unnec- performance of the ANMS. The algorithm simulates an approx-
essary corners, but, in many real case situations, this approach imate radius-nearest neighbor query by superimposing a grid
3

onto the keypoints and approximating the Euclidean distance Table 1: TDS time and storage analysis.
between keypoints by the distance between the centers of the K-d Tree Range Tree
cells into which they fall. Time Storage Time Storage
Insert O(log n) O(log d n)
Our proposed approaches tackle the limitations of previous Query O(n1−1/d + card(Pw )) O(n) O(log n + card(Pw )) O(n log d−1 n)
d

works while maintaining favorable efficiency and scalability. Delete O(log n) O(log d n)

3. Methodology that is capable of retrieving the set of points within the defined
range. Therefore, we describe both proposed algorithms (i.e
In this section, we describe a problem statement and propose
K-dT ANMS and RT ANMS) within this subsection.
several efficient algorithms which ensure a homogeneous repar-
The TDS is built on keypoints Pin sorted in decreasing or-
tition of keypoints in the image. Specifically, we cover ANMS
der of strength (i.e., cornerness score). This TDS is used in
based on Tree Data Structure (TDS) (includes K-dT and RT
our algorithm as a way to efficiently obtain the nearest neigh-
ANMSs) followed by Suppression via Square Covering (SSC).
bors of a particular keypoint given a search range. This search
Lastly, we provide a derivation of the initialization of the search
range is determined by the binary search that tries to guess the
range to further speed-up proposed algorithms.
most appropriate search range w to satisfy the queried num-
ber of keypoints. For every w guess, the nearest neighbors of
3.1. Problem statement
each keypoint (processed in a decreasing order of strength) are
Most of the recent ANMS approaches share a common suppressed in a way that they will not be considered in further
pipeline. The set of two-dimensional (d = 2) input keypoints iterations under the selected w. For this purpose, the index list
Pin = {piin }ni=1 of size n = card(Pin ) (where card(.) stands for Idx s is used to keep track of the uncovered keypoints. The bi-
cardinality operator) is extracted by the detector — and sorted nary search terminates when the number of retrieved keypoints
according to the cornerness score of the points. Further, the is close to the number of queried keypoints m according to a tol-
keypoints in Pin are iteratively processed to compute a smaller erance threshold m ± t . The outline is provided in Algorithm 1.
and better-distributed set of output keypoints Pout = {piout }m
i=1 of The proposed algorithm has similarities to the algorithm pre-
size m = card(Pout ), where m is defined by the user. The output sented in [8] where the authors have chosen to use an approxi-
set of points ensures a good spatial coverage all over the image mate nearest neighbor algorithm which relies on a randomized
while avoiding clustering. This homogeneous point distribution search tree. However, that algorithm [8] is not optimally ef-
is enforced by a spatial consistency check in an adaptive search ficient since it performs both query and delete operations for
range of size w (w is the radius of a circle or half the side of each candidate keypoint in Pin per radius guess. Furthermore,
a square depending upon what approach is used) defining the it requires dynamically adding/removing keypoints to the tree
suppression neighborhood around a candidate point pin . The which drastically slows performance. In contrast, our algo-
radius w is adjusted until the number of retrieved points is close rithms achieve comparable results with a single query opera-
to m according to a certain threshold m ± t , where t represents tion per search range guess, which makes it more efficient and
user-defined tolerance threshold. scalable.

3.2. ANMS based on Tree Data Structure Algorithm 1: ANMS based on TDS
Using a data structure is a common way to approach the Input: keypoints Pin extracted by the detector
ANMS problem [8]. However, previous attempts have resulted Output: spatially distributed keypoints Pout
in relatively inefficient implementations (Section 2). In addi- sort Pin by strength
tion, as observed in [1], after the ANMS step, there are still build T DS on sorted Pin
regions in the image containing a high level of clusteredness. initialize binary search boundaries (Sec. 3.4)
In this section, we propose an efficient algorithm which relies while binary search for search range w do
on more suitable data structures and maintains good spatial key- Pout = ∅
initialize Idx s with all as selected
point distribution. K-dimensional Tree [13] (K-dT) and Range
for pi ∈ Pin do
Tree [2] (RT) have been used for this purpose. if pi ∈ Idx s then
First, K-dT is a binary search tree where the data in each node Pout = Pout ∪ pi
is a K-dimensional point in space. Using this data structure al- Pw = T DS .query(pi ,w)
lows space partitioning to organize points in a K-dimensional Idx s = Idx s \ Pw
space. This partitioning can be used to efficiently retrieve the
set of points Pw which falls into a defined range around a partic- if |card(Pout ) − m| ≤ t then return Pout
ular point. On the other hand, RT is an alternative to K-dT. RT
is a binary search tree where the data in each node contains an
associated structure that is a (d−1)-dimensional RT. Compared
to K-dTs, RTs offer faster query times in exchange for worse 3.3. Suppression via square covering
storage complexity (see Table 1). While these two data struc- We have compared both K-dT ANMS and RT ANMS and
tures are essentially different, from the high-level perspective, observed similar performance in terms of keypoint repeatabil-
the algorithm is generic and appropriate for any data structure ity and clusteredness (see Section 4.4). It is worth mentioning
4

that while in the case of K-dT the range of search is defined by WI


the radius around the candidate point, RT uses a square approx- ah
ah ah+1
imation of the search range. This square approximation can ah
potentially boost the speed performance of the ANMS. 1 HI
One of the key approximations which makes SDC [8] effi- 2
2 2 ah+1
cient is a radius-nearest neighbor query, by superimposing a
grid Gw onto the keypoints and approximating the Euclidean
distance between keypoints by the distance between the centers
of the cells into which they fall. While this approximation per- Fig. 3: Graphical representation of the optimal point distribution. Bounding
forms well, it still requires computing the Euclidean distance boxes of different colors represent the search range around the candidate points.
between a large number of keypoints. Therefore, it is a cru-
cial concern since the number of computations increases as the
a row are perfectly aligned with image borders. If there are q
number of keypoints grows.
points (i.e. square centers) inside each row, then there are q − 1
To tackle the aforementioned problem, we propose to apply
distances in each row between these points. In addition, the left
the square approximation for the SDC [8] algorithm. In partic-
and right extreme points are located at a distance ah from the
ular, once the grid Gw is set, we try to cover the cells which lie
left and right borders of the image. Thus, we can express the
within 2w (determined by binary search) regardless of where
image width WI in terms of ah and q:
exactly the points are located inside this square range of cov-
erage. This drastically boosts the performance of the algorithm WI = 2ah + (ah + 1)(q − 1), (1)
since covering of the cells is simply performed by traversing hence, from Equation (1) the number of points in each row is:
through a square search range without the need for Euclidean
WI − ah + 1
distance computation. The pseudo-code is in Algorithm 2. q= . (2)
ah + 1
Algorithm 2: Suppression via Square Covering(SSC) Similarly, the maximum number of square centers l possibly
Input: keypoints Pin extracted by the detector
fitting within the image height HI is:
Output: spatially distributed keypoints Pout HI = 2ah + (ah + 1)(l − 1). (3)
sort Pin by strength
initialize binary search boundaries (Sec. 3.4) The queried number of points m is equal to the product of q and
while binary search for suppression side w do l. By substituting l = mq to Equation (3) and substituting q from
set resolution of grid Gw = w/2 Equation (2), we obtain the following equation:
uncover cells of Gw
(m − 1)a2h + ah (WI + 2m + HI ) + m + WI − HI WI = 0. (4)
Pout = ∅
for pi ∈ Pin do Solving this equation for ah yields two solutions, one of which
if cell Gw [pi ] is not covered then is always negative, while the other one gives us the final esti-
Pout = Pout ∪ pi mation of the square side:
cover cells of Gw around pi with square of side 2w √
HI + WI + 2m − ∆
if |card(Pout ) − m| ≤ t then return Pout ah = − , (5)
2(m − 1)
where the discriminant of the quadratic Equation (4) is:
∆ = 4WI + 4m + 4HI m + HI2 + WI2 − 2WI HI + 4WI HI m. (6)
3.4. Initialization of search range
Similar to our proposed algorithms, the SDC [8] uses binary It is worth mentioning, while the solution tries to allocate as
search to guess the appropriate search range. In this previous many points as possible, it does not guarantee that the num-
work, the upper bound ah of the binary search is set to image ber of points on the image will be exactly equal to m. This
width WI , while the low one al is set to 1. This often results happens for several reasons. First of all, the fraction mq might
in unnecessary iterations and decreases the convergence speed. not produce an integer value for l. Secondly, during the code
To tackle this problem, we propose a novel and elegant way implementation, we round the obtained value for ah since the
to precompute the bounds for the binary search which drasti- minimum unit of an image is 1 pixel.
cally decreases the number of iterations until convergence and, The lower bound al of the binary search can be determined
in turn, improves the speed of the algorithm. by looking closely at the worst possible point distribution. This
Our problem statement is the following, we want to homoge- happens when all n input points are located in a single square
neously distribute the m queried number of points on the image on the image with no space between them. Given such distribu-
without any clusters. To do so, we try to cover the image with tion, we want to retrieve at least m queried points by filling this
squares of side 2ah with a minimal distance between the square space with the smallest possible squares of side 2al . This can be
centers ah + 1 (see Fig. 3). Given 2ah and WI , we can calculate mathematically expressed as m(2al )2 = n. Therefore, the equa-
the maximum number of squares that perfectly fit in a row of tion for the lower bound of the binary search is the following:
r
the image. We define this row as a set of squares placed at the 1 n
same height in the image where the first and the last square in al = . (7)
2 m
5

4. Results

Mean computational time (ms)


6 4
RT ANMS RT ANMS

Std computational time (ms)


5 SSC 3.5
SSC
SDC 3
SDC
4
K-dT ANMS K-dT ANMS
4.1. Time and storage complexity 3
TopM 2.5

2
TopM

The detailed time complexity analysis is provided in Table 2. 2


1.5

All of the presented algorithms (listed in ‘Method’ column) 1


0.5

0 0

rely on the preprocessing (i.e. sorting by strength) input key- 1000 2000 3000 4000 5000 6000
number of points
7000 8000 9000 10000 11000 1000 2000 3000 4000 5000 6000 7000
number of points
8000 9000 10000 11000

points. For this purpose, we utilize a sorting algorithm with an (a) Mean processing time (b) Standard deviation
average performance of O(n log n). Additionally, K-dT and RT
ANMSs rely on TDS which has to be populated with the in- Fig. 4: Comparison of methods on synthetic data: (a) mean processing time,
put keypoints. This is performed by inserting (see Table 1 for (b) standard deviation.
complexity) n number of keypoints one by one into a data struc-
ture resulting in overall O(n log n) and O(n log d n) complexity,
to provide a fair comparison with SDC [8]. It should be noted
respectively. The query time for each algorithm to select ap-
that Brown [4] has been removed from these experiments for
propriate keypoints is stated in the ‘Query’ column in Table 2.
the sake of clarity (i.e., scale inconsistency) since this method
Specifically, the TopM algorithm simply retrieves m number
is significantly slower than the proposed approaches.
of keypoints from an already sorted list in O(m). The tradi-
tional ANMS [4] (designated ‘Brown’ in our experiments) al- The mean computational time and the standard deviation
gorithm requires the computation of the minimum distance be- against the number of points per iteration are available in
tween every keypoint which requires O(n2 ) followed by sort- Fig. 4(a) and Fig. 4(b) respectively. Through this experiment,
ing in O(n log n) and keypoint retrieval in O(m). Since the rest it is noticeable that the TopM algorithm drastically outperforms
of the algorithms (SDC, K-dT ANSM, RT ANMS, and SSC) more sophisticated approaches (but provides a very unsatisfy-
rely on a binary search algorithm to find the appropriate search ing point distribution in practice, see Sec. 4.4). Other algo-
range in O(log wini ), the total query time complexity can be ob- rithms show interesting characteristics. SSC is indisputably
tained by multiplying the number of search range guesses with more efficient than any existing algorithms both in stability
the complexity of keypoint selection per every guess. The to- (i.e., the standard deviation remains very low) and speed. On
tal time complexity listed in ‘Total (approximated)’ gives us the the other hand, SDC demonstrates satisfying results but suffers
following insight into the algorithm’s performance. Obviously, from a lack of stability for a low number of input keypoints. In-
the TopM approach clearly outperforms all other methods in deed, this approach is more efficient than RT ANMS when the
terms of speed due to its simplicity. Furthermore, SDC, K-dT number of input points is large, however, this tendency is re-
ANMS, RT ANMS, and SSC are certainly asymptotically faster versed when the detected keypoints do not exceed 5000. More-
than traditional ‘Brown’ [4]. over, SDC is less scalable than our SSC approach. Finally, de-
spite its efficiency for a small number of points, K-dT ANMS
The storage complexity evaluation is shown in Table 2.
loses its advantage for more than 2000 points.
Methods that do not rely on any data structure (e.g., TopM,
While the assessment with synthetic data is a good evaluation
‘Brown’ [4], SDC, SSC) at most occupy memory necessary to
showing clear tendencies, the distribution of keypoints in real
incorporate a number of input and output keypoints resulting in
images can be different than the ones obtained synthetically.
O(n+m) complexity. These methods surely demonstrate better
Therefore, we propose an extensive series of evaluations us-
storage complexity compared to K-dT and RT ANMSs which
ing real images. For this purpose, we select 1000 images from
additionally require memory for storing TDS (see Table 1).
KITTI [9] (Sequence 00), and detect keypoints using FAST [15]
Overall, due to the sophisticated time complexity estimation,
with a threshold th = 5. Such relatively low threshold results in
it is challenging to highlight a clear winner among the fastest
a large number of detected keypoints (i.e.,>10000 keypoints per
ANMS algorithms (SDC, K-dT ANMS, RT ANMS, and SSC).
image). Subsequently, the keypoints are sorted by their strength
In order to provide a qualitative evaluation of the algorithms,
. Further, we iteratively select a fixed number of the strongest
we have performed an extensive evaluation of all methods.
keypoints starting from 100 until reaching 10000. The step of
such selection is 100, which results in 100 tests per image. The
4.2. Synthetic and real experiments ANMS algorithms return a fixed percentage of the input num-
First of all, to fairly assess the speed performance of the ber of keypoints. For instance, for 1000 input keypoints and a
different algorithms, a large series of synthetic experiments queried percentage of 10%, the number of queried keypoints is
has been performed. For this purpose, a set of randomly dis- 100. We have applied 10 different ratios in range [10%, 100%]
tributed 2D points is generated on a synthetic image of resolu- with a step of 10%. Several representative results from these
tion 1280 × 720p. Further, a random cornerness score is indi- evaluations are provided in Figure 5.
vidually assigned to every point to simulate the behavior of key- This extensive evaluation demonstrates that SSC clearly out-
point detection in a natural image. The number of 2D points is performs all other methods in terms of speed. Overall, different
in range [800, 11000] with a step of 100. Every test is repeated conclusions from those obtained with the synthetic experiments
1000 times to ensure an unbiased estimation of the algorithms’ can be drawn. Indeed, SDC remains efficient for a relatively
speed. The queried number of points is fixed to 800, while the small number of queried keypoints (Fig.5(a)) but tends to be
search range w is initialized to image width. We intentionally less effective when a large number of input and output keypoints
did not use our initialization technique (Section 3.4) for this test are processed. This can be explained by the substantial number
6
Table 2: Time and storage complexity.

Time complexity Storage


Method Preprocess Build Query Total (approximated) complexity
TopM O(n log n) - O(m) O(n log n) O(n + m)
Brown O(n log n) - O(n2 + n log n + m) O(n2 ) O(n + m)
SDC O(n log n) - O(log wini · (n + m/εr )) O(n log n + log wini · (n + m/εr )) O(n + m)
O(log wini · (n + n1−1/d + card(Pw ))) O(n log n + log wini · (n + card(Pw ))) O(n + card(Pw ) + m)
P P
K-dT ANMS O(n log n) O(n log n)
O(n log d n) O(log wini · (n + log n + card(Pw )))
d
O(n log n + log wini · (n + card(Pw )))
d
O(n log d−1 n + card(Pw ) + m)
P P
RT ANMS O(n log n)
SSC O(n log n) - O(log wini · (n + 4m)) O(n log n + log wini · (n + 4m)) O(n + m)

10 percent 40 percent 70 percent


30 40 40

SDC SDC SDC


35 35
25 K-dT ANMS K-dT ANMS K-dT ANMS
RT ANMS 30
RT ANMS 30
RT ANMS
SSC SSC SSC
processing time [ms]

processing time [ms]

processing time [ms]


20
25 25

15 20 20

15 15
10

10 10

5
5 5

0 0 0
0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000
number of input points number of input points number of input points

(a) (b) (c)

Fig. 5: Mean processing time vs. number of input keypoints for 1000 images. Subfigures (a)-(c) show linear scale of the y axis.

10 Table 3: Speedup provided by initialization (average over 1000 images).


9
8
Number of iterations

7 Method Without Initialization (ms) With Initialization (ms) Speedup


6 SDC 7.4 3.1 2.4x
5 K-dT ANMS 17.5 6.8 2.6x
4 RT ANMS 8.9 7.3 1.2x
3 SSC 2.0 1.4 1.4x
2
1
0
SDC K-dT ANMS RT ANMS SSC
Without initialization With initialization can be explained by the nature of the RT structure itself. In fact,
with a closer initialization (i.e., a smaller search range), the total
Fig. 6: Number of iterations until convergence (with and without initialization). number of expensive queries increases.

of Euclidean distances comparison to be computed for a dense 4.4. Clusteredness


set of input keypoints (Fig.5(b) and (c)). In this respect, our RT The main advantage of using an ANMS strategy is the un-
ANMS scales more efficiently even when many outputs points clustered and well-distributed set of keypoints resulting from
are requested (see Fig.5(b) and (c)). Finally, our K-dT ANMS this process. Indeed, this feature allows us to avoid redun-
becomes inefficient for a large number of points due to the rel- dant information typically occurring with commonly used ap-
atively slow query time of this data structure (see Fig.5(c)). proaches such as bucketing keypoint detection and standard
NMS. To evaluate the clusteredness we have reproduced the
4.3. Effect of proposed initialization experiment suggested in [16], where the authors propose an ap-
In this experiment, we evaluate the impact of our initializa- propriate metric to evaluate this criterion. For this evaluation,
tion on the speed of the different methods. For this purpose, the the image is divided into a regular grid of 10 × 10 cells to com-
same real-image experimental setup described in Section 4.2 pute the number of points lying in every single cell. The stan-
is used. However, in this case, we employ our initialization dard deviation of the number of corners per cell is utilized as
approach (Section 3.4). Two criteria have been utilized to de- the clusteredness metric since it is representative of the homo-
termine the advantages offered by this technique. The first is geneity of the spatial distribution.
the number of iterations needed to reach the number of queried To provide a statically valid evaluation of the clusteredness
points (see Fig. 6). The second one is the overall speed-up pro- for every single approach, 2000 randomly selected images from
vided to each method (see Table 3). the KITTI dataset [9] are used. In this experiment, th = 12 and
For every single method, the number of necessary iterations the number of queried keypoints m varies between 100 to 700.
has been reduced by a factor of three, leading to a significant The obtained results are visible in Fig. 8. We can clearly notice
speed-up. It is noticeable that certain approaches are more af- that all the ANMS approaches provide similar outputs in terms
fected by this initialization. For instance, this is the case for the of spatial distribution which can also be observed in Fig. 7. As
K-dT ANMS approach which has been sped-up by a factor of to the bucketing approach (grid size is 7 × 5), it produces bet-
2.6×. However, some other algorithms such as our RT ANMS ter spatial distribution than TopM, but cannot meet the perfor-
has been moderately improved by our bounds calculation. This mance of the ANMS strategies. This can be explained by the
7

(a) (b) (c)


Fig. 7: Keypoint detection: (a) K-dT ANMS, (b) RT ANMS, (c) SSC. The red dots represent selected keypoints. In this experiment, th = 12 and m = 100.
0.03 RT ANMS
SSC particularly difficult to track. Under these conditions, ANMS
SDC
0.025
Brown algorithms show even more significant improvements. Another
K-dT ANMS
clusterness

0.02 TopM observation is the improved robustness to moving objects. Our


Bucketing
0.015 approaches also show very promising results in sequences con-
0.01
taining one or more moving objects (for instance in Seq04).
With ANMS only a few keypoints are detected on moving ob-
0.005
jects, while their majority belong to the rigid background, there-
0
100 200 300 400 500
number of points
600 700 fore, the outliers are more efficiently removed by a robust esti-
mation step (RANSAC in our SLAM). Finally, in Fig. 9, the
Fig. 8: Mean and standard deviation of the clusteredness over 1000 images. slight error discrepancy between ANMS methods is mostly due
to the inherent randomness of the point tracking strategy, noise,
and numerical error typical of real image experiments. Nev-
fact that bucketing approach is designed to ensure good spatial ertheless, we can certainly conclude that all the ANMS ap-
distribution, but does not solve the problem of point clusters. proaches - compared in this paper - significantly improve the
SLAM algorithm in a very similar manner.
4.5. Application to SLAM In Fig. 10, we propose a qualitative comparison of our SSC
SLAM is one of the applications where the spatial distribu- algorithm against the bucketing strategy. For this estimation,
tion of the keypoints on the image is crucial. Therefore, we we use the New College dataset [18] (see Fig. 1) consisting
have included our ANMS solutions in a stereo-SLAM algo- of 50000 stereo image pairs, covering 2.5km with a hand-
rithm which is conceptually close to S-PTAM [14]. Specifi- held stereo camera (multiple loops and challenging scenarios).
cally, the keypoints are detected on both stereo-images using Through this experiment, it is clear that our ANMS approach
the FAST (th = 12) and filtered by our ANMS algorithms to significantly reduces drift over the sequence compared to the
reach 750 points. These stereo points are matched together us- bucketing approach. This drift is particularly obvious in the
ing a line search strategy and triangulated to initialize the 3D side view (see Fig. 10(b)). Note that the TopM algorithm is not
map. Motion tracking is performed using a RANSAC P3P [11] depicted in this figure for sake of clarity (very large drift).
algorithm. Finally, the mapping is achieved by refining the
structure and the motion together via a local bundle adjustment 4.6. Discussion on proposed methods
scheme. For this evaluation, we have utilized all the training se-
quences from the KITTI dataset [9] where an accurate ground Certainly, ANMS approaches are beneficial under specific
truth is provided. The mean translation (in percentage) and ro- contexts and conditions. It is appropriate for pose estimation
tation (in degree) error per sequence are computed with the met- (SLAM, panorama stitching, etc.), self-calibration, and Struc-
rics recommended by [9]. ture from Motion (SfM). Similarly, Schauwecker et al. [16]
The results of the entire experiment are available in Fig. 9. have demonstrated that a good dissemination of the points in
Regarding the translational error, a clear tendency is noticeable. the images resulted in a better sparse stereo matching. How-
For instance, the TopM algorithm is particularly inefficient in ever, ANMS is not limited to these topics and can be appropri-
light of the other approaches. On the other hand, the bucketing ate for many real-time approaches. For example, it might be the
approach tends to perform better than the TopM approach but case for Bag-of-Word place recognition, where well-distributed
never provides better results than the ANMS methods. All the points can lead to a stronger description of the image. While the
ANMS approaches provide comparable results. The same ten- authors have originally developed SDC [8] for planar tracking
dency is observed for the rotation estimation. However, for ro- purposes, we believe that ANMS might be counter-productive
tation, the detection of well-distributed keypoints is less crucial. for visual tracking under certain conditions (i.e. small target,
The error discrepancy between the different sequences may be cluttered scene). Other techniques requiring a dense cluster of
justified by the various contexts in which the sequences have points on a salient part of the image (i.e. point based obstacle
been acquired. For instance, in Seq00 the large rotational er- detection) would probably not be improved by ANMS.
ror can be explained by the high quantity of turns in the se- In this paper, we have proposed three ANMS techniques
quence, while Seq04 (which admits a very low rotational error) named K-dT ANMS, RT ANMS, and SSC to homogeneously
mostly consists of a short and straight line. Moreover, Seq01 distribute keypoints on the image. While ANMS methods pro-
is interesting to analyze because it is probably the most chal- vide visually and statistically (analyzed by Z-test) similar out-
lenging - the vehicle is going at high speed through a relatively puts in terms of spatial distribution, SSC demonstrates the best
empty scene (low texture). These factors make the keypoints time and scalability performance. Therefore, this algorithm is
8
Translational error (%) 6 0.07
Brown RT-ANMS SSC SDC K-dT ANMS TopM Bucketing Brown RT-ANMS SSC SDC K-dT ANMS TopM Bucketing

Rotational error (°)


5 0.06
4 0.05
0.04
3
0.03
2
0.02
1 0.01
0 0
Seq00 Seq01 Seq02 Seq03 Seq04 Seq05 Seq06 Seq07 Seq08 Seq09 Seq10 Seq00 Seq01 Seq02 Seq03 Seq04 Seq05 Seq06 Seq07 Seq08 Seq09 Seq10
Sequence number Sequence number

(a) (b)
Fig. 9: Experimental results of different methods on SLAM: (a) mean translational error, (b) rotational error.

150 SSC
Bucketing
SSC
focus on this problem by proposing an ANMS applicable to the
Bucketing
unified spherical model.
100

150 200
Z (m)

Acknowledgment
100 150
50
Z (m)

50
100 This research was supported by the Shared Sensing for Coop-
50 X (m) erative Cars Project funded by Bosch (China) Investment Ltd.
0 0
0-20-40 0 The second author was supported by Korea Research Fellow-
0 50 100 150 200
X (m) Y (m) ship (KRF) Program through the NRF funded by the Ministry
(a) (b) of Science, ICT and Future Planning (2015H1D3A1066564).

Fig. 10: Trajectories computed on the New College dataset using our SSC ap-
proach and the bucketing strategy: (a) top view, (b) side view.
References
[1] A. Behrens and H. Röllinger. Analysis of feature point distributions for
fast image mosaicking algorithms. Acta Polytechnica, 50(4), 2010.
advisable for use in case an application requires real-time per- [2] R. Berinde. Efficient implementations of range trees. 2007.
[3] Y. Bok, H. Ha, and I. Kweon. Automated checkerboard detection and
formance even when the number of input points is relatively indexing using circular boundaries. Pattern Recognition Letters, 71:66–
high. This would include, for example, real-time SLAM or vi- 72, 2016.
sual odometry. On the other hand, since K-dT and RT ANMSs [4] M. Brown, R. Szeliski, and S. Winder. Multi-image matching using multi-
are based on TDS to store the input points, they can be used for scale oriented patches. In CVPR, 2005.
[5] S. Buoncompagni, D. Maio, D. Maltoni, and S. Papi. Saliency-based
situations when keypoints need to be reused. A good example keypoint selection for fast object detection and matching. Pattern Recog-
is a large scale SfM where many re-projection on the images nition Letters, 62:32–40, 2015.
have to be performed to aggregate new images. Thus, these [6] T. Chan. A minimalists implementation of an approximate nearest neigh-
approaches can be accelerated by using the same structure for bor algorithm in fixed dimensions. See https://goo.gl/cvDjAs, 2006.
[7] Z. Cheng, D. Devarajan, and R. Radke. Determining vision graphs for
keypoints detection and for point matching. Compared to K-dT distributed camera networks using feature digests. EURASIP Journal on
ANMS, RT ANMS offers faster query time but requires more Applied Signal Processing, 2007(1):220–220, 2007.
storage memory. Therefore, a user should consider this tradeoff [8] S. Gauglitz, L. Foschini, M. Turk, and T. Höllerer. Efficiently selecting
spatially distributed keypoints for visual tracking. In ICIP, 2011.
when choosing among the proposed ANMS based on TDS.
[9] A. Geiger, P. Lenz, and R. Urtasun. Are we ready for autonomous driv-
ing? the kitti vision benchmark suite. In CVPR, 2012.
[10] B. Kitt, A. Geiger, and H. Lategahn. Visual odometry based on stereo
5. Conclusion image sequences with ransac-based outlier rejection scheme. In IV, 2010.
[11] L. Kneip, D. Scaramuzza, and R. Siegwart. A novel parametrization of
In this paper, we have presented three novel ANMS tech- the perspective-three-point problem for a direct computation of absolute
camera position and orientation. In CVPR, 2011.
niques (codes are provided) to homogeneously distribute de-
[12] Q. Miao, G. Wang, C. Shi, X. Lin, and Z. Ruan. A new framework
tected keypoints in the image. Through an extensive series of for on-line object tracking based on surf. Pattern Recognition Letters,
experiments, we have highlighted the effectiveness and scala- 32(13):1564–1571, 2011.
bility of our approaches. Furthermore, we have demonstrated [13] M. Muja and D. Lowe. Scalable nearest neighbor algorithms for high
dimensional data. IEEE Transactions on Pattern Analysis and Machine
the positive impact of our ANMS strategies on visual SLAM. Intelligence, 36(11):2227–2240, 2014.
The presented results show that ANMS is a beneficial step for [14] T. Pire, T. Fischer, J. Civera, P. De Cristóforis, and J. Berlles. Stereo
improving SLAM performance. Another major contribution of parallel tracking and mapping for robot localization. In IROS, 2015.
this paper is the binary search boundaries initialization which [15] E. Rosten and T. Drummond. Machine learning for high-speed corner
detection. In ECCV, 2006.
drastically reduces the number of iterations needed to retain the [16] K. Schauwecker, R. Klette, and A. Zell. A new feature detector and stereo
queried number of points. The proposed initialization is de- matching method for accurate high-performance sparse stereo matching.
signed to be suitable for any ANMS relying on binary search. In IROS, 2012.
The current ANMS approaches are designed to handle con- [17] R. Seidel and C. Aragon. Randomized search trees. Algorithmica,
16(4):464–497, 1996.
ventional images, but may perform poorly on non-uniform spa- [18] M. Smith, I. Baldwin, W. Churchill, R. Paul, and P. Newman. The new
tial resolution induced by distortion (e.g. fisheye lens, catadiop- college vision and laser data set. The International Journal of Robotics
tric system, etc.). Naturally, the extension of this work will Research, 28(5):595–599, May 2009.

View publication stats

You might also like