PRLpaper Preprint2018
PRLpaper Preprint2018
net/publication/323388062
CITATIONS READS
38 13,194
6 authors, including:
All content following this page was uploaded by François Rameau on 25 February 2018.
Oleksandr Bailo, Francois Rameau∗∗, Kyungdon Joo, Jinsun Park, Oleksandr Bogdan, In So Kweon
a Department of Electrical Engineering, KAIST, Daejeon, 34141, Republic of Korea
ABSTRACT
Keypoint detection usually results in a large number of keypoints which are mostly clustered, re-
dundant, and noisy. These keypoints often require special processing like Adaptive Non-Maximal
Suppression (ANMS) to retain the most relevant ones. In this paper, we present three new efficient
ANMS approaches which ensure a fast and homogeneous repartition of the keypoints in the image.
For this purpose, a square approximation of the search range to suppress irrelevant points is proposed
to reduce the computational complexity of the ANMS. To further speed up the proposed approaches,
we also introduce a novel strategy to initialize the search range based on image dimension which leads
to a faster convergence. An exhaustive survey and comparisons with already existing methods are
provided to highlight the effectiveness and scalability of our methods and the initialization strategy.
c 2018 Elsevier Ltd. All rights reserved.
1. Introduction
Preprocessing and
initialization
Input image
Iteration
# detected points: 9 # detected points: 34 # detected points: 126 # detected points: 58 # detected points: 105
Keypoint detection Range size: 250p Range size: 125p Range size: 63p Range size: 94p Range size: 70p
Fig. 2: Algorithm’s workflow: (a) keypoint detection in the original image (depicted in blue), (b) sorting keypoints by strength and initialization of the search
range, (c) conceptual representation of our ANMS algorithm where: every column represents the search range guess (orange boxes) through a binary search process
iterated until queried number of points is reached (100 in this example); while every row depicts the iterations through input points, (d) final result where the red
dots represent the selected keypoints.
To sum up, the contributions of this paper are the following: leads to a very limited spatial dissemination of the keypoints
• Three novel and efficient ANMS algorithms (see Fig. 1(b)).
• A new and optimal initialization of the search range It should be noted that certain works have recently attempted
to improve the NMS stage by introducing a novel adaptive
• An extensive series of experiments against state-of-the-art
cornerness score calculation taking into consideration the lo-
• Efficient and optimized ANMS codes are made available cal contrast around the keypoints [16]. Thus, these approaches
at https://github.com/BAILOOL/ANMS-Codes. tend to improve the spatial distribution as well as the robustness
This paper is organized as follows. In Section 2, we provide against illumination variations. However, they suffer from the
an extensive literature review of existing approaches. The nota- point clustering effect inherent to NMS approaches.
tions as well as proposed methods are introduced in Section 3.
Finally, a large number of experiments is provided in Section 4 2.3. Adaptive non-maximal suppression
followed by a brief conclusion (Section 5). ANMS methods have been developed to tackle the aforemen-
tioned drawbacks. These techniques enforce better keypoint
2. Related work spatial distribution by jointly taking into account the cornerness
strength and the spatial localization of the keypoints. The very
In this section, we report existing methods that have been de- first ANMS approach was proposed by Brown et al. [4]. The
veloped to improve the spatial distribution of keypoints. These authors initially introduced this concept to robustify the image
approaches can be divided into three categories: bucketing ap- matching for panorama stitching. In that work, the keypoints
proaches, Non-Maximal Suppression (NMS), and ANMS. are suppressed based on their corner strength and the location
of the closest strong keypoint. Unfortunately, the original im-
2.1. Bucketing approach plementation of this ANMS has a quadratic complexity which
is not suitable for real-time applications such as SLAM.
Currently, the bucketing-based point detection approach [10]
To overcome this problem, multiple attempts to reduce the
is the most common technique used to ensure good repartition
computational time of ANMS have been investigated. For
of the keypoints. This approach is relatively simple: the source
instance, Cheng et al. [7] proposed an algorithm using a 2-
image is partitioned into a grid and keypoints are detected in
dimensional k-d tree for space-partitioning of high-dimensional
each grid cell. The bucketing-based approach is efficient for
data. Using this data structure, the keypoints are separated into
detecting keypoints all over the image, however, it is unable to
rectangular image regions. Then, from each cell, the strongest
avoid the presence of redundant information ( i.e. clusters of
features are selected as the output sample set. This algorithm
keypoints).
was extended by Behrens et al. [1] using a general tree data
structure. While these methods perform faster than the tra-
2.2. Non-maximal suppression ditional ANMS [4], they do not necessarily output homoge-
NMS (also referred to as TopM) is often used to remove a neously distributed points.
large number of keypoints which are mostly redundant or noisy More recently, Gauglitz et al. [8] have proposed two comple-
responses of the keypoint detectors. The most common ap- mentary approaches that reportedly perform in a subquadratic
proach for NMS [15] consists of suppressing the weakest key- run time. In the first approach, the authors have chosen to use
points using an empirically determined threshold. Thereafter, an approximate nearest neighbor algorithm [6] which relies on
the clusteredness is often reduced by suppressing the keypoints a randomized search tree [17]. The second algorithm named
which do not belong to a local maximum in a particular ra- Suppression via Disk Covering (SDC) aims to further boost the
dius. NMS is a straightforward and fast way to reject unnec- performance of the ANMS. The algorithm simulates an approx-
essary corners, but, in many real case situations, this approach imate radius-nearest neighbor query by superimposing a grid
3
onto the keypoints and approximating the Euclidean distance Table 1: TDS time and storage analysis.
between keypoints by the distance between the centers of the K-d Tree Range Tree
cells into which they fall. Time Storage Time Storage
Insert O(log n) O(log d n)
Our proposed approaches tackle the limitations of previous Query O(n1−1/d + card(Pw )) O(n) O(log n + card(Pw )) O(n log d−1 n)
d
works while maintaining favorable efficiency and scalability. Delete O(log n) O(log d n)
3. Methodology that is capable of retrieving the set of points within the defined
range. Therefore, we describe both proposed algorithms (i.e
In this section, we describe a problem statement and propose
K-dT ANMS and RT ANMS) within this subsection.
several efficient algorithms which ensure a homogeneous repar-
The TDS is built on keypoints Pin sorted in decreasing or-
tition of keypoints in the image. Specifically, we cover ANMS
der of strength (i.e., cornerness score). This TDS is used in
based on Tree Data Structure (TDS) (includes K-dT and RT
our algorithm as a way to efficiently obtain the nearest neigh-
ANMSs) followed by Suppression via Square Covering (SSC).
bors of a particular keypoint given a search range. This search
Lastly, we provide a derivation of the initialization of the search
range is determined by the binary search that tries to guess the
range to further speed-up proposed algorithms.
most appropriate search range w to satisfy the queried num-
ber of keypoints. For every w guess, the nearest neighbors of
3.1. Problem statement
each keypoint (processed in a decreasing order of strength) are
Most of the recent ANMS approaches share a common suppressed in a way that they will not be considered in further
pipeline. The set of two-dimensional (d = 2) input keypoints iterations under the selected w. For this purpose, the index list
Pin = {piin }ni=1 of size n = card(Pin ) (where card(.) stands for Idx s is used to keep track of the uncovered keypoints. The bi-
cardinality operator) is extracted by the detector — and sorted nary search terminates when the number of retrieved keypoints
according to the cornerness score of the points. Further, the is close to the number of queried keypoints m according to a tol-
keypoints in Pin are iteratively processed to compute a smaller erance threshold m ± t . The outline is provided in Algorithm 1.
and better-distributed set of output keypoints Pout = {piout }m
i=1 of The proposed algorithm has similarities to the algorithm pre-
size m = card(Pout ), where m is defined by the user. The output sented in [8] where the authors have chosen to use an approxi-
set of points ensures a good spatial coverage all over the image mate nearest neighbor algorithm which relies on a randomized
while avoiding clustering. This homogeneous point distribution search tree. However, that algorithm [8] is not optimally ef-
is enforced by a spatial consistency check in an adaptive search ficient since it performs both query and delete operations for
range of size w (w is the radius of a circle or half the side of each candidate keypoint in Pin per radius guess. Furthermore,
a square depending upon what approach is used) defining the it requires dynamically adding/removing keypoints to the tree
suppression neighborhood around a candidate point pin . The which drastically slows performance. In contrast, our algo-
radius w is adjusted until the number of retrieved points is close rithms achieve comparable results with a single query opera-
to m according to a certain threshold m ± t , where t represents tion per search range guess, which makes it more efficient and
user-defined tolerance threshold. scalable.
3.2. ANMS based on Tree Data Structure Algorithm 1: ANMS based on TDS
Using a data structure is a common way to approach the Input: keypoints Pin extracted by the detector
ANMS problem [8]. However, previous attempts have resulted Output: spatially distributed keypoints Pout
in relatively inefficient implementations (Section 2). In addi- sort Pin by strength
tion, as observed in [1], after the ANMS step, there are still build T DS on sorted Pin
regions in the image containing a high level of clusteredness. initialize binary search boundaries (Sec. 3.4)
In this section, we propose an efficient algorithm which relies while binary search for search range w do
on more suitable data structures and maintains good spatial key- Pout = ∅
initialize Idx s with all as selected
point distribution. K-dimensional Tree [13] (K-dT) and Range
for pi ∈ Pin do
Tree [2] (RT) have been used for this purpose. if pi ∈ Idx s then
First, K-dT is a binary search tree where the data in each node Pout = Pout ∪ pi
is a K-dimensional point in space. Using this data structure al- Pw = T DS .query(pi ,w)
lows space partitioning to organize points in a K-dimensional Idx s = Idx s \ Pw
space. This partitioning can be used to efficiently retrieve the
set of points Pw which falls into a defined range around a partic- if |card(Pout ) − m| ≤ t then return Pout
ular point. On the other hand, RT is an alternative to K-dT. RT
is a binary search tree where the data in each node contains an
associated structure that is a (d−1)-dimensional RT. Compared
to K-dTs, RTs offer faster query times in exchange for worse 3.3. Suppression via square covering
storage complexity (see Table 1). While these two data struc- We have compared both K-dT ANMS and RT ANMS and
tures are essentially different, from the high-level perspective, observed similar performance in terms of keypoint repeatabil-
the algorithm is generic and appropriate for any data structure ity and clusteredness (see Section 4.4). It is worth mentioning
4
4. Results
2
TopM
0 0
rely on the preprocessing (i.e. sorting by strength) input key- 1000 2000 3000 4000 5000 6000
number of points
7000 8000 9000 10000 11000 1000 2000 3000 4000 5000 6000 7000
number of points
8000 9000 10000 11000
points. For this purpose, we utilize a sorting algorithm with an (a) Mean processing time (b) Standard deviation
average performance of O(n log n). Additionally, K-dT and RT
ANMSs rely on TDS which has to be populated with the in- Fig. 4: Comparison of methods on synthetic data: (a) mean processing time,
put keypoints. This is performed by inserting (see Table 1 for (b) standard deviation.
complexity) n number of keypoints one by one into a data struc-
ture resulting in overall O(n log n) and O(n log d n) complexity,
to provide a fair comparison with SDC [8]. It should be noted
respectively. The query time for each algorithm to select ap-
that Brown [4] has been removed from these experiments for
propriate keypoints is stated in the ‘Query’ column in Table 2.
the sake of clarity (i.e., scale inconsistency) since this method
Specifically, the TopM algorithm simply retrieves m number
is significantly slower than the proposed approaches.
of keypoints from an already sorted list in O(m). The tradi-
tional ANMS [4] (designated ‘Brown’ in our experiments) al- The mean computational time and the standard deviation
gorithm requires the computation of the minimum distance be- against the number of points per iteration are available in
tween every keypoint which requires O(n2 ) followed by sort- Fig. 4(a) and Fig. 4(b) respectively. Through this experiment,
ing in O(n log n) and keypoint retrieval in O(m). Since the rest it is noticeable that the TopM algorithm drastically outperforms
of the algorithms (SDC, K-dT ANSM, RT ANMS, and SSC) more sophisticated approaches (but provides a very unsatisfy-
rely on a binary search algorithm to find the appropriate search ing point distribution in practice, see Sec. 4.4). Other algo-
range in O(log wini ), the total query time complexity can be ob- rithms show interesting characteristics. SSC is indisputably
tained by multiplying the number of search range guesses with more efficient than any existing algorithms both in stability
the complexity of keypoint selection per every guess. The to- (i.e., the standard deviation remains very low) and speed. On
tal time complexity listed in ‘Total (approximated)’ gives us the the other hand, SDC demonstrates satisfying results but suffers
following insight into the algorithm’s performance. Obviously, from a lack of stability for a low number of input keypoints. In-
the TopM approach clearly outperforms all other methods in deed, this approach is more efficient than RT ANMS when the
terms of speed due to its simplicity. Furthermore, SDC, K-dT number of input points is large, however, this tendency is re-
ANMS, RT ANMS, and SSC are certainly asymptotically faster versed when the detected keypoints do not exceed 5000. More-
than traditional ‘Brown’ [4]. over, SDC is less scalable than our SSC approach. Finally, de-
spite its efficiency for a small number of points, K-dT ANMS
The storage complexity evaluation is shown in Table 2.
loses its advantage for more than 2000 points.
Methods that do not rely on any data structure (e.g., TopM,
While the assessment with synthetic data is a good evaluation
‘Brown’ [4], SDC, SSC) at most occupy memory necessary to
showing clear tendencies, the distribution of keypoints in real
incorporate a number of input and output keypoints resulting in
images can be different than the ones obtained synthetically.
O(n+m) complexity. These methods surely demonstrate better
Therefore, we propose an extensive series of evaluations us-
storage complexity compared to K-dT and RT ANMSs which
ing real images. For this purpose, we select 1000 images from
additionally require memory for storing TDS (see Table 1).
KITTI [9] (Sequence 00), and detect keypoints using FAST [15]
Overall, due to the sophisticated time complexity estimation,
with a threshold th = 5. Such relatively low threshold results in
it is challenging to highlight a clear winner among the fastest
a large number of detected keypoints (i.e.,>10000 keypoints per
ANMS algorithms (SDC, K-dT ANMS, RT ANMS, and SSC).
image). Subsequently, the keypoints are sorted by their strength
In order to provide a qualitative evaluation of the algorithms,
. Further, we iteratively select a fixed number of the strongest
we have performed an extensive evaluation of all methods.
keypoints starting from 100 until reaching 10000. The step of
such selection is 100, which results in 100 tests per image. The
4.2. Synthetic and real experiments ANMS algorithms return a fixed percentage of the input num-
First of all, to fairly assess the speed performance of the ber of keypoints. For instance, for 1000 input keypoints and a
different algorithms, a large series of synthetic experiments queried percentage of 10%, the number of queried keypoints is
has been performed. For this purpose, a set of randomly dis- 100. We have applied 10 different ratios in range [10%, 100%]
tributed 2D points is generated on a synthetic image of resolu- with a step of 10%. Several representative results from these
tion 1280 × 720p. Further, a random cornerness score is indi- evaluations are provided in Figure 5.
vidually assigned to every point to simulate the behavior of key- This extensive evaluation demonstrates that SSC clearly out-
point detection in a natural image. The number of 2D points is performs all other methods in terms of speed. Overall, different
in range [800, 11000] with a step of 100. Every test is repeated conclusions from those obtained with the synthetic experiments
1000 times to ensure an unbiased estimation of the algorithms’ can be drawn. Indeed, SDC remains efficient for a relatively
speed. The queried number of points is fixed to 800, while the small number of queried keypoints (Fig.5(a)) but tends to be
search range w is initialized to image width. We intentionally less effective when a large number of input and output keypoints
did not use our initialization technique (Section 3.4) for this test are processed. This can be explained by the substantial number
6
Table 2: Time and storage complexity.
15 20 20
15 15
10
10 10
5
5 5
0 0 0
0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000
number of input points number of input points number of input points
Fig. 5: Mean processing time vs. number of input keypoints for 1000 images. Subfigures (a)-(c) show linear scale of the y axis.
(a) (b)
Fig. 9: Experimental results of different methods on SLAM: (a) mean translational error, (b) rotational error.
150 SSC
Bucketing
SSC
focus on this problem by proposing an ANMS applicable to the
Bucketing
unified spherical model.
100
150 200
Z (m)
Acknowledgment
100 150
50
Z (m)
50
100 This research was supported by the Shared Sensing for Coop-
50 X (m) erative Cars Project funded by Bosch (China) Investment Ltd.
0 0
0-20-40 0 The second author was supported by Korea Research Fellow-
0 50 100 150 200
X (m) Y (m) ship (KRF) Program through the NRF funded by the Ministry
(a) (b) of Science, ICT and Future Planning (2015H1D3A1066564).
Fig. 10: Trajectories computed on the New College dataset using our SSC ap-
proach and the bucketing strategy: (a) top view, (b) side view.
References
[1] A. Behrens and H. Röllinger. Analysis of feature point distributions for
fast image mosaicking algorithms. Acta Polytechnica, 50(4), 2010.
advisable for use in case an application requires real-time per- [2] R. Berinde. Efficient implementations of range trees. 2007.
[3] Y. Bok, H. Ha, and I. Kweon. Automated checkerboard detection and
formance even when the number of input points is relatively indexing using circular boundaries. Pattern Recognition Letters, 71:66–
high. This would include, for example, real-time SLAM or vi- 72, 2016.
sual odometry. On the other hand, since K-dT and RT ANMSs [4] M. Brown, R. Szeliski, and S. Winder. Multi-image matching using multi-
are based on TDS to store the input points, they can be used for scale oriented patches. In CVPR, 2005.
[5] S. Buoncompagni, D. Maio, D. Maltoni, and S. Papi. Saliency-based
situations when keypoints need to be reused. A good example keypoint selection for fast object detection and matching. Pattern Recog-
is a large scale SfM where many re-projection on the images nition Letters, 62:32–40, 2015.
have to be performed to aggregate new images. Thus, these [6] T. Chan. A minimalists implementation of an approximate nearest neigh-
approaches can be accelerated by using the same structure for bor algorithm in fixed dimensions. See https://goo.gl/cvDjAs, 2006.
[7] Z. Cheng, D. Devarajan, and R. Radke. Determining vision graphs for
keypoints detection and for point matching. Compared to K-dT distributed camera networks using feature digests. EURASIP Journal on
ANMS, RT ANMS offers faster query time but requires more Applied Signal Processing, 2007(1):220–220, 2007.
storage memory. Therefore, a user should consider this tradeoff [8] S. Gauglitz, L. Foschini, M. Turk, and T. Höllerer. Efficiently selecting
spatially distributed keypoints for visual tracking. In ICIP, 2011.
when choosing among the proposed ANMS based on TDS.
[9] A. Geiger, P. Lenz, and R. Urtasun. Are we ready for autonomous driv-
ing? the kitti vision benchmark suite. In CVPR, 2012.
[10] B. Kitt, A. Geiger, and H. Lategahn. Visual odometry based on stereo
5. Conclusion image sequences with ransac-based outlier rejection scheme. In IV, 2010.
[11] L. Kneip, D. Scaramuzza, and R. Siegwart. A novel parametrization of
In this paper, we have presented three novel ANMS tech- the perspective-three-point problem for a direct computation of absolute
camera position and orientation. In CVPR, 2011.
niques (codes are provided) to homogeneously distribute de-
[12] Q. Miao, G. Wang, C. Shi, X. Lin, and Z. Ruan. A new framework
tected keypoints in the image. Through an extensive series of for on-line object tracking based on surf. Pattern Recognition Letters,
experiments, we have highlighted the effectiveness and scala- 32(13):1564–1571, 2011.
bility of our approaches. Furthermore, we have demonstrated [13] M. Muja and D. Lowe. Scalable nearest neighbor algorithms for high
dimensional data. IEEE Transactions on Pattern Analysis and Machine
the positive impact of our ANMS strategies on visual SLAM. Intelligence, 36(11):2227–2240, 2014.
The presented results show that ANMS is a beneficial step for [14] T. Pire, T. Fischer, J. Civera, P. De Cristóforis, and J. Berlles. Stereo
improving SLAM performance. Another major contribution of parallel tracking and mapping for robot localization. In IROS, 2015.
this paper is the binary search boundaries initialization which [15] E. Rosten and T. Drummond. Machine learning for high-speed corner
detection. In ECCV, 2006.
drastically reduces the number of iterations needed to retain the [16] K. Schauwecker, R. Klette, and A. Zell. A new feature detector and stereo
queried number of points. The proposed initialization is de- matching method for accurate high-performance sparse stereo matching.
signed to be suitable for any ANMS relying on binary search. In IROS, 2012.
The current ANMS approaches are designed to handle con- [17] R. Seidel and C. Aragon. Randomized search trees. Algorithmica,
16(4):464–497, 1996.
ventional images, but may perform poorly on non-uniform spa- [18] M. Smith, I. Baldwin, W. Churchill, R. Paul, and P. Newman. The new
tial resolution induced by distortion (e.g. fisheye lens, catadiop- college vision and laser data set. The International Journal of Robotics
tric system, etc.). Naturally, the extension of this work will Research, 28(5):595–599, May 2009.