Deep Learning Methods For Animal Counting in Camera Trap Images
Deep Learning Methods For Animal Counting in Camera Trap Images
'HHS/HDUQLQJ0HWKRGVIRU$QLPDO&RXQWLQJLQ
Deep Learning Methods for Animal Counting in
&DPHUD7UDS,PDJHV
Camera Trap Images
<L]KHQ:DQJ<DQJ=KDQJ<XDQ)HQJDQG<L6KDQJ
Yizhen Wang, Yang Zhang, Yuan Feng, and Yi Shang
'HSWRI(OHFWULFDO(QJLQHHULQJDQG&RPSXWHU6FLHQFH
Dept. of Electrical Engineering and Computer Science
2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI) | 979-8-3503-9744-4/22/$31.00 ©2022 IEEE | DOI: 10.1109/ICTAI56018.2022.00143
8QLYHUVLW\RI0LVVRXUL
University of Missouri
&ROXPELD0LVVRXUL8QLWHG6WDWHV
Columbia, Missouri, United States
^\ZIP\DQJ]KDQJ\I]FVKDQJ\`#PLVVRXULHGX
{ywf3m, yangzhang, yfzc8, shangy}@missouri.edu
$EVWUDFW²&DPHUD WUDSV
Abstract-Camera traps DUH
are ZLGHO\
widely XVHG
used WR
to PRQLWRU
monitor WKH
the ,Q WKLV SDSHU
In this paper, two WZR QHZ
new PHWKRGV
methods, )LOWHU'HWHFWRU
FilterDetector DQGand
ELRGLYHUVLW\
biodiversity DQG
and SRSXODWLRQ
population GHQVLW\
density RI
of DQLPDO
animal VSHFLHV
species. &DPHUD
Camera '/('HWHFWRUDUHSURSRVHGWRLPSURYHDQLPDOFRXQWLQJEDVHG
DLEDetector, are proposed to improve animal counting based
WUDSLPDJHVDUHXVXDOO\WDNHQLQEXUVWVDQGWKHDQLPDOFRXQWLQJ
trap images are usually taken in bursts, and the animal counting RQ
on WKH
the GHWHFWLRQ
detection UHVXOWV
results RI 0LFURVRIW 0HJD'HWHFWRU
of Microsoft MegaDetector 9 V 4. 7KH
The
SUREOHPIRUDVHTXHQFHRIFDPHUDWUDSLPDJHVLVDOVRDQLPSRUWDQW
problem for a sequence of camera trap images is also an important
FDPHUDWUDSLPDJHVZHUHGLYLGHGLQWRWZRJURXSVEDVHGRQWKH
camera trap images were divided into two groups based on the
SDUWRIHYDOXDWLQJDQLPDOSRSXODWLRQGHQVLW\,QWKLVSDSHUWZR
part of evaluating animal population density. In this paper, two
GHQVLW\
density RI of DQLPDOV
animals LQ in WKH
the LPDJHV
images. )LOWHU'HWHFWRU
FilterDetector DSSOLHV
applies DQan
QHZDQLPDOFRXQWLQJPHWKRGVEDVHGRQ0LFURVRIW0HJD'HWHFWRU
new animal counting methods based on l\ficrosoft MegaDetector
9KDYHEHHQSURSRVHG)LOWHU'HWHFWRUXVHVGLIIHUHQWILOWHUVZLWK
V4 have been proposed. FilterDetector uses different filters with
HIILFLHQW
efficient ILOWHULQJ
filtering PHWKRG
method WR to WKH
the GHWHFWLRQ
detection UHVXOWV
results RI of
ERXQGLQJ
bounding ER[
box HQVHPEOH
ensemble DOJRULWKPV
algorithms WR
to DFKLHYH
achieve PRUH
more DFFXUDWH
accurate
0HJD'HWHFWRU
MegaDetector DQG and FRPELQHV
combines 1RQ0D[LPXP
Non-Maximum 6XSSUHVVLRQ
Suppression
ERXQGLQJER[GHWHFWLRQ'/('HWHFWRULVDQHQVHPEOHPHWKRGWKDW
bounding box detection. DLEDetector is an ensemble method that 106 >@WRDFKLHYHKLJKSUHFLVLRQERXQGLQJER[GHWHFWLRQ
(NMS) [6] to achieve high precision bounding box detection.
XVHV
uses WZR
two EDVH
base GHHS
deep OHDUQLQJ
learning PRGHOV
models WR
to FRUUHFW
correct DQG
and HQKDQFH
enhance WKH
the 7KH
The 106NMS DOJRULWKP
algorithm XVHGused LQ )LOWHU'HWHFWRU XWLOL]H
in FilterDetector utilize ERWK
both WKH
the
GHWHFWLRQ
detection UHVXOW
result RI
of 0HJD'HWHFWRU
MegaDetector. 2XU
Our H[SHULPHQWDO
experimental UHVXOWV
results LQ
in ,QWHUVHFWLRQRYHU8QLRQ
Intersection over Union (loU) ,R8 DQGWKHFRQILGHQFHVFRUHRIWKH
and the confidence score of the
L:LOG&DPFRPSHWLWLRQWHVWGDWDVHWVKRZWKDWERWKPHWKRGV
iWildCam 2022 competition test dataset show that both methods ERXQGLQJER[WREHWWHUUHPRYHIDOVHSRVLWLYHV'/('HWHFWRULV
bounding box to better remove false positives. DLEDetector is
RXWSHUIRUPHGWKHEHVWPHWKRGLQL:LOG&DPDQGWKHEDVHOLQH
outperformed the best method in iWildCam 2021 and the b aseline DQ
an HQVHPEOH
ensemble PHWKRG
method WKDW
that XVHV
uses WZR
two GHHS
deep OHDUQLQJ
learning PRGHOV
models WR to
PHWKRGEDVHGRQ0HJD'HWHFWRU9LQL:LOG&DPFRPSHWLWLRQ
method based on MegaDetector V4 in iWildCam 2022 competition
FRUUHFW
correct DQGand HQKDQFH
enhance WKHthe GHWHFWLRQ
detection UHVXOWV
results RI
of 0HJD'HWHFWRU
MegaDetector.
E\DQGUHVSHFWLYHO\DQGUDQNHGILUVWDQGWKLUGLQ
by 9.09% and 6.44%, respectively, and ranked first and third in
:HLJKWHG%R[)XVLRQ
Weighted Box Fusion (WBF) :%) >@DOJRULWKPLVXVHGWRIXVHWKH
[7] algorithm is used to fuse the
WKHFRPSHWLWLRQ
the competition.
GHWHFWLRQ
detection UHVXOWV
results IURP
from WZR
two GHWHFWRUV
detectors. )XUWKHUPRUH
Furthermore, D a ELQDU\
binary
.H\ZRUGV²FDPHUD
Keywords-camera WUDSV
traps, DQLPDO
animal FRXQWLQJ
counting, ERXQGLQJ
bounding ER[
box FODVVLILFDWLRQPRGHOLVWUDLQHGWRGLVWLQJXLVKEHWZHHQDQLPDOV
classification model is trained to distinguish between animals
HQVHPEOHPDFKLQHOHDUQLQJGHHSOHDUQLQJ
ensemble, machine learning, deep learning DQGEDFNJURXQGVWRHOLPLQDWHIDOVHSRVLWLYHV2XUH[SHULPHQWDO
and backgrounds to eliminate false positives. Our experimental
UHVXOWV
results RQ on L:LOG&DP
iWildCam 2022 FRPSHWLWLRQ
competition VKRZV WKDW ERWK
shows that both
,
I. ,INTRODUCTION
1752'8&7,21 )LOWHU'HWHFWRUDQG'/('HWHFWRURXWSHUIRUPHGWKHEHVWPHWKRG
FilterDetector and DLEDetector outperformed the best method
&DPHUDWUDSVDUHZLGHO\XVHGE\ELRORJLVWVDQGHWKRORJLVWV
Camera traps are widely used by biologists and ethologists in iWildCam 202 1 and the baseline. They ranked 1 VWstDQG
LQL:LOG&DPDQGWKHEDVHOLQH7KH\UDQNHG UG
and 3rd LQ
in
WRPRQLWRUELRGLYHUVLW\DQGSRSXODWLRQGHQVLW\RIDQLPDOVSHFLHV
to monitor biodiversity and population density of animal species WKHL:LOG&DPFRPSHWLWLRQ2YHUDOOWKHFRQWULEXWLRQVRI
the iWildCam 2022 competition. Overall, the contributions of
>@>@7KHFDPHUDVFDQDXWRPDWLFDOO\FROOHFWODUJHTXDQWLWLHVRI
[ 1 ] [2] . The cameras can automatically collect large quantities of WKLVZRUNLQFOXGH
this work include :
LPDJHV
images. &DPHUD
Camera WUDSV
traps DUH SODFHG LQ
are placed in DQ
an DUHD
area RI
of LQWHUHVW
interest ZLWK
with D
a
1 . $QDQLPDOGHQVLW\DQDO\VLVPHWKRGWKDWFDQHIIHFWLYHO\
An animal density analysis method that can effectively
PRWLRQWULJJHUDQGZKHQDPRWLRQLVGHWHFWHGWKHFDPHUDZLOO
motion trigger and, when a motion is detected, the camera will LPSURYH
Improve FDPHUDcamera trap WUDS LPDJH
Image DQLPDO
animal FRXQWLQJ
counting
WDNH D
take a VHTXHQFH EXUVWV RI
sequence (bursts) of SKRWRV
photos. $IWHU
After D
a ODUJH
large QXPEHU
number RI
of SHUIRUPDQFH
performance.
LPDJHV
images are DUH FROOHFWHG
collected, LGHQWLI\LQJ
identifying DQLPDO
animal VSHFLHV
species DQG
and FRXQWV
counts
2. )LOWHU'HWHFWRU
FilterDetector, D a QHZ
new NMS 106 EDVHG
based PHWKRG
method WKDW
that FDQ
can
PDQXDOO\
manually LV YHU\ WLPHFRQVXPLQJ
is very time-consuming DQG and ODERULQWHQVLYH
labor-intensive. JHQHUDWHV
generates LPSURYHG ERXQGLQJ ER[
improved bounding box GHWHFWLRQV
detections IRU
for
7KHUHIRUH
Therefore, UHVHDUFKHUV
researchers DUH
are ZRUNLQJ
working RQ
on GHYHORSLQJ
developing PDFKLQH
machine FDPHUD
camera WUDS
trap LPDJHV
images EDVHG
based RQon WKH
the GHWHFWLRQ
detection UHVXOWV
results RI
of
OHDUQLQJ
learning PHWKRGV
methods WR
to DXWRPDWH
automate WKH
the SURFHVV
process RI
of DQLPDO
animal GHWHFWLRQ
detection 0HJD'HWHFWRU9
MegaDetector V4.
DQGVSHFLHVFODVVLILFDWLRQDQGORFDOL]DWLRQRIDQLPDOVLQFDPHUD
and species classification and localization of animals in camera
3 . '/('HWHFWRU
DLEDetector, D a QHZ
new GHHS
deep OHDUQLQJ
learning EDVHG
based HQVHPEOH
ensemble
WUDSLPDJHV>@>@
trap images [3] [4] .
PHWKRG
method WKDWthat XVHV WZR EDVH
uses two base GHHS
deep OHDUQLQJ
learning PRGHOV
models WR to
L:LOG&DP
iWildCam LVis DQ
an DQQXDO
annual FRPSHWLWLRQGHGLFDWHG
competition dedicated WR
to WKLV
this ILHOG
field FRUUHFW
correct DQGand HQKDQFH
enhance WKH the GHWHFWLRQ
detection UHVXOWV
results RI of
ODXQFKHG
launched LQin
201 8 DVas SDUW
part RI
of WKH
the )LQH*UDLQHG
Fine-Grained 9LVXDO
Visual 0HJD'HWHFWRU9
MegaDetector V4.
&ODVVLILFDWLRQ )*9& ZRUNVKRS
Classification (FGVC) workshop DW WKH &RQIHUHQFH
at the Conference RQ on 7KH UHVW
The rest RI
of WKH
the SDSHU
paper LVis RUJDQL]HG
organized DVas IROORZV
follows. 6HFWLRQ
Section ,,II
&RPSXWHU
Computer 9LVLRQ DQG 3DWWHUQ
Vision and 5HFRJQLWLRQ (CVPR)
Pattern Recognition &935 >@
[5] . 7KH
The SURYLGHV
provides D a UHYLHZ
review RI of UHODWHG
related ZRUN
work. 6HFWLRQ
Section ,,,
III GHWDLOV
details WKH
the
WDUJHWTXHVWLRQVIRUL:LOG&DPYDU\HDFK\HDUL:LOG&DP
target questions for iWildCam vary each year. iWildCam 2021 SUREOHP
problem IRUPXODWLRQ
formulation. 6HFWLRQ
Section ,9 IV GHVFULEHV
describes WKH
the PHWKRGV
methods ZHwe
IRFXVHGRQDQLPDOFRXQWLQJDQGFODVVLILFDWLRQ,WWXUQHGRXWWKDW
focused on animal counting and classification. It turned out that SURSRVH
propose. 6HFWLRQ
Section 9 V SUHVHQWV
presents RXUour H[SHULPHQWDO
experimental UHVXOWV
results. )LQDOO\
Finally,
DFFXUDWHO\
accurately FRXQWLQJ
counting DQG
and FODVVLI\LQJ
classifying DQLPDOV
animals LQ
in LPDJH
image EXUVWV
bursts 6HFWLRQ9,GUDZVFRQFOXVLRQV
Section VI draws conclusions.
VLPXOWDQHRXVO\LVGLIILFXOW7KHUHDUHPDQ\DQLPDOLPDJHLVVXHV
simultaneously is difficult. There are many animal image issues
WKDWPDNHWKHSUREOHPFKDOOHQJLQJVXFKDVSRRULOOXPLQDWLRQ
that make the problem challenging, such as poor illumination, ,,
II. 5 (/$7(':
RELATED 25.
WORK
PRWLRQEOXURFFOXVLRQDQGFDPRXIODJH7KHUHIRUHL:LOG&DP
motion blur, occlusion, and camouflage. Therefore, iWildCam 'HHSOHDUQLQJWHFKQLTXHVKDYHEHHQXVHGWRSURFHVVFDPHUD
Deep learning techniques have been used to process camera
UHIRFXVHGRQWKHFRXQWLQJSUREOHPWRFRXQWLQGLYLGXDO
2022 re-focused on the counting problem -- to count individual WUDSLPDJHU\LQUHFHQW\HDUV$OWKRXJKWKHILUVWPHQWLRQHGGHHS
trap imagery in recent years. Although the first mentioned deep
DQLPDOVDFURVVVHTXHQFHV
animals across sequences. OHDUQLQJPHWKRGXVHGIRUHFRORJLFDOLGHQWLILFDWLRQZDVLQ
learning method used for ecological identification was in 2000
I ., . . . . .
,,,
III. 3 52%/(0)
PROBLEM 2508/$7,21
FORMULATION
$ Animal
A. $QLPDO&RXQWLQJDQG&ODVVLILFDLWLRQIRUD6HTXHQFHRI
Counting and Classificaitionfor a Sequence of
,PDJHV (iL:LOG&DP
Images WildCam 2021)
: I I • - - - - - •
0 1 ) • s • 1 • ' 1�
Numbfi of .wm.kHt«tHI by �Oftea«V<&
7KH
The JRDO
goal RI
of L:LOG&DP
iWildCam
2021 LV
is WR
to FDWHJRUL]H
categorize DQLPDOV
animals LQWR
into
VSHFLHV
species DQG
and FRXQW
count WKH
the QXPEHU
number RI
of LQGLYLGXDOV
individuals RI
of HDFK
each VSHFLHV
species D (b)
(a) E
DFURVV
across LPDJH
image EXUVWV
bursts, D
a VHTXHQFH
sequence RI
of LPDJHV
images FDSWXUHG
captured LQ
in TXLFN
quick )LJXUH'LVWULEXWLRQRILPDJHQXPEHULQHDFKVHTXHQFHRIL:LOG&DP
Figure 1. Distribution of image number in each sequence of iWildCam 2022
VXFFHVVLRQ
succession. 7KH
The HYDOXDWLRQ
evaluation PHWULF
metric LV
is 0HDQ
Mean &ROXPQZLVH
Columnwise 5RRW
Root WUDLQLQJGDWDVHW D WHVWGDWDVHW
training dataset (a), E
test dataset (b).
0HDQ6TXDUHG(UURU 0&506( DVVKRZQEHORZ
Mean Squared Error (MCRMSE), as shown below, 7KHL:LOG&DPWUDLQLQJGDWDVHWGRHVQRWFRQWDLQWKHJURXQG
The iWildCam training dataset does not contain the ground
WUXWK for
truth IRUDQLPDO
animal GHWHFWLRQ
detection DQG
and DQLPDOFRXQWLQJ
animal counting RI
of WKH
the WUDLQLQJ
training
m LPDJHV,WRQO\SURYLGHV0HJD'HWHFWRU9¶VGHWHFWLRQUHVXOWV
images. It only provides MegaDetector V4' s detection results
ͳ ͳ
ܧܵܯܴܥܯ
MCRMSE ൌ
݉j ݊ ! I=l
= ඩ ሺݔ െ ݕ ሻଶ
ୀଵ ୀଵ
DVDUHIHUHQFHDQGWKH'HHS0$&>@VHJPHQWDWLRQUHVXOWVIRU
as a reference and the DeepMAC [20] segmentation results for
HDFKERXQGLQJER[GHWHFWHGE\0HJD'HWHFWRU
each bounding box detected by MegaDetector.
ZKHUH
where Pm LV
is WKH
the QXPEHU
number RIof DQLPDO
animal VSHFLHV
species, Q
n LV
is WKH
the QXPEHU
number RI
of
VHTXHQFH
sequence, HDFK
each FROXPQ
column jM UHSUHVHQWV
represents D
a VSHFLHV
species, HDFK
each URZ
row L
i
represents a sequence, Xij
UHSUHVHQWVDVHTXHQFHݔ LVWKHSUHGLFWHGFRXQWIRUWKDWVSHFLHV
is the predicted count for that species
in that sequence, and Yij
LQWKDWVHTXHQFHDQGݕ LVWKHJURXQGWUXWKFRXQW
is the ground truth count.
940
940
Authorized licensed use limited to: Politecnico di Milano. Downloaded on May 18,2024 at 09:34:49 UTC from IEEE Xplore. Restrictions apply.
DSSOLHVGLIIHUHQWILOWHULQJSULQFLSDOVRQLPDJHVZLWKORZRUKLJK
applies different filtering principals on images with low or high
GHQVLW\RIDQLPDOV$OJRULWKPSURYLGHVWKHZKROHSURFHVV
density of animals. Algorithm 1 provides the whole process.
. -� $OJRULWKP3URSRVDO'HQVLW\%DVHG)LOWHULQJ$QDO\VLV
Algorithm 1 Proposal-Density-Based Filtering Analysis
.�rr
,QSXW3
Input: P =^&ROOHFWLRQRI3URSRVDOVSUHGLFWHGE\0HJD'HWHFWRUIRURQH
{Collection of Proposals predicted by MegaDetector for one
LPDJHLQFRQILGHQFHGHFUHDVLQJRUGHU`
image in confidence-decreasing order}
.
&
C =^&ROOHFWLRQRI&RQILGHQFHRI3URSRVDOVLQ'`
{Collection of Confidence of Proposals in D.}
2XWSXW)
Output: F =^&ROOHFWLRQRI3URSRVDOVDIWHUILOWHUHGE\3URSRVDO'HQVLW\
..........-,
•
(OVH
Else:
IRUL
for i =«QGR
I, 2, . . . , n do
--
Qlldup 3L =
.
GXS (Pi) 0
.
IRUGLQ3^3
for d in P/{P,} L`
:
,;..;,_ -.:c.... . -
LI,R8
if loU (P,, 3LGd)!,QWHUVHFWLRQRI8QLRQ
> 0 . 5 : # Intersection o f Union
(c) Qlldup 3L
GXS (Pi) += 1
HQGIRU
end for
HQGIRU
end for
if CLi !ș
LI& > 8(ndupQGXS (Pi)):
3L $GDSWLYHFRQILGHQFHWKUHVKROG
# Adaptive confidence threshold
Insert P,LLQWR)
,QVHUW3 into F
1) High-density
+LJKGHQVLW\LPDJHVILOWHULQJ
images filtering
)RUKLJKGHQVLW\LPDJHVWRFRXQWPRUHDQLPDOVZHVHWWKH
For high-density images, to count more animals we set the
(e) (f) FRQILGHQFHWKUHVKROGDV7RUHPRYHWKHGXSOLFDWLRQZHVHWD
confidence threshold as 0.0. To remove the duplication we set a
)LJXUH6DPSOHLPDJHVIURPWKHGDWDVHWV
Figure D WR
2. Sample images from the datasets. (a) G DUHLPDJHVZLWKOHVV
to (d) are images with less VWULFW
strict ,R8
loU WKUHVKROG
threshold DV
as
0.2 RI
of 106
NMS PHWKRG
method. 7KH
The ,R8
loU LV
is
FRORUHOHPHQWV,PDJHVRQOHIWVLGHFRQWDLQVVLQJOHDQLPDO,PDJHRQULJKWVLGH
color elements. Images on left side contains single animaL Image on right side LQWHUVHFWLRQ
intersection RI
of 8QLRQ
Union ZKLFK
which FDQ
can EH
be FDOFXODWHG
calculated ZLWK
with IRUPXOD
formula
FRQWDLQVPXOWLSOHDQLPDOV
contains multiple animals . EHORZ
below.
,9
IV. 0 (7+2'6 area of intersection of P; ܽ݊݀ܲ
݂ܽܲ݊݅ݐܿ݁ݏݎ݁ݐ݂݊݅ܽ݁ݎ and Pj
METHODS
loU(P;,PJ
ܷܫ൫ܲ ǡ ܲ ൯ ൌ
=
area of Union of P; ܽ݊݀ܲ
݂ܷ݂ܽܲ݊݅݊ܽ݁ݎ and Pj
$ Density
A. 'HQVLW\$QDO\VLV
Analysis
6LQFHWKHUHLVQRJURXQGWUXWKZHDQDO\]HGWKHGDWDVHWEDVHG
Since there is no grmmd truth, we analyzed the dataset based
)LJXUH
Figure
4. VKRZV
shows WKDW
that WKLV
this VHWWLQJ
setting RI
of SDUDPHWHUV
parameters KHOSV
helps XV
us WR
to
RQWKHUHVXOWVRI0HJD'HWHFWRU:HGLYLGHGWKHGLIIHUHQWLPDJHV
on the results ofMegaDetector. We divided the different images
GHWHFWPRUHDQLPDOVWKDQGHIDXOWVHWWLQJV FRQILGHQFHWKUHVKROG
detect more animals than default settings (confidence threshold
LQWRWZRJURXSVDFFRUGLQJWRWKHQXPEHURIDQLPDOVGHWHFWHGE\
into two groups according to the number of animals detected by
0.95) DQG
and PRVW
most RI WKH SUHGLFW
of the predict DUH
are FRUUHFW
correct 7KLV
This PHWKRG
method =
0HJD'HWHFWRU)RUDQLPDJHLIWKHQXPEHURIGHWHFWHGDQLPDOV
MegaDetector. For an image, if the number of detected animals
LPSURYHVWKHSXEOLFVFRUHWR
improves the public score to 0.253.
LVRUPRUHZHFRQVLGHUWKHLPDJHWREHDKLJKGHQVLW\LPDJH
is 8 or more, we consider the image to be a high-density image.
2WKHUZLVHLWLVWUHDWHGDVDORZGHQVLW\LPDJH)LJXUHVKRZV
Otherwise, it is treated as a low-density image. Figure 3 shows
WKHGLVWULEXWLRQRIDQLPDOVGHWHFWHGE\0HJD'HWHFWRU9LQHDFK
the distribution of animals detected by MegaDetector V4 in each
LPDJH,QERWKWUDLQLQJGDWDVHWDQGWHVWGDWDVHWLQPRUHWKDQ
image. In both training dataset and test dataset, in more than 40%
RIWKHLPDJHV0HJD'HWHFWRUGLGQRWGHWHFWDQLPDOV$QGLQPRVW
of the images MegaDetector did not detect animals. And in most
RIWKHSLFWXUHVZKHUHDQLPDOVZHUHGHWHFWHG0HJD'HWHFWRURQO\
of the pictures where animals were detected, MegaDetector only
GHWHFWHGRQHDQLPDO,QWKHWUDLQLQJGDWDVHWLPDJHVZLWK
detected one animaL In the training dataset, 5 , 1 27 images with
PRUHWKDQDQLPDOVZHUHGHWHFWHGDFFRXQWLQJIRUZKLOH
more than 1 0 animals were detected, accounting for 2. 5%, while
LQWKHWHVWGDWDVHWWKLVQXPEHUZDVDFFRXQWLQJIRU
in the test dataset, this number was 368, accounting for 0.6%.
(a) (b)
)LJXUH5HVXOWFRPSDULVRQRI3URSRVDO'HQVLW\%DVHGILOWHULQJ
Figure 4. Result comparison of Proposal-Density-Based filtering (b)E DQG
and
ILOWHUHGE\FRQILGHQFH on high-density images.
D RQKLJKGHQVLW\LPDJHV
filtered by confidence 0.95 (a)
� ��
� ,.,., i: 2) Low-density
/RZGHQVLW\LPDJHVILOWHULQJ
I
f •soo images filtering
! IWJ
7KH
The ILOWHULQJ
filtering PHWKRG
method RQ on ORZGHQVLW\
low-density LPDJHV
images LVis PRUH
jE ...,
i: FRPSOLFDWHG
complicated. ,W It LV
is QRWD
not a JRRGLGHD
good idea WRUHO\
more
to rely RQ,R8WKUHVKROGWR
I II on loU threshold to
. ...
. ...
941
Authorized licensed use limited to: Politecnico di Milano. Downloaded on May 18,2024 at 09:34:49 UTC from IEEE Xplore. Restrictions apply.
,QWKLVHYDOXDWLRQPHWULFܲLVWKHFROOHFWLRQRIDOOSURSRVDOV
In this evaluation metric, P is the collection of all proposals
IURP Megadetector
from 0HJDGHWHFWRU and DQGܲ P; LV WKH ith
is the LWK proposal
SURSRVDO of ܲ ߙ
RI P . a LV WKH
is the
FRHIILFLHQW of
coefficient RI confidence
FRQILGHQFH reductionUHGXFWLRQ and ݊ௗ௨ LV
DQG ndup QXPEHU of
is number RI
GXSOLFDWLRQV of
duplications P;
RI ܲ ,Iܲ
. If P; LV VHOHFWHG as
is selected DV final
ILQDO predictions,
SUHGLFWLRQV all
DOO of
RI
RWKHUGXSOLFDWHGSURSRVDOVZLOOEHUHPRYHGIURPܲ7KHUHVXOWV
other duplicated proposals will be removed from P . The results
VKRZWKDWZKHQߙ
show that when a = ൌ 0.ͲǤʹZHJHWWKHEHVWSXEOLFVFRUHZKLFKLV
2, we get the best public score which is
Figure
0.247. )LJXUH LV an
5. is DQexample
H[DPSOH of RI our
RXU adaptive
DGDSWLYH filtering
ILOWHULQJ score
VFRUH
PHWKRG
method.
(a) (b)
)LJXUH6XFFHVVIXOFDVHRIGHWHFWLRQUHVXOWFRPSDULVRQEHWZHHQ
Figure 6. Successful case of detection result comparison between
0HJD'HWHFWRU(a),
MegaDetector D DQG0DVN5&11
and Mask RCNN (b).E
(a) (b)
)LJXUH5HVXOWFRPSDULVRQRI3URSRVDO'HQVLW\%DVHGILOWHULQJ
Figure E DQG
5. Result comparison of Proposal-Density-Based filtering (b) and )LJXUH)DLOHGFDVHRIGHWHFWLRQUHVXOWFRPSDULVRQEHWZHHQ0HJD'HWHFWRU
Figure 7. Failed case of detection result comparison between MegaDetector
ILOWHUHGE\FRQILGHQFH
filtered D RQORZGHQVLW\LPDJHV
by confidence 0.95 (a) on low-density images. D DQG0DVN5&11
(a), and Mask RCNN (b).E
& Deep
C. 'HHS/HDUQLQJ%DVHG(QVHPEOH0HWKRG'/('HWHFWRU
Learning Based Ensemble Method - DLEDetector FRQILGHQFH score
confidence VFRUH not
QRW less
OHVV than
WKDQ 0 1H[W we
. 5 . Next, ZH perform
SHUIRUP data
GDWD
Our
2XU second
VHFRQG method
PHWKRG DLEDetector
'/('HWHFWRU tried
WULHG to
WR strengthen
VWUHQJWKHQ the
WKH FOHDQLQJRQWKHFURSSHGLPDJHVWRUHPRYHLPDJHVWKDWDUHWRR
cleaning on the cropped images to remove images that are too
GHWHFWLRQ results
detection UHVXOWV through
WKURXJK the
WKH combination
FRPELQDWLRQ of
RI multiple
PXOWLSOH neural
QHXUDO VPDOODOPRVWSXUHEODFNRUVHYHUHO\LPEDODQFHGLQDVSHFWUDWLR
small, almost pure black, or severely imbalanced in aspect ratio.
QHWZRUNV)RUKLJKGHQVLW\LPDJHVZHLPSOHPHQWHGDQGWUDLQHG
networks. For high density images, we implemented and trained $IWHUFOHDQLQJFURSSHGLPDJHVDUHPDUNHGDVDQLPDO
After cleaning, 1 88962 cropped images are marked as animal
WZRVXSSRUWPRGHOVWRDVVLVWWKHGHWHFWLRQRI0HJD'HWHFWRU
two support models to assist the detection of MegaDetector. DQGXVHGIRUELQDU\FODVVLILFDWLRQPRGHOWUDLQLQJWRJHWKHUZLWK
and used for binary classification model training together with
Detection
1) 'HWHFWLRQ model
PRGHOfIRU KLJK density
or high GHQVLW\ images:
LPDJHV In,Q order
RUGHU to
WR background
200000 EDFNJURXQG images.
LPDJHV We:H trained
WUDLQHG aD Resnet50
5HVQHW with
ZLWK
LPSURYH the
improve WKH performance
SHUIRUPDQFH of RI the
WKH detection
GHWHFWLRQ on RQ high
KLJK density
GHQVLW\ OHDUQLQJUDWHDQGHSRFKQXPEHU
learning rate 0.0001 and epoch number 50.
LPDJHV we
images, ZHtrained
WUDLQHG aD detection
GHWHFWLRQ model
PRGHO based
EDVHG onRQ Mask-RCNN
0DVN5&11 Bounding
3) %RXQGLQJ Box
%R[ Ensemble
(QVHPEOH with
ZLWK :%)
WBF: In ,Q this
WKLV step,
VWHS we
ZH
XVLQJ'HWHFWURQ>@IRUKLJKGHQVLW\LPDJHV
using Detectron2 [ 1 3] for high density images. DSSO\ the
apply WKH WBF
:%) algorithm
DOJRULWKP toWR fuse
IXVH theWKH detection
GHWHFWLRQ results
UHVXOWV of
RI
)LUVW
First, we ZH take
WDNH the
WKH high-density
KLJKGHQVLW\ images
LPDJHV from
IURP the
WKH entire
HQWLUH 0HJD'HWHFWRUDQGRXUPRGHO7KH,R8WKUHVKROGXVHGE\WKH
MegaDetector and our model. The loU threshold used by the
WUDLQLQJGDWDVHWDVWKHWUDLQLQJVHWIRUWKHGHWHFWLRQPRGHO)RU
training dataset as the training set for the detection model. For :%)DOJRULWKPLVWKHFRQILGHQFHWKUHVKROGLVDQGWKH
WBF algorithm is 0.5, the confidence threshold is 0.5, and the
DQLPDJHOHWWKHQXPEHURIEER[HVGHWHFWHGE\0HJD'HWHFWRU
an image, let the number of bboxes detected by MegaDetector IXVLRQZHLJKWRIWKHWZRPRGHOVLV
fusion weight of the two models is 1 : 1 .
ZLWKFRQILGHQFHVFRUHODUJHUWKDQEH݊WKHSL[HOFRYHUDJH
with confidence score larger than 0.95 be n, the pixel coverage Sequential
4) 6HTXHQWLDO$QDO\VLV,QVHTXHQWLDODQDO\VLVVWHSIRUORZ
Analysis: In sequential analysis step, for low
RIWKHVHEER[HVEHܴ
of these bboxes be Rcovered'௩ௗ LIWKHQXPEHURIGHWHFWHGEER[HV
if the number of detected bboxes GHQVLW\ images,
density LPDJHV if
LI the
WKH time
WLPH interval
LQWHUYDO between
EHWZHHQ two
WZR adj
DGMDFHQW
acent
n݊LVQRWOHVVWKDQDQGܴ
is not less than 5 and Rcavered௩ௗ LVQRWOHVVWKDQZHDGGWKH
is not less than 0. 1 , we add the LPDJHVLVODUJHUWKDQVDQGWKHQXPEHURIDQLPDOVGHWHFWHG
images is larger than 20s, and the number of animals detected
LPDJH to
image WR the
WKH training
WUDLQLQJ dataset.
GDWDVHW The
7KH size
VL]H of
RI the
WKH ILQDO WUDLQLQJ
final training LVGLIIHUHQWZHDVVXPHWKDWWKHDQLPDOVLQWKHWZRLPDJHVDUH
is different, we assume that the animals in the two images are
GDWDVHWLV
dataset is 2596. IURPGLIIHUHQWJURXSVDQGODWHUZHDGGXSWKHDQLPDOVLQHDFK
from different groups, and later we add up the animals in each
'XULQJ WUDLQLQJ we
During training, ZH use
XVH the
WKH bbox
EER[ detected
GHWHFWHG by E\ the
WKH JURXSDVWKHILQDOFRXQWLQJUHVXOW)RUKLJKGHQVLW\LPDJHVZH
group as the final counting result. For high-density images, we
0HJD'HWHFWRUZLWKDFRQILGHQFHVFRUHQRWOHVVWKDQDVWKH
MegaDetector with a confidence score not less than 0.5 as the FRQVLGHUDQLPDOVLQWKHLPDJHVDUHDOZD\VIURPWKHVDPHJURXS
consider animals in the images are always from the same group.
ODEHO$0DVN5&11PRGHOLVWUDLQHGZLWK'HWHFWURQXVLQJ
label. A Mask R-CNN model is trained with Detectron2, using $OJRULWKP shows
Algorithm2 VKRZVthe
WKH pseudocode
SVHXGRFRGH of RI our
RXU sequential
VHTXHQWLDO analysis
DQDO\VLV
WKHSUHWUDLQHGZHLJKWVRI0DVN5&115)31DVDVWDUWLQJ
the pretrained weights of Mask R-CNN R50-FPN as a starting DOJRULWKP
algorithm.
SRLQW with
point, ZLWK learning
OHDUQLQJ rate
UDWH 0.00025,
andDQG maximum
PD[LPXP number
QXPEHU of RI
LWHUDWLRQV 50000.
iterations Figure
)LJXUH 6 shows
VKRZV aD success
VXFFHVV case
FDVH that
WKDW our
RXU $OJRULWKP6HTXHQWLDO$QDO\VLV
Algorithm 2 Sequential Analysis
GHWHFWRUVXFFHVVIXOO\GHWHFWHGDQRFFOXGHGELUGDQG)LJXUHLV
detector successfully detected an occluded bird, and Figure 7 is '^'
D: {D1:L$QLPDOVGHWHFWHGIRULWKLPDJHLQWKHVHTXHQFH`
Animals detected for ith image in the sequence.}
aDIDLOHGFDVHZKHUHVRPHVFDWWHUHGELUGVZHUHQRWVXFFHVVIXOO\
failed case, where some scattered birds were not successfully 7^7
T: {T1:L7DNHQWLPH
Taken time (s) V RILWKLPDJHVLQWKHVHTXHQFH`
of ith images in the sequence.}
$�
A O
GHWHFWHG Later,
detected. /DWHU after
DIWHU applying
DSSO\LQJ WBF:%) algorithm
DOJRULWKP to WR fuse
IXVH the
WKH 0� '
M D,
GHWHFWLRQUHVXOWVIURPWZRGHWHFWRUVWKHSUREOHPVROYHG
detection results from two detectors, the problem solved. IRUL
for 2, . . . , n do
i � «QGR
Binary
2) %LQDU\&ODVVLILFDWLRQ0RGHO,QRUGHUWRHOLPLQDWHIDOVH
Classification Model: In order to eliminate false LI7if T,L±7
- T,_,L!
>� DQG'
20 and D,L!� 'L
D,_,:
SRVLWLYHV generated
positives JHQHUDWHG by E\ the
WKH detector,
GHWHFWRU we ZH train
WUDLQ aD binary
ELQDU\ $ A � $0
A+ M
0 M � ' D,L
FODVVLILFDWLRQ model
classification PRGHO to WR distinguish
GLVWLQJXLVK between
EHWZHHQ animals
DQLPDOV andDQG HOVH
else:
EDFNJURXQGV By
backgrounds. %\ going
JRLQJ through
WKURXJK theWKH test
WHVW GDWDVHW ZLWK the
dataset with WKH LI'
ifD,L!0
> M:
FODVVLILFDWLRQPRGHOXVHGIRUDQLPDOFODVVLILFDWLRQLQL:LOG&DP
classification model used for animal classification in iWildCam 0 'L
HQGIRU
end for
ZHHVWLPDWHWKDWRIWKHFDWHJRULHVDSSHDUHGLQWKH
202 1 , we estimate that 69 of the 204 categories appeared in the LI$
if A ��
o:
WHVW dataset.
test GDWDVHW We
:H then
WKHQ pick
SLFN out
RXW images
LPDJHV that
WKDW contain
FRQWDLQ these
WKHVH 69
UHWXUQ0
retum M
VSHFLHV of
species RI animals
DQLPDOV from
IURP the
WKH training
WUDLQLQJ dataset
GDWDVHW and
DQG cropped
FURSSHG the
WKH HOVH
else:
UHWXUQ$
return A
DQLPDOVEDVHGRQWKHGHWHFWHGEER[HVIURP0HJD'HWHFWRUZLWK
animals based on the detected bboxes from MegaDetector with
942
Authorized licensed use limited to: Politecnico di Milano. Downloaded on May 18,2024 at 09:34:49 UTC from IEEE Xplore. Restrictions apply.
)LQDOO\ our
Finally, RXU two
WZR methods
PHWKRGV ranked
UDQNHG No.
1R DQG No.
1 and 1R LQ the
3 in WKH
FRPSHWLWLRQ the
competition, WKH leaderboard
OHDGHUERDUG can
FDQ beEH checked
FKHFNHG at DW
KWWSVZZZNDJJOHFRPFRPSHWLWLRQVLZLOGFDP
htt
ps://www.kaggle .com/competitions/iwildcam2022-
IJYFOHDGHUERDUG
fgvc9/leaderboard .
9,CONCLUSION
VI. &21&/86,21$1' )8785(:25.
AND FUTURE WORK
,QWKLVSDSHUZHIRXQGWKDWDNH\LQIRUPDWLRQRILPSURYLQJ
In this paper, we found that a key information of improving
DQLPDO counting
animal FRXQWLQJ based
EDVHG on
RQ MegaDetector
0HJD'HWHFWRU was ZDVthe
WKH density
GHQVLW\ ofRI
DQLPDOVin
animals LQthe
WKHimage,
LPDJHand
DQGproposed
SURSRVHGtwoWZRnew
QHZmethods
PHWKRGVto WR do
GR
FDPHUDtrap
camera WUDSimages
LPDJHVanimal
DQLPDOcounting
FRXQWLQJbased
EDVHGon
RQMegaDetector
0HJD'HWHFWRU
)LJXUH2YHUDOOSLSHOLQHRI'/('HWHFWRU
Figure 8. Overall pipeline of DLEDetector 9)URPWKHSHUIRUPDQFHRIRXUILOWHUEDVHGPHWKRGLWFDQEH
V 4. From the performance of our filter-based method, it can be
VHHQWKDWWKHDQLPDOFRXQWLQJDFFXUDF\RI0HJD'HWHFWRUFDQEH
seen that the animal counting accuracy of MegaDetector can be
Overall
5) 2YHUDOOanimal
DQLPDOcounting
FRXQWLQJprocess:
SURFHVVFigure
)LJXUH 8 shows
VKRZVtheWKH
VLJQLILFDQWO\improved
significantly LPSURYHGwhen
ZKHQusing
XVLQJdifferent
GLIIHUHQWfilters
ILOWHUVfor
IRUimages
LPDJHV
RYHUDOOflow
overall IORZof RIour
RXUanimal
DQLPDOcmmting
FRXQWLQJprocess.
SURFHVVAfter
$IWHUgetting
JHWWLQJaD ZLWKGLIIHUHQWDQLPDOGHQVLWLHVFRPELQHGZLWK106DOJRULWKP
with different animal densities combined with NMS algorithm
VHTXHQFH of
sequence RI images,
LPDJHVweZH first
ILUVW check
FKHFNthe
WKHdensity
GHQVLW\ level
OHYHO of
RIthe
WKH WRremove
to UHPRYHredundancy.
UHGXQGDQF\Furthermore,
)XUWKHUPRUHourRXUdeep
GHHSlearning-based
OHDUQLQJEDVHG
LPDJHVLQWKHVHTXHQFH)RUKLJKGHQVLW\LPDJHVZHILUVWJHW
images in the sequence. For high-density images, we first get DSSURDFKVKRZVWKDWWUDLQLQJWKHGHWHFWLRQPRGHODORQHIRUKLJK
approach shows that training the detection model alone for high
WKHGHWHFWLRQUHVXOWVRIWKHLPDJHVIURPWKH0HJD'HWHFWRUDQG
the detection results of the images from the MegaDetector and GHQVLW\images,
density LPDJHVcombining
FRPELQLQJthe WKHbinary
ELQDU\classification
FODVVLILFDWLRQmodel
PRGHOto WR
RXUGHWHFWLRQ
our PRGHOWKHQFURSWKHGHWHFWHGDQLPDOVDQG
detection model, then crop the detected animals and feed IHHG UHPRYH false
remove IDOVH positives
SRVLWLYHV in
LQ the
WKH detection
GHWHFWLRQ model
PRGHO and
DQG the
WKH
LQWRRXUELQDU\FODVVLILFDWLRQPRGHOWRHOLPLQDWHIDOVHSRVLWLYHV
into our binary classification model to eliminate false positives. 0HJD'HWHFWRUand
MegaDetector, DQGthen
WKHQusing
XVLQJthe
WKH:)%DOJRULWKP
WFB algorithm to WRfuse
IXVHthe
WKH
1H[WZHIXVHWKHILOWHUHGEER[HVZLWKWKH:%)DOJRULWKPWR
Next, we fuse the filtered bboxes with the WBF algorithm to GHWHFWLRQ results
detection UHVXOWV from
IURP the
WKH two
WZR detectors,
GHWHFWRUV can
FDQ also
DOVR improve
LPSURYH
JHWWKHILQDOGHWHFWLRQUHVXOW)RUORZGHQVLW\LPDJHVZHZLOO
get the final detection result. For low-density images, we will DQLPDOFRXQWVSHUIRUPDQFH
animal counts performance.
GLUHFWO\H[WUDFWWKHGHWHFWLRQUHVXOW
directly extract the detection result withZLWKFRQILGHQFHVFRUHQRW
confidence score not $&.12:/('*0(17
ACKNOWLEDGMENT
OHVVWKDQRI0HJD'HWHFWRUDQGXVHWKH:%)DOJRULWKPWR
less than 0.95 of MegaDetector, and use the WBF algorithm to
IXVHWKHREWDLQHGEER[HVDVWKHILQDOGHWHFWLRQUHVXOW)LQDOO\
fuse the obtained bboxes as the final detection result. Finally,
7KLVZRUNLVSDUWLDOO\VXSSRUWHGE\JUDQWVIURPWKH0LVVRXUL
This work is partially supported by grants from the Missouri
'HSDUWPHQWRI&RQVHUYDWLRQ
Department of Conservation.
ZHDSSO\RXUVHTXHQFHDQDO\VLVDOJRULWKPDQGREWDLQWKHILQDO
we apply our sequence analysis algorithm and obtain the final
DQLPDOFRXQWIRUWKHVHTXHQFH
animal count for the sequence. 5()(5(1&(6
REFERENCES
>@ A.
[I] $)2 &RQQHOO-'1LFKROVDQG.8.DUDQWK&DPHUDWUDSVLQDQLPDO
F. O'Connell, J. D. Nichols, and K. U. Karanth, Camera traps in animal
9 EXPERIMENT
V. (;3(5,0(17RESULTS
5(68/76 HFRORJ\PHWKRGVDQGDQDO\VHV6SULQJHU
ecology: methods and analyses. Springer, 20 1 1 .
'XHWRWKHODFNRIWUXHODEHOVIRUERWKWUDLQLQJGDWDVHWDQG
Due to the lack of true labels for both training dataset and >@ F)7UROOLHW&9HUPHXOHQ0&+X\QHQDQG$+DPEXFNHUV8VHRI
[2] . Trolliet, C. Vermeulen, M.-C. Huynen, and A. Hambuckers, "Use of
WHVW dataset,
test GDWDVHW the
WKH experimental
H[SHULPHQWDO results
UHVXOWV are
DUH generated
JHQHUDWHG through
WKURXJK FDPHUDWUDSVIRUZLOGOLIHVWXGLHVDUHYLHZ%LRWHFKQRORJLH$JURQRPLH
camera traps for wildlife studies: a review," Biotechnologie, Agronomie,
VXEPLVVLRQto
submission WRthe
WKHiWildCam
L:LOG&DPcompetition.
FRPSHWLWLRQIn ,QiWildCam
L:LOG&DP2022, 6RFLpWpHW(QYLURQQHPHQWYROQR
Societe et Environnernent, vol. 1 8, no. 3, 2014.
WKHWHVWGDWDVHWLVGLYLGHGLQWRWZRSDUWVFDOOHG
the test dataset is divided into two parts, called ' SXEOLFVFRUH
public score' DQG
and >@ M.
[3] 0$7DEDNHWDO0DFKLQHOHDUQLQJWRFODVVLI\DQLPDOVSHFLHVLQFDPHUD
A. Tabak et a!., "Machine learning to classify animal species in camera
SULYDWHscore'.
VFRUH Both
%RWKscores
VFRUHVare
DUHcalculated
FDOFXODWHGwith
ZLWKapproximately
DSSUR[LPDWHO\ WUDS images:
trap LPDJHV Applications
$SSOLFDWLRQV in LQ ecology,"
HFRORJ\ Methods
0HWKRGV in
LQ Ecology
(FRORJ\ and
DQG
'private
(YROXWLRQYROQRSS
Evolution, vol. 1 0, no. 4, pp. 5 85-590, 2019.
RIWKHWHVWGDWD7KHSXEOLFVFRUHLVVKRZHGRQOHDGHUERDUG
50% of the test data. The public score is showed on leader board
>@ M.
[4] 0 S.
6 Norouzzadeh
1RURX]]DGHK et HWa!.,
DO "Automatically
$XWRPDWLFDOO\ identifying,
LGHQWLI\LQJ counting,
FRXQWLQJ and
DQG
GXULQJthe
during WKHcompetition,
FRPSHWLWLRQitLWcan
FDQbeEHtreated
WUHDWHGas
DVaDvalidation
YDOLGDWLRQscore
VFRUH GHVFULELQJ wild
describing ZLOG animals
DQLPDOV inLQ camera-trap
FDPHUDWUDS images
LPDJHV with
ZLWK deep
GHHS learning,"
OHDUQLQJ
XVHGto
used WRfinetune
ILQHWXQHWKH PHWKRG7KHSULYDWH
the method. The private scoreVFRUHisLVshown
VKRZQafter
DIWHU 3URFHHGLQJVRIWKH1DWLRQDO$FDGHP\RI6FLHQFHVYROQRSS
Proceedings of the National Academy of Sciences, vol. 1 1 5, no. 25, pp.
VXEPLVVLRQ deadline,
submission GHDGOLQH which
ZKLFK isLV used
XVHG as
DV aD real
UHDO test
WHVW score
VFRUH to
WR ((
E5716-E5725, 20 1 8 .
HYDOXDWHWKHSHUIRUPDQFHRIGLIIHUHQWPHWKRGV
evaluate the performance of different methods. [>@ 6%HHU\*9DQ+RUQ20DF$RGKDDQG33HURQD7KHL:LOG&DP
5 ] S. Beery, G . Van Hom, 0. Mac Aodha, and P . Perona, "The iWildCam
FKDOOHQJHGDWDVHWDU;LYSUHSULQWDU;LY
20 1 8 challenge dataset," arXiv preprint arXiv : l 904.05986, 2019.
7$%/(1. MEAN
TABLE 0($1$%62/87((5525)25,:,/'&$02022
ABSOLUTE ERROR FOR IWILDCAM 7(67
TEST >@ S.
[6] 6%HHU\ $$JDUZDO(&ROHDQG9%LURGNDU7KHL:LOG&DP
Beery, A. Agarwal, E. Cole, and V. Birodkar, "The iWildCam 2021
'$7$6(7
DATASET FRPSHWLWLRQGDWDVHWDU;LYSUHSULQWDU;LY
competition dataset," arXiv preprint arXiv:2105.03494, 2021 .
>@ R.
[7] 5Solovyev,
6RORY\HY W.
: Wang,
:DQJ and
DQG T.
7 Gabruseva,
*DEUXVHYD "Weighted
:HLJKWHGboxes
ER[HVfusion:
IXVLRQ
0HWKRGV
Methods 3XEOLF6FRUH
Public Score 3ULYDWH6FRUH
Private Score
(QVHPEOLQJboxes
Ensembling ER[HVfrom
IURPdifferent
GLIIHUHQWobject
REMHFWdetection
GHWHFWLRQmodels,"
PRGHOVImage
,PDJHand
DQG
)LOWHU'HWHFWRU
FilterDetector
0.247
0.240 9LVLRQ&RPSXWLQJYROS
Vision Computing, vol. 107, p. 104 1 1 7, 2021.
>@ S.
[8] 6&KULVWLQe+HUYHWDQG1/HFRPWH$SSOLFDWLRQVIRUGHHSOHDUQLQJ
Christin, E. Hervet, and N. Lecomte, "Applications for deep learning
'/('HWHFWRU
DLEDetector
0.255
0.247 LQHFRORJ\0HWKRGVLQ(FRORJ\DQG(YROXWLRQYROQRSS
in ecology, " Methods in Ecology and Evolution, vol. 10, no. 10, pp. 1632-
1644, 2019.
%HQFKPDUNPD[QXPRI
Benchmark: max num of
0HJD'HWHFWRUYEER[HV >@ S.
[9] 6Beery,
%HHU\D.
'Morris,
0RUULVand
DQGS.
6Yang,
<DQJ"Efficient
(IILFLHQWpipeline
SLSHOLQHfor
IRUcamera
FDPHUDtrap
WUDS
MegaDetector v4 bboxes 0.276 0.264
FRQILGHQFH! LPDJHUHYLHZDU;LYSUHSULQWDU;LY
image review, " arXiv preprint arXiv: l907.06772, 2019.
(confidence>0.95)
%HQFKPDUNL:LOG&DP
Benchmark: iWildCam 2021 >@ S.
[10] 65HQ.+H5*LUVKLFNDQG-6XQ)DVWHUUFQQ7RZDUGVUHDOWLPH
Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time
0.283
0.264 REMHFW detection
object GHWHFWLRQwith
ZLWKregion
UHJLRQproposal
SURSRVDOnetworks,"
QHWZRUNV Advances
$GYDQFHV in
LQneural
QHXUDO
ZLQQHU
w1nner
%HQFKPDUNPD[QXPRI LQIRUPDWLRQSURFHVVLQJV\VWHPVYRO
information processing systems, vol. 28, 20 1 5 .
Benchmark: max num of
0HJD'HWHFWRUYEER[HV
MegaDetector v3 bboxes
0.299
0.289 [>@
11] K .+H;=KDQJ65HQDQG-6XQ'HHSUHVLGXDOOHDUQLQJIRULPDJH
. He, X . Zhang, S. Ren, and J . Sun, "Deep residual learning for image
FRQILGHQFH!
(confidence > 0.98) UHFRJQLWLRQLQ3URFHHGLQJVRIWKH,(((FRQIHUHQFHRQFRPSXWHUYLVLRQ
recognition," in Proceedings of the IEEE conference on computer vision
%HQFKPDUNL:LOG&DP
Benchmark: iWildCam 2020 DQGSDWWHUQUHFRJQLWLRQSS
and pattern recognition, 2016, pp. 770-778.
0.467
0.443
ZLQQHU
w1nner >@ K.
[12] . He,
+H G.
* Gkioxari,
*NLR[DUL P.3 Dollar,
'ROOiU and
DQG R.
5 Girshick,
*LUVKLFN "Mask
0DVN r-cnn,"
UFQQ in
LQ
3URFHHGLQJVof
Proceedings RIthe
WKHIEEE
,(((international
LQWHUQDWLRQDOconference
FRQIHUHQFHon
RQcomputer
FRPSXWHUvision,
YLVLRQ
7DEOH 1 shows
Table VKRZV the
WKHpublic
SXEOLF score
VFRUH and
DQGprivate
SULYDWH score
VFRUH of
RI our
RXU SS
2017, pp. 2961 -2969.
PHWKRGVWRJHWKHUZLWKWKHSHUIRUPDQFHRIGLIIHUHQWEHQFKPDUN
methods together with the performance of different benchmark >@ Y.
[13] <:X$.LULOORY)0DVVD:</RDQG5*LUVKLFN'HWHFWURQ
Wu, A. Kirillov, F. Massa, W.-Y. Lo, and R. Girshick, "Detectron2,"
PHWKRGV$VWKHJRDOWKLV\HDULVKDUGRQO\WHQJURXSVDFKLHYHG
methods. As the goal this year is hard, only ten groups achieved
2019.
EHWWHUperformance
better SHUIRUPDQFHcompared
FRPSDUHGtoWRthe
WKHbest
EHVWbenchmark
EHQFKPDUNmethod.
PHWKRG
943
943
Authorized licensed use limited to: Politecnico di Milano. Downloaded on May 18,2024 at 09:34:49 UTC from IEEE Xplore. Restrictions apply.