Deep Learning of Path-Based Tree Classifiers For Large-Scale Plant Species
Deep Learning of Path-Based Tree Classifiers For Large-Scale Plant Species
Haixi Zhang∗ , Guiqing He∗ , Jinye Peng∗ , Zhenzhong Kuang† , Jianping Fan‡
∗ School of Electrical and Information
Northwestern Polytrchnical University, Xi’an, Shanxi, China
Email:guiqing [email protected]
† School of Computer Science and Technology
Hangzhou Dianzi University, Zhejiang, China
Email: [email protected]
‡ School of Computer Science
University of North Carolina at Charlotte, Charlotte, USA
Email: [email protected]
Abstract tours the gardens in the community, the question may arise:
In this paper, a deep learning framework is devel- what is the name of that plant? Intrigued by a particular
oped to enable path-based tree classifier training for herbaceous or woody plant, the person may wonder when
supporting large-scale plant species recognition, where it blooms, its flower color, size and bloom time, cultural
a deep neural network and a tree classifier are jointly
requirements, and its commercial availability. Will it attract
trained in an end-to-end fashion. First, a two-layer plant
taxonomy is constructed to organize large numbers of pollinators, such as insects and birds? Is it an introduced
plant species and their genus hierarchically in a coarse- (exotic) species or is it native to the region or state? Is the
to-fine fashion. Second, a deep learning framework is plant on the state or federal invasive species list? In other
developed to enable path-based tree classifier training, situations people will need to identify the plant species to
where a tree classifier over the plant taxonomy is
determine if internal or external exposure would cause harm.
used to replace the flat softmax layer in traditional
deep CNNs. A path-based error function is defined Image-based plant recognition has been a really popular
to optimize the joint process for learning deep CNN research area recently[2–8]. For large-scale plant species
and tree classifier, where back propagation is used to identification, some of these plant species may have strong
update both the classifier parameters and the network inter-species visual similarities, thus it is unreasonable to
weights simultaneously. We have also constructed a
ignore such inter-species visual similarities completely and
large-scale plant database of Orchid family for algorithm
evaluation. Our experimental results have demonstrated learn their inter-related classifiers independently. In most
that our path-based deep learning algorithm can achieve existing deep learning schemes [18–20], softmax is used
very competitive results on both the accuracy rates and and the inter-task correlations are completely ignored, as
the computational efficiency for large-scale plant species a result, the process for learning the deep CNNs may be
recognition.
pushed away from the global optimum because the gradients
Keywords-path based; tree classifier; plant speices recogni-
of the objective function are not uniform for all the object
tion; plant taxonomy
classes and such learning process may distract on discerning
I. I NTRODUCTION some object classes that are hard to be discriminated. In
order to leverage traditional deep CNNs[18, 20] for large-
Plants are enormously important to human welfare be-
scale plant species identification application, one potential
cause they are a source of food, clothing, housing materials,
solution for eliminating such distraction is to group the
medicines, and more besides. In the past, plant species
visually-similar plant species together(e.g., such visually-
identification was the sole domain of taxonomists, botanists,
similar plant species in the same group will have similar
and other professionals who identified the plants of interest
learning difficulties and the gradients of their joint objective
by comparing them with previously collected specimens or
function will be more uniform), thus it is very attractive
by using books or identification manuals [1]. Obviously,
to invest whether the inter-task correlations (inter-species
learning about unknown plants is also an exciting venture
similarities) can be leveraged to improve the learning of
for amateur gardeners and outdoor enthusiasts. When anyone
the deep CNNs for large-scale plant species recognition
hikes, visits public gardens, wanders around campus, or just
application.
*Guiqing He is the corresponding author(email: guiqing [email protected]. Another critical issue for large-scale plant species recog-
cn) nition is how to reduce huge computational cost at test time.
26
Authorized licensed use limited to: UNIVERSITY OF JORDAN. Downloaded on October 17,2023 at 09:17:28 UTC from IEEE Xplore. Restrictions apply.
extraction network and structure tree classifier over fixed B. Path-based Prediction
plant taxonomy. Our path based deep learning algorithm Given an input image x, we got the final prediction
have following key contributions: (1) leveraging the plant through searching max probability among all paths from
taxonomy help organizing large numbers of plant species root node to leaf nodes. The probability of a single path is
and their most relevant genus in a coarse-to-fine fashion; determined by the genu probability(PG) and the inner-group
(2) using a tree classifier over the plant taxonomy to replace species probability(PI). The path probability is calculated as:
the N-way flat softmax classifier in deep network; (3) using
a path-base back propagation method and a path-based loss P (x) = max pG Gg
g ∗ max ps (x) (1)
gG sGg
function to address the issue of inter-level error propagation;
(4) enabling joint learning of deep network (for feature and PG and PI is defined as
extraction) and tree classifier in an end-to-end fashion.
Compared with traditional approaches, our path-based P G(x|lg ) = max pG
g (x) = l (2)
gG
learning algorithm has following improvements: (1) A tree
classifier over the plant taxonomy is used to replace the P I(x|lg , ls ) = max pGg
s (x) = l (3)
N-way flat softmax classifier, and the inter-species visual sGg
similarities are leveraged for training the inter-related clas-
where pG Gg
g and ps represent the probability output at genu
sifiers for the sibling plant species under the same genus.
and specie level for an input image, lg and ls represent genu
In this paper, a two-layer plant taxonomy (genus layer and
and specie groundtruth label respectively.
species layer) is constructed to organize large numbers of
plant species and their most relevant genus hierarchically in C. Path-based Training
a coarse-to-fine fashion. (2) A path-based back propagation
A bottom-up approach is developed to jointly train the
method is used to achieve joint learning of deep network and
deep feature extraction network and the tree classifier over
tree classifier in an end-to-end fashion, so that the critical
fixed plant taxonomy. It is worth noticing that we consider
issue of inter-error propagation can be addressed effectively.
the prediction process as a path travel from genu node to
specie node. Similarly, we consider every single path as
A. Two-layer Tree Structure an entity during training process, which means parameters
on the same path should be updated at same time during
backpropagation. Our algorithm is used to learn the path
classifier ft (x) = (Wg + W s)T x + b, where sSg .
However, when misclassification happens on high-level
node, it is impossible to make right prediction on low-
level node. This means error will propagate from high-
level node to low-level node. Due to these observations, we
proposed a path-based training method which only update
parameters on paths which are relevant to prediction during
each iteration. As shown in figure(3), when right prediction
is made on genu level, only one single path need to be
updated because the prediction result on genu layer will not
affect the prediciton result on specie layer. On the contrary,
when misclassification happens on genu layer, prediction on
specie layer will be affected. So all specie nodes relevant
to the misclassified genu node needed to be updated. The
Figure 2. Two layer orchid ontology path-based objective function is formulated as:
As shown in figure(2), we construct a fixed two-layer L = min −log((Wg + Ws )T x + b) (4)
tree classifier for Orchid 2608. It is easy to understand gG,sSg
that we can get a fixed tree structure from plant taxonomy where Sg represents the relevant nodes with predicted
which can perfectly organize the large number of plant genu node.
species in a coarse to fine fashion. The high-level layer So the partial difference(gradients) of L are determined
includes 158 nodes represent 158 genus, and the low-level as:
layer include 2608 speice nodes. This fixed tree taxonomy
perfectly provide a good environment to determine the inter- ∂L ∂L ∂z(x)
= ∗ ∗ Ψ(g) (5)
species relations. ∂wg ∂z(x) ∂wg
27
Authorized licensed use limited to: UNIVERSITY OF JORDAN. Downloaded on October 17,2023 at 09:17:28 UTC from IEEE Xplore. Restrictions apply.
∂L ∂L ∂z(x) MTCDCNN[39]). Our comparison experiment focus on as-
= ∗ ∗ Ψ(s) (6)
∂ws ∂z(x) ∂wg sessing: (1) whether our path-based deep network can pro-
vide a better solution for jointly learning the deep network
wherez(x) = sof tmax(f (x)), Ψ(g) and Ψ(s) are used and the tree classifier; (2) whether our path-based deep
to determine nodes need to be update. Both Ψ(g) and Ψ(s) learning algorithm can achieve higher performance on large-
are constructed based on our plant taxonomy. scale plant identification.
The corresponding gradients for the joint objective func-
tion are backpropagated through the deep networks at the GA IA PA
Caffenet - - 69.69
species level to fine-tune the weights.
MTCDCNN 90.35 79.63 71.95
PathNet 89.86 80.44 72.28
HDCNN - - 71.12
Table I
C LASSIFICATION ACCURACY OF OUR PATH - BASED DEEPLEARNING
ALGORITHM ON O RCHID 2608 IMAGE SET
28
Authorized licensed use limited to: UNIVERSITY OF JORDAN. Downloaded on October 17,2023 at 09:17:28 UTC from IEEE Xplore. Restrictions apply.
R EFERENCES
29
Authorized licensed use limited to: UNIVERSITY OF JORDAN. Downloaded on October 17,2023 at 09:17:28 UTC from IEEE Xplore. Restrictions apply.
[18] A. Krizhevsky, I. Sutskever, G. E. Hinton, ImageNet [37] J. Wang, X. Shen, W. Pan, On large margin hierarchical
classification with deep convolutional neural networks, classification with multiple paths, Journal of the Amer-
NIPS, 2012. ican Statistical Association, vol.104, no.487, 2009.
[19] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, [38] Yan, Zhicheng, et al. HD-CNN: hierarchical deep
E. Tzeng, T. Darrell, DeCAF: A deep convolutional convolutional neural networks for large scale visual
activation feature for generic visual recognition, ICML, recognition. Proceedings of the IEEE International
2014. Conference on Computer Vision. 2015.
[20] K. Simonyan, A. Zisserman, “Very deep convolutional [39] Kuang Z, Li Z, Zhao T, et al. Deep Multi-task Learning
networks for large-scale image recognition”, ICLR, for Large-Scale Image Classification[C]//Multimedia
2015. Big Data (BigMM), 2017 IEEE Third International
[21] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Conference on. IEEE, 2017: 310-317.
Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, [40] Kontschieder, Peter, et al. ”Deep neural decision
“Going deeper with convolutions”, IEEE CVPR 2015. forests.” Proceedings of the IEEE International Con-
[22] Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, “Gradient- ference on Computer Vision. 2015.
based learning applied to document recognition”, [41] Wang, Jiang, et al. ”Cnn-rnn: A unified framework
Proc.IEEE, 1998. for multi-label image classification.” Proceedings of
[23] T. Gao, D. Koller, Discriminative learning of re- the IEEE Conference on Computer Vision and Pattern
laxed hierarchy for large-scale visual recognition, IEEE Recognition. 2016.
ICCV, pp. 2072-2079, 2011.
[24] G. Griffin, P. Perona, Learning and using taxonomies
for fast visual categorization, IEEE CVPR, 2008.
[25] S. Bengio, J. Weston, D. Grangier, Label embedding
trees for large multi-class tasks, NIPS, 2010.
[26] J. Fan, N. Zhou, J. Peng, L. Gao, “Hierarchical learning
of tree classifiers for large-scale plant species identifi-
cation”, IEEE Trans. Image Processing, vol. 24, no.11,
pp.4172-4184, 2015.
[27] J. Fan, Y. Gao, H. Luo, “Integrating concept ontology
and multi-task learning to achieve more effective clas-
sifier training for multi-level image annotation”, IEEE
Trans. on Image Processing, vol. 17, no.3, pp.407-426,
2008.
[28] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and F. Li,
“ImageNet: A large-scale hierarchical image database”,
IEEE CVPR, pp. 248-255, 2009.
[29] A. Zweig, D. Weinshall, Hierarchical regularization
cascade for joint learning, ICML, 2013.
[30] L. Cai, T. Hofmann, Hierarchical document categoriza-
tion with support vector machines, ACM CIKM, 2004.
[31] B. Shahbaba, R. Neal, Improving classification when
a class hierarchy is available using a hierarchy-based
prior, Bayesian Analysis, vol.2, pp.221-238, 2007.
[32] D. Koller, M. Sahami, Hierarchically classifying doc-
uments using very few words, ICML, 1997.
[33] A.K. McCallum, R. Rosenfeld, T. Mitchell, A.Y. Ng,
Improving text classification by shrinkage in a hierar-
chy of classes, ICML, 1998.
[34] O. Dekel, J. Keshet, Y. Singer, Large margin hierarchi-
cal classification, ICML, 2004.
[35] D. Zhou, L. Xiao, M. Wu, Hierarchical classification
via orthogonal transfer, ICML, 2011.
[36] M. Sun, W. Huang, S. Savarese, Find the best path: an
efficient and accurate classifier for image hierarchies,
IEEE ICCV, 2013.
30
Authorized licensed use limited to: UNIVERSITY OF JORDAN. Downloaded on October 17,2023 at 09:17:28 UTC from IEEE Xplore. Restrictions apply.