0% found this document useful (0 votes)
63 views10 pages

Constrained Design of Deep Iris Networks

This article proposes designing deep iris networks using constrained optimization to consider model size and computation in addition to recognition accuracy. Current deep iris networks cannot achieve the compactness and low computational requirements of the classic IrisCode. The authors aim to automatically learn optimal iris network architectures that achieve state-of-the-art performance while requiring less computation and memory than existing approaches through constrained network design. This allows investigation of the optimality of previous methods and enables learning compact networks for practical iris recognition applications.

Uploaded by

pedro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views10 pages

Constrained Design of Deep Iris Networks

This article proposes designing deep iris networks using constrained optimization to consider model size and computation in addition to recognition accuracy. Current deep iris networks cannot achieve the compactness and low computational requirements of the classic IrisCode. The authors aim to automatically learn optimal iris network architectures that achieve state-of-the-art performance while requiring less computation and memory than existing approaches through constrained network design. This allows investigation of the optimality of previous methods and enables learning compact networks for practical iris recognition applications.

Uploaded by

pedro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2020.2999211, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING 1

Constrained Design of Deep Iris Networks


Kien Nguyen, Member, IEEE, Clinton Fookes, Senior Member, IEEE, Sridha Sridharan, Life Senior
Member, IEEE

Abstract—Despite the promise of recent deep neural networks to provide more accurate and efficient iris recognition compared to
traditional techniques, there are vital properties of the classic IrisCode which are almost unable to be achieved with current deep iris
networks: the compactness of model and the small number of computing operations (FLOPs). This paper casts the iris network design
process as a constrained optimization problem which takes model size and computation into account as learning criteria. On one hand,
this allows us to fully automate the network design process to search for the optimal iris network architecture with the highest
recognition accuracy confined to the computation and model compactness constraints. On the other hand, it allows us to investigate
the optimality of the classic IrisCode and recent deep iris networks. It also enables us to learn an optimal iris network and demonstrate
state-of-the-art performance with less computation and memory requirements.

Index Terms—Iris Recognition, Deep Learning, Iris Network Design, Constrained Deep Network Design.

1 I NTRODUCTION required to understand the trade-off between the superior


Deep neural networks are extremely effective at automatic accuracy by automatic feature engineering and the excessive
feature learning for object recognition by leveraging the computation and network model size we have to adopt for
power of high capacity models, vast amounts of data and this benefit.
high-end computing infrastructure [1], [32]. In recent years, In addition, there are a vast number of possible archi-
deep networks have shown to be able to automatically learn tectures (i.e. number of layers, filters, connections, etc.) of
discriminative features with promising recognition accuracy a neural network, which can be millions in case of deep
in the iris recognition setting [11], [20], [23], [38]. However, networks. In the iris recognition setting, the existing archi-
when compared with handcrafted approaches such as the tectures can range from several layers (FeatNet [38]) to tens
classic IrisCode [7], [18], it is pertinent to ask a question: of layers (DeepIrisNet [11], off-the-shelf CNNs [23]) with
do we really need networks that deep with tens or even millions of possible connections between them. Redundant
hundreds of layers and millions of parameters and floating- layers, connections and filters would lead to non-compact
point operations in the iris recognition setting? Recall that feature representations and unnecessary computation. This
the IrisCode has been very successful with only a few raises two important questions: (i) how can we determine
parameters, e.g. orientations and scales of Gabor filters, and the optimal architecture? and (ii) how do we fully automate
it has been very cheap to run (30 ms on a CPU for iris image the network design process, i.e. automatic feature learning
analysis and creation of an IrisCode) [21]. and automatic architecture learning? These are important
Despite their superior accuracy, current deep iris net- issues that must be addressed to reach the full potential of
works are unable to achieve two vital properties of the deep networks in the iris recognition setting.
classic IrisCode: the compactness of the model and the small One possible solution to the above problems is to lever-
number of computing operations (FLOPs). These properties age recent advances in neural architecture search (NAS) to
are among the major driving forces for iris recognition to search for the network architecture that can achieve the
be a preferred choice compared to other biometric modal- best accuracy. Modern neural architecture search relies on
ities. Current deep iris approaches cannot achieve these reinforcement learning and evolution theory to explore the
properties due to the inherent large size of the networks architecture space to gradually evolve the architecture to
with millions of parameters and hundreds of layers that better ones. For example, Zoph et al. designed a reinforce-
are potentially required to achieve the desired accuracy. The ment learning agent to explore the architecture search space
benefits of automatic feature engineering come at the cost of to find the optimal configurations [39], [40]. Real et al.
significant computational power and memory requirements. relied on evolution theory to gradually evolve the network
As automatic feature learning is highly desired in the iris architecture toward higher performing ones [30].
recognition context (to both remove the pitfalls in feature Existing NAS approaches only focus on improving the
design and automatically discover the best feature repre- accuracy. However, there are other vital factors which are
sentation directly from the data), more investigations are critical to the success in real-world applications. In the iris
recognition setting, these include the two properties men-
tioned above: the model compactness and the computation.
• K. Nguyen, C. Fookes and S. Sridharan are with Image and Video Re-
search Laboratory, SAIVT, School of Electrical Engineering and Robotics, Both factors have been the main driving forces behind the
Queensland University of Technology, Brisbane, QLD, 4000, Australia. success of modern iris recognition [6], [8], [27]. Taking these
E-mail: {k.nguyenthanh,c.fookes,s.sridharan}@qut.edu.au factors into account is important especially for practical
Manuscript received May 23rd, 2019 applications. For example, if the iris recognition system is

1057-7149 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 28,2020 at 03:21:42 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2020.2999211, IEEE
Transactions on Image Processing
2 IEEE TRANSACTIONS ON IMAGE PROCESSING

stored and processed on a high-computing platform, then aiming to improve recognition performance of the iris bio-
there doesn’t need to be restrictions on model compactness metric system. There are two lines of deep iris networks in
or matching speed, and the design process can target the the literature: softmax-based and pairwise-based.
highest accuracy. In contrast, if the iris recognition system The softmax-based networks train images on a fixed
is performed on embedded or mobile systems with limited number of classes and seek to match a test image with those
resources, model compactness and matching speed have to classes. The typical examples of these are DeepIrisNet [11],
be prioritized. [12] and Off-the-shelf Iris CNNs [23]. DeepIrisNets stack
There are two main objectives of this paper: (1) Develop classic layers to create three networks versions. DeepIrisNet-
an algorithm to search for the optimal deep iris network that A has 8 convolutional layers, 4 pooling layers, 3 fully
satisfies both the pre-defined constraints in computation connected layers and 2 dropout layers. DeepIrisNet-B swaps
and model compactness and achieves the highest possible a number of late convolutional layers with Inception layers
performance; and (2) Use this algorithm to investigate the [35]. DeepIrisNet-2 adds spatial transformer layers [17] to
optimality of the classic IrisCode and recent iris networks. predict the coefficients of affine transformations within the
To achieve these objectives, we re-visit existing NAS ap- iris images. Training on these big networks requires large
proaches, and re-interpret them with additional constraints datasets, hence Off-the-shelf Iris CNNs apply transfer learn-
to model the design as a constrained optimization problem. ing. State of the art Off-the-shelf CNNs (AlexNet, VGG,
Solving this constrained optimization problem allows us to Inception, ResNet, DenseNet) pre-trained on the large-scale
automatically discover the optimal network (the network ImageNet dataset [31] are further fine-tuned to achieve
which achieves the highest achievable accuracy within the state of the art iris recognition accuracy [23]. The main
constraints of the computation and model compactness). requirement of the softmax-based networks is the test image
In addition, applying this algorithm with two constraints has to belong to one of classes on the training set, which
similar to those of the existing approaches, we are able means the networks will have to be re-trained whenever a
to: (i) understand the optimality of the classic handcrafted new class is added.
IrisCode and recent deep iris networks; and (ii) optimize The pairwise-based networks deal with this drawback
an existing network architecture to achieve the highest by learning networks to measure the similarity or dissimi-
accuracy under the same computational and memory cost. larity between two images, without knowing their classes.
Our major contributions can be highlighted as follows: The typical examples of these are DeepIris [20] and FeatNet
• Our work is the first effort to consider automatic ar- [38]. DeepIris applies a pairwise filter on the two input iris
chitecture search for deep iris recognition networks. images, followed by a stack of classic convolutional, pooling
• We re-interpret the search for network architecture and fully connected layers. DeepIris outputs a similarity
and parameters as a constrained optimization with score between two input images [20]. FeatNet uses a fully
design constraints related to the compactness of the convolutional architecture by replacing all fully connected
model along with the matching speed. layers with 1 × 1 convolutional layers to retain spatial corre-
• Our approach results in the full automation of deep spondence between two input images. The classic pairwise
iris network design, in both feature learning and loss [33] is extended to incorporate the translation and mask,
architecture learning. leading to an accurate similarity estimation between the
• We discover an optimal deep network for iris recog- input couple [38].
nition achieving state-of-the-art performance with All these deep iris networks have their architectures
less computation and memory required compared to handcrafted, which means potential redundant connections,
existing deep iris networks. filters and layers, and no optimal accuracy guarantee. Learn-
• We provide a way to understand the optimality of ing both a feature representation and an optimal architecture
the classic IrisCode approach. is challenging considering the huge number of potential
architectures and their parameters. This challenge will be
The remainder of the paper is organized as follows: Sec- addressed in this paper.
tion 2 discusses recent deep iris recognition networks along
with feature learning and architecture learning approaches;
Section 3 presents how we model the network design as a bi- 2.2 Neural Architecture Search
level constrained optimization problem; Section 4 illustrates There does not exist any work in network architecture
our experimental results; and the paper is concluded in search in the iris recognition setting. In general deep learn-
Section 5. ing, modern architecture search approaches usually rely
on Evolution Theory and Reinforcement Learning (RL) to
2 R ELATED WORK design searching policies. For example, Real et al. brought
across ideas from the natural evolution process to gradually
This section reviews the existing attempts in the literature
update the architecture of the network to achieve higher
on applying deep learning approaches for iris recognition
accuracies [29], [30]. At each iteration, a number of archi-
and architecture learning via neural architecture search.
tectures, called the population, are investigated. The best
architecture is mutated (i.e. randomly add/remove layers)
2.1 Deep iris recognition networks to generate a new child architecture which is then added
A number of deep networks have been proposed to take into the population. Subsequently, the worst architecture
advantage of their automatic feature engineering capability [30] or the oldest architecture [29] is discarded to generate a
to automatically learn feature representation for iris images, new population. This algorithm is considered as a discrete

1057-7149 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 28,2020 at 03:21:42 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2020.2999211, IEEE
Transactions on Image Processing
NGUYEN et al.: CONSTRAINED DESIGN OF DEEP IRIS NETWORKS 3

Fig. 1. Re-interpreting layer connections. Traditionally, two layers, i and j , are connected by one operation selected from the operation set, O =
{o1 , ..., oN }. This means activating only operation, ok , in the operation set with a coefficient 1, while disabling all others with coefficients of 0.
(i,j)
The coefficients only take a discrete value {0, 1}. We relax this to allow the coefficients αk to take a continuous value between [0, 1]. This
re-interpretation enables us to make the architecture search space continuous.

optimization process. In contrast, Zoph et al. trained a RNN node is computed as a summary of network operations
controller to iteratively sample candidate architectures, and applied on its predecessors,
trains them to convergence to measure their performance on X
the desired task. The controller then uses the performance xj = o(i,j) (xi ), (1)
i<j
as a guiding signal to find more promising architectures
[39], [40]. Parameter sharing can be forced on all child where o (i,j)
is the candidate network operation between
models to improve the search speed at a slight cost of node i and node j . The candidate operation o(i,j) is defined
performance [24]. Both evolution and RL approaches are ex- by two properties: its operation type, which is one of the
tremely computationally expensive despite their remarkable types in the operation set O; and the parameters of the
performance. For example, the network architecture search chosen operation type. The operation set can include a
for CIFAR-10 needs 1800 GPU days of RL [40] or 3150 GPU convolutional operator, conv ; a pooling operator, pool; a
days of evolution [29]. skip connection operator, Identity ; and a no connection
One of our main motivations arises from a recent body indicator, zero. Each convolutional operator is followed
of works on gradient-based architecture search [14], [19] by a batch norm operation by default. The size of the
where the problem is encoded as a bi-level optimization convolutional kernel can vary, e.g. 3 × 3 and 3 × 5. There
problem with one level to optimize based on the architecture are two types of convolution: traditional convolution and
and the other level to optimize based on the weights of its dilated version [37]. The Identity operation will function
the chosen architecture. This strategy is able to discover as a skip connection similar to the ResNet architecture [15],
high-performing architectures achieving high classification which adds the original signal from the input feature map
accuracy in only tens of GPU hours. This strategy is highly to the output feature map. The Zero operation will model
desired in the iris recognition scenario to search for the the lack of connection between two feature maps.
highest-accuracy architecture. However, it is not directly With this notation, the network design is interpreted
applicable to the iris recognition scenario because in contrast as two tasks: (1) searching for a set of network operations
to the natural image recognition scenario, the approaches {o(i,j) : j : 1..L and i : 1..j }; and (2) searching for the
in the iris recognition scenario need to take into account operation weights (a.k.a parameter values) to achieve the
design constraints as discussed earlier to be applicable in highest performance network in the iris recognition setting.
real-world applications. In this paper, we introduce a new For each network, we consider two constraints:
design procedure to address these challenges.
It is also worth noting that two related tasks: hyperpa- • Compactness of the model: number of parameters:
rameter optimization and network simplification/pruning P , which is directly related to memory required: MP ,
are simpler tasks considering the dimension of the parame- • Computation: number of FLOPS: K .
ters compared with the whole architecture. In addition, a Our aim is to design a network that achieves the highest
recent body of work on resources-aware network design accuracy possible conditioned on these constraints.
such as MorphNet [13] and NetAdapt [36] have considered
resource constraints in the design process. However, these 3.1 Re-interpreting layer connections
techniques work on simplifying a pre-trained model to
match the resource constraints, which is much simpler than Considering o(i,j) is discrete, as shown in the literature,
an architecture search from scratch as being addressed in searching in the discrete space is extremely computational
this paper. heavy and may result in missing the optimal point [14],
[22]. We employ one adjustment to make the search space
(i,j)
continuous by assigning a coefficient αo for each candi-
3 D ESIGNING CONSTRAINTS date operation in the operation set O. Other than activating
A network architecture, α, is defined as a directed acyclic a single operation while all others are disabled between
graph consisting of an ordered sequence of L nodes [19]. two nodes (i, j), we activate all candidate operations in the
Each node, xj , in the graph represents a layer in the network operation set but only one operation is strongly encouraged
architecture. The first node is the input node, which is the with a high coefficient value while others are strongly
input image. The final node is the output. Each intermediate discouraged with small coefficient values. We call this a

1057-7149 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 28,2020 at 03:21:42 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2020.2999211, IEEE
Transactions on Image Processing
4 IEEE TRANSACTIONS ON IMAGE PROCESSING

Fig. 2. One search architecture example: with connection bundles, the architecture search becomes an optimization task to find the best set
(i,j)
{αo }, i, j = 1..L, i < j to minimize the loss.

connection bundle as shown in Figure 1. In the traditional


(i,j)
discrete case, αo takes only one of two values, {0, 1}. The
(i,j)
most popular choice is one of αo is 1 while all others are
(i,j)
0. In our continuous case, αo can take any value in the
range of [0, 1]. This adjustment has been employed in the
gradient-based architecture learning approaches [19].
The categorical summary in Equation 1 can now be
interpreted as a softmax over all possible operations,
X exp(αo(i,j) )
o(i,j) (x) = o(x), (2)
o∈O
Z
where the denominator Z is the normalization factor de-
P (i,j)
fined as the sum of all coefficients Z = o0 ∈O exp(αo0 ).
This interpretation translates the architecture search from
the discrete domain of {o(i,j) } to the continuous domain
of {α(i,j) }. The translation to the continuous domain re-
laxes the search and can be solved by standard optimiza-
tion approaches such as gradient descent. Once the set
of continuous variables α = {α(i,j) } have been learned, Fig. 3. Histogram of the distances between the strongest and second
the discrete architecture can be derived by replacing the strongest operation coefficients. It can be seen that most strongest
operation coefficients are far from its second strongest, centring around
pseudo operation o(i,j) with the strongest operation, i.e. 0.6.
(i,j)
o(i,j) = argmaxo∈O αo .
One example of an architecture search space is illus-
trated in Figure 2. We also measure the Euclidean distances learning process can be represented as a bi-level constrained
between the strongest and the second strongest operation optimization problem as follows,
coefficients for all layers in this example architecture, and
subsequently visualize these values by plotting their his- min Lval (w∗ (α), α) + |α|, (3)
α
togram. The plot is shown in Figure 3. This figure shows
that the most strongest operation coefficients are far from its
second strongest, centring around 0.6. subject to: # F LOP (α) < K, (4)
and # P arameter(α) < P, (5)
3.2 Network design as a constrained optimization ∗
and w (α) = argmin Ltrain (w, α), (6)
w
In biometrics, the dataset is usually split into three sub-
sets: training, validation and testing. The training subset where the validation data is used to search for the architec-
is employed to learn models’ parameters while the vali- ture and the training data is used to search for the weights
dation subset is for learning hyperparameters. Following of an architecture. This is a two-level optimization problem
the same vein, we use the validation subset to learn the where: (1) the lower level optimization problem is: searching
network architecture, α, and the training subset to learn for the best weights of the existing architecture to minimize
the weights, w, of the network. The network design task the training loss; and (2) the upper level optimization prob-
is now described as finding the optimal network architec- lem is: searching for the best network architecture that, with
ture, α∗ , that minimizes the validation loss Lval (w∗ , α), its best weights, minimizes the validation loss.
where the weight, w∗ , associated with that network archi- This bi-level optimization has also arisen in hyperpa-
tecture, α∗ , are derived by minimizing the training loss, rameter optimization in the network design such as in [10]
w∗ (α) = argmaxw Ltrain (w, α∗ ). but in simple forms where the dimensions of the scalar-
The network design is now interpreted as a double learn- valued hyperparameters are substantially smaller than the
ing problem. Taking the constraints mentioned before, the architecture dimension.

1057-7149 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 28,2020 at 03:21:42 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2020.2999211, IEEE
Transactions on Image Processing
NGUYEN et al.: CONSTRAINED DESIGN OF DEEP IRIS NETWORKS 5

3.3 Iterative solution


Iterative gradient-based solutions have been shown to be
tractable for solving the bi-level optimization [34]. How-
ever, popular gradient-based approaches such as Stochastic
Gradient Descent (SGD) [2] are not applicable here due
to the presence of inequality design constraints. To solve
this, we propose an algorithm to solve this constrained
bi-level optimization problem based on Projected Gradient
Descent (PGD) [3] as in Algorithm 1. The gradient of the
loss in the train subset, ∇w Ltrain (w, α), is used to update
the weights, w, to learn the best weight for the current
architecture. The gradient of the loss in the validation
subset, ∇α Lval (w − ξ∇w Ltrain (w, α), α), is used to guide
the architecture update, where ξ is a coefficient. The up-
dated architecture, α, is tested on the computation and Fig. 4. Sample images from ND-CrossSensor-2013 (1st row), CASIA-
Iris-Thousand (2nd row) and UBIRIS datasets (3rd row).
memory constraints. If it does not satisfy the constraints,
α will be projected into the constraint domain to generate
a new architecture that satisfies the constraints. This itera- public iris dataset in the literature in terms of the
tive gradient-based approach has been applied in [19] but number of images [25].
without constraints. • CASIA-Iris-Thousand dataset2 : contains NIR iris im-
ages captured by the dual-eye iris IKEMB-100 camera
Algorithm 1: Constrained deep iris network design from IrisKing at the Institute of Automation, Chinese
Inputs: Constraints: Academy of Sciences (CASIA). There are 20,000 NIR
- maximum number of FLOPs: K iris images captured in several sessions from 1,000
- maximum number of parameters: P subjects. This is the largest public iris dataset in the
Outputs: The optimal network: literature in terms of the number of subjects [4].
+ optimal architecture α∗ • UBIRIS.v2 iris dataset3 : contains visible iris images
+ optimal weights w∗ captured when the subjects are on the move and at
+ optimal performance metrics: EER∗ a distance by an off-the-shelf camera (Canon EOS
Initialize the connection bundles o(i,j) parameterized 5D) at the University of Beira Interior (UBI). There
(i,j) are 11,102 Visible iris images captured in several
by αo for each edge (i,j);
while not converged do sessions from 261 subjects. The unconstrained con-
1. Check the convergence conditions; ditions (at-a-distance, on-the-move and on the visi-
2. Update the weights w by gradient descending ble wavelength) with realistic noise factors make it
the training loss ∇w Ltrain (w, α); challenging for the iris recognition task [26].
3. Update the network architecture α by gradient Some examples and statistics of three datasets are pre-
descending the validation loss sented in Figure 4 and Table 1.
∇α Lval (w − ξ∇w Ltrain (w, α), α); Three experiments will be performed for validation.
4. If α does not satisfy the #FLOPs and First, the proposed algorithm will be employed to inves-
#parameters constraints, project it to the tigate the optimality of the existing approaches. By tar-
constraint domain to generate a new architecture geting the computation and/or model size of the existing
end approaches as the constraints for the search, our constrained
Derive the optimal network architecture by replacing search algorithm automatically discovers the network that
connection bundles with the strongest operation, achieves the highest achievable accuracy. Comparing this
(i,j)
o(i,j) = argmaxo∈O αo for each edge (i,j). accuracy with the accuracy of the existing approaches, their
optimality can be validated or rejected. The classic IrisCode
and a recent deep iris network will be sequentially used
for demonstration in the first two experiments presented
4 E XPERIMENTAL RESULTS in Section 4.3 and 4.4. The third experiment is to search
for a network with competitive accuracy to the state-of-
We have conducted our experiments on three publicly avail-
the-art approaches with less computation and model size
able datasets:
requirements.
• ND-CrossSensor-Iris-2013 dataset1 : contains NIR iris
images captured by two iris cameras: LG4000 and 4.1 Experimental setup
LG2200 at the University of Notre Dame (ND). We
We first pre-process the iris images by segmentation and
choose to use the images from the LG2200 cameras
normalization. The iris region is defined by one inner and
due to their large number of images suitable for
learning deep networks. There are 116,564 NIR iris 1. https://sites.google.com/a/nd.edu/public-cvrl/data-sets
images captured in several sessions from 676 subjects 2. http://www.cbsr.ia.ac.cn/english/IrisDatabase.asp
over three years from 2008 to 2010. This is the largest 3. http://iris.di.ubi.pt/ubiris2.html

1057-7149 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 28,2020 at 03:21:42 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2020.2999211, IEEE
Transactions on Image Processing
6 IEEE TRANSACTIONS ON IMAGE PROCESSING

TABLE 1
Statistics of three datasets, ND-CrossSensor-2013, CASIA-Iris-Thousand and UBIRIS.v2 in this research.

Image Iris Subject


# Subjects # Images Distance Imager Wavelength
Resolution Diameter Cooperation
ND-CrossSensor-2013 676 111,564 Close-up LG2200 640x480 200 NIR Highly
CASIA-Iris-Thousand 1,000 20,000 Close-up IKEMB-100 640x480 180 NIR Highly
UBIRIS.v2 261 11,102 4-8 meters CanonEOS 5D 800x600 180-80 Visible Less

one outer circular boundary. These two circles are detected recognition [20], [38]. Hence we also split the data
by an integro-differential operator [5] as, into subject-disjoint training and testing subsets.
∂ I(x, y)
I
In summary, there are two data splitting schemes in this
maxr,x0 ,y0 Gσ (r) ∗ ds , (7)
∂r r,x0 ,y0 2πr work: (1) sample-disjoint splitting and (2) subject-disjoint
splitting. Depending on the task to be performed, our exper-
where I(x, y) denotes the input image and and Gσ refers
iments will apply one of these two schemes. For the sample-
to a Gaussian with a standard deviation σ . This integro-
disjoint scheme, we split images of each subject into 70% of
differential operator detects circles by iteratively searching
the images for training, 10% of the images for validation
the circular arc ds with a radius r centered at the location
and 20% of the images for testing. For the subject-disjoint
(x0 , y0 ). A mask is also added to eliminate the occlusion
scheme, we split 70% of the subjects for training, 10% of the
impact of eyelids and eyelash.
subjects for validation and 20% of the subjects for testing.
We subsequently normalize the segmented iris region
Intra-dataset performance We perform the intra-dataset
to a fixed size rectangle by a rubber-sheet model [5] by
experiment on the ND-CrossSensor-Iris-2013 dataset, not
re-mapping the segmented iris image IS (x, y), from the
the other two datasets, due to its large size suitable for train-
raw Cartesian coordinates (x, y) to the dimensionless polar
ing. The training subset is to train a network architecture to
coordinates (r, θ) as,
find the best weights. The validation subset is used to find
x(r, θ) = (1 − r)xp (θ) + rxs (θ), (8) the best network architecture. The testing subset is used to
y(r, θ) = (1 − r)yp (θ) + rys (θ), (9) report the intra-dataset performance.
The classic IrisCode is used as a handcrafted baseline.
where r and θ are the radius and the angle in the range Our implementation achieves a F RR = 3.76% at F AR =
of [0, 1] and [0, 2π] respectively. The normalization step 0.1% and an EER = 1.75% on the ND-CrossSensor-2013
helps to reduce the rotations of the eye (e.g., due to the dataset, which is comparable to the state of the art imple-
head movement), to simple translation during matching. mentation [38].
We choose an open-source software, USIT v2.2, from the Cross-dataset performance The best network learned is
University of Salzburg [28] for the pre-processing phase and further investigated for generalization capability through
generate normalized images with a fixed size of 64 × 512 training in one dataset and testing on others. The best
pixels. Some manual corrections are subsequently made by network discovered in the ND-CrossSensor-Iris-2013 dataset
removing a small portion of wrongly segmented images. is tested on the other two datasets, CASIA-Iris-Thousand
and UBIRIS.v2 to understand its generalizability. We do not
4.2 Performance metrics perform network search on the CASIA and UBIRIS datasets
To report the performance, we rely on False Rejection Rate as their small number of images will restrict the search
(FRR) and Equal Error Rate (EER). In this work, FRRs at space.
False Acceptance Rate (FAR) = 0.1% are experimented and
reported due to its popular adoption in the field. 4.3 Case 1: Handcrafted - IrisCode
State-of-the-art iris networks employ two types of losses:
Firstly, we are interested to see how well the deep networks
(1) cross-entropy loss [11], [23] and (2) pairwise loss [20],
perform if they have to limit their computation akin to the
[38].
one in the classic handcrafted IrisCode [6]. We impose one
• Cross-entropy loss: the important property of the computation constraint, i.e. the maximum number of FLOPs,
cross-entropy loss is using the same identities in the to be akin to the one in the classic IrisCode, i.e. 0.9M ,
training and testing datasets, which means sample- and investigate the best accuracy a deep network could
disjoint but not subject-disjoint. This is shown in the achieve compared to the accuracy of the IrisCode. We run
softmax classifier of [11] and the SVM classifier of the constrained design algorithm to find the best network
[23]. Hence to be comparable with the state of the art, architecture yielding the highest accuracy or smallest EER
we first divide our dataset into the sample-disjoint conditioned on the IrisCode computation.
but not subject-disjoint training and testing subsets. Operation set We apply popular operations, which are
• Pairwise loss: the pairwise loss measures the sim- widely used in the existing deep iris networks, in the oper-
ilarity or dissimilarity between two input images, ation set O: 3 × 3 and 3 × 5 convolution, 2 × 2 max pooling,
deciding whether they are from the same class or 2 × 2 average pooling, Identity and Zero. All operations
not. This loss allows us to have unseen subjects, i.e. are of stride 1.
not present in the training phase, in the test phase. Loss choice We apply the most popular softmax clas-
This loss has been shown to be effective for iris sifier with a classification cross-entropy loss. A majority of

1057-7149 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 28,2020 at 03:21:42 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2020.2999211, IEEE
Transactions on Image Processing
NGUYEN et al.: CONSTRAINED DESIGN OF DEEP IRIS NETWORKS 7

Fig. 5. The architecture that achieves the same accuracy level with the handcrafted IrisCode. Notice it achieves the same performance level with 8
times additional computation.

Fig. 6. The optimal network architecture that is discovered by our constrained search algorithm.

previous deep iris recognition networks use this classifica- TABLE 3


tion loss trained over a set of known iris identities and then Computation vs. EER of deep architectures vs. IrisCode. The first row
presents the computation ratio between the deep network and IrisCode
take the intermediate bottleneck layer as a representation K
KR = deeparchitecture
KIrisCode
; the second row presents the best EER
beyond the set of identities used in training. We use the achieved by under the selected computation constraints; and the third
classification probabilities as scores on known subjects in row presents the accuracy ratio between the deep network and
EERdeeparchitecture
the test set. Varying the score threshold generates different IrisCode EERR = EER
.
IrisCode
operating points in the DET curve. To be succinct, we only
present the EER in the first two experiments. The DET curve KR 2.0 3.0 4.0 5.0 6.0 7.0 8.0
will be presented in the third experiment when comparing EER 16.2 15.1 13.3 8.9 5.6 2.9 1.4
with the state-of-the-art approaches. EERR 9.3 8.6 7.6 5.1 3.2 1.7 0.8
Dataset The experiment is performed on the ND-
CrossSensor-Iris-2013 dataset. Due to the cross-entropy loss,
the sample-disjoint scheme is applied to split the dataset layers, then reported the EERs as shown in Table 3. The same
into 70% of the images in each subject for training, 10% of level of accuracy is only achieved with 8 times the additional
the images in each subject for validation and 20% of the computation amount with the network architecture discov-
images in each subject for testing. ered in Figure 5. This aligns with the well-known universal
approximation theory of deep networks [9], which states
TABLE 2 that a deep network can approximate any function given
Case 1 constrained design task. enough resources. This illustrates the main characteristic of
these deep networks: automatic feature engineering comes
Inputs
Algorithm
Outputs
at the cost of heavy computation and memory requirements.
hyperparameters
- O = {3 × 3, 3 × 5 conv ;
- α∗
K = KIrisCode 2 × 2 max pool; 2 × 2 avg
- w∗
4.4 Case 2: Deep learning - ResNet18
P = Inf pool; Identity ; Zero}
- EER∗ Secondly, we are interested to see how effective the state-
- Loss = cross-entropy
of-the-art deep networks perform in terms of architecture
The constrained design task with input constraints, hy- design. In other words, under the same computation and
perparameters on the operation set and the loss choice, memory with the existing state-of-the-art networks, what is
and the outputs are summarized in Table 2. We found that the best accuracy we can achieve for the iris recognition
the best network constrained by the number of FLOPs of task. Nguyen et al. analyzed layer by layer performance
the classic IrisCode achieves an EER = 16.7, which is 10 of the landmark networks which have won the ImageNet
times the EER achieved by the IrisCode. Two interesting challenge since 2012 to the iris recognition tasks [23]. De-
points can be inferred here: (i) deep networks struggle spite being pre-trained on ImageNet, they have shown
when the computation is strictly limited. Fundamentally, competitive performance on the iris recognition task with
strictly limited computation is detrimental to the modeling transfer learning. We choose one of the landmark networks,
capacity and learning algorithms; and (ii) this reinforces called ResNet18, for experiments due to its simplicity and
the effectiveness of the classic IrisCode in terms of both uniformity in the layer connections and wide adoption in
computation and recognition accuracy. the field [15]. We apply the proposed constrained design
We subsequently pose a question related to how much algorithm to search for the network with the best accu-
additional computation and memory we have to sacrifice racy that can be yielded bounded by the computation,
to achieve similar or even better accuracies than the hand- KResN et18 , and the number of parameters, PResN et18 , of the
crafted counterpart. We gradually increased the constraint ResNet18 network. We employ the same operation set, the
K values by a KIrisCode step and the initial number of cross-entropy loss and the data splitting scheme as in Case

1057-7149 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 28,2020 at 03:21:42 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2020.2999211, IEEE
Transactions on Image Processing
8 IEEE TRANSACTIONS ON IMAGE PROCESSING

1. The constrained design task is summarized in Table 4. TABLE 6


Our algorithm discovers a new architecture that achieves Comparison with state of the art approaches on ND-CrossSensor-2013
dataset.
higher accuracy than the original ResNet, EER = 1.12%
vs. 1.29% and F RR = 2.23% vs. 2.58% with the same level FRR EER K P
of computation and memory. IrisCode 3.76% 1.75% 0.5M 5
DeepIris [20] 2.62% 1.31% 20M 192K
TABLE 4 FeatNet [38] 1.79% 0.99% 30M 13K
Case 2 constrained design task. Ours 1.32% 0.68% 19M 11K

Algorithm
Inputs Outputs
hyperparameters
- O = {3 × 3, 3 × 5 conv ;
- α∗
K = KResN et18 2 × 2 max pool; 2 × 2
- w∗
P = PResN et18 avg pool; Identity ; Zero}
- EER∗
- Loss = cross-entropy

4.5 Case 3: State-of-the-art


We also want to see whether we can achieve competitive
accuracy compared with the state-of-the-art deep iris net-
works.
Operation set We apply popular operations in the opera-
tion set O: 3×3 and 3×5 convolution, 3×3 and 3×5 dilated
convolution, 2 × 2 max pooling, 2 × 2 average pooling,
Identity and Zero. All operations are of stride 1.
Loss choice We leverage the most recent advance in the
loss design for biometrics by using a pairwise loss called
Extended Triplet loss as investigated in [33], [38]. Compared
with classification losses, pairwise losses directly reflect what Fig. 7. DET curves for comparison with other deep learning feature
we want to achieve, i.e. to train the representation to correspond to representations on the test set of the ND-CrossSensor-2013 dataset.
iris similarity. This results in irises of the same person having Best viewed in color.
small distances and irises of different people having larger
distances.
Dataset The experiment is performed on the ND- normalized iris images, with a size of 64 × 512, are fed
CrossSensor-Iris-2013 dataset. Due to the pairwise loss, the directly to [38] since they use the same input size. The
subject-disjoint scheme is applied to split the dataset into normalized iris images are resized to the expected size of
70% of the subjects for training, 10% of the subjects for [20], [38] to be compatible with their network designs. The
validation and 20% of the subjects for testing. performance achieved is comparable to those reported in
The constrained design task is summarized in Table 5. the papers. Table 6 and the DET curve in Figure 7 show the
Running on a single Nvidia GTX 1080Ti GPU, our algorithm discovered network outperforms all existing deep iris net-
discovered a network as presented in Figure 6 in 26 hours. works in terms of accuracy with less number of parameters
We compare the discovered network with the state-of-the- and computation required.
art approaches in four metrics: two in accuracy (F RR and
To understand the statistical significance of the differ-
EER), one computation (K ) and one memory (P ).
ence in performance, we randomly divided the subjects
into 10 different test sets. For each test set, we measured
TABLE 5 the performance of the network architecture found by our
Case 3 constrained design task.
algorithm and the top-performance algorithm in the litera-
Algorithm
ture FeatNet. Then, we used a paired t-test to see whether
Inputs Outputs the proposed method obtained a statistically significant
hyperparameters
- O = {3 × 3, 3 × 5 conv ; improvement. The null statistical hypothesis, H0, that two
3 × 3, 3 × 5 dilated conv; - α∗ classifiers are drawn from the same distribution is being
K = KF eatN et
2 × 2 max pool; 2 × 2 - w∗
P = PF eatN et tested. The paired t-test operates upon calculating a p-
avg pool; Identity ; Zero} - EER∗
- Loss = pairwise value showing whether we can reject H0 (if p < alpha)
or fail to reject H0 (if p > alpha). The popular choice of
We compare with two state-of-the-art deep iris networks the cut-off threshold alpha is 0.05. If the null hypothesis
using pairwise loss including: DeepIris [20] and FeatNet is rejected, it suggests that there is evidence to suggest
[38]. Since the original models in their papers are not significant difference in the improvement. This strategy has
publicly available, we carefully implemented and trained been employed in [16]. The results are shown in Table 7.
the networks according to all the details in [20], [38]. A The t-test showed statistically significant improvement of
notable difference is we use the same segmentation method the new network architecture found by our algorithm over
from USIT v2.0 for all approaches to be comparable. The the state of the art approach.

1057-7149 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 28,2020 at 03:21:42 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2020.2999211, IEEE
Transactions on Image Processing
NGUYEN et al.: CONSTRAINED DESIGN OF DEEP IRIS NETWORKS 9

TABLE 7 R EFERENCES
Statistical significance of the performance.
[1] B. Bhanu and A. Kumar. Deep Learning for Biometrics. Springer,
2017.
Methods Avg EER [2] S. Boyd and L. Vandenberghe. Convex Optimization. Stanford
FeatNet [38] 1.79 × 10−2 University, 2004.
Ours 1.32 × 10−2 [3] S. Bubeck. Convex optimization: Algorithms and complexity.
p-value 2.61 × 10−3 Foundations and Trends in Machine Learning, 8(3):231–357, 2015.
[4] Chinese Academy of Sciences Institute of Automation. CASIA iris
image database, http://biometrics.idealtest.org/, Aug 2017.
TABLE 8 [5] J. Daugman. How iris recognition works? IEEE Transactions on
Cross-dataset performance of the network discovered. Circuits and Systems for Video Technology, 14:21 – 30, 2004.
[6] J. Daugman. New methods in iris recognition. IEEE Transactions
on Systems, Man and Cybernetics, 37:1167 – 75, 2007.
CASIA-Iris-1K UBIRISv2 [7] J. Daugman. Information theory and the iriscode. IEEE Transac-
FRR EER FRR EER tions on Information Forensics and Security, 11(2):400–409, Feb 2016.
IrisCode 5.53% 3.46% 14.31% 8.33% [8] J. Daugman. Information theory and the iriscode. IEEE Transac-
DeepIris [20] 4.25% 2.17% 13.20% 7.12% tions on Information Forensics and Security, 11:400–409, Feb 2016.
FeatNet [38] 3.98% 1.93% 13.93% 6.69% [9] J. Daugman and C. Downing. Searching for doppelgangers:
Ours 3.07% 1.54% 11.12% 5.98% assessing the universality of the iriscode impostors distribution.
IET Biometrics, 5:65–75, 2016.
[10] L. Franceschi, P. Frasconi, S. Salzo, R. Grazzi, and M. Pontil.
Bilevel programming for hyperparameter optimization and meta-
4.6 Generalizability of the model learning. In International Conference on Machine Learning (ICML),
2018.
Finally, we want to understand the generalization capability [11] A. Gangwar and A. Joshi. Deepirisnet: Deep iris representation
of the network architecture discovered by performing a with applications in iris recognition and cross-sensor iris recog-
nition. In IEEE International Conference on Image Processing (ICIP),
cross-dataset experiment on two smaller-size datasets, CA- pages 2301–2305, Sep 2016.
SIA [4] and UBIRIS [26]. While the CASIA dataset captured [12] A. K. Gangwar, A. Joshi, P. Joshi, and R. Raghavendra. Deepiris-
the NIR iris images using a different camera, the UBIRIS net2: Learning deep-iriscodes from scratch for segmentation-
robust visible wavelength and near infrared iris recognition. CoRR,
dataset captured the iris images with a visible camera. abs/1902.05390, 2019.
This demonstrates the wide range of imaging conditions to [13] Z. Gordon, E. Eban, O. Nachum, B. Chen, T.-J. Yang, and E. Choi.
test the generalizability. Two datasets CASIA and UBIRIS Morphnet: Fast & simple resource-constrained structure learning
of deep networks. In IEEE International Conference on Computer
are split into 20% for training and 80% for testing. Three Vision and Pattern Recognition (CVPR), 2018.
networks, i.e.: ours, DeepIris and FeatNet, trained as per [14] W. Grathwohl, E. Creager, S. K. S. Ghasemipour, and R. Zemel.
Section 4.5 are further fine-tuned using the training subset Gradient-based optimization of neural network architecture. In
International Conference on Learning Representation (ICLR), 2018.
and tested in the testing subset of the two datasets for the [15] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for
cross-dataset performance investigation. image recognition. In IEEE Conference on Computer Vision and
The performance is presented in Table 8. The network Pattern Recognition (CVPR), pages 770–778, Jun 2016.
[16] K. P. Hollingsworth, K. W. Bowyer, and P. J. Flynn. Improved
discovered by our architecture search algorithm outper- iris recognition through fusion of hamming distance and fragile
forms the state-of-the-art approaches in both CASIA and bit distance. IEEE Transactions on Pattern Analysis and Machine
UBIRIS datasets, illustrating a high level of generalization Intelligence (PAMI), 33(12):2465–2476, Dec 2011.
[17] M. Jaderberg, K. Simonyan, A. Zisserman, and k. kavukcuoglu.
across different sensors, different imaging distances and Spatial transformer networks. In C. Cortes, N. D. Lawrence, D. D.
different levels of subject cooperation. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural
Information Processing Systems (NIPS), pages 2017–2025. 2015.
[18] A. W. K. Kong, D. Zhang, and M. S. Kamel. An analysis of iriscode.
IEEE Transactions on Image Processing, 19(2):522–532, Feb 2010.
5 C ONCLUSIONS [19] H. Liu, K. Simonyan, and Y. Yang. Darts: Differentiable architec-
ture search. arXiv:1806.09055, 2018.
This paper proposes an algorithm to design a deep iris [20] N. Liu, M. Zhang, H. Li, Z. Sun, and T. Tan. Deepiris: Learning
recognition network with attention to computation and pairwise filter bank for heterogeneous iris verification. Pattern
memory constraints. By modeling the design process as a Recognition Letters, 82:154 – 161, 2016.
[21] M. Lopez, J. Daugman, and E. Canto. Hardware-software co-
bi-level constrained optimization approach, our algorithm design of an iris recognition algorithm. IET Information Security,
is able to search for the optimal network which achieves the 5:60–68, Mar 2011.
[22] R. Luo, F. Tian, T. Qin, E. Chen, and T.-Y. Liu. Neural architecture
best possible performance conditioned on the pre-defined optimization. In Advances on Neural Information Processing Systems
computation and model compactness constraints. This al- (NIPS), 2018.
gorithm enables us to investigate the effectiveness of the [23] K. Nguyen, C. Fookes, A. Ross, and S. Sridharan. Iris recognition
with off-the-shelf cnn features: A deep learning perspective. IEEE
classic handcrafted IrisCode compared with deep network
Access, 6:18848 – 18855, 2017.
counterparts. It also enables us to further improve the [24] H. Pham, M. Guan, B. Zoph, Q. Le, and J. Dean. Efficient
existing deep iris recognition networks to achieve similar neural architecture search via parameter sharing. In International
or better accuracy with the same level of computation Conference on Machine Learning (ICML), 2018.
[25] P. Phillips, W. Scruggs, A. O’Toole, P. Flynn, K. Bowyer, C. Schott,
and memory cost. The design algorithm also discovers an and M. Sharpe. Frvt 2006 and ice 2006 large-scale experimental re-
optimal network with competitive performance with less sults. IEEE Transactions on Pattern Analysis and Machine Intelligence,
computation and memory required than the state-of-the- 32(5):831–846, May 2010.
[26] H. Proenca, S. Filipe, R. Santos, J. Oliveira, and L. Alexandre. The
art approaches, in both intra-dataset and cross-dataset ex- UBIRIS.v2: A database of visible wavelength images captured on-
periments. More importantly, this algorithm simultaneously the-move and at-a-distance. IEEE Transactions on Pattern Analysis
achieves both automatic feature engineering and network and Machine Intelligence, 32(8):1529–1535, Aug 2010.
[27] H. Proena and J. C. Neves. Irina: Iris recognition (even) in
architecture engineering, opening us to full automation in inaccurately segmented data. In IEEE Conference on Computer
deep iris recognition network design. Vision and Pattern Recognition (CVPR), pages 6747–6756, Jul 2017.

1057-7149 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 28,2020 at 03:21:42 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2020.2999211, IEEE
Transactions on Image Processing
10 IEEE TRANSACTIONS ON IMAGE PROCESSING

[28] C. Rathgeb, A. Uhl, P. Wild, and H. Hofbauer. Design decisions


for an iris recognition SDK. In K. Bowyer and M. J. Burge, editors,
Handbook of Iris Recognition, Advances in Computer Vision and
Pattern Recognition. Springer, 2016.
[29] E. Real, A. Aggarwal, Y. Huang, and Q. V. Le. Regularized evolu-
tion for image classifier architecture search. CoRR, abs/1802.01548,
2018.
[30] E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. Le,
and A. Kurakin. Large-scale evolution of image classifiers. In
International Conference on Machine Learning (ICML), 2017.
[31] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma,
Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg,
and L. Fei-Fei. Imagenet large scale visual recognition challenge.
International Journal of Computer Vision, 115(3):211–252, Dec 2015.
[32] J. Schmidhuber. Deep learning in neural networks: An overview.
Neural Networks, 61:85 – 117, 2015.
[33] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified
embedding for face recognition and clustering. In IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), pages 815–823,
Jun 2015.
[34] A. Sinha, P. Malo, and K. Deb. A review on bilevel optimization:
From classical to evolutionary approaches and applications. IEEE
Transactions on Evolutionary Computation, 22(2):276–295, Apr 2018.
[35] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna.
Rethinking the inception architecture for computer vision. In
IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
pages 2818–2826, Jun 2016.
[36] T. Yang, A. G. Howard, B. Chen, X. Zhang, A. Go, V. Sze, and
H. Adam. Netadapt: Platform-aware neural network adaptation
for mobile applications. In European Conference on Computer Vision
(ECCV), 2018.
[37] F. Yu and V. Koltun. Multi-scale context aggregation by dilated
convolutions. In International Conference on Learning Representation
(ICLR), 2016.
[38] Z. Zhao and A. Kumar. Towards more accurate iris recognition
using deeply learned spatially corresponding features. In IEEE
International Conference on Computer Vision (ICCV), Oct 2017.
[39] B. Zoph and Q. V. Le. Neural architecture search with reinforce-
ment learning. In International Conference on Learning Representation
(ICLR), 2017.
[40] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le. Learning trans-
ferable architectures for scalable image recognition. In IEEE
International Conference on Computer Vision and Pattern Recognition
(CVPR), 2018.

Kien Nguyen is a Research Fellow at Queensland University of Tech-


nology (QUT, Australia). He has been conducting research in iris recog-
nition for 10 years, having published in the top conferences and journals
such as CVPR, PR, and CVIU. His research interests are applications
of computer vision and deep learning techniques for biometrics, surveil-
lance and scene understanding. He has been serving as an Associate
Editor of the journal IEEE Access in the area of biometrics since 2016.

Clinton Fookes is a Professor in Vision and Signal Processing


at the Queensland University of Technology. He holds a BEng
(Aerospace/Avionics), an MBA, and a PhD in computer vision. He
serves on the editorial boards for the Pattern Recognition Journal and
the IEEE Transactions on Information Forensics & Security. He is a
Senior Member of the IEEE, an AIPS Young Tall Poppy, an Australian
Museum Eureka Prize winner, and a Senior Fulbright Scholar.

Sridha Sridharan obtained his MSc degree from the University of


Manchester, UK and his PhD degree from University of New South
Wales, Australia. He is currently a Professor at Queensland University
of Technology (QUT) where he leads the research program in Speech,
Audio, Image and Video Technologies (SAIVT).

1057-7149 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on June 28,2020 at 03:21:42 UTC from IEEE Xplore. Restrictions apply.

You might also like