0% found this document useful (0 votes)
69 views6 pages

Efficient Palm-Line Segmentation With U-Net Context Fusion Module

Uploaded by

fake.myself00
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views6 pages

Efficient Palm-Line Segmentation With U-Net Context Fusion Module

Uploaded by

fake.myself00
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2020 International Conference on Advanced Computing and Applications (ACOMP)

Efficient Palm-Line Segmentation with U-Net


Context Fusion Module
2020 International Conference on Advanced Computing and Applications (ACOMP) | 978-1-7281-8167-7/20/$31.00 ©2020 IEEE | DOI: 10.1109/ACOMP50827.2020.00011

Toan Pham Van, Son Trung Nguyen, Linh Bao Doan, Ngoc N. Tran
R&D Lab, Sun* Inc
{pham.van.toan,nguyen.trung.son,doan.bao.linh,tran.ngo.quang.ngoc}@sun-asterisk.com

Ta Minh Thanh
Le Quy Don Technical University, 236 Hoang Quoc Viet, Bac Tu Liem, Ha Noi
[email protected]

Abstract—Many cultures around the world believe that palm As mentioned above, recent works have been focused on
reading can be used to predict the future life of a person. traditional image processing and mathematical methods. Sev-
Palmistry uses features of the hand such as palm lines, hand eral algorithms were applied with computer vision techniques
shape, or fingertip position. However, the research on palm-line
detection is still scarce, many of them applied traditional image such as noise filtering, edge detection and directional detectors
processing techniques. In most real-world scenarios, images [2][3] to accomplish this purpose. With these methods, the
usually are not in well-conditioned, causing these methods to principle palm lines can be extracted under clear conditions
severely under-perform. In this paper, we propose an algorithm and high resolution. However, the performance of previous
to extract principle palm lines from an image of a person’s hand. works drops, giving low-accuracy results for images with
Our method applies deep learning networks (DNNs) to improve
performance. Another challenge of this problem is the lack of complex background.
training data. To deal with this issue, we handcrafted a dataset To overcome these problems, deep learning approaches
from scratch. From this dataset, we compare the performance have been proposed for palm line extraction. An important
of readily available methods with ours. Furthermore, based on architecture in deep learning is convolutional neural network
the UNet segmentation neural network architecture and the (CNN) [4]. This model is useful for many tasks in computer
knowledge of attention mechanism, we propose a highly efficient
architecture to detect palm-lines. We proposed the Context vision such as face recognition [5], image classification [4],
Fusion Module to capture the most important context feature, image super-resolution [6], semantic segmentation [7], and so
which aims to improve segmentation accuracy. The experimental on. The advantage of deep learning techniques is that they
results show that it outperforms the other methods with the are generally accurate when trained with a good dataset. That
highest F1 Score about 99.42% and mIoU is 0.584 for the same means the algorithm can cover many different cases of input
dataset.
Index Terms—image segmentation, palm lines reading, palm- images while ensuring high accuracy. Two big disadvantages
istry, context fusion module. when using deep learning methods are the required high-
quality dataset and the low evaluation speed of the model in
I. I NTRODUCTION the production phase. To solve these two problems, in addition
to building a dataset carefully, we need to build a network
A. Overview architecture that can take the balance of high accuracy and
Nowadays, with great advances in computer science, image predictive speed of the model. We compare many network
processing applications are becoming more and more popular. architectures to choose the optimal for both accuracy and
Palmistry is one of the interesting problems in the computer computing speed. Our method achieves F1 score of 99.42%,
vision field. The main task of this problem is to extract the mIoU score is 0.584 and can run at 94 FPS with our mid-range
palm lines from a hand picture. The merit of this task is that it consumer-tier GPU.
is believed the palm lines can be used to predict one’s future
life. Looking at these images, we can find principle lines, B. Our contributions
wrinkles, and ridges on one’s palm. Usually, a hollow will To sum up, our main contributions are summarized below:
have some main lines in a palm that are most notable and • (1) Proposing the use of a DNN to address the problem
change little over time. Wrinkles are generally much thinner of palm-line detection. We use deep learning alongside
than principal lines and much more irregular. The ridge’s image processing techniques instead of pure traditional
shape is the same as the fingerprint’s; hence it is difficult to image processing as in other previous papers.
distinguish them in the low-resolution image. Our task is to • (2) Providing a high-quality dataset for this problem. This
manually choose the best parameters for some algorithm to dataset was carefully annotated by professionals in the
distinguish three line types [1]. However, this is not easy, as field1 .
these parameters may work very well in some cases but not
in others, resulting in a lack of generality. 1 https://link.sun-asterisk.vn/palmlinedataset

2688-0202/20/$31.00 ©2020 IEEE 23


DOI 10.1109/ACOMP50827.2020.00011
Authorized licensed use limited to: Mukesh Patel School of Technology & Engineering. Downloaded on January 31,2024 at 15:46:43 UTC from IEEE Xplore. Restrictions apply.
Fig. 1. The architecture of our image segmentation system.

• (3) Proposing our model with the Context Fusion Mod- however a majority of them suffer from the blemishes in most
ule to achieve high accuracy even with complex palm- images with complex patterns. To overcome these difficulties,
line images. With this approach, we achieve respectable we propose a network with a custom module called Context
performance with respect to mIoU score. Fusion Module (CFM) combined with the traditional U-Net
architecture. This helps our network work better with input
C. Roadmap images containing complex palm printing.
The rest of the paper is organized as follows. Section 2
A. Segment Architecture
presents a brief review of related works. In Section 3, the data
pre-processing method and proposed model architecture are U-net [12] is a U-shaped convolutional neural network
discussed. The data preparation for the experiment and system which was first used in the field of medical image segmen-
setup is mentioned in Section 4. Experimental results and tation [13][14][15]. It is a specific symmetric instance of
evaluation are presented in Section 5. Finally, our conclusion the encoder-decoder network structure, with skip connections
of the paper is in Section 6. from layers in the encoder to the corresponding layers in the
decoder. The encoder-decoder networks have been applied to
II. R ELATED WORKS many computer vision tasks, including object detection and
A. Palm-line applications in real-life semantic segmentation. These networks contain an encoder
Palm reading (also known as palmistry) is an ancient module that compresses feature maps to capture higher se-
technique originated in China. It is the analysis of a human mantic information. And a decoder module that recovers that
hand to foretell the owner’s future and personality. Ancient spatial information.
Chinese believed palm lines inhold information of humans, Feature Pyramid Network (FPN) [16] uses a standard
similar to how our ancestors found out the correlation between network with multiple high spatial resolution features and
the movement of planets and events that happened on Earth. adds a top-down channel with lateral connections. The top-
Be it the curve, the length, the depth or the location of the down path begins at the deepest level of the network and is
lines, every detail has its own specific meaning. progressively upsampled while adding a converted version of
Palm print can be used in the field of biometric verification the high-resolution feature from the bottom-up path. The FPN
and recognition. Similar to fingerprinting, each person has generates a pyramid, where each level has the same channel
unique palms printing. Distinct palm printing features such dimension.
as geometry, lines, points, and wrinkles can be used for B. Backbone Network
authentication purposes. Combined with other biometrics, the
ResNet-34 [17], or more generally, ResNet, was developed
security level and privacy are increased significantly[8].
by Microsoft in 2015. It has a structure similar to VGG but
B. Other approaches with multiple stacked layers. With traditional deep learning
At present, numerous methods were proposed for palm models, extra layers are added in order to achieve better accu-
detection process. However, subpar accuracy is still the main racy, which causes a phenomenon called vanishing/exploding
issue when extracting features of palm lines. Earlier works gradient [18][19] as models get deeper. ResNet with residual
were affected by the limitations of pure traditional image block are designed to solve the problem, hence providing a
processing techniques; and some papers suggested integrating better outcome. ResNet-34 (34-layer) was selected among the
hardware devices to improve precision, which provides spe- best, based on our previous experiments.
cific optimal circumstances for detection. For instance, among ResNeXt-50 [20] are described as a straightforward network
the proposals were ROI extraction [9][10], and 3D palmprint in the task of image classification. The authors established
with structured light imaging [11]. a new hyper-parameter named cardinality, a crucial factor in
addition to the model’s depth and width. According to the
III. P ROPOSED METHOD paper, increasing cardinality shows much better results than
In this section, we present the deep learning algorithms used going deeper (increasing layers) or wider (increase bottleneck
to solve our defined problem above. We conduct experiments width). We choose to experiment with this backbone since the
with other existing deep learning models and compare its authors declared that the network fares better than ResNet on
accuracy. Each architecture has advantages and disadvantages, both COCO detection and ImageNet-5k.

24

Authorized licensed use limited to: Mukesh Patel School of Technology & Engineering. Downloaded on January 31,2024 at 15:46:43 UTC from IEEE Xplore. Restrictions apply.
Fig. 3. U-Net with Context Fusion Module

as the left branch but without the sigmoid and is independent


of the left. This branch’s purpose is to capture the global
context feature as a piece of additional information for the
fusion module. The left branch output is then used as weights
to linearly combine the input before CFM yielding a local
context vector, which then is fused together with the right
branch through element-wise addition. The complete module
is illustrated in Figure 2.
Fig. 2. Context Fusion Module combining local and global context features.
IV. DATASET AND T RAINING

C. Our network - U-Net with Context Fusion Module A. Dataset

Attention mechanisms focus on the important regions of For our task at hand, we handcrafted our own dataset. As
the local features and neglect irrelevant information of the we mentioned in the previous part, 11K Hands dataset (1600×
global features. This design makes them effective in solv- 1200 pixels) [24] has 11076 images of human hands, ranging
ing the long-range dependency problem. With the attention from 18 to 75 years old. All images have same white solid
mechanism, deep learning models have become successful in background and similar distance from viewpoint. Based on two
many computer vision problems such as image classification labels “palmar left” and “palmar right”, we gather a dataset
[21], image captioning [22], image segmentation [23], and so consisting of 5243 images of palmar sides. We discard all but
on. To minimize long-range dependencies in the palm line 1039 best quality images to ensure these properties of a good
segmentation problem, we combine local and global features dataset:
in one module called Context Fusion Module (CFM). This • Balanced distribution. We proceed to only select 5423
module is based on the attention mechanism to improve the palmar side images, then filter out subpar images from
accuracy of the overall model. It was integrated into U-Net high-quality ones, to remain with 512 images of the
after the encoder component as a bottleneck layer as shown in “palmar left” label and 527 images of the “palmar right”
Figure 3. Our Context Fusion Module is shown in Figure 2. It label.
can be divided into two sub-modules. The first module called • Wide range variety. The original dataset contains diverse
Context Modeling captures the global context features with images of skin color, gender, ages. Our custom dataset
a 1x1 convolution layer, followed by a softmax, to obtain still remains the diversity of the source, achieved by
the attention weights. The main purpose of this module is carefully choose the variety in gender, skin color, hand
to perform attention pooling and obtain the global context pose, and age.
features. The second module called the Context Transform • No incorrect labels, image noise. There are many
module is divided into two branches. The left branch includes blurred ones, incorrectly labeled pictures or images un-
two 1x1 convolutions, with a ReLU activation and a sigmoid suitable for instance hand with long scars, palm lines
after each of them, respectively. The left branch aims to not visible, etc. We carefully observe and select the most
compute the importance of each channel and captures channel- appropriate images for the task and pass it through the
wise dependencies. The right branch has the same architecture upcoming annotation process.

25

Authorized licensed use limited to: Mukesh Patel School of Technology & Engineering. Downloaded on January 31,2024 at 15:46:43 UTC from IEEE Xplore. Restrictions apply.
• Annotation method. We used Supervisely [25], which over the Union region. In a way, this is the segmentation
is a professional platform for image annotation and data version of the recall.
management. A specific tool called “add bitmap” was target ∩ prediction
utilised to draw bitmap paths along the palm lines. IoU = (5)
target ∪ prediction
Also, we mainly concentrate on only visible and most
meaningful lines for somatomancy purposes. The intersection consists of pixels in both the target
and prediction region, while the union is the area of
After the annotation process, we proceed to augmentation
both have taken. The IoU score is calculated for each
process with Albumentation [26] to diversify and enrich the
class separately, and then averaged over all classes to
dataset. The techniques we used including Horizontal-Flip,
provide mean IoU (mIoU) score of semantic segmentation
shift scale rotate, Random brightness contrast, and CLAHE.
prediction.
The final results contain 4156 images with 4156 corresponding
masks that have appropriate variation in complexions, contrast, Every image has a corresponding binary mask. The aug-
and magnitude. mented dataset with a total of 4156 images was split into
3 parts: 80% samples for training (3324 images), 10% for
B. Training validation (415 images) and 10% samples for testing (415
1) Loss function: Since the output of our model are prob- images). We take experiments on our dataset in both grayscale
abilities denoting whether a pixel is of interest, we use binary and negative channels. As our experimental results, the neg-
cross-entropy as the loss function. This loss function measures ative images show the potential of giving the best results.
the entropy difference (and equivalently, the statistical distance Further inspect, we find out when an image is transformed
plus some constant) between the ground truth Bernoulli dis- into negative type, palm lines become brighter, so it’s easier
tribution and our predicted one. We sum up the pixel-level for segmentation process.
cross-entropy values to get the loss for each image as follows: We train the vanilla U-Net, FPN and our custom network
  with training pair S = {xi , yi } with xi is the i-th image
L(x , x) = − xij log xij + (1 − xij ) log(1 − xij ) (1) and yi ⊂ {0, 1} is mask (label) corresponding to xi . The
i,j networks are initially set to be trained through 100 epochs
 with the Adam optimizer [27] and an initial learning rate of
where x is the ground truth mask value (either 0 or 1), and
x is our predicted probability of whether a pixel is segmented 0.0001. Learning rate will be dropped to a fifth if the loss
as positive. Since our input size is fixed (i = j = 256), value does not reduce after 8 epochs. Also, if loss value still
we evaluate the class predictions for each pixel and take the remains unchanged after 10 epochs, the training process will
average of the losses overall pixels. We also do experiments be automatically stopped.
with mean squared error (MSE) as a loss candidate. However, With our network, the resolution of all images for training,
MSE was slower and converged to a worse local minimum, verification and testing will be resized to 256 × 256 because
which can be explained by the fact that it was a generic loss model will train faster with smaller images and using less
function taking no prior information about the problem into memory and computational power, which fit for our limited
optimization. resources. Batch size for all processes is fixed to 64. The loss
2) Evaluation metrics: We opt to use 2 metrics for our function is the binary cross-entropy as we have mentioned
model’s performance: above. We train the network for slightly more than two hours
• F1 score: this is a natural measurement for a pixel-level
with early stopping at epoch number 36.
classification like our models’ settings. The formula for C. System configuration
F1 score is:
Our experiments are conducted on a computer with Intel
TP Core i5-7500 CPU @3.4GHz, 32GB of RAM, GPU GeForce
P recision = (2)
TP + FP GTX 1080 Ti, and 1TB SSD hard disk. The models are
TF implemented with the TensorFlow framework [28] and Keras
Recall = (3)
TP + TN API [29].
 
2 V. R ESULT COMPARISONS
F1 = (4)
Recall−1 + P recision−1 A. Results
which is the harmonic mean of the P recision (true pos- In [3], the authors proposed a system that can handle hand
itives over predicted positives) and Recall (true positives images after various preprocessing techniques like desatura-
over actual positives). This gives us a better evaluation tion, threshold, dilation and palm extraction by locating special
than mere accuracies in the case of imbalanced data, points, applying interpolation, and determining the Region
which happens to be our case as well, since most of the of Interest. With the extracted palm hand, they continued
regions in the picture are not of palm lines. to use Canny edge detector to get the palm lines. Canny
• IoU Score: is a measure of accuracy for segmentation edge detector, which was developed by John F. Canny in
problems, defined as the ratio of the Intersection region 1986 [30], is a multi-stage algorithm, designed for finding the

26

Authorized licensed use limited to: Mukesh Patel School of Technology & Engineering. Downloaded on January 31,2024 at 15:46:43 UTC from IEEE Xplore. Restrictions apply.
The weight of its neighbors decreases as the spatial distance
between them and the center pixel increases:
−(x−μx )2 −(y−μy )2
2 + 2
2σx 2σy
G0 (x, y) = Ae
with μ being the mean (peak) and σ the variance of x and y.
Parameter μ controls the amount of change the Gaussian filter
act upon the image. The size of kernel should be chosen wide
enough, which we selected to be 3 × 3 for this process.
Figure 5 shows the result with and without the Gaussian
blur post-processing step. As expected, this optional step
reduces disconnected random pixels being classified as of
interest, but at the same time smoothens the edges of our
segmentation output. As a result, the mIoU actually decreases,
albeit by a very small margin. One may consider opting to
include this step if you want a nicer-looking mask, for example
Fig. 4. The output images of some models applied to this problem. Our if the next task in your pipeline requires it.
network (Unet-CFM) achieve a better result with complex palm line input.
VI. F UTURE WORKS
Our research opens a plethora of possibilities to be consid-
edge regions. We have successfully implemented the proposed
ered. For example, one may be interested in exploring the idea
algorithm in [3] for comparison along with U-Net or FPN
of applying the Context Fusion Module on each of the skip
(with ResNet-34 and ResNeXt-50 as the backbone), and our
connection in every tier of the U-Net architecture. Or, they can
network. Figure 4 gives more detail of the experimental
experiment with directing the encoding output of each U-Net
results.
tier into the CFM for it to have a more comprehensive inter-
Table I shows the numerical result comparisons. For mIoU,
pretation of the information, which then could be distributed
our model surpasses U-Net with 34-layer baseline ResNet
back to the respective decoding parts. Another direction to be
backbone by 0.045 and ResNeXt-50 backbone by 0.049.
considered is whether replacing U-Net with Feature Pyramid
Against FPN, U-Net-CF shows superior improvement with a
Network (FPN)-like structure would be a good idea: CFM
0.228 difference on ResNet-34 and 0.193 on ResNeXt-50. F1
would work great with pyramid-pooling scheme; however
Score also reveals 0.53% and 0.41% increment compare to
traditionally FPN models are used for object detection tasks
U-Net; 3.8% and 3.41% gain compare to FPN. Our proposed
instead of segmentation. Along with experimenting with other
model has fewer parameters (only 10,270,115) than other deep
models, we can also improve our dataset with either better
networks such as ResNet-34 and ResNeXt-50. Nevertheless,
processing to handle the variations of complicated background
as shown in the table, our network still outperforms other
images, or simply increases the amount of data for our model
approaches. Therefore, applying CFM can intuitively be con-
to learn.
sidered as a promising method. It provides a new approach for
palmprinting algorithm that in the future can be investigated VII. C ONCLUSION
for localization performance improvements.
In this paper, we applied deep learning techniques to build
neural networks to solve the palm lines segmentation problem.
TABLE I The final mIoU of our model is 0.584 and F1 score is 99.42%
Q UANTITATIVE COMPARISON BETWEEN U-N ET, FPN AND U-N ET-CF
on our dataset. This dataset was collected manually and will be
Method Backbone Params F1 Score mIoU distributed publicly for scientific purposes. The experimental
Unet ResNet-34 24,456,299 98.89% 0.539 results show that the proposed method has tremendous ad-
ResNeXt-50 32,063,339 99.01% 0.535 vantages over the traditional image processing in the palm-
FPN ResNet-34 25,696,459 95.62% 0.356
ResNeXt-50 28,179,403 96.01% 0.391 line image segmentation tasks. Future works of the present
Unet-CF Unet [12] 10,270,115 99.42% 0.584 study would be working on a more robust method to handle
the variations of complicated background images; also further
investigations can be done using other functionalities of the
CFM.
B. Gaussian Filter
ACKNOWLEDGMENT
Gaussian Filter [31] are usually used to generate blur image.
Our main purpose of using this filter is to reduce image noise This work is partially supported by Sun-Asterisk Inc.
or any excessive detail. Gaussian filter’s basic idea is that a We would like to thank our colleagues at Sun-Asterisk Inc
pixel’s location affects its density in the image. For instance, for their advice and expertise. Without their support, this
pixels located in the middle would have the biggest weight. experiment would not have been accomplished.

27

Authorized licensed use limited to: Mukesh Patel School of Technology & Engineering. Downloaded on January 31,2024 at 15:46:43 UTC from IEEE Xplore. Restrictions apply.
Fig. 5. Results with (right) and without Gaussian blur (left).

R EFERENCES [21] F. Wang, M. Jiang, C. Qian, S. Yang, C. Li, H. Zhang, X. Wang,


and X. Tang, “Residual attention network for image classification,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern
[1] L. Liu and D. Zhang, “Palm-line detection,” in IEEE International Recognition, 2017, pp. 3156–3164.
Conference on Image Processing 2005, vol. 3. IEEE, 2005, pp. III269– [22] Q. You, H. Jin, Z. Wang, C. Fang, and J. Luo, “Image captioning with
III272. semantic attention,” in Proceedings of the IEEE conference on computer
[2] V. Kumar, A. Dua, H. Bansal, H. Aggarwal, A. Madan, and J. Bhatia, “A vision and pattern recognition, 2016, pp. 4651–4659.
simple technique for palm recognition using major lines,” The Scientific [23] L.-C. Chen, Y. Yang, J. Wang, W. Xu, and A. L. Yuille, “Attention to
Bulletin of Electrical Engineering Faculty, vol. 17, no. 2, pp. 38–43, scale: Scale-aware semantic image segmentation,” in Proceedings of the
2017. IEEE conference on computer vision and pattern recognition, 2016, pp.
[3] K.-P. Leung and N. Law, “An efficient automatic palm reading algorithm 3640–3649.
and its mobile applications development,” 08 2016, pp. 1–6. [24] M. Afifi, “Gender recognition and biometric identification using a large
[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification dataset of hand images,” CoRR, vol. abs/1711.04322, 2017. [Online].
with deep convolutional neural networks,” in Advances in neural infor- Available: http://arxiv.org/abs/1711.04322
mation processing systems, 2012, pp. 1097–1105. [25] D. Systems, https://supervise.ly/.
[5] S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back, “Face recognition: [26] A. Buslaev, V. I. Iglovikov, E. Khvedchenya, A. Parinov, M. Druzhinin,
A convolutional neural-network approach,” IEEE transactions on neural and A. A. Kalinin, “Albumentations: Fast and flexible image
networks, vol. 8, no. 1, pp. 98–113, 1997. augmentations,” Information, vol. 11, no. 2, 2020. [Online]. Available:
[6] C. Dong, C. C. Loy, and X. Tang, “Accelerating the super-resolution https://www.mdpi.com/2078-2489/11/2/125
convolutional neural network,” in European conference on computer [27] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
vision. Springer, 2016, pp. 391–407. International Conference on Learning Representations, 12 2014.
[7] M. Siam, M. Gamal, M. Abdel-Razek, S. Yogamani, and M. Jagersand, [28] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.
“Rtseg: Real-time semantic segmentation comparative study,” in 2018 Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow,
25th IEEE International Conference on Image Processing (ICIP). IEEE, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur,
2018, pp. 1603–1607. J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, C. Olah,
[8] S. Chandran, “Enhancement of palmprint using median filter for biomet- M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker,
rics application,” Maejo international journal of science and technology, V. Vanhoucke, V. Vasudevan, F. Viegas, O. Vinyals, P. Warden, M. Wat-
pp. 15–17, 01 2014. tenberg, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: Large-scale
[9] Saranraj S, Padmapriya V, Sudharsan S, Piruthiha D, and Venkateswaran machine learning on heterogeneous distributed systems,” 2016.
N, “Palm print biometric recognition based on scattering wavelet trans- [29] F. Chollet et al. (2015) Keras. [Online]. Available:
form,” in 2016 International Conference on Wireless Communications, https://github.com/fchollet/keras
Signal Processing and Networking (WiSPNET), 2016, pp. 490–495. [30] L. Ding and A. Goshtasby, “On the canny edge detector,” Pattern
[10] B. Zhang, W. Li, P. Qing, and D. Zhang, “Palm-print classification by Recognit., vol. 34, pp. 721–725, 2001.
global features,” IEEE Transactions on Systems, Man, and Cybernetics: [31] E. Gedraite and M. Hadad, “Investigation on the effect of a gaussian
Systems, vol. 43, no. 2, pp. 370–378, 2013. blur in image filtering and segmentation,” 01 2011, pp. 393–396.
[11] D. Zhang, G. Lu, W. Li, L. Zhang, and N. Luo, “Palmprint recognition
using 3-d information,” IEEE Transactions on Systems, Man, and
Cybernetics, Part C (Applications and Reviews), vol. 39, no. 5, pp. 505–
519, 2009.
[12] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks
for biomedical image segmentation,” 2015.
[13] O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa,
K. Mori, S. McDonagh, N. Y. Hammerla, B. Kainz, B. Glocker, and
D. Rueckert, “Attention u-net: Learning where to look for the pancreas,”
2018.
[14] Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang, “Unet++: A
nested u-net architecture for medical image segmentation,” 2018.
[15] M. Z. Alom, M. Hasan, C. Yakopcic, T. M. Taha, and V. K. Asari,
“Recurrent residual convolutional neural network based on u-net (r2u-
net) for medical image segmentation,” 2018.
[16] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie,
“Feature pyramid networks for object detection,” 2016.
[17] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
recognition,” 2015.
[18] S. Hochreiter, “The vanishing gradient problem during learning recurrent
neural nets and problem solutions,” International Journal of Uncertainty,
Fuzziness and Knowledge-Based Systems, vol. 6, pp. 107–116, 04 1998.
[19] G. Philipp, D. Song, and J. G. Carbonell, “The exploding gradient
problem demystified - definition, prevalence, impact, origin, tradeoffs,
and solutions,” 2017.
[20] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, “Aggregated residual
transformations for deep neural networks,” 2016.

28

Authorized licensed use limited to: Mukesh Patel School of Technology & Engineering. Downloaded on January 31,2024 at 15:46:43 UTC from IEEE Xplore. Restrictions apply.

You might also like