1.an Analytical Approach For Enhancing The Automatic Detection & Recognition of Skewed Bangla LP
1.an Analytical Approach For Enhancing The Automatic Detection & Recognition of Skewed Bangla LP
Abstract—Although there has been a huge body of work on character detection, and character recognition. Although the
Bangla license plate detection and recognition, the successes ALPR system developed by HeadBlocks (referred to as HB-
of these works have largely been limited to correct detection ALPR henceforth) has over 96% accuracy as determined by
and recognition of undistorted license plates whose images are
taken chiefly from the front or the back of vehicles with slight experiments on their confidential test set, HB-ALPR finds it
angular variations. As a result, most Bangla automatic license difficult to correctly identify license plates that are more than
plate recognition (ALPR) systems in practice struggle when the 30° skewed on the image plane (ip-skewed) or the viewing
license plates are skewed on the viewing or the image planes plane (vp-skewed). Examples of such ip-skewed and vp-skewed
of the license plates. In this paper, we address this issue by license plates are shown in Fig. 1.
proposing an analytical approach that can enhance the ALPR of
both normal and skewed license plates and can be incorporated Since HB-ALPR has been commercially deployed and can-
into existing Bangla ALPR systems without modifying their not be modified without affecting all four stages of its pipeline
internal structures. Specifically, we demonstrate how existing and extensive testing, we approached the problem of correctly
ALPR systems can be treated as black boxes and analyzed to identifying skewed license plates by treating HB-ALPR as a
understand what sort of license plate images they work best on black box. Concretely, we first analyzed what kind of license
and introduce a novel pipeline that combines deep learning and
an algorithmic procedure for transforming images of both normal plate images are ideal for HB-ALPR, and then by combining
and skewed license plates into formats that are best suited for the a deep learning method with an algorithmic procedure, we
ALPR systems. We note that our proposed method can be easily developed a pre-processing step for the license plate images
generalized and applied to non-Bangla license plates as well. that led to a performance improvement of HB-ALPR for both
Index Terms—Bangla, Automatic License Plate Recognition, skewed and challenging non-skewed images.
Deep Learning
The rest of the work is organized as follows. In Section
II, we describe some related work done on Bangla and non-
I. I NTRODUCTION Bangla ALPR. In Section III, we describe our data set, the
Correct recognition of vehicle license plates has numerous analysis we did to understand the strengths and weaknesses
use cases that include penalizing irresponsible driving and of HB-ALPR, and the method we developed to pre-process
parking, keeping track of vehicles coming in and going out the images to the best format for HB-ALPR. In Section IV,
of parking lots, identifying vehicle ownership and so on. we show how our method affected the performance of HB-
While a lot of work on automatic detection and recognition of ALPR, and then in Section V, we present our conclusions and
license plates has been done for decades, there is still a lot of avenues of future work.
progress to be made on correct Bangla automatic license plate
II. R ELATED W ORK
recognition (ALPR). The variance that exists in Bangla license
plates owing to existence of different metro and vehicle types, While there have been a number of works on Bangla
and the scarcity of adequate samples to catch all of that variety ALPR based on algorithmic approaches such as [1] (which
makes Bangla ALPR specially challenging. A Bangladeshi focused on license plate detection only) and [2] (which used
company named HeadBlocks is working with the Dhaka template matching), shallow machine learning methods such
Metropolitan Police (DMP) to address this and has developed as [3] (which used support vector machines), and deep learning
a commercial Bangla ALPR system with a four stage pipeline such as [4] (which used convolutional neural networks), none
that includes license plate detection, character segmentation, of these works present extensive evaluations of their perfor-
mances on skewed license plate images. Moreover, to the best
* denotes equal contribution. of our knowledge, no work has been done so far to augment
Authorized licensed use limited to: University of Glasgow. Downloaded on June 03,2020 at 22:23:22 UTC from IEEE Xplore. Restrictions apply.
(a) IP-Skewed Ex. I (b) IP-Skewed Ex. II (c) VP-Skewed Ex. I (d) VP-Skewed Ex. II
Fig. 1. Examples of Different Types of Skewed License Plate Images
the performances of existing Bangla ALPR systems through license plates on images and then transforming the segmented
pre-processing their input images while treating the systems license plates into uniform rectangular views. The details of
as black boxes. each of the steps are provided in the following sub-subsections.
As in the case of Bangla ALPR systems, most ALPR 1) Instance Segmentation of License Plates: We conducted
systems developed for non-Bangla license plates such as [5], transfer learning on an existing Mask R-CNN [11] model and
[6], [7], and [8] do not demonstrate good performances or fine tuned it for Bangla license plates to determine exactly
show thorough assessments on skewed license plates. which pixels a license plate consisted of and to regress the
Silva and Jung [9], however, focused on building a complete smallest rectangular bounding box that contained the entirety
ALPR system robust to skewness of license plates which of the license plate.
comprised of three main steps: vehicle detection, license plate To train our Mask R-CNN, we divided the 1015 mask
detection and unwarping, and optical character recognition annotated license plate images into a training set of 800 images
(OCR). In this work, they introduced a novel network named and a validation set of 215 images. By selecting a learning rate
Warped Planar Object Detection Network (WPOD-NET) that of 0.001 and loading 2 images to an NVIDIA K80 GPU at
searches for license plates in the region of each detected a time, we trained our model for 100 epochs with the batch
vehicle and computes parameters for an affine transformation size set equal to 100. For the rest of the hyperparameters, we
that enables the rectification of each detected license plate to used the same values as mentioned in [11]. The outputs of our
a rectangular frontal view. Our proposed solution is similar Mask R-CNN model are shown in Fig. 3.
to WPOD-NET in that it too produces a rectangular frontal 2) Perspective Transformation of License Plates: Using the
view of the detected license plate, but it does so in a different predicted masks from our Mask R-CNN model, we generated
way. In addition, our proposed approach is applicable to any the Shi-Tomasi Corners [12] reconstructed from Harris Cor-
ALPR system and is not confined to any specific pipeline as ners [13] of each license plate instance. Through our empirical
WPOD-NET is. studies, we found that padding the mask by 10 pixels led to
III. M ETHODOLOGY
A. The Dataset
Our dataset consisted of around 3000 license plate images of
varying sizes taken from different perspectives using different
cameras. For bringing uniformity, we rescaled all images to
400 × 400 because the character segmentation and recognition
networks of HB-ALPR were trained using images of that
size. While resizing images, we opted to keep their original
aspect ratios the same and added black padding where it was
necessary. An example set of such images after converting (a) 400x400 (b) 600x600
them to the size 400 × 400 is shown in Fig. 2.
We then used three annotators to label the images as vp-
skewed, ip-skewed, or normal (all being mutually exclusive)
and used the majority vote of the annotators to decide the final
label of each image. Next, we randomly selected 1015 images
and created mask annotations of the license plates in these
images using a tool called VIA [10].
B. The Proposed Solution
Our proposed method of pre-processing license plate images (c) 332x400 (d) 3840x2160
consists of two steps: performing instance segmentation of Fig. 2. Images of Different Sizes after Conversion to the Size 400x400
Authorized licensed use limited to: University of Glasgow. Downloaded on June 03,2020 at 22:23:22 UTC from IEEE Xplore. Restrictions apply.
(a) IP-Skewed Ex. I (b) IP-Skewed Ex. II (c) VP-Skewed Ex. I (d) VP-Skewed Ex. II
Fig. 3. Instance Segmentation of the License Plates in Fig. 1 Using Our Mask R-CNN Model (Best Viewed in Color)
where
M11 x + M12 y + M13
f1 (x, y) = (9)
M31 x + M32 y + M33
M21 x + M22 y + M23
f2 (x, y) = (10) Fig. 4. Accuracy Distribution for Mixture of Good and Challenging Images
M31 x + M32 y + M33
Authorized licensed use limited to: University of Glasgow. Downloaded on June 03,2020 at 22:23:22 UTC from IEEE Xplore. Restrictions apply.
(a) IP-Skewed Ex. I (b) IP-Skewed Ex. II (c) VP-Skewed Ex. I (d) VP-Skewed Ex. II
Fig. 5. The Deskewed Versions of the License Plates in Fig. 1
Authorized licensed use limited to: University of Glasgow. Downloaded on June 03,2020 at 22:23:22 UTC from IEEE Xplore. Restrictions apply.