Road Image Classification: Leonid Dashko
Road Image Classification: Leonid Dashko
Leonid Dashko
[email protected]
Institute of Computer Science, University of Tartu
Supervisor: Amnir Hadachi
Abstract. This article summarizes the different ap- Basically, a road image can be classified into a struc-
proaches for road detection and recognition from single tured (covered with asphalt and may have line markings,
image and shows the results from classifying images to e.g. a road in urban area) or unstructured roads (for
detect whether the image contains road or not. example, roads in rural area). The most commonly used
Keywords: road classification, road image datasets, approach for detecting structured road in image is to
convolution neural network, image classification. localize road borders or road markings. The main tech-
niques for detecting roads are algorithms based on:
1) Finding 2 most dominant lines (their angles are
I. Introduction usually close to 30 and 120 degrees) which can make a
trapezoid form by using Hough Line Transform or other
earching for a valid path is one of the most im-
1
Distributed System Seminar • Spring 2018
Figure 1: Example of the structure of CNN (Convolutional Neural Network) for classification task
(or specific patch/region) and its labels are used as input into 7 classes: daylight, night, rainy day, rainy night,
to a model and the output from the model is the mullti- snowy, sun stoke, images in the tunnel.
dimentional array (the same as image size) that shows – CamVid [9]. 701 labelled images of structured roads
which pixels are road or non-road. only with different variation of weather, contains
In computer vision, people widely use the CNNs (Con- videos.
volutional Neural Networks) for image classification, ob- – LabelMeFacade [10]. 945 labelled images of structured
ject detection, image segmentation tasks. This type of roads only, contains videos.
neural network will also be used in our experiment. Ex- – SUN2012. 426 images of different types of roads,
ample of the structure of CNN for classification task can does not have labels.
be seen in Figure 1. – Dataset provided by Amnir Hadachi (supervisor of this
As an evidence, the paper [4] is considered as one of work). 1,780 snowy road images captured on the
the breakthroughs in computer vision during the last snowy roads of Tartu neighbourhood, does not have
decade. In this article, the researchers used a large deep labels.
convolutional neural network to classify the 1.2 million – Dataset collected from ImageNet online image catalog by
high-resolution images in the ImageNet ILSVRC-2012 the author [11]. 494 road images of different type,
contests into the 1000 different classes and they. Their under different weather and day-time conditions,
approach performed extremely well and they achieved a does not have labels.
winning top-1 and top-5 error rates of 17.0% and 37.5% – Dataset created from Google, Bing, Yahoo images by the
accordingly using purely supervised learning. author [11]. 5,390 road images of different type, un-
CNN architecture for classification tasks can be sum- der different weather and day-time conditions, does
marized as following: not have labels.
– Receives a vector of images as input layer and pass II.i. Overview of operations in CNN
it further to hidden layers.
– Hidden layers include a set of mathematical oper- Convolutional filters (Figure 2). The convolutional fil-
ations for extracting low-level features such as ap- ters on the first step try to detect low-level features, for
plying convolution filters, activation functions, max- example, edges and curves. Later, when we apply ad-
pooling operation, resizing the layer, transforming ditional convolutional filters, they can be more complex
to fully-connected layer. as they are builded up to the already discovered fil-
– Finally, the last operation in CNN for classification ters. Usually, the CNN has more than one convolu-
problem will be taking the fully-connected layer as tional filter that ends up in a series of convolutional
input and flattening out the nodes into the vector of layers. For every convolution filter, we must specify the
size N (where N is the number of classes). Filter_size = WidthxHeight (this is how many pixels we
want to supply to filters).
Datasets. Among existing road image datasets, these
are the most-widely used:
2
Distributed System Seminar • Spring 2018
IV. Resuls
Figure 4: Main activation functions The false road images of different objects and environ-
ments which surrounded by nature (they were down-
Pooling (down sampling) layers. After activation loaded from Google, 1,000 non-road images). The model
functions, pooling layer can be applied to this output was trained and tested on 3 different datasets separately:
of activation function. This basically takes a filter (nor- 1) "KITTI" dataset [6] contains 612 images with labels of
mally its size 2x2) and a stride of the same length and only structured roads.
pull the maximum value in the filter. This serves two - 596 photos are chosen for training (roads: 484 (81.21%),
main purposes. The first is that the amount of parameters non-roads: 112 (19%).)
or weights is reduced by 75%, thus lessening the compu- - 150 photos are chosen for testing (roads: 128 (85.33%),
tation cost. The second is that it will control overfitting. non-roads: 22 (15%)).
Results of classified images: 150 total images, correctly
3
Distributed System Seminar • Spring 2018
classified 147, wrongly classified 3. Accuracy: 0.98% (it Actual/Predicted Road Non-road
seems that the model was overfitted) Road 0 0
Non-road 86 764
Actual/Predicted Road Non-road
Table 3: Confusion matrix for another test on non-road 850 images
Road 126 2 which were not included to training/testing data
Non-road 1 21
4
Distributed System Seminar • Spring 2018
References
[1] Z. Tian, C. Xu, X. Wang and Z. Yang, "Non-
parametic model for robust road recognition" IEEE
10th INTERNATIONAL CONFERENCE ON SIG-
NAL PROCESSING PROCEEDINGS, Beijing, 2010,
pp. 869-872.