0% found this document useful (0 votes)
3 views5 pages

Video Sum4

This research proposes a big data analytics system to process real-time unstructured data from CCTV for traffic management, utilizing the YOLO framework and NoSQL databases. The system transforms unstructured video data into semi-structured JSON format for real-time visualization and analysis, enabling authorities to monitor traffic conditions and detect anomalies such as illegal parking. The implementation demonstrates effective vehicle counting and classification, providing insights into traffic patterns and improving management strategies.

Uploaded by

skushitha2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views5 pages

Video Sum4

This research proposes a big data analytics system to process real-time unstructured data from CCTV for traffic management, utilizing the YOLO framework and NoSQL databases. The system transforms unstructured video data into semi-structured JSON format for real-time visualization and analysis, enabling authorities to monitor traffic conditions and detect anomalies such as illegal parking. The implementation demonstrates effective vehicle counting and classification, providing insights into traffic patterns and improving management strategies.

Uploaded by

skushitha2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

,QWHUQDWLRQDO&RQIHUHQFHRQ'DWD6FLHQFHDQG,WV$SSOLFDWLRQV ,&R'6$

Big Data Analytics for Processing Real-time


Unstructured Data from CCTV in Traffic
Management
Faqih Hamami Iqbal Ahmad Dahlan
School of Industrial and System Engineering Military Engineering Faculty
Telkom University Indonesia Defense University
Bandung, Indonesia Bogor, Indonesia
[email protected] [email protected]

Setya Widyawan Prakosa Khamal Fauzan Somantri


Electronic and Computer Engineering Faculty of Mathematics and Science
National Taiwan University of Science & Technology Universitas Pendidikan Indonesia
Taipei, Taiwan Bandung, Indonesia
[email protected] [email protected]

Abstract—Todays many devices generate data Nowadays many applications generate data in huge
everywhere and anytime. Data grow massively and volume and make big data phenomenon everywhere [1].
becomes complex thing that needs to be handled. The phenomenon of big data appears in many places
Unstructured data is one type of big data that is including industry, bank, media, tourism, healthcare,
difficult to process and consists of unstable attributes. traffic etc. They produce a lot of data but they do not
In traffic management, CCTVs are installed to know how to process its since data is in unstructured
monitor the specific location in the highway. CCTV format.
generates unstructured data in image and video One of the main areas is in traffic management which
format. These data are difficult to process due to the the company or governments gathers data from images
complexity of the data. This research proposes to and video from CCTV. On the highway many CCTVs
implement big data analytics to process real-time have been installed to monitor traffic. Generated data
unstructured data from CCTV into knowledge from CCTV is one of the examples of unstructured data.
displayed in web dashboard. We implement the Data has a big size and it is growing very fast in video
YOLO framework with YoloV4 Architecture and format. It needs a big data solution to handle the
COCO dataset for traffic flow counting and detecting unstructured data. Meanwhile, the complexity of traffic
illegal parking which is categorized as abnormal needs to be understood as soon as possible. It means
situation. Unstructured data from CCTV then unstructured data from CCTV should be processed in real-
transformed into semi-structured format in JSON. time.
Data also can be visualized in real time to facilitate There two ways to process data 1) Batch processing
local authority to understand the highway situation. and 2) Real-time processing [1]. Real-time data is more
Historical data are stored in the NoSQL database to complex than batch because it must be processed in a
deep more knowledge such as vehicle traffic pattern. small time period or in near real time. Several domains
The proposed system requires the ROI drawing line as need to be processed in real-time processing and one of
trigger to count the passing vehicles. These them is in traffic management. Real time processing can
experiments are conducted from open CCTV for help the authority to understand the current condition in
traffic online in Bali Tower Public Streaming. The traffic immediately. One example is to understand the
prototype result is able to detect the object with 10 fps. density of the traffic. Big data analytics can help to
analyze the traffic in real-time from CCTV and return
Keywords: big data, unstructured data, real-time, CCTV
structured reports to active officers.
I. INTRODUCTION
II. RELATED WORK
Big data is a term to describe a collection of data
with huge volume. Data grows rapidly and has complex Traffic management is a complicated task that
variety. It consists of several characteristics such as happens in every city. Traffic management usually needs
volume, velocity and variety. There are three types of data the officer to stand by in a crowded area to monitor the
1) Structured 2) Semi-structured and 3) Unstructured. traffic. Besides that, CCTVs are installed in many areas to
Structured data is data with fixed format and it can be monitor the traffic online. This paper is designed to
easily processed with traditional databases such as accomplish some mission below:
RDBMS. Semi-structured format is structured form but 1) Real-time Counting: A real-time automatic
with dynamic attributes e.g. JSON or XML format while Vehicle Counting System needs to accomplish the traffic
unstructured data has unknown form of format such as situation in real-time and manage the traffic with the
image and video. efficient ways [11].

Authorized licensed use limited to: Sai Vidya Institute of Technology. Downloaded on March 26,2025 at 10:00:51 UTC from IEEE Xplore. Restrictions apply.
k,(((
2) Multiple-object (vehicle) Sensing: To accurately historical data while a deep learning algorithm is used to
count the traffic flow, the Vehicle Counting System needs read and transform stream video to semi-structured data.
to be capable of sensing multiple objects and it will be The deep learning algorithm used is Convolutional Neural
classified based their type of vehicles with multiple
Network (CNN).
detections [11].
3) Vehicle Classification: In the real-world scenario, A. NoSQL Database
there will be many types of objects picked up by the
camera. This data is important to give the feedbacks for NoSQL database is a type of non-relational database
the authorities to know the priority based on this data that aims to handle structured and unstructured data with
[11]. flexible schemes. There are several types of NoSQL
4) Anomaly Detection: In the real system, we also databases such as key-value database, document database,
can detect the real time anomaly detection such as a traffic graph database, column-oriented database and search
jam and illegal parking. This system can improve the database.
quality of traffic management and hopefully it can change NoSQL databases arise because of the limited ability
the people’s behavior in using the public roads [11]. of SQL databases to store data. The NoSQL database is
Data generated by CCTV is in unstructured form. It is very flexible and has a dynamic scheme to store
video or image type of data. Stream data from CCTV structured and unstructured data. Here are a few points
displays only captured activity without understanding why should use NoSQL instead of SQL Databases:
what is happening in real situations. 1. NoSQL is able to handle large amounts of data
Deep learning is the newest technology to with various types of data (structured /
understand images. It can recognize the object and unstructured)
understand the situation. Deep learning is not only able to 2. NoSQL consists of dynamic data scheme and no
recognize objects in traffic but also in other domains such need to define a fixed schema and its relationships
as railway stations, plate recognition, fire detection, traffic in the beginning
incidents, Fight events and others [2][3][4][5][6]. Several 3. NoSQL is good at scalability and can be developed
deep learning algorithms are AE, CNN, DBN, LSTM, horizontally with adding machines
RBM, VAE, GAN and RNN [7]. 4. Has a high performance in data storage and easy to
In this research CNN algorithm is used and perform a query with fast computing response
combined with YOLO v4 to detect vehicles on the MongoDB is a document-oriented type of NoSQL
highway [12]. The YOLO architecture has good accuracy database. Stored data in MongoDB is in Binary JSON
in object detection [8][12]. CNN algorithm is also quite format. There are several terms in MongoDB that are
good for image classification with 2D input data for similar to relational databases as shown in Table 1.
supervised learning model [14]. TABLE 1. MONGODB TERMS
Data from CCTV streams and processed with CCN SQL Database MongoDB Database
algorithm to extract data from unstructured dataset. CCN Database Database
is a popular deep learning algorithm because it has
Table Collection
excellent performance in machine learning problems [9].
CNN has the structure to reduce the dimensions and also Field/Column Field
filtering of the previous layer’s output for faster Row Document
computation with less information loss, and make it
possible to process original images directly. CNN also can
optimize the number of layers and filters of each CNN to MongoDB is good for handling big data because it
significantly reduce the calculation cost. Thus, the can distribute data into multiple machines horizontally
proposed algorithm not only achieves the state of-the-art and store multiple forms of data format.
performance for image processing but also achieves faster B. Convolutional Neural Network (CNN)
and more efficient computation [12][13]. However, in our
experimental systems, vehicles are sometimes failed to CCTV generates video format which consists of
detect because this system depends on the internet multiple images. CNN algorithm is good for processing
connection and CCTV FPS’s rate. images data. CNN is one of the best deep learning
The unstructured data is difficult to process with algorithms [9].
traditional databases. It needs a big data tool to process Same as other neural network algorithms, CNN also
both real-time and batch processing. Several open source has weight, bias and activation functions. CNN consists of
tools are available such as Spark, Kafka, NoSQL, multiple layers i.e. convolutional layer, non-linear layers,
Hadoop, Redis etc [1]. MongoDB is one of the best pooling layers and fully-connected layers.
NoSQL databases. It is a document-oriented database CNN algorithm has excellent performance in dealing
[10]. It can store unstructured text data and process it with images dataset, natural language processing (NLP)
easily. and computer vision [9]. It can give the accuracy result
III. BASIC CONCEPT very well with good performance related to speed
computation.
Big data technology is the basic concept of this
research. NoSQL database is a big data engine to store

Authorized licensed use limited to: Sai Vidya Institute of Technology. Downloaded on March 26,2025 at 10:00:51 UTC from IEEE Xplore. Restrictions apply.
IV. SYSTEM MODEL
This research proposes a prototype of traffic
management consisting of big data technologies to handle
data from CCTV with unstructured and real-time data.
First, we train data to create a YOLO object detection
model for object classification with COCO dataset.
YOLO is an intelligent convolutional neural network
(CNN) for object detection in real-time. This algorithm
divides the image into parts and predicts for each part. Fig. 3. Object aggregation.
COCO (Common Objects in Context), is one of the Other important features are added to video
most popular image datasets for object classification, attributes such as date, time, CCTV ID, etc. Features are
object detection, segmentation and text writing. COCO the important attributes to be analyzed from video. The
dataset is taken from objects found in the environment following is an example of important features extracted by
around humans. As written in the original research paper, the object detection model from a video stream
there were 91 object categories in COCO but only 80
object categories from labeled images were released in the {"frame":"xxxx","datetime":"year-month-day
hour:minute:second","object_1":"number-of-
first publication [15]. Objects from the COCO dataset object-1","object_2":"number-of-object-
include: person, bicycle, car, motorcycle, airplane, bus, 2","object_n":"number-of-object-n", "cctv":"xx"}
train, truck, boat, traffic light, bird, cat, dog, etc.
After the data is trained, several steps are taken to 2. Real-time and Batch Processing
analyze the real-time unstructured data from CCTV, 1) There are two ways of processing that will be used,
data extraction, 2) real-time and batch processing, 3) real-time and batch processing. In real-time processing
Front-end dashboard. System phase architecture is shown vehicles in current time will be analyzed in real-time
in Figure 1. action. The authority will understand the numbers of
current situations or the traffic density. Not every analysis
in traffic management should be in real-time. It consumes
a lot of computation to handle it. Batch processing is
needed also to gain the pattern from the historical
database. Traditional databases are not able to store traffic
data from CCTV. The volume of data is huge, the growth
velocity is very high and it consists of unstructured data.
Features from stream video are stored into the
Fig. 1. System architecture. MongoDB database. Generated features appear in various
forms with different attributes. Historical data in
MongoDB can be analyzed both real-time and batch.
1. Data Extraction
Following is an example of data stored in the MongoDB
Data extraction is a process to extract data from its database
source. In phase one, traffic conditions as a data source is
extracted from CCTV. The CCTV produces video format
and consists of multiple images called frames. YOLO
object detection model is used to understand types of
vehicles such as cars, buses, trucks and even people who
pass in that area captured by CCTV. The illustration of
object detection on the highway is shown in Figure 2.

3. Front End Dashboard


Both real-time and batch analytics will be displayed
to the dashboard. The dashboard is developed with a
JavaScript framework to make the user interface more
powerful and interactive. The web visualization phase was
Fig. 2. Object detection with deep learning. developed with NodeJS. It can visualize in real-time
stream data extracted from deep learning or from NoSQL
When the object has been recognized and extracted Database.
from frame, the same object will be automatically counted
of each object as seen in Figure 3.

Authorized licensed use limited to: Sai Vidya Institute of Technology. Downloaded on March 26,2025 at 10:00:51 UTC from IEEE Xplore. Restrictions apply.
V. RESULTS AND DISCUSSION
The experiments were conducted in a traffic area with
a huge volume of traffic during the day when the weather
is sunny. The current traffic on the highway was captured
by CCTV in real-time. The source of this CCTV
streaming is captured from Jakarta Open CCTV with link
below (smartcity.jakarta.go.id/maps in CCTV Bali
Tower). The vehicle classification detection model
extracted important features from stream data. The
snapshot of the processed video is in shown Figure 4.

Fig. 6. MongoDB Traffic Database.

In Figure 6, Historical data in MongoDB is analyzed


to understand traffic patterns in specific areas. It can be
seen in Figure 7 when the authority wants to know the
traffic patterns of vehicles in the CCTV 1 area will appear
in the form of time series in line chart

Fig. 4. Features extraction from real time traffic.

In Figure 4, several objects in stream video from


CCTV are detected in real time. Different vehicle types
are displayed with different color tooltips. Several vehicle
types detected are bus, motorcycle, truck, car and traffic
light. Important features extracted by model are
transformed into JSON format. The following below is
the extracted JSON data
{ Fig. 7. Traffic pattern analysis.
"frame":"2135",
"datetime":"2020-07-09 10:10:52 After collecting data, this system can predict the traffic
"total":"164", trend that can be useful for local authorities to prepare the
"car":"119",
"bus":"11", situation in the next period.
"truck":"23",
"person":"1",
"cctv":"1"
}
The extracted JSON can be automatically visualized
in real-time with a bar chart to make the authority can
understand the knowledge easily. Bar graph visualization
is shown in Figure 5.

Fig. 8. Traffic prediction pattern analysis.

In the real system, we also can detect the real time


anomaly detection such as a traffic jam and illegal parking
and it will be shown in figure 9. This system can improve
the quality of traffic management and hopefully it can
change the people’s behavior in using the public roads
and it reports into json format.

Fig. 5. Real-time objects detection visualization.

Historical data is also stored in MongoDB.


MongoDB uses MongoDB Compass as its graphical user
interface for simplicity in database management.

Authorized licensed use limited to: Sai Vidya Institute of Technology. Downloaded on March 26,2025 at 10:00:51 UTC from IEEE Xplore. Restrictions apply.
Fig. 9. Illegal traffic parking detection anomaly case.

VI. CONCLUSIONS
Deep learning algorithms and NoSQL databases are
big data technologies that can process unstructured data in
real-time. They can be very useful to understand traffic
conditions and help the officer to monitor the highway.
The proposed prototype is able to detect objects such as
car, truck, bus etc. and aggregate the type of vehicles. The
YOLO v4 model and COCO dataset was trained to
classify objects in highway traffic. It also can analyze
normal and abnormal situations on the real time
unstructured data. This real time system with Dell
Inspiron GPU 1050 Ti, i7 intel processor can monitor the
real time monitoring with real time CCTV streaming with
10 fps.

REFERENCES
[1] T. Kolajo, O. Daramola, and A. Adebiyi, “Big data stream
analysis: a systematic literature review,” J. Big Data, 2019.
[2] I. A. Dahlan and F. Hamami, “Big Data Implementation of Smart
Rapid Transit using CCTV Surveillance,” no. December, pp. 0–5,
2019.
[3] S. Prabhu, S. Kalambur, and D. Sitaram, “Live Stream Videos,” pp.
2359–2365, 2017.
[4] K. Muhammad, S. Member, J. Ahmad, and S. Member,
“Convolutional Neural Networks Based Fire Detection in
Surveillance Videos,” IEEE Access, vol. 6, pp. 18174–18183,
2018.
[5] A. P. Shah, J. Lamare, T. Nguyen-anh, and A. Hauptmann,
“CADP: A Novel Dataset for CCTV Traffic Camera based
Accident Analysis,” 2018 15th IEEE Int. Conf. Adv. Video Signal
Based Surveill., no. i, pp. 1–9.
[6] A. A. Einstein, “DETECTION OF REAL-WORLD FIGHTS IN
SURVEILLANCE VIDEOS Mauricio Perez , Alex C . Kot School
of Electrical and Electronic Engineering University of Campinas
Institute of Computing,” pp. 2662–2666, 2019.
[7] M. Mohammadi, G. S. Member, A. Al-fuqaha, and S. Member,
“Deep Learning for IoT Big Data and Streaming Analytics: A
Survey,” vol. X, no. X, pp. 1–40, 2018.
[8] J. Redmon, “YOLOv3: An Incremental Improvement.”
[9] S. Albawi and T. A. Mohammed, “Understanding of a
Convolutional Neural Network,” 2017 Int. Conf. Eng. Technol., pp.
1–6, 2017.
[10] I. Mearaj, “Data Conversion from Traditional Relational Database
to MongoDB using XAMPP and NoSQL,” 2018 Fifth HCT Inf.
Technol. Trends, pp. 94–98, 2018.
[11] J. Lin and M. Sun, “A YOLO-based Traffic Counting System,”
2018 Conf. Technol. Appl. Artif. Intell., pp. 82–85, 2018.
[12] C. Wang and H. M. Liao, “YOLOv4: Optimal Speed and Accuracy
of Object Detection.”
[13] J. Yamanaka, S. Kuwashima, and T. Kurita, “Fast and Accurate
Image Super Resolution by Deep CNN with Skip Connection and
Network in Network,” pp. 1–9.
[14] M. Mohammadi, G. S. Member, A. Al-fuqaha, and S. Member,
“Deep Learning for IoT Big Data and Streaming Analytics: A
Survey,” vol. X, no. X, pp. 1–40, 2018.
[15] T. Lin, C. L. Zitnick, and P. Doll, “Microsoft COCO: Common
Objects in Context,” pp. 1–15.

Authorized licensed use limited to: Sai Vidya Institute of Technology. Downloaded on March 26,2025 at 10:00:51 UTC from IEEE Xplore. Restrictions apply.

You might also like