0% found this document useful (0 votes)
18 views

Batch - 5 Base Paper

Quantum enhanced AI to predict the traffic flow

Uploaded by

ABISHAKE S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Batch - 5 Base Paper

Quantum enhanced AI to predict the traffic flow

Uploaded by

ABISHAKE S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

j o u r n a l o f t r a f fi c a n d t r a n s p o r t a t i o n e n g i n e e r i n g ( e n g l i s h e d i t i o n ) 2 0 2 4 ; 1 1 ( 1 ) : 1 e1 5

Available online at www.sciencedirect.com

ScienceDirect

journal homepage: www.keaipublishing.com/jtte

Original Research paper

Solving traffic data occlusion problems in computer


vision algorithms using DeepSORT and
quantum computing

Frank Ngeni*, Judith Mwakalonge, Saidi Siuhi


Department of Engineering, South Carolina State University, Orangeburg, SC 29117, USA

highlights

 YOLOv5 model was used for detection, and the DeepSORT model was used for tracking to study the vehicle occlusion problem.
 The power of quantum computing with the alternating direction method of multipliers (ADMM) optimizer was leveraged.
 The multiple object tracking accuracy (MOTA) indicated a significant increase by 16% more than the regular YOLOv5-DeepSORT.
 A 6% multiple object tracking precision (MOTP) increase and a 6% identification metrics (F1) score increase were observed.

article info abstract

Article history: Inaccuracies of traffic sensors during traffic counting and vehicle classification have per-
Received 15 August 2022 sisted as transportation agencies have been prompted to calibrate sensors periodically.
Received in revised form Detection of multiple objects, heavy occlusions, and similar appearances in congested
13 May 2023 places are some causes of computer vision model inaccuracies. This paper used the
Accepted 15 May 2023 YOLOv5 model for detection and the DeepSORT model for tracking objects. Due to the
Available online 24 January 2024 nature of the reported problem caused by many misses and mismatches, the power of
quantum computing with the alternating direction method of multipliers (ADMM) opti-
Keywords: mizer was leveraged. A basic Kalman filter and the Hungarian algorithm features were
Traffic classification used in combination with a quantum optimizer to present robust multiple object tracking
Traffic counting (MOT) algorithms. This hybrid combination of the classical and quantum model has
DeepSORT fastened learning the occludes during frame matching of tracks and detections by gener-
YOLOv5 ating minimum quantum cost function value. Comparisons with the existing models
Quantum computing indicated a significant increase in the primary MOT metric multiple object tracking accu-
racy (MOTA) by 16% more than the regular YOLOv5-DeepSORT model when using a
quantum optimizer. Also, a 6% multiple object tracking precision (MOTP) increases and a
6% identification metrics (F1) score increase were observed using the quantum optimizer
with identity switching reduced from 6 to 4. This model is expected to assist transportation
officials in improving the accuracy of traffic counts and vehicle classification and reduce
the need for regular computer vision software calibration.

* Corresponding author.
E-mail addresses: [email protected] (F. Ngeni), [email protected] (J. Mwakalonge), [email protected] (S. Siuhi).
Peer review under responsibility of Periodical Offices of Chang'an University.
https://doi.org/10.1016/j.jtte.2023.05.006
2095-7564/© 2024 Periodical Offices of Chang'an University. Publishing services by Elsevier B.V. on behalf of KeAi Communications Co.
Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
2 J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15

© 2024 Periodical Offices of Chang'an University. Publishing services by Elsevier B.V. on


behalf of KeAi Communications Co. Ltd. This is an open access article under the CC BY
license (http://creativecommons.org/licenses/by/4.0/).

unreliability and immature development, hybrid quantum-


1. Introduction classical algorithms (a combination) have been used to
exploit the abilities of quantum computers (Radonjic  et al.,
The advances in object tracking technologies have widened 2012). In the hybrid quantum-classical ML, the quantum
the importance of computer vision applications for in-vehicle algorithm (containing ansatz algorithms) prepares quantum
navigation, video surveillance, computer-human interaction, states according to its inputs, and measurement outcomes
and autonomous vehicle development (Ali et al., 2014). are fed to the classical algorithm to calculate the cost
Artificial intelligence (AI) and machine learning (ML) have function (Luca, 2021). During this stage, learning algorithms
become essential in these applications, especially in object adjust the outputs from the quantum algorithm to minimize
classification (Liu and An, 2020), face recognition (Hashmi the cost. The updated output is then fed back to the classical
et al., 2021), and object detection (Lin et al., 2017). Also, algorithm as the loop continues, as shown in Fig. 2.
speed tracking of vehicles has been easy due to motion As different states struggle to eliminate traffic volume
recognition (Dong et al., 2021), and target tracking estimation errors from the stations, the need for accurate data
algorithms. These advances in intelligent transportation is more pressing. High-accuracy traffic data from weigh-in-
systems (ITS) have enabled possible vehicle detection, motion (WIM) is instrumental in traffic and pavement engi-
alerting, counting, and classification in various conditions neering. Data such as traffic volume are used in highway and
(Kamkar and Safabakhsh, 2016; Ngeni et al., 2022). The pavement structure designs while detection models can be
technologies allow to count and classify vehicles by used further in pavement distress detections (Li et al., 2011;
extracting features such as vehicle length in their Ruseruka et al., 2023). The use of quantum computers will
corresponding time-spatial image and correlation or help transportation agencies to improve the efficiency of
association computed from the co-occurrence matrix of the data collection, analysis, and real-time dissemination of
vehicle image with a bounding box. traffic information to the traveling public and other agencies
The challenges with object detection and tracking can be such as highway patrol, enforcement, etc.
explained in Fig. 1, where the occlusion effect can be seen as a The objectives of this study are twofold. First, to reduce the
problem when tracking vehicles in a congested roadway. As identity switches, misses, and mismatches during vehicle
the vehicles move, motion changes and proximity to others counting and classification. Second, to identify factors influ-
changes, making the machines lose track. Fig. 1 indicates encing cost function value during detections and model pre-
the presence of bounding boxes, movements of vehicles dictions. In the model development, this study addressed the
across different frames, and the expectation of the computer inaccuracy of occlusion using a quantum optimizer to opti-
algorithm to detect the changes. mize the objective function in the regular DeepSORT algorithm
Despite technological advancement, object detection and with the YOLOv5 detector. A quantum alternating direction
tracking face a lot of challenges due to appearance changes, method of multipliers (ADMM) optimizer minimizes the cost
scale changes, distractions, illumination changes, motion function to allow faster tracking via immediate value release
changes, detection of multiple objects, and most importantly, for quicker tracking before it switches identities (Gambella and
occlusion (Ali et al., 2014; Hong and Prozzi, 2006; Koller et al., Simonetto, 2020). This study employs the YOLOv5 detection
2005; Li and Ghosh, 2020). Occlusion can be classified either and DeepSORT tracking algorithms because they can
as object self-occlusion or inter-objects occlusion, partial incorporate object features during the tracking process.
occlusion or total occlusion, and long-term occlusion or This study is organized into six sections. Section 2 presents
short-term occlusion (Ali et al., 2014). a detailed overview of past works related to object occlusions
Different scholars have tried to study the solutions to and solutions. Section 3 presents the overall model
handle occlusions, but most have failed to handle total oc- architecture and its components, from datasets used and
clusions (Han et al., 2009; Shu et al., 2012). This failure is to custom training to classical and quantum model
some extent attributed to the fact that the furthest vehicles descriptions. Section 4 presents model tests for short-term
from the camera view are often obstructed if traffic is heavy; and long-term video duration. In contrast, Section 5 presents
hence poor performance in counting is to be expected and discusses the results and performance comparison of
(Pancharatnam and Sonnadara, 2008). Also, past studies the model with other DeepSORT YOLO models, and Section
show that vehicle counting and classification systems 6 presents conclusions and recommendations.
require periodic calibrations due to these persistent errors
(Memon et al., 2018).
In the last decade, ML and AI have developed to the extent 2. Literature review
they require higher training and testing capabilities. Quantum
computers are considered a viable option to tackle this limi- In deep learning, deep convolutional networks (CNNs) are
tation and many performance constraints due to their faster capable of learning image features for classifications (Eslami
speed in execution (Dilmegani, 2022). But due to their and Yun, 2023; Zhao et al., 2019). Region-CNN (R-CNN) uses
J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15 3

Fig. 1 e Vehicle tracking. (a) Complex scenario of tracking vehicles in congested roadway condition. (b) Detection using
bounding boxes. (c) Movements of vehicles across different frames and what the machine detects during losses. (d) The
expectation for the machine to detect across the frames.

Fig. 2 e Quantum algorithm and classical algorithm connection.


4 J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15

selective region search in the fixed image size but region- from respective frames. At the same time, at the object-
based fully convolutional networks (R-FCN), feature pyramid level, every object is located based on adaptive appearance
network (FPN), and mask RCNN have improved the feature models, spatial distributions, and inter-occlusion
extraction methods, selection, and classification capabilities relationships (Huang and Essa, 2005). One of the
of CNN in various ways (Girshick et al., 2014; He et al., 2017). disadvantages of the model is there are no motion, shape, or
These are all termed two-stage detection methods. The one- size assumptions, but it can perceive the persistence of
stage methods include the single shot multibox detector object occlusions even when they re-emerge from the
(SSD) and the famous YOLO model frameworks (Liu et al., occludes. The ignored motion, shape, and size assumptions
2016; Redmon et al., 2016). are insufficient (Huang and Essa, 2005). Later the idea of
Most of the multi-object tracking models have employed object permanence was extended by Papadourakis and
detection-based tracking (DBT) and detection-free tracking Argyros (2010) model that did not require prior training to
(DFT) for object initialization (Song et al., 2019). The more account for the shape, size, or motion differences. The
prominent DBT method utilizes background modeling to model autonomically and dynamically builds appropriate
detect moving objects in video frames before tracking starts object representations. It answered how to model objects
because it considers the problems of similarity of inter- and generated powerful data association mechanisms to be
frame objects and intra-frame objects. Based on this, employed as it also answered how to handle long-term
multiple object tracking (MOT) has been simplified and occlusions (Papadourakis and Argyros, 2010).
classified into two, which are online and offline tracking. During the last decade, detection has improved signifi-
Online tracking processes two frames at a time and has cantly using the SORT algorithm (Kline et al., 2019). Still, the
good real-time application, but it is difficult to recover from DeepSORT has made it better where the cost used during the
occlusions. Offline tracking processes a batch of frames, and first matching step on frames is set as a combination of the
it is robust to recover from short-term occlusions and Mahalanobis and the cosine distances functions (Hou et al.,
suitable for video analysis. The latter is not suitable for 2019; Wojke et al., 2017). For the persisting occlusion problem
realtime applications, but most MOT methods depend on it in multi-object tracking, Wojke et al. (2017) proposed the
for initialization (Milan et al., 2017). Simple online and real DeepSORT algorithm by introducing cascade matching based
time tracking (SORT) is a representative of the online on SORT and matching the association process between the
method, which is based on rudimentary data association prediction and detection frame of the targets by the
and state estimation techniques such as the Kalman filter Hungarian algorithm. Cascading reduced the occlusion
(KF) and the Hungarian algorithm for the tracking (Bewley problem, but low light intensity still led to misses during
et al., 2016). Moreover, Milan et al. (2016) proposed the first detection (Wojke et al., 2017). Further improvements on
online MOT algorithm based on deep learning with high DeepSORT have led to the introduction of acceleration
performance on the benchmarks (Milan et al., 2016). parameter components and global trajectory generation
In solving the occlusion problem, Han et al. (2009) used mechanisms, but only a slight improvement has been
Kanade-Lucas-Tomasi (KLT) feature to represent an object reported (Chen et al., 2020). The current performance
and a trajectory estimation algorithm by considering the comparison between YOLOv3 and YOLOv5 indicates an
weighting function of tracked features. This function improved performance in the tracking of objects using the
achieved desirable results by removing tracking errors for DeepSORT algorithm and MOT-17 datasets (Gai et al., 2021).
partially occluded objects only (Han et al., 2009). Later, Shu Table 1 shows the performance metrics comparison between
et al. (2012) developed a more discriminant and robust YOLOv3 with DeepSORT and YOLOv5 with DeepSORT.
model against appearance changes and occlusions by Using quantum computing, scholars have studied solving
excluding parts of the background within the detection the occlusion problem using state-of-the-art non-maximum
window; however, it failed with total occlusions also (Shu suppression (NMS) to enable the image-retrieving process
et al., 2012). Another study by Gabriel et al. (2003) while removing redundant objects. The process is based on
approached the problem by splitting it into two groups eliminating false positives by keeping frames with the highest
namely merge-split (MS) and straight-through (ST). In the detection scores. However, it suffers during detection under
MS approach, as the blobs are declared occluded, the system occlusion, where true positives with lower detections are
merges them into a new blob characterized by new suppressed (Li and Ghosh, 2020). Quantum computing can be
attributes until they split again (Gabriel et al., 2003). This used to remove redundant detections in the quadratic
creates the problem of identifying the objects again after unconstrained binary optimization (QUBO) framework with
splitting, as MS does not address the problem. In another scores from every bounding box and overlap ratio between
case, the ST approach builds a model for each object, and it pairs of bounding boxes optimized (Hu and Ni, 2019; Li and
does not suffer from MS problems since they classify any Ghosh, 2020). The generated QUBO optimization problem was
pixel in the occlusion region as it belongs to one of the
occluded objects. However, studies suggest it is still
insufficient (Gabriel et al., 2003).
Table 1 e Performance metrics comparison between
In further study to solve the occlusion problem, Huang and YOLOv3 and YOLOv5.
Essa (2005) presented an idea on object permanence to reason
Clear MOT metrics MOTA (%) IDF1 (%) FP FN IDS
about occlusions by using region-level and object-level
tracking. At a region-level, a genetic algorithm is used to find Yolov3-DeepSORT 50.2 69.5 1971 1413 29
Yolov5-DeepSORT 60.1 76.0 156 227 6
optimal region tracks by associating foreground (FG) regions
J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15 5

solved with the proposed quantum-soft QUBO suppression


(QSQS) algorithm for fast and accurate detection by 3. Model architecture
exploiting quantum computing powers. However, its
accuracy was 75.11% for PASCAL VOC 2007 datasets (Li and 3.1. Dataset
Ghosh, 2020). It is well documented that classical computers
take a longer time compared to quantum computers when This study classified vehicles into eight classes however, there
the complexity of a problem increases (Abbas et al., 2021). is no specific vehicle image datasets for classification; hence
From this scenario, quantum computing was utilized to the Google dataset was used. These data included two-axle
improve the tracking accuracy and faster release of cost trucks, three-axle trucks, four-axle trucks, five-axle trucks,
function value to reduce identity switching during occlusion. six-axle trucks, motorcycles, cars, buses, and airplanes to
From the literature, occlusion still poses significant chal- make a total of nine labels. The images were manually an-
lenges in traffic data collection and analysis. There is a need to notated using the tool Make Sense (Make Sense, 2022), freely
find solutions to reduce occlusion-related problems in traffic available online. Nine labels were assigned to images to
data, namely identity switches misses, and mismatches. This represent the eight vehicle categories and airplanes and
study aims to solve the occlusion problem using the YOLOv5 were exported in acceptable YOLOv5 format. Fig. 3 shows
model with the DeepSORT algorithm and quantum computing the label data distribution among the classes. The number of
powers. The main objective of this research is to reduce oc- instances of cars, buses, motorcycles, and airplanes from
clusion problems and provide a robust traffic counting and images was higher compared to other vehicle categories,
classification model. Specifically, the following are the main and a total of 2946 images were used.
contributions of this study. Due to the inadequate image datasets for some classes,
YOLOv5 offered a useful solution to increase the number of
(1) Develop a new vehicle classification model to reduce datasets using data augmentation. Data augmentation is the
vehicle misclassification errors. The newly developed application of transformation that may be simple or complex
model is essential for designing highway and pavement such as style transfer and flipping to the datasets to overcome
structures. the large persistent requirements in the number of datasets.
(2) To incorporate deep machine learning and artificial in- Augmentation is enabled through scaling, color space ad-
telligence technologies to minimize the current clas- justments, and novel mosaic augmentation. It is well-docu-
sical cost function value by adding a quantum optimizer mented that the proper performance of detection models
to the DeepSORT algorithm. These new technologies requires a large amount of data (Shorten and Khoshgoftaar,
will significantly reduce computational time and oc- 2019). This requirement is because more layers need more
clusion errors in traffic data, especially vehicle counts examples for transfer learning. Through augmentation,
and classification. efficient image transformation operations were achieved by

Fig. 3 e Number of instances for each class.


6 J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15

Fig. 4 e Image data augmentation process.

generating a powerful image augmentation interface. Fig. 4 the model is extensive. It uses either a YOLO layer or a faster
shows how the three-axle truck image was augmented. region-convolutional neural network (R-CNN) as a detector for
target detection in each frame (Gai et al., 2021). However,
3.2. Model training faster R-CNN is a two-stage detection process that needs to
extract the region of interest using the region extraction
The dataset was split into two sets, with 80% for training and technique and then detect the target for the specific region.
20% for testing or validation. An additional 300 images were These steps make it tedious, leading to a slower detection
used as background images without their labels to reduce the rate and a slower tracking process; however, it has two
effect of false negative (FN) and false positives (FP) to increase improvements over the SORT. It combines motion
the model's accuracy. This set was also split into 80 % and 20%; information with apparent target features as the matching
the first was included in the training set and the latter in the criteria for the same target. It uses a combination of cascade
validation set. The model was trained on a graphics processing matching and intersection over union (IoU) matching to
unit (GPU) machine with NVIDIA GeForce GTX 1650 and CUDA reduce tracking errors due to occlusion. The DeepSORT is
version 11.6. To achieve desirable results, the model was equipped with the following vital algorithms.
trained at different parameter settings with a default image
size of 416 pixels, 40 batch sizes, 100 epochs, and a 0.01 learning 3.4.1. Kalman filtering
rate. These parameters were adjusted to obtain optimal values Kalman filter (KF) is also called linear quadratic estimation
with stochastic gradient descent (SGD) optimizer. (LQE). The algorithm uses a series of prior measurements
observed over time (statistical noise included) to produce
3.3. The YOLOv5 detection model more accurate estimates of unknown variables (Busu and
Busu, 2021; Kalman, 1960). The KF is the algorithm that has
The YOLOv5 system architecture used for the detection con- been incorporated in the DeepSORT algorithm, aimed to
sists of three parts, namely, the backbone (cross stage partial accurately predict the tracked target position based on the
darknet (CSPDarknet)), the neck (path aggregation network target's initial motion state through optimal estimation of
(PANet)), and the head (YOLO layer) (Gai et al., 2021). Data were the overall system state.
fed to the CSPDarknet for feature extraction through a cross-
stage hierarchy, then the PANet for feature fusion by 3.4.2. Hungarian algorithm (Kuhn-Munkres algorithm)
enhancing the instance segmentation process by preserving For matching predictions from the KF and detection targets,
spatial information. Finally, the YOLO layer produces the Kuhn-Munkres (KM) algorithm was used. The KF gener-
detection results: class, score, location, and size. ates optimal states and predicts bounding boxes of the target
The model input adopted mosaic data enhancement with state that later is matched by the detection target derived
adaptive anchor frame calculation and adaptive image scaling from the detection algorithm. This matching process is in two
for different datasets for easy feed in the future parts of the stages (Gai et al., 2021). Initially, the cost matrix is calculated
model. The backbone has a focused structure with two CSPs which is defined as the weighted value of IoU distance and
designed whereby one is applied in the backbone and the appearance similarity distance between the predicted and
other in the neck. The model has a neck to simplify collection detected targets. Then minimization of the total matching
feature maps by connecting the head and backbone. cost and returning a matching matrix containing the flags of
the predicted and detected targets that have been matched
and that have failed to match is done. This is done by
3.4. The DeepSORT target-tracking algorithm
searching for the optimal solution to an assignment bipartite
graph. From the matching stage in the Hungarian algorithm,
DeepSORT tracking algorithm is a detection-based and multi-
the cost function optimization was passed to a quantum
target tracking algorithm with better robustness than the
optimization model to minimize the total matching cost.
SORT algorithm though it needs powerful computers when
J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15 7

3.5. Quantum optimization systems the variables were incorporated into the model as equations.
Before these constraints were passed through the quantum
3.5.1. Cost function optimization algorithm, normal DeepSORT was allowed to
The cost function was used to assign a numerical score or proceed before capturing the fully cost matrix for optimization.
value that each prospective solution compares and chooses Below are the constraints and their descriptions.
the most suitable solution, the optimal solution, typically the
lowest cost value. In quantum computing optimization, the (1) Maximum matching threshold distance (u)
laws of quantum physics apply where the Hamiltonian func-
tion takes the role of the cost function where its cost value is A new bounding box (Bb) from the detection algorithm will
termed as the system energy. Each state chosen is termed the be assigned to a track if the cost is minimal, as in Eq. (1). The
state and the lowest energy state is called the ground state. cost function is computed as the sum of the linear
Usually, the mathematical expression approach defines the combination of the Euclidean distance between centroids
cost function of the problem's parameters and variables. (Cb1) of the previous bounding box assigned to the track
During optimization solution generation, constraints are used (Tb1) and the detected centroids (Cb) and the absolute area
such that relationship between multiple variables must be difference of the detected bounding boxes area (Pb) and the
satisfied for a solution to be valid. It is normal for solutions previous bounding box area (Pb1) assigned to the track (Tb1).
that violate the pre-defined constraints to be assigned higher

argmin
P
Bb ðCb ; Pb Þ : b ¼ ðud2 ðCb1 ; Ci Þ þ ð1  uÞjPi  Pb1 jÞ ue½0; 1; d2 ðCb1 ; Ci Þ < T (1)
ieF
i

cost values or penalties by the cost function or, in other cases, where T is the distance between centroids in consecutive
be excluded explicitly by the optimization solver. frames, u is the adjustable parameter that determines the
displacement's relative influence and the bounding box area
3.5.2. Optimization models change in consecutive frames, F is a set of all bounding
The discussed power of quantum computing provides the po- boxes within the current frame. This distance controls the
tential to solve problems that are practically unfeasible on maximum allowable distance between a detected bounding
classical computers or in other cases speed up the solutions for box and the previous bounding box assigned to a track. It
the regular classical solutions. Quantum combinatorial opti- negatively influences the association matrix because as it
mization (QCO) algorithm was used to find an optimal object increases, the association probability decreases; hence larger
from a finite set of available alternative solutions. The problem cost values are predicted.
was phrased as a minimization function of the objective using
a sum functions that usually employs the approximate opti- (2) Maximum IoU threshold (v)
mization in finding the approximate solution which is usually
termed non-deterministic polynomial-time hardness (NP- This value refers to the threshold value used to determine
hard). the extent to which the bounding boxes should overlap when
Due to the prevailing computational errors and mostly determining the identities of the unassigned tracks. It is
important noisy intermediate-scale quantum (NISQ) from contrary to the IoU value that ranges from 0 to 1 used to
using many gates in quantum computers, a hybrid classical- specify the extent of overlap between the predicted and
quantum model was developed. Quantum processing unit ground truth bounding box during object tracking. It is ex-
(QPU) computed the system energy for a given set of param- pected that the greater the number, the lesser the probability
eters from the classical computer, and the later steps were that association can be achieved. Lower probabilities are
done on a classical computer. usually assigned larger cost values; hence keeping the value
For the given constrained optimization problem, the alter- optimal is necessary. After assigning the extent the bounding
nating direction method of multipliers (ADMM) convex opti- boxes should overlap, the linear assignment by matching
mizer with an operator splitting algorithm was used. It is cascade is formulated to compute the cost matrix between
known to have a residual, objective, and dual variable conver- each detected bounding box Di ; ief1; 2; /; Ng and all predicted
gence properties if convexity assumptions are held. They can bounding boxes Pi ; ief1; 2; /; Mg within a frame with the IoU as
also be solved using a QUBO quantum device via variational a metric shown in Eq. (2).
algorithms. One of the positive parts of ADMM is its continuous 2 3
convex-constrained subproblem, which can effectively be IoUðD1 ; P1 Þ / IoUðD1 ; PM Þ
6 7
solved with both quantum and classical optimizers. 6 7
6 IoUðD2 ; P1 Þ / IoUðD2 ; PM Þ 7
6 7
IoUðD; PÞ ¼ 6
6
7
7 (2)
3.5.2.1. Model constraints. The variable constraints of the 6 « « 7
6 7
regular DeepSORT model are usually a trial and error to achieve 4 5
an optimum combination in state-of-art models. In this model, IoUðDN ; P1 Þ / IoUðDN ; PM Þ
8 J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15

The IoU between the detected and predicted bounding box were fed to the quantum optimization algorithms to minimize
is given by Eq. (3). the cost function value that is defined by f ðQoptÞi in Eq. (5),
T where i shows the ith frame for the matching process,
Di Pi
IoUðDi ; Pi Þ ¼ S (3) assumptions have been taken that these variables have
Di Pi
equal weights.

f ðQoptÞi ¼ ui þ vi þ xi þ yi þ zi (5)
(3) Maximum number of misses before a track is deleted (x) where ui is the maximum matching threshold distance, vi is
the maximum IoU threshold or gating threshold, xi is the
This refers to the maximum aging of the detected tracks or maximum number of misses before a track is deleted, yi is the
the maximum number of consecutive misses before the track number of frames that a track remains in the initialization
state is set to be deleted. Newly created tracks that go missing phase, zi is the maximum size of the appearance descriptors
are usually classified as tentative until enough evidence has gallery.
been collected to delete them. Then, the track state is changed Fig. 5 shows the consolidated diagram of the model
to confirmed, and tracks that are no longer alive are classified combining the regular DeepSORT model and quantum
as deleted to mark them for removal from the set of active optimization algorithm to fasten learning of the occludes
tracks. The loss of tracking targets and missed matches are while utilizing the powers of the hybrid quantum model.
prone to happen when trackers perform poorly, or the target
number is large. This is more predominantly in the occlusion
event in a congested area. Hence it has got a negative effect on
association hence penalty is given to the cost function. 4. Vehicle counting and classification
analysis
(4) Number of frames that a track remains in the initiali-
zation phase (y) 4.1. Analysis of a short-duration video

The detection-based tracking requires manual initializa- The initial output of the model after incorporating training
tion of a fixed number of objects in the first frame before weights was the counting and classification of vehicles ac-
localizing them in subsequent frames. This initialization stage cording to the FHWA (2014) classes essential in the design of
is fundamental to the tracking algorithm, and the larger the highway pavements. The counting of the vehicles on the
number of frames a track remains uninitialized, the less the 3 min video was analyzed by establishing the ground truth
accuracy. However, it has a minimal effect because detection counts using manual counts from the video.
in the next frame can compensate for the losses. Hence the The model counts without quantum optimization and with
association will be reduced to some extent but not the prev- quantum optimizer were recorded to compare the counts in
alent case. terms of misses (FN), mismatches (FP), and identity switches
(ID switches). Table 2 shows that the number of misses was
(5) Maximum size of the appearance descriptors gallery (z) reduced by comparing the southbound (SB) and northbound
(NB) counts in the model with and without a quantum
A vector that can describe all the features of a given image optimizer. Furthermore, the mismatches on the buses were
in DeepSORT is achieved by building a classifier over a pre- removed as the model was able to differentiate them from
defined dataset, training, and then utilizing the final classifi- trucks. The identity switching was also reduced especially in
cation layer. The dense layer capable of producing a feature the number of cars from five cars to only three cars along the
vector for classification called the object appearance NB.
descriptor is generated. Once training was done, we passed all
the crops of the detected bounding box from the image to this 4.2. Analysis of a long-duration video
network and obtained a definite dimensional feature vector. It
directly influences the Mahalanobis distance that the updated This study used a 34-min video recording to check the model's
distance metric becomes as Eq. (4). effectiveness over a long duration for vehicle counting and
classification. The video was analyzed similarly to the previ-
D ¼ lDk þ ð1  lÞDa (4)
ous short video by establishing the ground truth counts using
where Dk is the Mahalanobis distance, Da is the cosine dis- manual counts. Fig. 6 shows the model counts compared to
tance between the appearance feature vectors, l is the established ground truth counts. It shows a slight difference
weighting factor. in the number of counts compared to the ground truth as
This incorporation in DeepSORT has enabled tracking ob- the number of identity switches increases with traffic. The
jects through more extended periods of occlusions, effectively northbound traffic was higher than the southbound traffic,
reducing the number of identity switches (Wojke et al., 2017). explaining the higher difference. The number of cars has
It indicates that the larger the value, the higher probability of been scaled down by 100 to simplify visualization. Further
association, and it directly affects the cost function's value. data analysis was carried out to study the extent of the
difference in traffic counts.
3.5.2.2. Objective function. After the formulation of a fully cost Fig. 7 shows the difference between the model and ground-
matrix in the regular DeepSORT, the tracks and detections truth counts observed. It was observed that cars, buses, and 6-
J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15 9

Fig. 5 e Consolidated quantum-DeepSORT model.

Table 2 e Vehicle counts with and without the quantum optimizer.


S/No. Class Ground truth DeepSORT model DeepSort with quantum
SB NB SB NB Comment SB NB Comment
1 2-axle truck 4 1 1 0 Misses (FN) 2 0 Misses (FN)
2 3-axle truck 3 1 0 0 Misses (FN) 1 0 Misses (FN)
3 4-axle truck 0 1 0 1 0 1
4 5-axle truck 0 3 0 0 Misses (FN) 0 1 Misses (FN)
5 6-axle truck 0 0 0 0 0 0
6 Bus 0 0 0 4 Mismatch (FP) 0 0
7 Car 107 66 125 71 ID switch 112 68 ID switch
8 Motorcycle 1 0 0 0 Misses (FN) 1 0
9 Airplane 0 0 0 0 0 0
Total 115 72 126 76 116 70

axle trucks are more different than other classes, but their vehicles exceeds 4% for a single lane. Table 3 shows the
composition can explain this in total traffic. There were a differences between the ground truth and the model counts.
total of 2422 vehicles southbound and 2551 vehicles The roadway had three lanes on each side, and on average,
northbound. The model generated a total of 2286 vehicles the error in counts was 2% per class, with the error in the
southbound and 3018 vehicles northbound. The positive number of cars along the 3 lanes reaching 17% equivalent to
value on the graph indicates the number of counts not 5% increase in NB direction per lane. This error is due to an
observed by the model, and the negative value shows the identity-switching problem reported.
increase in the number of counts due to switching other
identities. Further observation in traffic data indicated there
was switching of identities of 5-axle trucks and 6-axle trucks 5. Performance comparison with other
to buses as the number of buses increased. Similarly, it was models
observed that cars were switching identities between
themselves as the car volume increased. The model performance was tested on the MOT datasets and
According to FHWA (2014), the traffic data measurements compared with other tracking models. MOT has different
become unacceptable when the percentage of unclassified metrics that are useful during comparison. MOTP, MOTA, F1
10 J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15

Fig. 6 e Vehicle counts on a model with the quantum optimizer and ground truth.

Fig. 7 e Difference between model and ground truth counts.

score, number of frames, number of matches, number of track MOT models, association, detection, and localization errors
switches, number of false positives (false alarms), and number are expected. Cascading, sometimes called fragmentation, is
of misses are some of them. The following is the description of defined as the loss of tracking by the model between consec-
the main MOT performance metrics according to the classifi- utive frames and once the detection re-emerges, it is assigned
cation of events, activities, and relationships (CLEAR). a new identity. It can be seen in Fig. 8(a) where a pedestrian
was assigned ID-24 in the first frame, is lost in the second
5.1. MOT metrics frame, then emerges in the third frame with ID-47. Fig. 8(b)
shows how occlusion is experienced during vehicle tracking
The classification of events, activities, and relationships on a congested road where only a few vehicles can be
(CLEAR) MOT metrics are used to summarize other metrics detected.
listed earlier and they include MOTP, MOTA, and F1 score. These localization and association errors are estimated for
With occlusion and cascading as the main challenges for the every single frame at a time (t) in the series of frames. Then
J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15 11

where N is total number of frames, FN is false negative/miss,


Table 3 e Vehicle counts with and without the quantum
IDS is ID switch/mismatch errors, FP is false positives, GT is
optimizer.
ground truth.
Number of counts differences Differences (%)
SB NB SB NB (2) Multiple object tracking precision (MOTP)
2-axle truck 6 10 0 0
3-axle truck 0 2 0 0 This is the secondary metric measure that is used to
4-axle truck 3 4 0 0 quantify the localization accuracy of the object detector used
5-axle truck 8 12 0 0 and has little information to describe the performance of the
6-axle truck 21 36 1 1
tracker used. The threshold value is expected to influence the
Bus 82 106 3 4
Car 177 432 7 17
MOTP value at a time t strongly and is computed as in Eq. (7).
Motorcycle 3 7 0 0
P
N
Average 1 2 dit
i;t
MOTP ¼ (7)
P
N
TPt
the cumulated sum of the errors on a video is used to estimate i

the final MOT metric value. These values are defined as where ¼ 1  IoU is the distance between the localization of
dit
follows. an object in the ground truth and the detection at time t, TPt
are the total matches made between ground truth and the
(1) Multiple object tracking accuracy (MOTA) detection (number of true positives) at time t.

It is a primary measure of the overall accuracy that con- (3) Identification metrics (IDF1) or F1 score
siders both detection and association errors. MOTA deals with
both tracker output and detection output and is computed at This also has good attributes in measuring the association
time t as in Eq. (6). accuracy rather than detection hence considered as the sec-
ondary metric. It is described as the ratio of correctly identi-
P
N
ðFNt þ FPt þ IDSt Þ fied detections to the average of ground truth and generated
MOTA ¼ 1  t¼1 (6) detections with the Hungarian algorithm involved in selecting
P
N
GTt trajectories and can be computed as in Eq. (10). It usually
t¼1
combines the ID_precision (Eq. (9)) and ID_recall (Eq. (8)).

Fig. 8 e Challenges for the MOT models. (a) Cascading problem during pedestrian tracking. (b) Vehicle occlusion problem
during tracking.
12 J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15

detection and association of the objects since it usually


Table 4 e Summary of the evaluation metrics.
disregards the track swaps that may result in unusable
Acronym High Low MOTP value. Also, once the threshold is considered very low,
Multiple object tracking accuracy MOTA 76.21 39.04 the number of misses will increase, and MOTA and MOTP
Multiple object tracking precision MOTP 91.97 84.85 values will be considered useless. Hence the optimization
Identity switches IDSW 4.00 72.00 stage has been regarded as one persistent problem by
CLEAR MOT recall CLR_Re 79.68 39.65
autonomically choosing the threshold values.
CLEAR MOT precision CLR_Pr 99.83 91.71
Table 4 shows the MOTA value reached 76.21%. It is a very
CLEAR MOT true positive CLR_TP 36,463.00 3913.00
CLEAR MOT false negative CLR_FN 16,432.00 1217.00 useful value that usually describe how the model fails. The
CLEAR MOT false positive CLR_FP 683.00 53.00 value is less than the MOTP because in multiple objects
Mostly tracked MT 49.00 11.00 tracking, the accuracy is much affected by mismatches and
Partially tracked PT 60.00 10.00 misses. Other parameters were also extracted from the
Mostly lost ML 36.00 1.00 tracking evaluation that shows how much the tracking
Fragmentation Frag 130.00 6.00
algorithm recovers the trajectories. The number of objects
tracked for at least 80% of its maximum age mostly tracked
(MT) reached a maximum of 49, and the number of objects
tracked between 20% and 80% of its maximum age partially
jIDTPj
ID recall¼ (8) tracked (PT) reached a value of 65. These parameters may
jIDTPj þ jIDFNj
become useless when the identity remains the same
throughout the tracking. Furthermore, the number of objects
jIDTPj
ID precision¼ (9) tracked in less than 20% of its maximum age (ML) was
jIDTPj þ jIDFPj
between 1 and 36. For practice, a higher number of MT and
lower ML are desirable. In another scenario, the total counts
jIDTPj
IDF1 ¼ (10) of how many times a ground truth trajectory was interrupted
jIDTPjþ0:5jIDFPjþ0:5jIDFNj
or the total number of times the trajectory was fragmented
where IDTP refers to the number of true positive IDs which are (fragmentation, frag) had a value between 6 for the higher
the IDs that have already matched the ground truth and pre- performance and 130 for the poor performance tracker. Table
dicted IDs, IDFN refers to the number of false negative IDs 4 summarizes the best and the poor performance metrics
which are the remaining ground truth IDs that have not from the evaluation (high and low values).
matched, IDFP refers to the number of false positive IDs which
are the remaining predicted IDs trajectories that are not 5.4. Evaluation metrics comparisons
matched with any ground truth IDs.
Since there are no vehicle datasets to compare accuracy From the MOT evaluation algorithm using MOT17 datasets,
with other state-of-art models, pedestrian datasets were used. the following were the results of the model in comparison to
other models. Table 5 shows that the quantum tracking model
5.2. Datasets generated the best results in terms of MOTA that explain both
detection and association accuracy where 16% higher
Multiple object tracking 17 (MOT17) datasets released in 2017 accuracy was achieved in comparison to the regular
were used to study the performance comparison in solving YOLOv5-DeepSORT model. The MOTP that focuses on
occlusion problems. These datasets contain seven different localization precision was 6% higher than the YOLOv5-
indoor and outdoor scenes of public places with pedestrians DeepSORT model with an F1 score that explains the
as the objects of interest for tracking evaluation. The video association accuracy rather than detection higher by 6%
scenes in MOT17 datasets are divided into training and than the regular YOLOv5-DeepSORT model.
testing, and three different detectors are used which are scale
dependent pooling (SDP) (Yang et al., 2016), faster-RCNN and
deformable parts model (DPM) (Felzenszwalb et al., 2009). 6. Conclusions and recommendations
These three models provided the basis for comparison
(ground truth) to the proposed MOT model. The best 6.1. Conclusions
performing accuracy, precision, and least identity switches
are usually selected as the model metrical performance. This study used the YOLOv5 model equipped with a Deep-
SORT algorithm to address the tracking accuracy errors in
5.3. Evaluation metrics output

The MOTP value of 91.97% compared to the GT in Table 4 Table 5 e Results of the quantum model in comparison
with other models.
indicates the system is too superior in the average
localization considering the area occupied on average by a Model MOTA (%) MOTP (%) IDF1 (%) IDS
person. However, it is essential to note the influence of the YOLOv3-DeepSORT 50.32 65.46 69.50 29
threshold value when discussing the results. It is well noted YOLOv5-DeepSORT 60.18 86.26 76.00 6
that once the threshold is set to a higher value, the impacts YOLOv5-DeepSORT 76.21 91.97 82.09 4
and quantum
will be felt on the MOTA value in measuring the correct
J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15 13

object tracking, such as pedestrians and vehicles. The quan- (1) The accuracy of computer vision applications in traffic
tum optimizer was used as a novelty to speed up the learning counting and classifications can be enhanced by incor-
of occludes during frames interpretation. It also compared the porating quantum computing due to its faster learning
YOLOv3 DeepSORT, regular YOLOv5 DeepSORT, and YOLOv5 of occludes and the DeepSORT tendency to store iden-
DeepSORT with quantum optimizer outputs to check how the tities in maximum age variable, reducing mismatches,
model overcame identity switching, misses, and false de- misses, and identity switches.
tections. These attributes usually accompany the occlusion (2) The comparison of the model with the state-of-art
problem, hence the need to be reduced. In summary, the models showed a lack of datasets with established
following results were observed. ground truth ready to derive the MOT metrics, particu-
larly vehicle datasets. The only available datasets are
(1) The model showed a decrease in misses and mis- pedestrian datasets that have their advantages, such as
matches. This decrease indicates the effectiveness of suitability for analysis in different environmental
the quantum optimizers in improving the tracking ac- complexities but lack the characteristics like vehicles,
curacy and faster release of the quantum cost function especially vehicle profiles that have led to identity-
value before new identities are assigned to vehicles switching properties. The model exhibited extensive
compared to regular YOLO models. This decrease is identity switching between five-axle and six-axle trucks
considerably significant in a short video, but as the switching to buses. Thus, there is a need for further
traffic increased, the number of identity switches analysis in the future to solve this problem.
increased while the number of mismatches and misses (3) For future studies, other quantum optimizers, such as
remained proportional. adiabatic quantum computation (AQC), can minimize
(2) The optimization stage considered one persistent an objective function by interpolating two Hamilto-
problem of tentatively choosing the intersection over nians, which will need to be defined based on the
union (IoU) value by autonomically choosing its problem and later their accuracy compared to the state-
threshold values, hence controlling the values of the of-art models.
multiple object tracking accuracy (MOTA) and multiple (4) Another metric for tracking evaluation, such as higher
object tracking precision (MOTP) that may be regarded order tracking accuracy (HOTA), can be utilized to check
as useless if the threshold is not optimal. accuracy. It can combine three IoU scores in terms of
(3) Comparisons of the model with selected state-of-art detection, association, and localization metrics. The
models indicated a significant increase in the primary study shows it has better performance and explanatory
classification of events, activities, and relationships parameters for MOT.
(CLEAR) multiple object tracking metric (MOTA-76%)
when using a quantum optimizer. The regular Deep-
SORT model with YOLOv3 has 50%, while the regular
Author contributions
DeepSORT model with YOLOv5 has 60%. This result
indicated a 16% increase in the MOTA value.
The authors confirm their contribution to the paper as follows.
(4) During comparisons, the MOTP value reached 92%,
Study conception and design: F. Ngeni, J. Mwakalonge, S.
significantly higher than other state-of-art models after
Siuhi; model architecture: F. Ngeni, J. Mwakalonge, S. Siuhi;
adding a quantum optimizer. This metric is affected
model results analysis and interpretation: F. Ngeni, J. Mwa-
considerably by the set IoU threshold value in the reg-
kalonge, S. Siuhi; draft manuscript preparation: F. Ngeni, J.
ular DeepSORT model. Optimization application
Mwakalonge, S. Siuhi. All authors reviewed the results and
removed the necessity of setting value since it is chosen
approved the final version of the manuscript.
based on other parameters.
(5) The study observed a higher value in the secondary
metric called F1 score using the quantum optimizer on a
DeepSORT model using pedestrians MOT17 datasets
Conflict of interest
with identity switching reduced from six to four.

The authors do not have any conflict of interest with other


entities or researchers.
6.2. Recommendations and future studies

This study enhances the knowledge of computer vision soft-


ware using quantum computing to solve the occlusion prob- Acknowledgments
lem in multiple object tracking. Computer vision is well-
documented with fragmentation/cascading and occlusion We acknowledge the contributions from the Center for Con-
problem in MOT algorithms. However, quantum computers nected Multimodal Mobility (C2M2) (Tier 1 University Trans-
have speed and accuracy that can solve the issues. Based on portation Center) administered by the transportation program
the results of this study, the following recommendations can of the South Carolina State University (SCSU) and Benedict
be drawn. College (BC) for the quantum training knowledge.
14 J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15

references Hu, L., Ni, Q., 2019. Quantum automated object detection
algorithm. In: 25th International Conference on Automation
and Computing, Lancaster, 2019.
Huang, Y., Essa, I., 2005. Tracking multiple objects through
Abbas, A., Sutter, D., Zoufal, C., et al., 2021. The power of quantum
occlusions. In: 2005 IEEE Conference on Computer Vision
neural networks. Nature Computational Science 1, 403e409.
and Pattern Recognition (CVPR), San Diego, 2005.
Ali, H., Mohamed, M., El-Sayed, M.S., et al., 2014. Multiple objects
Kalman, R.E., 1960. A new approach to linear filtering and
tracking under occlusions: a survey. In: International
prediction problems. Journal of Fluids Engineering 82 (1), 35e45.
Conference on Advances in Computing, Electronics and
Kamkar, S., Safabakhsh, R., 2016. Vehicle detection, counting and
Electrical Technology, Kuala Lumpur, 2014.
classification in various conditions. The Institution of
Bewley, A., Ge, Z., Ott, L., et al., 2016. Simple online and realtime
Engineering and Technology 10 (6), 406e413.
tracking. In: 2016 IEEE International Conference on Image
Kline, K., Salvo, M., Johnson, D., 2019. How Artificial Intelligence and
Processing (ICIP), Phoenix, 2016.
Quantum Computing are Evolving Cyber Warfare. Available at:
Busu, C., Busu, M., 2021. An application of the Kalman filter
https://www.iwp.edu/cyber-intelligence-initiative/2019/03/27/
recursive algorithm to estimate the Gaussian errors by
how-artificial-intelligence-and-quantum-computing-are-
minimizing the symmetric loss function. Symmetry 13 (2), 240.
evolving-cyber-warfare/ (Accessed 28 October 2022).
Chen, Y., Wang, H., Zhu, Y., et al., 2020. A multi-target tracking
Koller, D., Weber, J., Malik, J., 2005. Robust Multiple Car Tracking
algorithm based on improved DeepSORT algorithm.
with Occlusion Reasoning. University of California, Berkeley.
Computer Application Research 37 (S2), 311e315.
Li, J., Ghosh, S., 2020. Quantum-soft QUBO Suppression for
Dilmegani, C., 2022. In-depth Guide to Quantum Artificial
Accurate Object Detection. The Pennsylvania State
Intelligence in 2022. Available at: https://research.aimultiple.
University, University Park.
com/quantum-ai/ (Accessed 28 October 2022).
Li, Q., Xiao, D.X., Wang, K.C.P., et al., 2011. Mechanistic-empirical
Dong, M., Fang, Z., Li, Y., et al., 2021. AR3D: attention residual 3D
pavement design guide (MEPDG): a bird’s-eye view. Journal of
network for human action recognition. Sensors 21 (5), 1656.
Modern Transportation 19, 114e133.
Eslami, E., Yun, H.-B., 2023. Comparison of deep convolutional  r, P., Girshick, R., et al., 2017. Feature pyramid networks
Lin, T., Dolla
neural network classifiers and the effect of scale encoding
for object detection. In: 2017 IEEE Conference on Computer
for automated pavement assessment. Journal of Traffic and
Vision Andd Pattern Recognition (CVPR), Honolulu, 2017.
Transportation Engineering (English Edition) 10 (2), 258e275.
Liu, J., An, F., 2020. Image classification algorithm based on deep
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., et al., 2009.
learning-kernel function. Scientific Programming 2020, 7607612.
Object detection with discriminatively trained part-based
Liu, W., Anguelov, D., Erhan, D., et al., 2016. SSD: single shot
models. IEEE Transactions on Pattern Analysis and Machine
MultiBox detector. In: Leibe, B., Matas, J., Sebe, N. (Eds.),
Intelligence 32 (9), 1627e1645.
Computer Vision-ECCV 2016. Springer, Cham, pp. 21e37.
FHWA, 2014. Traffic Monitoring Guide-Appendix C. Vehicle Types.
Luca, G.D., 2021. A survey of NISQ era hybrid quantum-classical
Available at: https://www.fhwa.dot.gov/policyinformation/
machine learning research. Journal of Artificial Intelligence
tmguide/tmg_2013/vehicle-types.cfm (Accessed 28 October
and Technology 2 (1), 9e15.
2022).
Make Sense, 2022. Make Sense. Available at: https://www.
Gabriel, P.F., Verly, J.G., Piater, J.H., et al., 2003. The State of the Art
makesense.ai/ (Accessed 28 October 2022).
in Multiple Object Tracking under Occlusion in Video
Memon, S., et al., 2018. A video-based vehicle detection, counting
Sequences. University of Lie ge, Lie ge.
and classification system. International Journal of Image,
Gai, Y., He, W., Zhou, Z., 2021. Pedestrian target tracking based on
Graphics and Signal Processing 10 (9), 34e41.
DeepSORT with YOLOv5. In: 2nd International Conference on , L., Reid, I., et al., 2016. MOT16: a benchmark
Milan, A., Leal-Taixe
Computer Engineering and Intelligent Control (ICCEIC),
for multi-object tracking. arXiv 1603, 00831.
Chongqing, 2021.
Milan, A., Rezatofighi, S.H., Dick, A., et al., 2017. Online multi-
Gambella, C., Simonetto, A., 2020. Multi-block ADMM heuristics for
target tracking using recurrent neural networks. In: 31st
mixed-binary optimization on classical and quantum
AAAI Conference on Artificial Intelligence, San Francisco, 2017.
computers. IEEE Transactions on Quantum Engineering 1,
Ngeni, F., Mwakalonge, J.L., Comert, G., et al., 2022. Monitoring of
3102022.
Illegal Removal of Road Barricades Using Intelligent
Girshick, R., Donahue, J., Darrell, T., et al., 2014. Rich feature
Transportation Systems in Connected and Non-connected
hierarchies for accurate object detection and semantic
Environments. Center for Connected Multimodal Mobility
segmentation. In: IEEE Conference on Computer Vision and
(C2M2), Clemson.
Pattern Recognition, Columbus, 2014.
Pancharatnam, M., Sonnadara, U., 2008. Vehicle counting and
Han, B., Paulson, C., Lu, T., et al., 2009. Tracking of Multiple Objects
classification from a traffic scene. In: 26th National
under Partial Occlusion. University of Florida, Gainesville.
Information Technology Conference, Colombo, 2008.
Hashmi, M.F., Ashish, B.K.K., Sharma, V., et al., 2021. LARNet:
Papadourakis, V., Argyros, A., 2010. Multiple objects tracking in
real-time detection of facial micro expression using lossless
the presence of long-term occlusions. Computer Vision and
attention residual network. Sensors 21 (4), 1098.
Image Understanding 114 (7), 835e846.
He, K., Gkioxari, G., Dolla  r, P., et al., 2017. Mask R-CNN. In: IEEE
Radonjic , M., Prvanovic , S., Buric , N., 2012. System of classical
International Conference on Computer Vision (ICCV), Venice,
nonlinear oscillators as a coarse-grained quantum system.
2017.
Journal of Physics Conference Series 442 (85), 022117.
Hong, F., Prozzi, J.A., 2006. Comparison of equivalent single-axle
Redmon, J., Divvala, S., Girshick, R., et al., 2016. You only look once:
loads from empirical and mechanistic-empirical approaches.
unified, real-time object detection. In: 2016 IEEE Conference on
In: Transportation Research Board 85th Annual Meeting,
Computer Vision and Pattern Recognition, Las Vegas, 2016.
Washington DC, 2006.
Ruseruka, C., Mwakalonge, J., Comert, G., et al., 2023. Pavement
Hou, X., Wang, Y., Chau, L.-P., 2019. Vehicle tracking using
distress identification based on computer vision and
DeepSORT with low confidence track filtering. In: 16th IEEE
controller area network (CAN) sensor models. Sustainability
International Conference on Advanced Video and Signal
15 (8), 6438.
Based Surveillance (AVSS), Taipei, 2019.
J. Traffic Transp. Eng. (Engl. Ed.) 2024; 11 (1): 1e15 15

Shorten, C., Khoshgoftaar, T.M., 2019. A survey on image data transportation planning, travel demand modeling, transportation
augmentation for deep learning. Journal of Big Data 6 (1), 1e48. systems analysis, transportation economics, and traffic safety.
Shu, G., Dehghan, A., Oreifej, O., et al., 2012. Part-based multiple- Furthermore, he has worked with clients, contractors, and
person tracking with partial occlusion handling. In: 2012 IEEE consulting firms in the construction industry for more than 4
Conference on Computer Vision and Pattern Recognition, years and supervised construction projects as a project manager,
Providence, 2012. project planner, and materials engineer.
Song, H., Liang, H., Li, H., et al., 2019. Vision-based vehicle
detection and counting system using deep learning in
highway scenes. European Transport Research Review 11 (1),
Dr. Judith Mwakalonge is a professor of
1e16.
Department of Engineering at South Carolina
Wojke, N., Bewley, A., Paulus, D., 2017. Simple online and realtime
State University with a specialty in trans-
tracking with a deep association metric. In: IEEE International
portation engineering. She has more than 13
Conference on Image Processing (ICIP), Beijing, 2017.
years of experience planning and modeling
Yang, F., Choi, W., Lin, Y., 2016. Exploit all the layers: fast and
transportation networks, analysis of traffic
accurate CNN object detector with scale dependent pooling
operational efficiency and safety, and evalu-
and cascaded rejection classifiers. In: 2016 IEEE Conference
ating self-driving in the connected vehicle
on Computer Vision and Pattern Recognition, Las Vegas, 2016.
environment. She has published and pre-
Zhao, Z., Zheng, P., Xu, S., et al., 2019. Object detection with deep
sented numerous papers in various international journals and
learning: a review. IEEE Transactions on Neural Networks and
proceedings.
Learning Systems 30 (11), 3212e3232.

Frank Ngeni is a civil and transportation en- Dr. Saidi Siuhi is an assistant professor of
gineering professional specializing in trans- Department of Engineering at South Carolina
portation engineering. He has over seven State University specializing in civil and
years of experience in planning, design, su- transportation engineering. He has more
pervision, engineering, and budgeting for than 14 years of experience in transportation
transportation networks. He is a dedicated, planning, travel demand modeling, trans-
resourceful, and innovative transportation portation systems analysis, transportation
engineering researcher with more than three economics, traffic safety, and evaluation of
years of experience with interest in intelligent transportation self-driving in the connected vehicle environment. He has pub-
systems (ITS), quantum computing, artificial intelligence (AI), lished and presented numerous papers in various international
multimodal mobility, connected and automated vehicles (CAVs), journals and conferences.

You might also like