A Comprehensive Analysis of Road Traffic Prediction Using Machine Learning Algorithms-3
A Comprehensive Analysis of Road Traffic Prediction Using Machine Learning Algorithms-3
Abstract—Road traffic prediction provides dynamic patterns, and differing climatic conditions which render the
directions and information for traffic management, improves prediction process unreliable [7]. Accurate traffic movement
road safety, forecasts traffic flow density, and fosters effective prediction is key to effective modern transportation systems,
driving arrangements. Traffic flow forecasting is essential for as it enables route planning that allows travellers to choose the
urban development. Therefore, intelligent traffic management best routes, thereby reducing overall traffic flow and improve
systems are increasingly being utilized by transportation the travel efficiency [8]. Traffic prediction can be categorized
planners and government agencies to plan for informed into two types: non-parametric and parametric. Machine
construction projects. The challenges posed by temporal and Learning (ML) and temporal models are used to address
spatial dependencies in traffic flow prediction are further
complex traffic problems, and enhance the prediction
compounded by limitations in monitoring equipment. To
address these challenges, Machine Learning (ML) and Deep
accuracy. The non-parametric models do not assume a
learning (DL) techniques are employed in road traffic specific distribution, while parametric models rely on on
prediction, allowing for the handling of both historical and predefined assumptions and equations regarding data
actual time information. This integration enhances the accuracy distribution [9].
of traffic forecasts by studying both historical data and real-time The remainder of this paper is organized as follows:
trends. The ML and DL techniques deployed in road traffic Section 2 provides a literature review, Section 3 outlines the
prediction include Support Vector Regression (SVR), Logistic
taxonomy for road traffic prediction, Section 4 presents the
Regression (LR), Random Forest algorithm, K-Nearest
comparative analysis of various models, while Section 5
Neighbor (KNN), Auto Regressive Integrated Moving Average
(ARIMA) and Long Short-Term Memory (LSTM),
discusses the problems in road traffic prediction, and finally,
Convolutional Neural Network classifier (CNN), and Cross- Section 6 summarizes the paper’s findings.
Modality CNN (CM- CNN). This survey analyses methodologies
II. LITERATURE REVIEW
and advancements in road traffic prediction, contributing to the
development of improved traffic management systems. Qin [1] developed a multi-modal traffic flow prediction
model that integrated road network data with historical traffic
Keywords—convolutional neural network, deep learning, flow and climate information. A Weighted Spatio-Temporal
machine learning, road traffic prediction, support vector machine, Graph (STG) was constructed from the traffic flow series, and
urban planning. a weighted ST Synchronous Graph Convolutional Networks
(STSGCN) model was used to extract the STG characteristics.
I. INTRODUCTION Further, the image sequence method was applied to track
Transportation is a crucial industry in urban cities, vehicle and road networks by converting road layouts into
providing regular services for a large population. The growth visual features. The Multi-Channel Attention (MCA)
of transportation systems is largely driven by the increasing mechanism was then used to fuse STG data and visual
population in modern cities [1]. This expansion has also led to characteristics, creating an aligned fusion vector that was
multiple disadvantages such as increased fuel consumption, integrated with climate feature vectors to achieve high
higher air pollution, and elevated costs due to the time wasted precision forecasting of road traffic movement. The multi-
by drivers in traffic, as well as losses from road accidents [2]. modal fusion model without the feature module of the STG
Many countries have implemented various policies and did not perform well in terms of multi-step prediction
measures to address traffic management, such as traffic accuracy.
restrictions and the promotion of green travel options [3].
Effective traffic management encompasses a range of Huang et al. [10] presented Multi-mode Dynamic Residual
facilities aimed at road management and public safety. It plays Graph Convolutional Network (MDR-GCN), a traffic flow
a vital role in transportation logistics, vehicle tracking and prediction model designed to address the dynamic effects of
identifying strategies for congestion relief [4]. Traditional various factors on the road network. The MDGC module
methods for measuring traffic flow include spatial sensors and captured the flow characteristics from various traffic modes
TV cameras placed at various points along city roads which through two distinct relationships between the pedestrians and
provide comparable data on traffic density, vehicle counts, traffic flow matrices. It employed a Dynamic Fusion Module
and speed [5]. (DFM) to effectively merge these flow characteristics.
Furthermore, the MDGRU module combined spatial and
Traditional traffic flow prediction models frequently temporal dependencies, efficiently enhancing the model’s
suffer from low accuracy and poor robustness against prediction capabilities. The Dynamic Residual Module
interference because of inherent non-linear and non-stationery (DRM) integrated original traffic flow data with spatial
traffic flow, as well as difficulties in adjusting to changing features extracted by MDGRU to estimate future traffic flow.
conditions [6]. Traffic flow prediction is influenced by various The MDRGCN outperformed other baseline models, and
complex issues including intricate network structure, trip ablation experiments confirmed the usefulness of each
Authorized licensed use limited to: Indian Institute Of Technology Jammu. Downloaded on April 18,2025 at 11:36:02 UTC from IEEE Xplore. Restrictions apply.
component. However, the process of applying graph
convolutional twice noted to be less resource-efficient and
increased computational demands.
Zheng et al. [11] developed Wavelet Decomposition
Attention (WDA) mechanism integrated with a GCN for
regional-level traffic flow prediction, emphasizing the
spatiotemporal connections among traffic monitors. The
WDA-GCN model contained three main components: the
Data Spatial-Temporal Architecture Module, which analysed
spatial correlations among various traffic monitors, the GCN
module, which calculated spatial correlation region, and the
Gated Recurrent Unit (GRU) which learned the spatial
features extracted by the GCN. The GRU employed a
decoder-encoder structure to facilitate multi-step predictions
at the regional level. Modifications to the GCN model
demonstrated that incorporating wavelet decomposition and
attention mechanisms significantly enhanced the forecasting
Fig. 1. Existing Machine Learning Methods for Road Traffic Prediction
performance of the WDA-GCN model. However, the
performance of WDA-GCN model was dependent on the A. Machine Learning Methods
tuning parameters such as those for WDA and GCN. Poor
tuning of these parameters led to inaccurate predictions. ML is a subdivision of AI that allows algorithms to
discover hidden patterns in data, allowing them to make
Wang et al. [12] presented the Spatial Temporal (ST) GCN predictions on new, similar data without requiring detailed
using a hierarchical method for traffic movement forecasting, instructions for each task. ML can be categorized into two
efficiently merging connections with and without historical types: supervised learning and unsupervised learning. It is
information for more accurate dynamic forecasting. The ST- used for pattern recognition, prediction, and data classification
GCN predicted traffic movement at intersections using by learning from existing datasets. Numerous ML techniques
historical information and extracting spatial dependencies available include ARIMA, Support Vector Machine, logistic
through graph convolutional layers and temporal regression, and random forest algorithms.
dependencies using a GRU. A similar adjacent algorithm
predicted their spatiotemporal correlations established on data 1) Logistic Regression and Random Forest: Chuanxia et
from connections with historical data. The historical method al. [14] developed a model to predict traffic accidents by IoT
used in this approach allowed for the combination of real time and ML to address the inaccurate traffic accident prediction.
and historical data to forecast sudden traffic events and Logistic Regression (LR) was used as a classification
variations. Nonetheless, the ST-GCN model relied on technique to derive a regression formula for establishing the
historical data which included noise that made the data classification boundary. Random Forest algorithm is an
incomplete and non-reflective of present traffic predictions ensemble learning algorithm derived from bagging, where
which led to inaccurate forecasting.
the decision trees are used as base learners to create an
Bao et al. [13] implemented Deep Belief network (DBN) ensemble bagging while ensuring randomness of feature
technique to enhance traffic forecasting under poor weather selection for training the decision trees. In this study, the
conditions for improved prediction accuracy. The dataset was classified into rural and urban areas to account
combination of Support Vector Regression (SVR) and DBN for regional heterogeneity of accident data, and the model’s
proved efficient in regression tasks with improved learning
performance was evaluated.
procedures. The structure used an old DBN which learned
main traffic characteristics in an unsupervised manner for 2) ARIMA and SVM: Pandey et al. [15] proposed
traffic analysis and prediction. The upper layer of DBN improved GPS method for road traffic flow forecasting by
employed SVR in a supervised capacity, effectively mapping combining classified vehicle totals such as trucks, cars, and
the complex connections within the traffic system while also groups, resulting in precise forecasting of traffic congestion
integrating weather conditions. However, the computation of and interruptions. The ML-based overlay process focused on
the improved DBN was slightly longer than ARIMA, but improving the route selection through various classification
suggestively shorter than that of a standard neural network. techniques to predict the traffic capacity at specific places.
The Auto-Regressive Integrated Moving Average (ARIMA)
III. TAXONOMY
and SVM were used to predict the road traffic volume with
Road traffic prediction is primarily divided into two the previously gathered dataset and forecast the density of
categories based on the use of ML and DL models. Each traffic in a zone at a specific time and traffic size. ARIMA
category encompasses various methods designed to enhance
was employed to capture time-series patterns, while the SVM
the accuracy of road flow predictions. Accurate road flow
prediction significantly impacts travel comfort, the health of model handled complex patterns and non-linear relationships
drivers and passengers, and the overall efficiency of within the data for traffic forecasting.
transporation systems. Figure 1 illustrates the taxonomy of 3) SVR and KNN: Lin et al. [16] implemented a technique
road traffic prediction methods, highlighting the diverse for screening spatial time-delayed traffic series using the
approaches within both ML and DL frameworks. Maximal Information Coefficient. The SVR and KNN were
applied through a combination of traffic state vectors to
determine the traffic flow. During the SVR training
Authorized licensed use limited to: Indian Institute Of Technology Jammu. Downloaded on April 18,2025 at 11:36:02 UTC from IEEE Xplore. Restrictions apply.
procedure, errors were identified and addressed by using the Bidirectional LSTM (Bi-LSTM), called PSO-Bi-LSTM.
KNN algorithm to forecast the errors, so enhancing the PSO-Bi-LSTM model was designed to enhance the Bi-LSTM
accuracy of SVR predictions. However, the grid search network architecture by incorporating the features of road
method used for SVR parameter selection was data. The PSO algorithm demonstrated quick convergence,
computationally inefficient and time consuming. and parameter reliability was enhanced by learning from the
features of the data. The PSO model allowed the Bi-LSTM
B. Deep Learning methods
model to be constructed and optimized based on the
DL is a method in artificial intelligence based on artificial characteristics of the road traffic data.
neural networks and the relationships within data. Hybrid
3) Bi-GRU: Chauhan et al. [18] introduced hybrid DL
Convolutional Neural Network-Long Short-Term Memory
(CNN-LSTM), Particle Swarm Optimization Bi-directional model for predicting traffic flow, consisting of two units. The
LSTM (PSO-Bi-LSTM), and Bi-directional Gated Recurrent first unit was a Bi-GRU with a confined attention mechanism
Unit (BI-GRU) are types of DL methods. that extracted the temporal attributes of traffic data. The
second unit was another Bi-GRU, designed to capture the
1) Hybrid CNN-LSTM: Obayya et al. [17] proposed DL periodic features of the traffic data. By combining a refined
method, an Attention-based Hybrid CNN with LSTM attention mechanism into the first Bi-GRU, the model
namely, AHCNLS, incorporating an attention mechanism to enhanced its ability to focus on the relevant features of the
estimate the traffic flow. The attention mechanism was input data, leading to more accurate predictions of traffic
employed to efficiently estimate road traffic by focusing on flow. The Bi-GRU module incorporated the external features
relevant features of the input data. The Hybrid CNN-LSTM such as weekends, weekdays, and holidays which improved
technique was used to evaluate traffic at impending periods. its performance.
Research indicated that the CNN-LSTM technique
performed worse than the LSTM due to its unpredictable IV. COMPARATIVE ANALYSIS
presentation. The output of the CNN was connected to the The road traffic prediction is compared with other methods
input of the separate LSTM component, displaying both to improve the performance of the approach. The comparative
spatial and temporal information. A fully connected layer was analysis is significant for developing and effectively
used to make the final predictions. The CNN-LSTM enhancing its performance. Table 1 represents the
underperformed when compared to the old methods which comparative analysis of the existing methods, while Figure 2
achieved better outcomes than the hybrid CNN-LSTM illustrates a graphical representation of accuracy results. The
model. accuracy results of the existing methods, such as the hybrid
stacking ensemble model [2], DBN [13], and ARIMA and
2) BPO-Bi-LSTM: Bharti et al. [6] developed prediction
SVM [15] are illustrated in Figure 2.
framework combining Particle Swarm Optimization with
TABLE I. COMPARATIVE ANALYSIS OF ROAD TRAFFIC PREDICTION USING MACHINE LEARNING
Author Methodology Advantages Limitations Performance Metrics Accuracy
Name Result
Qin [1] STSGCN The integration of weather features with The multi-modal fusion model Root Mean Squared Error N/A
the aligned fusion vectors achieved high without the feature unit of STG (RMSE), Mean Absolute
accuracy in traffic flow forecasting. performed badly in multi-step Error (MSE), Mean
prediction effect. Absolute Percentage Error
(MAPE)
Amiri and A Hybrid The meta-learner aimed to enhance the Hybrid models were tough to Accuracy, Accuracy
Pierre [2] stacking performance by aggregating predictions interpret when compared to precision, = 0.941
ensemble from multiple base models, emphasizing simpler models, making it recall,
model the most relevant features to refine its challenging to understand how F1 score.
final predictions. predictions were made.
Huang et MDRGCN MDR-GCN captured the dynamic effects Applying Graph Convolutional RMSE, MSE, MAPE and N/A
al. [10] of traffic by considering both time and twice was not effective and Pearson Correlation
location when predicting traffic flow in increased the computation which Coefficient (PCC)
different areas of the road network. led to inefficiency and
overfitting.
Wang et STGCN The hierarchical approach in STGCN The STGCN combined both RMSE, MAE N/A
al. [12] handled both historical and non-historical historical and current data, and
data, allowing to adapt to various hugely depended on the quality
conditions while performing traffic of historical data. Its
forecasting tasks. performance depended on the
quality of historical data, and
when the data was disorganized,
it was hard to predict traffic
patterns.
Bao et al. DBN SVR integrated with DBN and weather DBN was sensitive to variations Accuracy Accuracy
[13] conditions, mapped the complex in input data like outliers or noise = 90.5%
connections within the system which which was frequently impaired
improved prediction accuracy and in poor weather conditions,
enhanced both the models. leading to inaccurate forecasts.
Pandey et ARIMA and ARIMA was utilized to identify time The ARIMA and SVM model Accuracy Accuracy
al. [15] SVM series patterns, while the SVM model depend on accurate classification = 95%
Authorized licensed use limited to: Indian Institute Of Technology Jammu. Downloaded on April 18,2025 at 11:36:02 UTC from IEEE Xplore. Restrictions apply.
handled complex patterns and non-linear of vehicles such as trucks and
relationships within the data. cars. As a result, inaccurate
forecasts occurred when the
model found the error in
classification.
Obayya et Hybrid CNN- The CNN captured spatial structure, The simplifications and specific MAE, RMSE, MAPE N/A
al. [17] LSTM while LSTM model captured the temporal assumptions used in the hybrid
dynamics, so as to handle multi- model did not fully capture the
dimensional data for video prediction and complex dynamics of
traffic forecasting. real-life traffic situations.
REFERENCES
[1] X. Qin, “Traffic flow prediction based on Two-Channel Multi-Modal
fusion of MCB and attention,” IEEE Access, vol. 11, pp. 58745–
58753, May 2023.
[2] P. A. D. Amiri, and S. Pierre, “An ensemble-based machine learning
model for forecasting network traffic in VANET,” IEEE Access,
vol. 11, pp. 22855–22870, March 2023.
[3] D. Yang, and L. Lv, “A graph deep learning-based fast traffic flow
prediction method in urban road networks,” IEEE Access, vol. 11, pp.
93754–93763, August 2023.
[4] Y. Xiong, and H. Wang, “Spatio-Temporal Contextual Conditions
Causality and Spread Delay-Aware Modeling for Traffic Flow
Prediction,” IEEE Access, vol. 12, pp. 21250–21261, January 2024.
Fig. 2. Graphical Representation of Accuracy Result
[5] S. Bilotta, E. Collini, P. Nesi, and G. Pantaleo, “Short-term prediction
of city traffic flow via convolutional deep learning,” IEEE Access, vol.
V. PROBLEM STATEMENT 10, pp. 113086–113099, October 2022.
• The multi-modal fusion model performed very badly [6] Bharti, P. Redhu, and K. Kumar, “Short-term traffic flow prediction
in multi-step prediction when compared to the model based on optimized deep learning neural network: PSO-Bi-
LSTM,” Physica A, vol. 625, p. 129001, September 2023.
without the feature module of the STG feature module,
[7] X. Huang, Y. Ye, X. Yang, and L. Xiong, “Multi-view dynamic graph
which contributed information for accurate forecasting convolution neural network for traffic flow prediction,” Expert Syst.
of behaviours or states. Appl. , 222, p.119779, July 2023.
• Temporal and spatial dependencies posed challenges [8] S. S. Sepasgozar, and S. Pierre, “Network traffic prediction model
considering road traffic parameters using artificial intelligence methods
in traffic flow estimation on roads, and limitations in in VANET,” IEEE Access, vol. 10, pp. 8227–8242, January 2022.
monitoring equipment which led to inaccurate data [9] A. Navarro-Espinoza, O. R. López-Bonilla, E. E. García-Guerrero, E.
collection, delaying traffic management and impairing Tlelo-Cuautle, D. López-Mancilla, C. Hernández-Mejía, and E.
the forecasting accuracy. Inzunza-González, “Traffic flow prediction for smart traffic lights
using machine learning algorithms,” Technologies, vol. 10(1), p. 5,
• Traffic flow forecasting was impacted by various January 2022.
complex factors, including intricate network [10] X. Huang, Y. Ye, X. Yang, and L. Xiong, “Multi-view dynamic graph
structures, trip patterns, and weather situations, which convolution neural network for traffic flow prediction,” Expert Syst.
complicated the accuracy of predictions. Appl. , vol. 222, p. 119779, July 2023.
[11] Y. Zheng, S. Wang, C. Dong, W. Li, W. Zheng, and J. Yu, “Urban road
• Grid search method used for selecting parameters in traffic flow prediction: A graph convolutional network embedded with
SVR was computationally inefficient and time wavelet decomposition and attention mechanism,” Physica A, vol. 608,
p. 128274, December 2022.
consuming, suggesting problems in the process.
[12] H. Wang, R. Zhang, X. Cheng, and L. Yang, “Hierarchical traffic flow
VI. CONCLUSION prediction based on spatial-temporal graph convolutional
network,” IEEE Trans. Intell. Transp. Syst., vol. 23, pp. 16137–16147,
This paper presents ML and DL based forecasting February 2022.
techniques for road traffic and traffic volume prediction. The [13] X. Bao, D. Jiang, X. Yang, and H. Wang, “An improved deep belief
methods such as hybrid CNN- LSTM, BPO-Bi-LSTM, SVM, network for traffic prediction considering weather factors,” Alexandria
and KNN were used to predict vehicular traffic at Eng. J., vol. 60, pp. 413–420, February 2021.
intersections, laying the ground work for adaptive traffic [14] S. Chuanxia, Z. Han, and Y. Peixuan, “Machine learning and IoTs for
forecasting prediction of smart road traffic flow,” Soft Comput., vol.
control. Challenges such as inaccurate data collection, delayed 27, pp. 323–335, November 2022.
traffic management, inefficient traffic flow estimation,
[15] A. D. Pandey, B. Kumar, M. Parida, A. K. Chouksey, and R. Mishra,
inadequate monitoring equipment, and low quality of “A machine learning-based overlay technique for improving the
historical data complicated the predictions of road traffic mechanism of road traffic prediction using global positioning
patterns. Evaluations metrics such as RMSE, MAE, accuracy, system,” Innovative Infrastruct. Solutions, vol. 9, p. 300, June 2024.
F1 Score, and precision were used to estimate the ML and DL [16] G. Lin, A. Lin, and D. Gu, “Using support vector regression and K-
models’ accuracy of traffic predictions. The traffic flow nearest neighbors for short-term traffic flow prediction based on
prediction model addresses the dynamic effects of various maximal information coefficient,” Inf. Sci., vol. 608, pp. 517–531,
August 2022.
factors on the road network by capturing the flow
[17] M. Obayya, F.N. Al-Wesabi, R. Alabdan, M. Khalid, M. Assiri, M. I.
characteristics from various traffic modes. Challenges arise Alsaid, A. E. Osman, and A. A. Alneil, “Artificial Intelligence for
because of spatial and temporal dependencies in traffic Traffic Prediction and Estimation in Intelligent Cyber-Physical
patterns. To overcome these challenges, further research will Transportation Systems, IEEE Trans. Consum. Electron., vol. 70, pp.
focus on developing enhanced Machine Learning techniques 1706–1715, October 2023.
to improve the accuracy of road traffic prediction. [18] N. S. Chauhan, N. Kumar, and A. Eskandarian, “A Novel Confined
Attention Mechanism Driven Bi-GRU Model for Traffic Flow
Authorized licensed use limited to: Indian Institute Of Technology Jammu. Downloaded on April 18,2025 at 11:36:02 UTC from IEEE Xplore. Restrictions apply.
Prediction,” IEEE Trans. Intell. Transp. Syst., vol. 25, pp. 9181–9191,
March 2024.
Authorized licensed use limited to: Indian Institute Of Technology Jammu. Downloaded on April 18,2025 at 11:36:02 UTC from IEEE Xplore. Restrictions apply.