0% found this document useful (0 votes)
29 views12 pages

Machine Learning-Assisted Macro Simulation For Yard Arrival Prediction

Machine learning in the context of DE simulation experiments

Uploaded by

llanojairo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views12 pages

Machine Learning-Assisted Macro Simulation For Yard Arrival Prediction

Machine learning in the context of DE simulation experiments

Uploaded by

llanojairo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Journal of Rail Transport Planning & Management 25 (2023) 100368

Contents lists available at ScienceDirect

Journal of Rail Transport Planning & Management


journal homepage: www.elsevier.com/locate/jrtpm

Machine learning-assisted macro simulation for yard arrival


prediction
Niloofar Minbashi ∗, Hans Sipilä, Carl-William Palmqvist, Markus Bohlin,
Behzad Kordnejad
Division of Transport Planning, KTH Royal Institute of Technology, Brinellvägen 23, SE-100 44 Stockholm, Sweden

ARTICLE INFO ABSTRACT

Keywords: Increasing the modal share of the single wagonload transport in Europe requires improving
Yards the reliability and predictability of freight trains running between the yards. In this paper,
Delay prediction we propose a novel machine learning-assisted macro simulation framework to increase the
Macroscopic simulation
predictability of yard departures and arrivals. Machine learning is applied through a random
Machine learning
forest algorithm to implement a yard departure prediction model. Our yard departure prediction
Rail traffic
approach is less complex compared to previous yard simulation approaches, and provides an
accuracy level of 92% in predictions. Then, departure predictions assist a macro simulation
network model (PROTON) to predict arrivals to the succeeding yards. We tested this framework
using data from a stretch between two main yards in Sweden; our experiments show that the
current framework performs better than the timetable and a basic machine learning arrival
prediction model by 𝑅2 of 0.48 and a mean absolute error of 35 minutes. Our current
results indicate that combination of approaches, including yard and network interactions,
can yield competitive results for complex yard arrival time prediction tasks which can assist
yard operators and infrastructure managers in yard re-planning processes and yard-network
coordination respectively.

1. Introduction

The European Union has set an ambitious target to transport at least 30% of the freight by rail by 2030. This target is in line
with the European Green Deal to achieve climate neutrality by 2050 (Rail Freight Forward, 2018). One of the challenges towards
fulfilling this target is to make a better utilization of yards as main rail freight capacity nodes. One requirement to enhance yard
capacity utilization and performance is to improve coordination between yards and the railway network which can be achieved
through increasing the predictability of yard arrivals and departures.
Inherently, yards are large areas where the arrangements of wagons in incoming trains are changed to form outgoing trains to
new destinations. Yard operations are sequential; first, trains arrive to the arrival yard. Then, wagons are sent over a small hill
called hump to the classification yard where they are arranged on each track dedicated to their destination. Finally, the wagons
are pulled to the departure yard, attached to a locomotive and dispatched to the network. In some yard practices, the locomotive
is attached to the wagons in the classification yard and the train is directly dispatched to the network (see the conventional yard
layout in Fig. 1).
Increased arrival predictability from the network side assists yard operators in adapting yard operations to unwanted effects
arrival deviations may have, for example on wagon re-bookings and final punctual yard departures (Boysen et al., 2012; Bohlin

∗ Corresponding author.
E-mail address: [email protected] (N. Minbashi).

https://doi.org/10.1016/j.jrtpm.2022.100368
Received 26 August 2022; Accepted 12 December 2022
Available online 4 January 2023
2210-9706/© 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).
N. Minbashi et al. Journal of Rail Transport Planning & Management 25 (2023) 100368

Fig. 1. A conventional European yard layout (Minbashi, 2020).

et al., 2018). On the other side, increased departure predictability from the yard side helps the infrastructure manager to reduce
unwanted impacts that yard departures may have on other trains on the line. It also helps yard operators to better evaluate the
reliability of freight services they provide and aim for increase in on-time departures, which is required to guarantee certain levels
of wagon travel times and meet satisfactory delivery times for customers (Jaehn and Michaelis, 2016).
So far, researchers have mostly applied simulation to model an optimal performance of the yards through delay minimization.
Simulation models are apt for realizing the dynamics of the yard operations; however, incorporating an acceptable level of the
detailed yard operations into a simulation model is time-consuming and somehow yard layout dependent (Belošević et al., 2015;
Bohlin et al., 2018; Deleplanque et al., 2022), which makes it more difficult to generalize and apply a simulation model for other
yards.
Additionally, yard simulation models are focused on internal yard operations, which leads to considering yards as separate
entities within a railway network or with a controlled level of impact from the network side. While recent research shows how
important is the impact of the surrounding network on the yard performance (Licciardello et al., 2020; Dick, 2021; Minbashi et al.,
2021b).
Data-driven approaches are another alternative; prediction models implemented through these approaches reproduce implicitly
the relationships between various variables (Wang and Work, 2015) in a faster modeling process. However, developing data-driven
approaches require large and reliable data, which is rather difficult to obtain in rail freight domain.
We contribute to address a part of the gap in the literature by developing a data-driven approach to increase the predictability
of the arrivals and departures to and from the yards through creating a coordination between yards and the network. To the best of
our knowledge, this is the first study in which a machine learning-assisted macro simulation model is applied for yard arrival and
departure predictions.
The remainder of this paper is organized as follows: Section 2 reviews previous research on yard departure delay, yard-network
interaction, and the application of combined methods in railway operational problems. Section 3 describes the method development
on yard departure prediction using machine learning and macroscopic network simulation. Section 4 presents data and the case study
to test the proposed model framework. Section 5 presents and discusses the results of the paper. Section 6 concludes the paper.

2. Literature review

In this section, we summarize the most relevant part of the literature related to yard departure delay modeling, yard-network
interaction, and combined models in railway operational problems.

2.1. Yard departure delay

Yard departure delays due to yard operator performance were identified among the main causes of freight train delays (Krüger
et al., 2013; Palmqvist et al., 2022). Thus, minimizing yard departure delays has been studied, among other objectives, as one of
the ways to optimize yard performance rather than a mere train delay prediction problem. Kraft (2000) used arrival, departure, and
processing times in yard operations to minimize the lateness of all outbound trains.
Jaehn et al. (2015a) proposed algorithms to minimize the weighted tardiness of departing trains leading to more successful wagon
connections between arrivals and departures. Similarly, Jaehn et al. (2015b) had minimized weighted departure times considering
the priority values of departing trains.
Minbashi et al. (2021b) fitted general distributions to departure deviations of two different yards in Sweden, and found log-
normal as the best fit for delayed departures, whereas for early departures log normal and gamma were the best fits which showed
the impact of yard individuality and comparison between different yards in finding relationships between parameters that impact
yard departures.
Minbashi et al. (2020) attempted to develop an analytical model for departure delay estimation, considering the arrival yard
congestion. They emphasized the complexity of the yard departure delay problem, and suggested incorporating elaborate yard
operational parameters extracted from historical data in yard departure delay models. Minbashi et al. (2021a), in a further research,
investigated the application of decision tree and random forest algorithms to predict the departure status from yards as a first step
in building yard departure prediction models. They received positive accuracy levels discussing the importance of the amount of
data to obtain generalized results.
As described through the previous studies, the yard departure delay has not received sufficient attention as a train delay
prediction problem for various reasons. For example, in most railways, freight trains do not follow a strict schedule; thus, proper yard
operation models are more practical than mere yard departure prediction models. On the other hand, in railways where freight trains

2
N. Minbashi et al. Journal of Rail Transport Planning & Management 25 (2023) 100368

are scheduled, passenger trains are often prioritized, which puts freight trains operations in a secondary importance, and focuses on
practical network models for passenger ones. Both of these has led to separate yard and network models which if connected may
increase the predictability of freight trains on railway networks.

2.2. Yard and network interaction

In recent years, interest has risen to study yards in conjunction with the network, mainly with the aim to provide comprehensive
models that can assist yard operators and infrastructure managers in providing fluid freight train runs between yards and the
network.
One example is the OPTIYARD European Shift2Rail project (Licciardello et al., 2020) in which a decision support system was
developed for yard dispatchers. This decision support system aimed at integrating a yard-micro simulation model to an optimization
module in order to assist yard dispatchers to make optimized dispatching decisions in real-time operations. The integration of the
yard-micro model and the optimization module was only evaluated. OPTIYARD also introduced a conceptual integration between
the optimization yard-micro model and a network micro simulation model to forecast estimated train arrival (ETA) for future
research (Liu et al., 2016).
The rest of the literature in this area has mainly focused on the impact of arrival deviations on yard performance and
punctuality. Minbashi et al. (2020) hypothesized that the number of arriving trains in a predefined period of time before train
departures can represent congestion imposed from the network and may impact departure delays. Finding a negative correlation
between this parameter and departure delays, they concluded that the impact of arrival deviations from the network side on yard
departure delays needs to be elaborated through more complex quantification.
Dick (2021) and Dick and Nishio (2019) investigated the impact of schedule flexibility in terms of the train arrival time variability
using YardSYM simulation model. The results showed that increasing train arrival time variability determines the yard performance
by increasing the proportion of wagons missing their planned connection and the overall wagon dwell time and also increased
variability in volume/length on departing trains.
The effect of train arrival variability on yard performance from different aspects was also studied in some previous studies (Dong,
1997; Li et al., 2002; Khoshniyat, 2012).
The correlation between the network usage, a parameter representing the available capacity on the line, and delayed departures
was analyzed by Minbashi et al. (2021b) using historical data from two main yards in Sweden. They concluded that on a daily basis
the network usage has a different impact on delayed departures from individual yards.
Barbour et al. (2018b,a) applied machine learning algorithms for estimated train arrival (ETA) of freight trains in the US with
promsing results. They suggested incorporating the origin and destination yards status on freight train ETA models.

2.3. Combined method developments

In recent years, there has been an interest in the application of combined and/or integrated methods in solving railway
operational planning problems. Combined approaches can be complex to implement due to inherent differences between the methods
and the difficulty of connecting models. However, combined approaches bring the advantages and strengths of different approaches
together, which may lead to promising solutions as shown in the train timetabling problem.
Lee et al. (2017) proposed a simulation-based heuristic approach to improve train timetables from efficiency and robustness
perspectives. Their approach showed how time supplement manipulation can reduce system delays. Högdahl et al. (2019) combined
optimization with micro-simulation to improve train timetable robustness. Their results showed increased socio-economic benefits
and increased punctuality.
Integrating micro and macro simulation has received more attention from researchers. Bešinović et al. (2016) proposed an
integrated micro-macro model framework to generate feasible timetables. Placido et al. (2014) combined a macro-optimization
approach and a micro-simulation one to manage operations in the occurrence of disruptions. The macro-optimization provided
the optimal timetable and rolling stock composition in terms of operating costs and passenger needs. The micro-simulation part
evaluated the feasibility of the solutions proposed by the macro-optimization.
Schlechte et al. (2011) developed an algorithmic bottom-up approach to transform a microscopic railway network to an
aggregated macroscopic network model and vice-versa to create an optimal train timetable. To sum up, upgrading macro simulation
models or integrating them with more detailed models can enhance timetable feasibility (Bešinović et al., 2016).
A conceptual application of combining regression tree with macroscopic simulation for train traffic simulation was proposed
by Watanabe et al. (2019). In this approach, regression tree is used to extract the rules for running times, dwell times and train
intervals per stations. Then, these rules are incorporated into a macroscopic simulation model based on the longest path algorithm
to simulate the arrival and departure times from and to stations.

3
N. Minbashi et al. Journal of Rail Transport Planning & Management 25 (2023) 100368

Fig. 2. The proposed model framework.

3. Method

3.1. The proposed model framework

We propose a novel machine-learning-assisted macro simulation (von Rueden et al., 2020) model framework to connect yards
and the railway network to predict yard arrival and departures. The purpose is to capture an important aspect of an integrated rail
system: the impact that yard performance has on departures and, ultimately, the predictability of downstream train arrivals.
Fig. 2 shows a schema of the proposed model framework which combines machine learning and macro simulation. The approach
is data-driven, making use of the data from the infrastructure manager and the yard manager and thereby does not require that
yard operations being modeled and simulated in detail. It is called a machine learning-assisted macro simulation approach since
machine learning provides input to the simulation.
The machine learning part is the yard prediction model where yard departure deviations are predicted through a random forest
algorithm implementation. Then, these departure deviations are incorporated into the macro simulation model as a part of the input
to simulate the running of freight trains along the line and predict arrivals to the next yard.

3.2. Yard prediction model

The yard prediction model is dependent on a regression-type prediction algorithm, in this paper represented by a random forest
algorithm, which has been indicated (Minbashi et al., 2021a; Barbour et al., 2018b) to provide better results compared to other
machine learning algorithms for freight train delay prediction purposes.
The random forest algorithm is an ensemble version of the well-known decision tree algorithm, where a set of decision trees
are trained on randomly selected subsets of the training data set. Each single decision tree is composed by nodes where a decision
rule, based on a single input feature, separates the remaining data set (represented by applying all rules from the root note) into
subsets. Decision trees thereby separate data instances with similar output values. The training is in essence performed by splitting
the remaining data set at a node by adding a new decision rule, until a fixed performance criterion is reached, for example, when
the prediction error, the number of samples or the decrease in prediction error in the node is lower than a specified threshold.
In a random forest, the ensemble prediction is the average of the prediction results of the individual trees. Some trees may result
in better or worse predictions, which makes the final result of a random forest model more balanced. The predicted output value for
each terminal node in the tree is calculated from the corresponding training samples that terminated in the node. The predictions
made by individual trees are averaged to arrive at the ensemble prediction. Combining many weak learner regression trees in the
random forest predictor has shown to be an effective methodology to avoid overfitting (Kuhn and Johnson, 2013).
The random forest model in our approach is built in the KNIME analytics platform (Berthold et al., 2008). The optimal tuning
parameters for the model were train depth 40 in a forest of 100 trees. We applied a common 10-fold cross-validation for the
re-sampling procedure. In the data section, the procedure of the data pre-processing and the predictor selection is described.

3.3. PROTON macro simulation

The network simulations are carried out in PROTON (Punctuality and Railway Operation Simulation) which is a macroscopic
railway traffic simulation tool developed by DB Analytics (Detsche Bahn AG) mainly within the Shift2Rail projects PLASA and
PLASA-2. PROTON has the capability to handle large networks with a large number of trains with a reasonably short simulation
time. It was formerly known as PRISM (PLASA Railway Interaction Simulation Model).
In PROTON, the macroscopic infrastructure consists of nodes and edges. Nodes typically represent stations or other types of
operational control or time measurement points in a railway network. The edges connect nodes and have properties such as length,
number of tracks, type of train protection system and average block section length.

4
N. Minbashi et al. Journal of Rail Transport Planning & Management 25 (2023) 100368

Fig. 3. Simulation setup showing the data sources used and a simplified description of the different steps.

Then, PROTON requires a timetable as input. The timetable specifies the sequence of nodes and the respective scheduled arrival
and departure times for each individual train along with train ID, timing load, and train category. Timing loads describe the train
configurations used during timetable construction. Technical minimum running times for all relevant timing loads, edges, and drive
modes are needed for determining any available running time allowance per edge, which can be used to reduce delays. PROTON
does not calculate running times for trains; therefore, the technical running times must be provided in order to establish the available
allowance, i.e. the difference between technical and scheduled running times.
In PROTON, train conflicts are modeled based on minimum headway times, which are calculated from block occupation times
by considering the infrastructure edge block length, train travel time through the block, route set-up, and release times. When
arrival and departure times are calculated for a train, the temporal distance to the preceding train is checked. If it is lower than
the calculated minimum headway time on this edge and for the specific train types the event time for the second train is delayed
until the minimum headway condition is met. The dispatching scheme in this paper is based on train priorities which gives closest
results to the empirical data in the calibration phase.
PROTON and its functionality are described in more detail in Zinser et al. (2019). An early use case where results from the
macroscopic tool, later to evolve into PROTON, are compared with results from the microscopic tool RailSys is also presented
in Zinser et al. (2018).

3.3.1. PROTON simulation setup


Fig. 3 shows the simulation setup in a simplified way. The macroscopic infrastructure and some train configuration parameters
are mainly based on information from the national RailSys microscopic model maintained by the Swedish Transport Administration.
Timetables are brought from TrainPlan data, and technical driving times are brought from a system called Tigris which contains
the driving times used in TrainPlan. Delay distributions are compiled from historical data compiled from a database called LUPP.
During the calibration phase, simulations are run with different scaling on dwell and run time distributions with the aim of finding
a suitable scaling level for representing primary delays. This is explained further in Section 3.3.2. Input from the Yard Prediction
Model is not used in the calibration phase.
Once the distribution scaling level is established, freight trains which have a match between the Yard Prediction Model output
and the timetable (train ID and date) will get their initial/entry deviation value from the model instead of using a more general
distribution. Input distributions come in histogram format. Initial/entry and dwell distributions can be used directly as they come,
but run time distributions must be converted to Weibull format to be used in PROTON. The setup with initial/entry distributions
is made so that passenger trains will never be initiated before scheduled time, whereas freight trains and empty trains can also be
initiated ahead of scheduled time according to the histograms.
Apart from the simplifications in modeling train interactions in the macro modeling, there are three additional limitations which
affect the outcome to varying degrees. First, crossing inside stations is not modeled; only arrival and departure events are simulated;
in order not to overestimate capacity, the simulation blocks overtaking at most of the nodes of the double track lines.
Second limitation is that if nodes are enabled for overtakings (and meetings on single tracks), the track resources can be viewed
as infinite, i.e., PROTON does not model the number of useful tracks. A node can either allow or not allow a change in train sequence
which means that more trains can be overtaken at a node than what is possible in reality due to the available track resources at this
particular node. Third limitation is that PROTON simulates in 24 h periods (from 0 to 24) which means that trains running over
midnight are cut.

5
N. Minbashi et al. Journal of Rail Transport Planning & Management 25 (2023) 100368

Fig. 4. Map showing the railway lines (highlighted) included in the simulations.

3.3.2. PROTON calibration


PROTON requires distribution of primary delays as input. Separate distributions are used for entry, dwell, and run time delays,
and for different train types (freight, long-distance, regional, and local passenger trains), further separated by direction of travel.
These primary delay distributions are based on empirical data.
However, the empirical delay distributions contain both primary and secondary delays, with no obvious way to disentangle them
from each other. To compensate for this, we adjust the empirical distributions by scaling down the probability of being delayed
(shifting the probability mass to being on time). The idea is that only a certain proportion of delays are primary, and the rest are
secondary. This can be done straightforwardly with run and dwell time delays, but is conceptually more difficult with entry delays.
For this reason, we do not scale down entry delays in this paper.
The question is then by how much run and dwell time delays should be scaled down. This is essentially a calibration problem,
and we tried a range of settings in 5%-intervals from 0% to 40%, and found that using a scale factor of 25% yielded a punctuality
output that best fit the empirically observed punctuality. Simply put, assuming that 25% of the empirical run and dwell time delays
are primary, with the remaining 75% being secondary delays, makes the simulation yield realistic punctuality output of about 90%.
The size of the scale factor will of course vary from case to case, a factor of 25% implies that every minute of primary delay generates
three minutes of secondary delays (see for instance Johansson et al., 2021).

4. Data description

4.1. Case study

We selected the stretch between two main yards of Sweden, Malmö and Hallsberg due to high mutual interactions between them.
Malmö yard is located in the south of Sweden, on the Scandinavian-Mediterranean Rail Freight Corridor of Europe, an important
yard from a larger European perspective. Hallsberg is the largest yard in Scandinavia.
The simulated area is shown in Fig. 4. Total distance between Malmö and Hallsberg is around 400 km. It is double-track except
shorter sections near Hallsberg. Section Malmö-Mjölby forms part of the Southern Main Line. Generally, the traffic on the Southern
Main Line consists of a mix of high speed, regional, local passenger, and freight trains. Scheduled maintenance activities require
track closure between Lund and Hässleholm 3–4 weekends per year. Therefore, some additional lines in that area are included in
the simulations as well (see Fig. 4).

4.2. Yard data

For yard prediction model, we used operational data provided by the main yard operator in Sweden covering the year 2019.
This data includes three different data sets: wagon connection data, train punctuality data, and train features data. We used Python
data analysis library (Pandas) for data pre-processing, which is presented in the following.

6
N. Minbashi et al. Journal of Rail Transport Planning & Management 25 (2023) 100368

Table 1
The predictors in yard prediction model.
Minimum wagon dwell time
Maximum wagon dwell time
Wagon connection
Number of wagons per departing train
Number of arriving trains to a departing train
Departing train number
Scheduled departure hour
Train punctuality
Departure week-day
Departure month
Operated length
Train features
Operated weight

Wagon connection data: the main yard operator in Sweden uses a wagon booking system to allocate wagons to departing trains.
Wagons are booked to their departing trains before entering the yard. In total, the number of booked wagons for this period was
161,162. The main discrepancy observed in this data set was double bookings which included infeasible dwell times of less than
140 min, some even being negative. These were excluded from the data which omitted 10% of the data set.
Train features data: this data set gives information on the operated and planned train weight and length, yard operator train
category, infrastructure manager train category, train owner, and station where the operative train number changes. This data set
did not contain duplicates. The main issue with this data set is that the main train number used by the yard operator and the
operative train number used by the infrastructure manager can sometimes be different. The operative train number can change in
a station without conducting any actual operation on a train. This changing in train number makes the matching of trains in the
three data sets difficult.
Train punctuality data: this data set, records the time trains are ready to depart, the actual departure and arrival times of trains,
and the delay cause in some cases of delays. Combining these three data sets led to a large data-frame at the wagon level. Since the
purpose of our modeling is to predict train departures, we aggregated a data frame at the train connection level comprising 30,548
train connections. The extracted predictors from the combined data-frame are shown in Table 1.

4.3. Network simulation data

We used the 2019 timetable for the network simulation. The simulation is run for all days from February 4 to June 2 year 2019.
The timetable for passenger trains is largely similar for all weekdays, less traffic on Saturdays and Sundays. Freight trains have more
variation, some trains run several or all days a week and others less frequently. On the double-tracked sections, the majority of the
scheduled stops for passenger trains are commercial stops, i.e., the operators have requested the stops with associated stop times
(dwell).
Freight trains have also commercial stops, e.g., change of drivers and/or shunting activities. They also have a large number of
scheduled technical stops in the timetable, these are mainly necessary stops for allowing faster passenger trains overtake. A scheduled
stop time may also be a mix of requested time for a commercial stop and added technical time. On single-tracked lines technical
stop times usually arise to deal with train meetings. During operations pure scheduled technical stops may be canceled and new
ones created depending on the operational situation.
We also used the operational network data from the railway line between Malmö and Hallsberg in 2019. This line spans 83
stations or timing points, along which we have almost 4.2 million train movements; 20% of these are from long-distance passenger
trains, 25% from regional, and 24% from local passenger trains, while 31% of the movements are from freight trains. The records
contain timestamps for departures and arrivals, both scheduled and actual, which we have processed to create distributions of entry,
run, and dwell time delays, separated by train category and direction of travel.

5. Results and discussion

5.1. Departure prediction accuracy

The first part of the results evaluates the yard prediction model in Table 2. Results are compared to a simple base line model,
which uses the median of the actual departure times from the historical data. The evaluation metrics are 𝑅2 , mean absolute error
(MAE), root mean square error (RMSE), and mean signed difference (MSD).
𝑅2 = 0.92 means that the model is able to explain 92% of the variance in the data. Mean absolute error (MAE) calculates the
average absolute difference between the predicted departure deviation and the actual departure deviation from the historical data.
The MAE for the random forest model is less than three minutes, whereas using the median of the historical data shows a value of
19 minutes. In terms of RMSE, the random forest model is almost 4 times better than the base line model: 11 minutes compared to
40 minutes.
The final metric is mean signed difference (MSD); the positive value means that the model is predicting higher than the reality
and the negative value means that the model is predicting lower than the reality. Our model shows a positive value of 0.12 showing
that the departure deviations predicted by the yard model are slightly higher than their actual departure times. Focusing on the
high accuracy in terms of 𝑅2 and MAE makes the results acceptable and realistic enough to be integrated in PROTON.

7
N. Minbashi et al. Journal of Rail Transport Planning & Management 25 (2023) 100368

Table 2
The evaluation metrics for yard prediction model.
Model 𝑅2 MAE RMSE MSD
Base line (median) – 19.0 40 –
Random forest 0.92 2.9 11 0.12

Fig. 5. The basic machine learning arrival prediction model.

Table 3
The evaluation metrics for arrival prediction accuracy.
Model 𝑅2 MAE (min) RMSE (min)
Timetable – 42 72
ML-assisted macro simulation 0.48 35 51
Basic ML 0.19 39 56

5.2. Arrival prediction accuracy

5.2.1. Comparison of the ML-assisted macro simulation model performance to basic models
The performance of the ML-assisted macro simulation approach in terms of arrival prediction accuracy is compared to two basic
alternatives: timetable and a basic machine learning model (Basic ML). According to the timetable, all deviations should be zero.
The basic machine learning model uses the same concept and predictors applied in yard departure prediction model to predict the
arrival deviation to the next yard. We call it basic since it does not consider any parameter regarding the running of the freight trains
on the line (see Fig. 5). For all three approaches, the arrival deviations estimated are the deviations from the scheduled timetable.
The performance of the models in terms of 𝑅2 , MAE, and RMSE calculates the difference between the predicted arrival deviation
and the actual arrival deviation from the historical data for the same period.
The evaluation metrics are presented in Table 3. The timetable gives an MAE of 42 minutes and an RMSE of 72 minutes, whereas
the combined approach provides a better performance in both metrics: an MAE of 35 minutes and RMSE of 51 minutes. For the
Basic ML model, both metrics are close to the ML-assisted macro simulation approach since it uses the same concept behind the
yard departure prediction, and there is a high correlation between the departure deviations from Malmö and arrival deviations to
Hallsberg (see Fig. 6).
However, 𝑅2 = 0.19 for the Basic ML approach does not capture the variation in the data. This is not surprising since the
Basic ML approach is considering only operational data from the origin yard, and is therefore definitely missing some operational
predictors de-facto present in the ML-assisted macro simulation approach. There may therefore be a potential in the future to improve
yard arrival predictions applying machine learning-based approaches if combined with delay propagation models, and including
destination yard operational parameters (Barbour et al., 2018a).
Although the results of MAE are rather close in all three models, it should be considered that these three approaches are inherently
different, and are implemented on variant data sizes. For example, the Basic ML model is implemented on trains between Malmö
and Hallsberg, whereas the yard departure prediction is implemented on all departures from Malmö. This is due to a limited access
to data. Therefore, an entirely accurate comparison between these three models may not be possible.

5.2.2. The ML-assisted macro simulation model performance in predicting the range of deviations
To examine the model performance in predicting the range of deviations, we depicted the distribution of the arrival deviations
in Fig. 7.
We can see that the simulation gives a higher level of probabilities to smaller deviations than what the actual arrival deviations
show. For larger delays less than 300 minutes, the simulation underestimates the level of occurrences. In addition, the simulation
results are more concentrated in the center, and tend to predict that trains arrive on-time. This means that the simulation does not
fully capture the early arrivals of trains (Johansson et al., 2021).

8
N. Minbashi et al. Journal of Rail Transport Planning & Management 25 (2023) 100368

Fig. 6. The correlation between departure deviations from Malmö and arrival deviations to Hallsberg.

Fig. 7. The distribution of the arrival deviations.

A closer look at the distribution of the prediction errors in Fig. 8 shows that the model performance is better in predicting smaller
deviations as we see higher probability for the smaller deviations. However, the highest probabilities are mostly negative prediction
errors which depicts that our model framework is predicting less than the reality. The probability of the larger prediction errors is
less which may also be due to the less occurrences of the larger deviations in general.
A box plot of the most frequent trains in the simulation shows the spectrum of differences in actual arrival deviations and the
simulated ones as shown in Fig. 9. Depicting the spectrum of deviations per train is interesting from the operational perspective
because it gives the opportunity to operators what level of deviations to expect and react to for each train. In general, we can see
that the spectrum of deviations in the simulation is smaller than the spectrum of the deviations in reality. In general, our prediction
error range is less than 100 minutes on both sides, except for train number three. We also see that capturing large deviations is
difficult for the model; there is an instance on train number one where the large delay is estimated by the model. There are also
trains that the model captures a better range for them, such as train number two and five.
To sum up, three possible explanations on large deviation under estimation and current setup performance can be:

1. In reality train delays are often correlated, a train that is delayed at one point is often more prone to be delayed again in
the future, with some trains systematically being more likely to suffer from delays than others, but the current version of the
simulation model does not account for any such correlations or systematic delays.
2. Even if the simulation cycles are large, the simulation period is limited to four months due to data storage limitations. Longer
simulation periods may result in improved capture of the variation in the deviations and lead to a wider range of predicted
arrival times. By longer periods, we mean both more months and more hours, since in the current version we are only able
to simulate operational traffic days as calendar days of 24 h, and this cuts trains running over midnight.

9
N. Minbashi et al. Journal of Rail Transport Planning & Management 25 (2023) 100368

Fig. 8. The distribution of the prediction error.

Fig. 9. The box plot of the most frequent trains.

3. Although not much, current departure prediction errors have a slight bias towards predicting higher than reality; an MAE of
three minutes and a positive MSD, which may slightly impact the final arrival predictions.

6. Conclusion

In this paper, we proposed the first application of a machine learning-assisted macro simulation model framework to increase
the predictability of yard departure and arrivals. The novelty with this approach is that macro simulation uses results of a
machine learning-based yard model to initiate freight trains instead of using previous aggregate distributions, where the yard
model is implemented on a random forest algorithm, and provides efficient departure predictions without considering detailed
yard operations.
We showcased the model framework potential on the departure and arrivals between two main yards in Sweden (Malmö and
Hallsberg) for a period of four months. Current model setup captures 48% of the variation in the actual arrival deviations compared
to a basic ML model, which only captures 19% of the variation. It has an MAE of 35 min, compared to 39 min in the benchmark
ML model. However, the macro simulation does not yet fully capture large deviations or a bias towards early arrivals. Future
improvements regarding them can decrease the prediction error and lead to future real-time applications.

10
N. Minbashi et al. Journal of Rail Transport Planning & Management 25 (2023) 100368

This modeling framework connects yards in sequence through the network, and increases the predictability of yard arrivals.
These results can provide improved flexibility for yard practitioners, and assist them in better utilization of yard capacity, ad-hoc
re-planning of yard operations, and necessary wagon re-bookings.
Based on findings in this area, future research can be promising in the following steps: (1) Improving the macro simulation
model by incorporating larger distributions that handle correlations between primary delays. (2) Analyzing results from a temporal
perspective may find potential patterns in the prediction errors. (3) Running the simulations on longer periods of months and hours
for improved generalization. (4) Implementing a machine learning approach that uses the origin and destination yard data to connect
the departures from one yard and arrivals to the next yard.

CRediT authorship contribution statement

Niloofar Minbashi: Conceptualization, Methodology, Software, Validation, Formal analysis, Data curation, Investigation, Visu-
alization, Writing – original draft, Writing – review & editing. Hans Sipilä: Methodology, Software, Validation, Writing – original
draft, Writing – review & editing. Carl-William Palmqvist: Data curation, Writing – original draft, Writing – review & editing,
Supervision. Markus Bohlin: Methodology, Review & editing, Supervision. Behzad Kordnejad: Review & editing, Supervision,
Funding acquisition, Project administration.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared
to influence the work reported in this paper.

Acknowledgments

We would like to thank Magnus Wahlborg and Fredrik Lundström from the Swedish Transport Administration (Trafikverket) for
their support and availability during the time of conducting this research. We also thank Jonatan Gjerdrum from Green Cargo for
support with the data collection and sharing his operational expertise. This study has received funding from the Shift2Rail Joint
Undertaking (JU) under grant agreement No. 881778. The JU receives support from the European Union’s Horizon 2020 research
and innovation programme and the Shift2Rail JU members other than the Union. Additional funding has been received from the
Swedish Transport Administration (Trafikverket) and KAJT (Capacity in the Railway Traffic System) via PRATA project.

References

Barbour, W., Martinez Mori, J.C., Kuppa, S., Work, D.B., 2018a. Prediction of arrival times of freight traffic on US railroads using support vector regression.
Transp. Res. C 93, 211–227. http://dx.doi.org/10.1016/j.trc.2018.05.019.
Barbour, W., Samal, C., Kuppa, S., Dubey, A., Work, D.B., 2018b. On the data-driven prediction of arrival times for freight trains on U.S. railroads. In: 21st
International Conference on Intelligent Transportation Systems. ITSC, Maui, Hawaii, USA, http://dx.doi.org/10.1109/ITSC.2018.8569406.
Belošević, I., Ivić, M., Kosijer, M., Pavlović, N., Aćimović, S., 2015. Challenges in the railway yards layout designing regarding the implementation of intermodal
technologies. In: 2nd Logistics International Conference. Belgrade, Serbia.
Berthold, M.R., Cebron, N., Dill, F., Gabriel, T.R., Kötter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B., 2008. KNIME: the Konstanz information miner.
Stud. Classification Data Anal. Knowledge Organ. 319–326, https://link.springer.com/chapter/10.1007/978-3-540-78246-9_38.
Bešinović, N., Goverde, R.M., Quaglietta, E., Roberti, R., 2016. An integrated micro–macro approach to robust railway timetabling. Transp. Res. B 87, 14–32.
http://dx.doi.org/10.1016/J.TRB.2016.02.004.
Bohlin, M., Hansmann, R., Zimmermann, U.T., 2018. Optimization of Railway Freight Shunting. In: International Series in Operations Research and Management
Science, vol. 268, pp. 181–212. http://dx.doi.org/10.1007/978-3-319-72153-8_9.
Boysen, N., Fliedner, M., Jaehn, F., Pesch, E., 2012. Shunting yard operations: Theoretical aspects and applications. European J. Oper. Res. 220 (1), 1–14.
http://dx.doi.org/10.1016/j.ejor.2012.01.043.
Deleplanque, S., Hosteins, P., Pellegrini, P., Rodriguez, J., 2022. Train management in freight shunting yards: Formalisation and literature review. IET Intell.
Transp. Syst. http://dx.doi.org/10.1049/itr2.12216, URL: https://onlinelibrary.wiley.com/doi/10.1049/itr2.12216.
Dick, C.T., 2021. Influence of mainline schedule flexibility and volume variability on railway classification yard performance. J. Rail Transp. Plan. Manag. 20,
http://dx.doi.org/10.1016/J.JRTPM.2021.100269, URL: https://linkinghub.elsevier.com/retrieve/pii/S2210970621000342.
Dick, C.T., Nishio, N., 2019. Influence of mainline schedule flexibility and volume variability on railway classification yard performance. In: Peterson, A.,
Joborn, M., Bohlin, M. (Eds.), RailNorrköping 2019. 8th International Conference on Railway Operations Modelling and Analysis. ICROMA, Linköping
University Electronic Press, Linköpings universitet, Norrköping, pp. 406–425.
Dong, Y., 1997. Modeling Rail Freight Operations Under Different Operating Strategies (Ph.D. thesis). Massachusets Institute of Technology, p. 251.
Högdahl, J., Bohlin, M., Fröidh, O., 2019. A combined simulation-optimization approach for minimizing travel time and delays in railway timetables. Transp.
Res. B 126, 192–212. http://dx.doi.org/10.1016/j.trb.2019.04.003.
Jaehn, F., Michaelis, S., 2016. Shunting of trains in succeeding yards. Comput. Ind. Eng. 102, 1–9. http://dx.doi.org/10.1016/j.cie.2016.10.006.
Jaehn, F., Rieder, J., Wiehl, A., 2015a. Minimizing delays in a shunting yard. OR Spectrum 37 (2), 407–429. http://dx.doi.org/10.1007/s00291-015-0391-1.
Jaehn, F., Rieder, J., Wiehl, A., 2015b. Single-stage shunting minimizing weighted departure times. Omega (United Kingdom) 52, 133–141. http://dx.doi.org/
10.1016/j.omega.2014.11.001.
Johansson, I., Palmqvist, C.-w., Sipilä, H., Warg, J., 2021. Microscopic and macroscopic simulation of early freight train departures. J. Rail Transp. Plan. Manag.
21 (November 2021), 100295. http://dx.doi.org/10.1016/j.jrtpm.2022.100295.
Khoshniyat, F., 2012. Simulation of Planning Strategies for Track Allocation At Marshalling Yards (Master Thesis). KTH Royal Institute of Technology.
Kraft, E.R., 2000. A hump sequencing algorithm for real time management of train connection reliability. J. Transp. Res. Forum 39 (4), URL: https:
//trid.trb.org/view/668945.
Krüger, N.A., Vierth, I., Roudsari, F.F., 2013. Spatial, Temporal and Size Distribution of Freight Train Delays: Evidence from Sweden. Centre of Transport Studies,
Stockholm, URL: http://www.diva-portal.org/smash/get/diva2:1157840/FULLTEXT01.pdf.

11
N. Minbashi et al. Journal of Rail Transport Planning & Management 25 (2023) 100368

Kuhn, M., Johnson, K., 2013. Applied Predictive Modeling. Springer Science and Business Media LLC, pp. 1–600. http://dx.doi.org/10.1007/978-1-4614-6849-3.
Lee, Y., Lu, L.S., Wu, M.L., Lin, D.Y., 2017. Balance of efficiency and robustness in passenger railway timetables. Transp. Res. B 97, 142–156. http:
//dx.doi.org/10.1016/j.trb.2016.12.004, URL: http://dx.doi.org/10.1016/j.trb.2016.12.004.
Li, H.L., Zhang, C., Zhou, Z., Yang, Z., Xie, H., 2002. Computation of capacity at marshalling yard under imbalanced transportation circumstance. Traffic Transp.
Stud. 538–543.
Licciardello, R., Adamko, N., Deleplanque, S., Hosteins, P., Liu, R., Pellegrini, P., Peterson, A., Wahlborg, M., Zatko, M., 2020. Integrating Yards , network and
optimisation models towards real-time rail freight yard operations. Scienza E Tecnica 417–440.
Liu, R., Ye, H., Whiteing, T., 2016. DITTO Project Deliverable 3.2 Milestone 7 Simulation and Control of ERTMS Level 2. Technical Report, University of Leeds.
Minbashi, N., 2020. Applying Data Analytics to Freight Train Delays in Shunting Yards. KTH Royal Institute of Technology, Stockholm, Sweden, p. 33, URL:
https://www.diva-portal.org/smash/record.jsf?pid=diva2:14853783.
Minbashi, N., Bohlin, M., Kordnejad, B., 2020. A departure delay estimation model for freight trains. In: Lusikka, T. (Ed.), Proceedings of TRA2020, the 8th
Transport Research Arena 2020: Rethinking Transport – Towards Clean and Inclusive Mobility. Helsinki, URL: https://www.researchgate.net/publication/
339900221_A_Departure_Delay_Estimation_Model_for_Freight_Trains.
Minbashi, N., Bohlin, M., Palmqvist, C.W., Kordnejad, B., 2021a. The application of tree-based algorithms on classifying Shunting Yard departure status. J. Adv.
Transp. 2021, http://dx.doi.org/10.1155/2021/3538462.
Minbashi, N., Palmqvist, C.-W., Bohlin, M., Kordnejad, B., 2021b. Statistical analysis of departure deviations from Shunting Yards: Case study from
Swedish railways. J. Rail Transp. Plan. Manag. 18, http://dx.doi.org/10.1016/j.jrtpm.2021.100248, URL: https://linkinghub.elsevier.com/retrieve/pii/
S2210970621000159.
Palmqvist, C.-W., Lind, A., Ahlqvist, V., 2022. How and why freight trains deviate from the timetable: Evidence from Sweden. IEEE Open J. Intell. Transp. Syst.
3, 210–221. http://dx.doi.org/10.1109/OJITS.2022.3160546.
Placido, A., Cadarso, L., D’Acierno, L., 2014. Benefits of a combined micro-macro approach for managing rail systems in case of disruptions. Transp. Res. Procedia
3, 195–204. http://dx.doi.org/10.1016/J.TRPRO.2014.10.105.
Rail Freight Forward, 2018. 30 by 2030 Rail Freight Strategy to Boost Modal Shift. Technical Report, Coalition of European Rail Freight Companies.
Schlechte, T., Borndörfer, R., Erol, B., Graffagnino, T., Swarat, E., 2011. Micro–macro transformation of railway networks. J. Rail Transp. Plan. Manag. 1 (1),
38–48. http://dx.doi.org/10.1016/J.JRTPM.2011.09.001.
von Rueden, L., Mayer, S., Sifa, R., Bauckhage, C., Garcke, J., 2020. Combining machine learning and simulation to a hybrid modelling approach: Current and
future directions. In: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),
vol. 12080 LNCS, Springer, pp. 548–560, URL: https://link.springer.com/chapter/10.1007/978-3-030-44584-3_43.
Wang, R., Work, D.B., 2015. Data driven approaches for passenger train delay estimation. In: IEEE Conference on Intelligent Transportation Systems, Proceedings,
ITSC. 2015-Octob. Institute of Electrical and Electronics Engineers Inc. pp. 535–540. http://dx.doi.org/10.1109/ITSC.2015.94.
Watanabe, S., Mori, Y., Takatori, Y., Yonemoto, K., Tomii, N., 2019. Train traffic simulation algorithm based on historical train traffic records. Comput. Railw.
XVI http://dx.doi.org/10.2495/CR180261, URL: www.witpress.com.
Zinser, M., Betz, T., Becker, M., Geilke, M., Terschlüsen, C., Kaluza, A., Johansson, I., Warg, J., 2019. PRISM: A macroscopic Monte Carlo railway simulation.
In: Proceedings of 12th World Congress on Railway Research. WCRR, Tokyo.
Zinser, M., Betz, T., Warg, J., Solinen, E., Bohlin, M., 2018. Comparison of microscopic and macroscopic approaches to simulating the effects of infrastructure
disruptions on railway networks. In: Proceedings of 7th Transport Research Arena TRA.

12

You might also like