vehicle trajectory dataset
vehicle trajectory dataset
A R T I C L E I N F O A B S T R A C T
Keywords: Vehicle trajectory data have become essential for many research fields, such as traffic flow, traffic safety, and
Vehicle trajectory dataset automated driving. To make trajectory data useable for researchers, an overview of the included road section and
Traffic flow traffic situation as well as a description of the data processing methodology is necessary. In this paper, we present
Traffic safety
a trajectory dataset from a German highway with two lanes per direction, an off-ramp and congested traffic in one
Computer vision
direction, and an on-ramp in the other direction. The dataset contains 8,648 trajectories and covers 87 min and an
~1,200 m long section of the road. The trajectories were extracted from drone videos using a posttrained YOLOv5
object detection model and projected onto the road surface via three-dimensional (3D) camera calibration. The
postprocessing methodology can compensate for most false detections and yield accurate speeds and accelera-
tions. The trajectory data are also compared with induction loop data and vehicle-based smartphone sensor data
to evaluate the plausibility and quality of the trajectory data. The deviations of the speeds and accelerations are
estimated at 0.45 m/s and 0.3 m/s2, respectively. We also present some applications of the data, including traffic
flow analysis and accident risk analysis.
1. Introduction interest in publicly available datasets, such as the NGSIM dataset (U.S.
Federal Highway Administration (FHWA), 2006), has shown that these
Vehicle trajectory data or microscopic traffic data are highly valuable datasets can be used for more than the application they were created for.
for a wide range of applications. The oldest application, starting with the Most datasets contain only limited information on the data collection
famous Greenshields’ study (Greenshields et al., 1935), was the analysis methodology and quality. Commonly used metrics in object detection,
of traffic flow and the development and validation of traffic flow models. such as the mean average precision (mAP), represent only the quality of
Treiterer (1975) was among the first researchers to collect vehicle tra- the position measurements, while most applications require velocity and
jectories using aerial images. The second application emerged in the acceleration. With appropriate postprocessing of the raw trajectory data,
1980s with the Swedish Traffic Conflict Technique, which uses vehicle accurate velocities and accelerations can be achieved even with noisy
trajectories to evaluate traffic safety based on surrogate safety measures position measurements. However, there is no standardized metric for
(SSMs), e.g., time to collision (TTC) (Hyden and Linderholm, 1984). evaluating the quality of processed trajectories. The trajectory dataset
While vehicle movements were analyzed manually in the beginning, presented in this paper is compared with data from an induction loop and
image processing methods helped to automatically compute SSMs and data from a smartphone sensor that was present in one of the vehicles
identify traffic conflicts (Messelodi and Modena, 2005). The most recent included in the trajectory data. By comparing the speeds and accelera-
application is the development of automated driving systems, which rely tions, we can estimate the accuracy of the speed and acceleration values.
on naturalistic driving data to ensure that these systems interact safely Although the purpose of publishing trajectory datasets is to make
with human drivers in every possible situation (Roesener et al., 2017). All them useable for other researchers, it is often difficult for them to assess
these applications benefit from a wide variety of trajectory datasets with whether a dataset contains a suitable road section and suitable traffic
different road characteristics and traffic situations. scenarios for their research question. We therefore conduct a traffic flow
With the increasing number of applications, progress in computer analysis with time–space diagrams, fundamental diagrams, and time
vision technology has led to a growing number of trajectory datasets. The series of flow, density, and mean speed.
* Corresponding author.
E-mail address: [email protected] (M. Berghaus).
https://doi.org/10.1016/j.commtr.2024.100133
Received 26 January 2024; Received in revised form 9 April 2024; Accepted 9 April 2024
2772-4247/© 2024 The Authors. Published by Elsevier Ltd on behalf of Tsinghua University Press. This is an open access article under the CC BY license (http://
creativecommons.org/licenses/by/4.0/).
M. Berghaus et al. Communications in Transportation Research 4 (2024) 100133
The trajectory dataset presented in this paper originates from a For example, in Clausse et al. (2019), a general framework with a method
German highway with two lanes per direction, an off-ramp and congested based on Mask R–CNN was presented for the extraction of 3D trajectories
traffic in one direction, and an on-ramp in the other direction. Based on for the use of traffic cameras.
the information on data quality, road characteristics, and traffic situa- However, even when focusing on the acquisition of trajectories from
tions provided in this paper, researchers can evaluate whether the dataset drone videos, a multitude of works have attempted different methods
meets their requirements. with varying success. Azevedo et al. (2014) recognized the possibilities of
The remainder of this paper is structured as follows. In Section 2, we a drone-based approach early on and extracted vehicle trajectories
provide an overview of other publicly available trajectory datasets. In through background subtraction and a k-shortest disjoint path algorithm
Section 3, we present the data collection and data processing method- for tracking at a speed of two frames per second without classification.
ology. Section 4 contains the description of the data and an evaluation of Apeltauer et al. (2015) used the AdaBoost classifier with multiblock local
the data quality. In Section 5, we present two possible applications of the binary pattern (MB-LBP) for detection followed by the sequential Monte
dataset. Carlo method to track vehicles through an intersection. Khan et al.
(2017) used the Lucas–Kanade optical flow algorithm to detect and track
2. Related work vehicles in combination with background extraction. Zhao and Li (2019)
used Mask R–CNN for detection combined with a semiautomatic
2.1. Vehicle trajectory datasets extraction of different lanes but lacked a description of the method for
camera calibration. Kim et al. (2019) compared the results from the
When searching for comparable datasets, the high requirements in aggregated channel feature (ACF) and Faster R–CNN methods for the
the area of traffic safety analysis and the complex boundary conditions tracking of vehicles in congested traffic situations and achieved prom-
that allow an evaluation of the infrastructure should not be neglected. ising results despite some problems with false positive detection. Ahmadi
The investigated datasets are compared and analyzed in this section. and Mohammadzadeh (2017) extracted vehicle trajectories from space-
The first category of datasets is generated from infrastructure-based borne videos using background subtraction with a much larger field of
sensors (Creß et al., 2022; U.S. FHWA, 2006; Wang et al., 2023). Here, view but with no classification and a lower precision due to the lower
the focus does not lie in the generation of particularly diverse data. resolution of the videos. Masouleh and Shah-Hosseini (2019) developed
Rather, with this kind of data generation, large amounts of data can be a new algorithm for the semantic segmentation of vehicles from
collected easily, e.g., for training vehicle behavior models. As the first UAV-based thermal infrared imagery using a Gaussian–Bernoulli
large dataset of its kind, NGSIM (U.S. FHWA, 2006) demonstrated the restricted Boltzmann machine (GB-RBM) with improvements compared
possibility of microscopic traffic data collection by placing cameras on to other semantic segmentation networks. Feng et al. (2020) included
buildings, understandably with a small amount of data and noncompa- pedestrians and cyclists in their YOLOv3-based approach and can
rable accuracy. The dataset presented in (Creß et al., 2022) comes from a therefore be used to record scenes on urban roads with all traffic par-
large test site of the A9 highway in Germany, which covers a 3 km stretch ticipants, with an accuracy of approximately 92% for motor vehicles. Shi
of highway and is captured with lidar and cameras. A similar approach et al. (2021) used videos made from multiple helicopters to capture a
using radar was taken by Wang et al. (2023), where a dataset of several larger section of a highway, and simultaneously, trajectory extraction
kilometers of highway in China was created. All three datasets and their with YOLOv3 automatically detected lane markings and calculated
corresponding methodologies show promising results but are too com- vehicle motion characteristics. Yeom and Nam (2021) approached the
plex and expensive to perform on many different routes or have little lead detection problem through the difference between two consecutive im-
time and low cost. ages for a driving vehicle and tracked the detected vehicles using a
The remaining datasets used a similar approach as in this work: flying Kalman filter, sadly only using their method on a total of 22 vehicles in
drones at a very high altitude either above or directly next to the high- the recorded videos.
way. For example, Krajewski et al. (2018) acquired ~16.5 h of trajectory Furthermore, there are methods already listed in Section 2.1 that
data from straight highway sections, and Moers et al. (2022) supple- have already proven their potential with the publication of large datasets.
mented these data with on-ramp and off-ramp trajectories. The AUTO- As one of the first applications of neural networks for the extraction of
MATUM DATA dataset also provides a good basis for many applications vehicle trajectories from drone videos, Krajewski et al. (2018) used U-Net
with 30 h of material (Spannaus et al., 2021). The MAGIC dataset focuses for the detection of vehicles and further started the extraction of pa-
on using a large number of drones simultaneously and uses a commercial rameters of interest for the automotive industry, such as maneuver
service to extract the trajectories, which are detrimentally longer than classification. Using the same method, trajectories from intersections and
those of any other dataset because of the large number of drones (Ma exits were recorded in subsequent years (Krajewski et al., 2018; Moers
et al., 2022). et al., 2022).
Although all of these works are very relevant and the data can be Since the number of works on this topic is too large to cite them all in
considered for many use cases, there is still room for improvement. These this work, the interested reader can refer to two reviews. In their review
studies generally rely on a road surface consisting of a single flat area paper on drone-based road traffic monitoring systems, Bisio et al. (2022)
with no change in elevation. However, this approach works only when presented additional approaches and datasets and compared different
simple road geometries are used. The more complex the road infra- detection and tracking algorithms. Unfortunately, there is no mention of
structure is, the larger the resulting errors will be. In contrast, this work the different calibration methods used in the reviewed papers. Butila and
uses three-dimensional (3D) models for the infrastructure, which allows Boboc (2022) provide a systematic overview of 34 works on drone-based
these inaccuracies to be avoided. trajectory extraction, categorizing the works according to aim and
feasibility.
2.2. Trajectory extraction methods
3. Methods
The generation of trajectory data has been a research topic for many
years, such that a whole series of works is available. Various sensors and In this section, we describe the workflow used to obtain the trajectory
evaluation algorithms are used, often depending on the use case and data presented in this paper. Section 3.1 contains all relevant information
users. For example, several laser scanners are fused for trajectory data on the study area and the collection of the videos. Section 3.2 describes
(Zhao et al., 2009), and several lidar sensors (Kloeker et al., 2020) are how the trajectories are extracted from the drone videos, including video
used to capture trajectories at intersections. Furthermore, camera sys- stabilization, camera calibration, and vehicle detection. Section 3.3
tems installed in the infrastructure can be used for recording trajectories. contains the necessary steps of data processing to achieve a high-quality
2
M. Berghaus et al. Communications in Transportation Research 4 (2024) 100133
trajectory dataset. (focal length and distortion) of the drone were computed in preparation
for the evaluation and used for the extrinsic calibration process.
3.1. Study area and material collection For the extrinsic calibration, the first step is to create a 3D model of
the road markings using publicly available geodata (georeferenced
The dataset presented in this paper was collected at the highway A43 orthophotos and an elevation model from laser scanner data) from North
near Münster, Germany. The highway has two lanes in each direction, Rhine–Westphalia (GDI NRW, 2024) (Fig. 2). Since the resulting model
with a two-lane off-ramp in direction 1 (west to east) and a one-lane on- does not show a single straight road surface as in other works but rather a
ramp in direction 2 (east to west). The dataset covers an ~1,200 m long more realistic 3D surface, it is approximated with triangulation. Instead
stretch of road during the morning peak hour (7:11 to 8:38) on of calculating a single transformation for the entire road surface, as is
September 6, 2021. Direction 1 is partly congested during this time due usually the case; in our case, each triangulation surface receives a sepa-
to a nearby on-ramp. The data were collected at two locations using DJI rate transformation. With the 3D model, reference points can be
Mavic Pro drones. Two drones per location were used alternately due to captured, and the different transformations can be computed in a final
limited battery capacity. As a result, there are some temporal overlaps step.
and gaps in the videos, which must be considered during data processing.
The drones flew 500 m above ground and each covered more than 600 m 3.2.3. Detection and tracking
of the road with a spatial overlap of ~50 m. The videos were recorded The detection of vehicles in single frames is performed with a post-
with 4 K resolution (3,840 2,160) at 25 frames per second, which trained model based on YOLOv5 (Jocher et al., 2022). For this purpose,
corresponds to a vehicle size of approximately 30 12 pixels. Fig. 1 several minutes of video footage were labeled by hand and used via
shows two aerial views of the filmed road section. transfer learning to adapt an existing model. The trained model provides
To check the plausibility of the vehicle trajectories derived from excellent results in detecting vehicles ([email protected] of 95%). A slight ten-
drone data, additional measurements were taken with a vehicle that was dency toward false-positive results can be remedied in the future via
traveling on the highway section during the time of video recording. A further training; in our case, false detections are reliably eliminated by
smartphone's GPS sensor and inertial measurement unit captured the tracking and postprocessing of the trajectories. Accordingly, in the pre-
vehicle's position, speed, and acceleration at frequencies of 1 Hz (position sent work, not single detections are a criterion of the data quality but
and speed) and 100 Hz (acceleration), respectively. The smartphone was rather the complete trajectories (see Section 4). These trajectories are
mounted on a flat surface approximately at the center of the vehicle to created in the first step by matching the positions and the driving di-
ensure that it remained aligned in the vehicle coordinate system at all rection. In all further steps, the distance between two detections is no
times. The data were collected with an iPhone X and the app PhyPhox, longer determined, but the distance between the current position and the
which allows the recording of raw sensor data (Staacks et al., 2018). To position is predicted on the basis of the speed recorded thus far. Munkres
ensure that the vehicle was easily visible in the video images, an orange (1957) is used as the matching algorithm. Even if a vehicle is covered by a
van (VW Transporter) was used. This ensures that the drone data can be bridge or gantry for a short period, this algorithm can track it in most
correctly assigned to this vehicle and that the drone and vehicle data can cases. Fig. 3 shows a snapshot of a drone video with the vehicle de-
be compared correctly. tections reprojected into the image.
3.2.1. Stabilization Data processing ensures that the vehicle trajectories are plausible and
The first step in evaluating drone images is usually video stabiliza- that the trajectories extracted from each video are combined into a single
tion. Since the calibration step explained in the next section can only be dataset.
performed on the first frame of each video, each subsequent frame in the The first step is to convert the vehicle positions from world co-
video is transformed to match this first frame. The stabilization is per- ordinates to road coordinates, i.e., a coordinate system where the x-axis
formed based on a standard pipeline using feature detection from Shi and lies on the right edge of the rightmost lane and the y-axis is orthogonal to
Tomasi (1994), the Lucas–Kanade feature tracking method presented in the x-axis. Thus, the x value represents the distance traveled by a vehicle,
Bouguet (1999), and the computation of a homography via RANSAC. If and the y value represents the distance to the right edge of the road,
the computation of the homography is successful, the current image can which can be used to determine the lane on which the vehicle drives. This
be deformed accordingly. In the case of particularly abrupt movements of coordinate system is convenient for most applications involving inter-
the drone or continuous displacement of the image and thus too large a urban roads, where all vehicles drive in the same direction, and mostly
deviation, stabilization is interrupted, and the video is split, removing up only the velocities and accelerations in the driving direction are relevant.
to 2 s of video to eliminate any blurred images. Editing of the video is The road markings were extracted as polygonal chains from georefer-
initiated in the event that less than 5% of the features found can be enced orthophotos (see Section 3.2) and then smoothed.
recovered by homography. The road coordinates are also useful for checking the plausibility of
the data. It can be assumed that vehicles only drive forward, i.e., the
3.2.2. Calibration velocity in the x-direction must not be smaller than zero. It can also be
Calibration of the video corresponds to the process of finding a assumed that the velocity has an upper bound. Therefore, the difference
transformation matrix for converting two-dimensional (2D) image co- between two subsequent x values (Δx) is a criterion for plausibility. The
ordinates to 3D world coordinates. The intrinsic camera parameters
Fig. 1. Aerial views of the filmed road section. (a) Western region and (b) Fig. 2. 3D-street model of the recorded highway segment, color-coded ac-
eastern region. cording to the z-axis.
3
M. Berghaus et al. Communications in Transportation Research 4 (2024) 100133
Fig. 4. (a) Raw data with implausible points marked in red. (b) Data after removal of implausible points.
4
M. Berghaus et al. Communications in Transportation Research 4 (2024) 100133
Fig. 6. Identifying the shift between two adjacent (temporal or spatial) videos.
5
M. Berghaus et al. Communications in Transportation Research 4 (2024) 100133
two induction loops (one on each lane, only direction 1) that are located RMS deviations between the signals are on average 0.43 m/s (direction
in the middle of the filmed road section. Fig. 8 shows that both the flows 1) and 0.46 m/s (direction 2).
and the mean speeds from the drone dataset are consistent with the flows The travel direction (x-direction) and direction orthogonal to the
and mean speeds from the induction loop data. However, the flows from travel direction (y-direction) of acceleration are compared. The general
the drone dataset are mostly slightly smaller, which indicates that not all patterns of the drone data and in-vehicle sensor data are similar in both x-
vehicles have been detected. The root mean square (RMS) deviation of direction (Fig. 11) and y-direction (Fig. 12), which is also indicated by
the flows is 2.2 vehicles per minute in lane 1 and 3.1 vehicles per minute low average RMS deviations (Table 1). However, high-frequency changes
in lane 2. The speeds from the drone data are also generally slightly in acceleration cannot be detected in the drone data. This is particularly
lower. The RMS deviation of the speeds is 0.78 m/s (2.8 km/h) in both noticeable in y-direction. Lane changes are clearly recognizable as peaks
lanes. Since the accuracy of the speed measurement of the induction with in-vehicle data, while the signal derived from drone data is sub-
loops is unknown, it cannot be concluded which values are more stantially smoothed. This is also reflected in a low correlation in y-di-
accurate. rection between the two data sources. In addition, there is an offset in x-
The data from the adjacent induction loops can be used to interpolate and y-directions, which changes between runs. This could be due to small
the mean speeds in the filmed road section. Due to the propagation of movements of the smartphone mount between runs, as the offset is
shock waves (forward in free flow conditions and backward in congested constant during each test run.
flow conditions), linear interpolation is not appropriate. Instead, we use Compared to the speed differences between induction loops and
the adaptive smoothing method (ASM) proposed by Treiber and Helbing drone data, the differences between in-vehicle data and drone data are
(2002), which takes the propagation of shock waves into account. Fig. 9 substantially smaller. This indicates that the induction loops on this
shows good agreement between the mean speeds obtained from the section of the road might not be well calibrated for speed measurements.
trajectory data and the interpolated mean speeds obtained from the in- Therefore, the comparison with the in-vehicle data confirms the high
duction loop data. quality of the drone data.
The induction loops categorize the vehicle flows into passenger cars
and trucks. In direction 1, the truck ratio is 8.2% (direction 2: 10.5%) 5. Possible applications of the dataset
according to the induction loop data and 12.3% (direction 2: 11.5%)
according to the trajectory data. These differences indicate that some 5.1. Traffic flow analysis
passenger cars might have been falsely labeled as trucks in the trajectory
data. However, the accuracy of vehicle categorization in induction loop In the following, we present a microscopic and macroscopic traffic
data cannot be validated. flow analysis based on trajectory data. Fig. 13 shows two time–space
diagrams of an excerpt of the data. The color-coding of the lanes
4.2. Microscopic comparison with in-vehicle sensors (Fig. 13a) illustrates the frequency and locations of lane changes, the
gaps between vehicles and the speed differences between the lanes. The
To check the plausibility of the calculated speeds and accelerations color coding of the speed (Fig. 13b) illustrates the propagation of shock
obtained from drone data, eight test runs with in-vehicle sensors were waves.
performed, four in each direction. Similar to the drone data, in-vehicle The time series of flow, density, and mean speed in 1-min interval
data were smoothed to reduce signal noise. A smoothing spline with (Fig. 14) allow the identification of congestion and the distribution of
break points of 0.5 s for the accelerometer and 2 s for the GPS was used. vehicles between lanes. There are two stop-and-go waves (7:46 and 7:53)
These parameters were chosen to filter out sensor noise without with large densities and low speeds in both lanes. Due to the small length
compromising the validity of the data. For a valid comparison between of these stop-and-go waves, the flow does not decrease significantly. As
drone data and in-vehicle data, the clocks of both data sources were expected, the mean speed in lane 2 is greater than that in lane 1 due to the
aligned based on the position data. The signals of the in-vehicle sensors presence of fewer trucks in lane 1.
were shifted by the time difference between the drone and in-vehicle The flow–density diagram, speed–flow diagram, and speed–density
sensor data at the position where the measurement vehicle was first diagram (Fig. 15) show the capacity, optimal speed, and critical density
detected by the drone. on this road section, respectively. Again, the small length of the stop-and-
Fig. 10 shows good agreement between the speeds derived from go waves leads to densities up to 57 veh/km per lane, which is well below
drone data and the speeds obtained from in-vehicle sensors. The de- the maximum jam density.
viations between the signals of the two data sources are normally
distributed, with a mean of 0.01 m/s and a standard deviation of 0.47 m/ 5.2. Accident risk analysis
s. Thus, no systematic deviation can be identified. The mean correlations
between the signals are 0.99 (direction 1) and 0.90 (direction 2). The The concept of traffic conflict analysis is based on the assumption that
Fig. 8. Comparison of induction loop data and drone data with respect to (a) flow and (b) mean speed in 1-min-intervals.
6
M. Berghaus et al. Communications in Transportation Research 4 (2024) 100133
Fig. 9. Comparison of (a) mean speeds obtained from induction loop data and the adaptive smoothing method and (b) the mean speeds obtained from our trajec-
tory data.
Fig. 10. Comparison of the speeds of the measurement vehicle obtained from drone data (blue lines) and from in-vehicle data (orange lines) of eight test runs (left:
direction 1, right: direction 2).
Fig. 11. Comparison of the accelerations along the road (x-direction) of the measurement vehicle obtained from drone data (blue lines) and from in-vehicle data
(orange lines) from eight test runs (left: direction 1, right: direction 2).
each traffic interaction can lead to a collision. The less likely the par- SSMs can be categorized as follows: (1) temporal-proximal indicators, (2)
ticipants of a traffic interaction are to react and avoid a crash, the more deceleration-based indicators, and (3) distance-based proximal
dangerous the situation is evaluated. To determine this “closeness” to a indicators.
crash, different surrogate safety measures (SSMs) have been developed in In the first two categories, time-to-collision (TTC) and deceleration
recent decades. According to Mahmud (Mahmud et al., 2017), these rate to avoid crash (DRAC) are two of the most commonly used measures
7
M. Berghaus et al. Communications in Transportation Research 4 (2024) 100133
Fig. 12. Comparison of the accelerations orthogonal to the road (y-direction) of the measurement vehicle obtained from drone data (blue lines) and from in-vehicle
data (orange lines) from eight test runs (left: direction 1, right: direction 2).
Table 1
Parameters for comparing acceleration data derived from drone data and acceleration data obtained from the in-vehicle sensor.
Signal Mean of deviations (m/s2) Std.–Dev. (m/s2) Correlation Average RMS deviations (m/s2)
Fig. 13. Time–space diagrams with (a) color-coded lanes and speeds in lanes (b) 1 and (c) 2.
to analyze traffic safety. Both of these SSMs rely on the assumptions that
the analyzed traffic participants will maintain their course and mo-
mentum from the initial moment until a collision occurs. In this way, the
TTC determines the remaining time until the collision from this initial
moment, whereas the DRAC estimates the smallest deceleration rate
needed to avoid the collision (Almqvist et al., 1991; Hayward, 1971).
However, both in inner-city traffic and on highways, acceleration and
deceleration of the traffic participants cannot be neglected. Therefore, to
evaluate the traffic on the recorded road section, we applied modified
versions of these two indicators: We used the modified time-to-collision
(MTTC) and deceleration rate to avoid crashes using constant initial ac-
celeration (DCIA). The MTTC was developed by Ozbay et al. (2008) and
can be calculated as Eq. (1):
8
>
>
D
; if vd > 0 and ad ¼ 0
>
<vd
MTTC ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (1)
>
> vd vd þ 2ad D; if ad 6¼ 0
> 2
:
ad
8
M. Berghaus et al. Communications in Transportation Research 4 (2024) 100133
Fig. 15. (a) Flow–density diagram, (b) speed–flow diagram, and (c) speed–density diagram.
vehicles. MTTC is the smallest positive result. When the MTTC is equal to
or lower than 1.5 s, the interaction is considered risky (Ozbay et al.,
2008).
The DCIA was developed by Fazekas et al. (2017) and can be calcu-
lated as Eq. (2):
dL T þ vL dF R vF
DCIA ¼ ; if T > R (2)
T R
where vF , vL , dF , and dL are the initial speed and initial deceleration of the
follower and leader vehicles, respectively. R is the reaction time of the Fig. 17. Average DCIA values in 20 m long sections from the west to the east.
follower vehicle. T is the time until a crash, which can be calculated as
Eq. (3):
threshold value of 3.4 m/s2 on the road section. In fact, the highest
vF R vL R 2D average value present in the data was 1.69 m/s2, which, although cor-
T¼ ; if denom: 6¼ 0 (3) responding to stronger braking, still falls within the range of normal
vL þ dL R vF dF R
occurrences in traffic. The results can be explained by the congested
when the DCIA is above 3.4 m/s2, the interaction is considered dangerous traffic, where vehicles are relatively close to each other; therefore, the
(Fazekas et al., 2017). time to a crash is low, while due to lower speed values, traffic participants
Due to the traffic data collection method described above, SSMs can do not need to brake strongly to avoid a collision.
be calculated at any timestamp in this dataset. This allows us to deter-
mine the extreme values of TTC and DCIA between interacting vehicles, 6. Conclusions
which enables the analysis of the whole traffic scene. For this purpose, we
built so-called pairs of interacting vehicles (one follower and one leader This study presented a vehicle trajectory dataset from a German
vehicle) that could collide based on their momentum. In the case of highway with two lanes and an off-ramp as well as the methods imple-
MTTC, we then determined the lowest value of each vehicle pair, mented to create the dataset. The data contain both free and congested
whereas for DCIA, we identified the highest deceleration rates between traffic. The data extraction and processing methodology is applicable to
the paired vehicles. In the next step, we located these extreme values on other drone videos and, to some extent, to videos from stationary cam-
the positions of the follower vehicles on the road. Then, we calculated the eras. We performed a traffic flow analysis and an accident risk analysis,
average value of these results in each 20 m long section on the analyzed which showed that the trajectory data are suitable for these two appli-
road section for each traffic lane separately. To present only relevant cations. We also evaluated the plausibility and quality of the data by
information, we considered only MTTC values under 5 s. Accordingly, comparing the speeds, accelerations and flows with the results from in-
Fig. 16 shows the average MTTC, and Fig. 17 shows the average DCIA duction loop data and smartphone accelerometer data. The results
values in the west-to-east travel direction. The values are presented with showed good agreement between our dataset and the other sensor data,
colors: The riskier the interaction based on the SSM is, the darker the red which indicates good data quality. We therefore conclude that the dataset
is. is useable for traffic flow and traffic safety analyses.
The average MTTC values were less than the threshold value of 1.5 s
in the main lane (lane 1) in 23% of the 20 m long road sections, in the Replication and data sharing
passing lane (lane 2) in 27.9% and in the first exit lane (lane 0) in 18.2%.
In contrast to the MTTC, the average DCIA values did not reach the The data presented in the manuscript can be accessed at https://data
.isac.rwth-aachen.de/?p¼58.
9
M. Berghaus et al. Communications in Transportation Research 4 (2024) 100133
draft, Visualization, Methodology, Formal analysis, Conceptualization. Ma, W., Zhong, H., Wang, L., Jiang, L., Abdel-Aty, M., 2022. MAGIC dataset: multiple
conditions unmanned aerial vehicle group-based high-fidelity comprehensive vehicle
Markus Oeser: Supervision, Project administration, Funding acquisition.
trajectory dataset. Transport. Res. Rec. 2676, 793–805.
Mahmud, S.M.S., Ferreira, L., Hoque, M.S., Tavassoli, A., 2017. Application of proximal
surrogate indicators for safety evaluation: a review of recent developments and
Declaration of competing interest research needs. IATSS Res. 41, 153–163.
Masouleh, M.K., Shah-Hosseini, R., 2019. Development and evaluation of a deep learning
model for real-time ground vehicle semantic segmentation from UAV-based thermal
The authors declare th at they have no known competing financial infrared imagery. ISPRS J. Photogrammetry Remote Sens. 155, 172–186.
interests or personal relationships that could have appeared to influence Messelodi, S., Modena, C.M., 2005. A computer vision system for traffic accident risk
the work reported in this paper. measurement. A case study. Adv. Transport. Stud. 7, 51–66.
Mobilithek, 2022. Querschnittsdaten (Q und v) von Messstellen auf BAB in Nordrhein-
Westfalen. https://mobilithek.info/offers/110000000003477001.
Acknowledgements Moers, T., Vater, L., Krajewski, R., Bock, J., Zlocki, A., Eckstein, L., 2022. The exiD
dataset: a real-world trajectory dataset of highly interactive highway scenarios in
Germany. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 958–964.
The work presented in this study is part of the projects BueLaMo Munkres, J., 1957. Algorithms for the assignment and transportation problems. J. Soc.
(Bürgerlabor Mobiles Münsterland - Citizens' Laboratory Mobile Mün- Ind. Appl. Math. 5, 32–38.
Ozbay, K., Yang, H., Bartin, B., Mudigonda, S., 2008. Derivation and validation of new
sterland) funded by the German Federal Ministry for Education and
simulation-based surrogate safety measure. Transport. Res. Rec. 2083, 105–113.
Research, FeGisþ (Früherkennung von Gefahrenstellen im Stra- Roesener, C., Sauerbier, J., Zlocki, A., Fahrenkrog, F., Wang, L., Varhelyi, A., et al., 2017.
βenverkehr - Early Detection of Dangerous Areas in Traffic) funded by the A comprehensive evaluation approach for highly automated driving. In: 25th
International Technical Conference on the Enhanced Safety of Vehicles (ESV),
German Federal Ministry for Digital and Transport, and NeMo (Neue
Detroit, 0259.
Ans€atze der Verkehrsmodellierung unter Berücksichtigung komplexer Shi, J., Tomasi, 1994. Good features to track. In: 1994 Proceedings of IEEE Conference on
Geometrien und Daten - New traffic models considering complex ge- Computer Vision and Pattern Recognition, pp. 593–600.
ometries and data) funded by the German Research Foundation (DFG). Shi, X., Zhao, D., Yao, H., Li, X., Hale, D.K., Ghiasi, A., 2021. Video-based trajectory
extraction with deep learning for High-Granularity Highway Simulation (HIGH-SIM).
Commun. Transport. Res. 1, 100014.
References Spannaus, P., Zechel, P., Lenz, K., 2021. Automatum data: drone-based highway dataset
for the development and validation of automated driving software for research and
commercial applications. In: 2021 IEEE Intelligent Vehicles Symposium, vol. IV,
Ahmadi, S.A., Mohammadzadeh, A., 2017. A simple method for detecting and tracking
pp. 1372–1377.
vehicles and vessels from high resolution spaceborne videos. In: 2017 Joint Urban
Staacks, S., Hütz, S., Heinke, H., Stampfer, C., 2018. Advanced tools for smartphone-based
Remote Sensing Event (JURSE), pp. 1–4.
experiments: phyphox. Phys. Educ. 53, 045009.
Almqvist, S., Hyden, C., Risser, R., 1991. Use of speed limiters in cars for increased safety
Treiber, M., Helbing, D., 2002. Reconstructing the spatio-temporal traffic dynamics from
and a better environment. Transport. Res. Rec. 34–39.
stationary detector data. Cooper@tive Tr@nsport@tion Dyn@mics 1, 1–3.
Apeltauer, J., Babinec, A., Herman, D., Apeltauer, T., 2015. Automatic vehicle trajectory
Treiterer, J., 1975. Investigation of traffic dynamics by aerial photogrammetry
extraction for traffic analysis from aerial video data. Int. Arch. Photogram. Rem. Sens.
techniques. Transport. Res. Rec. 224.
Spatial Inf. Sci. 40, 9–15.
U.S. Federal Highway Administration (FHWA), 2006. Next generation simulation
Azevedo, C.L., Cardoso, J.L., Ben-Akiva, M., Costeira, J.P., Marques, M., 2014. Automatic
program (NGSIM). http://ops.fhwa.dot.gov/trafficanalysistools/ngsim.htm.
vehicle trajectory extraction by aerial remote sensing. Procedia Soc. Behav. Sci. 111,
Wang, J., Fu, T., Xue, J., Li, C., Song, H., Xu, W., et al., 2023. Realtime wide-area vehicle
849–858.
trajectory tracking using millimeter-wave radar sensors and the open TJRD TS
Berghaus, M., Lamberty, S., Ehlers, J., Kallo, E., Oeser, M., 2022. Vehicle trajectory
dataset. Int. J. Transp. Sci. Technol. 12, 273–290.
dataset from drone videos including off-ramp and congested traffic. https://data.isac.
Yeom, S., Nam, D.H., 2021. Moving vehicle tracking with a moving drone based on track
rwth-aachen.de.
association. Appl. Sci. 11, 4046.
Bisio, I., Garibotto, C., Haleem, H., Lavagetto, F., Sciarrone, A., 2022. A systematic review
Zhao, D., Li, X., 2019. Real-world trajectory extraction from aerial videos-A
of drone based road traffic monitoring system. IEEE Access 10, 101537–101555.
comprehensive and effective solution. In: 2019 IEEE Intelligent Transportation
Bouguet, J., 1999. Pyramidal Implementation of the Lucas Kanade Feature Tracker, vol.
Systems Conference (ITSC), pp. 2854–2859.
16. Intel Corp., Microprocessor Research Labs, Santa Clara, CA.
Zhao, H., Cui, J., Zha, H., Katabira, K., Shao, X., Shibasaki, R., 2009. Sensing an
Butila, E.V., Boboc, R.G., 2022. Urban traffic monitoring and analysis using unmanned
intersection using a network of laser scanners and video cameras. IEEE Intell.
aerial vehicles (UAVs): a systematic literature review. Rem. Sens. 14, 620.
Transport. Syst. Mag. 1, 31–37.
Clausse, A., Benslimane, S., de La Fortelle, A., 2019. Large-Scale extraction of accurate
vehicle trajectories for driving behavior learning. In: 2019 IEEE Intelligent Vehicles
Symposium (IV), pp. 2391–2396.
Creß, C., Zimmer, W., Strand, L., Fortkord, M., Dai, S., Lakshminarasimhan, V., et al., Moritz Berghaus received the M.Sc. degree in transportation
2022. A9-dataset: multi-sensor infrastructure-based dataset for mobility research. In: engineering from RWTH Aachen University in 2017, where he
2022 IEEE Intelligent Vehicles Symposium (IV), pp. 965–970. is currently pursuing the Ph.D. degree. Since 2018, he has been
Fazekas, A., Hennecke, F., Kall o, E., Oeser, M., 2017. A novel surrogate safety indicator a Research Assistant with the Institute of Highway Engineering,
based on constant initial acceleration and reaction time assumption. J. Adv. RWTH Aachen University. His research interests include traffic
Transport. 2017, 8376572. safety, traffic flow, and simulation.
Feng, R., Fan, C., Li, Z., Chen, X., 2020. Mixed road user trajectory extraction from
moving aerial videos based on convolution neural network detection. IEEE Access 8,
43508–43519.
GDI NRW, 2024. Geoportal.NRW. https://www.geoportal.nrw.
Greenshields, B., Bibbins, J., Miller, H., 1935. A study of traffic capacity. In: https://onlin
epubs.trb.org/Onlinepubs/hrbproceedings/14/14P1-023.pdf.
Hayward, J.C., 1971. Near Misses as a Measure of Safety at Urban Intersections. M.S.
Thesis. The Pennsylvania State University, Philadelphia, PA, USA.
Hyden, C., Linderholm, L., 1984. The Swedish traffic-conflicts technique. In: Asmussen, E.
(Ed.), International Calibration Study of Traffic Conflict Techniques, pp. 133–139.
Jocher, G., Chaurasia, A., Stoken, A., Borovec, J., Kwon, Y., Fang, J., et al., 2022. Serge Lamberty received the M.Sc. degree in electrical engi-
Ultralytics/yolov5: v6. 1-tensorrt, tensorflow edge tpu and openvino export and neering, information technology, and computer engineering
inference. https://github.com/ultralytics/yolov5/discussions/6740. from RWTH Aachen University in 2017. From 2017 to 2022, he
Khan, M.A., Ectors, W., Bellemans, T., Janssens, D., Wets, G., 2017. Unmanned aerial was a Researcher in the field of traffic digitalization and since
vehicle–based traffic analysis: methodological framework for automated multivehicle 2022, he has been heading the Digitization Division at the
trajectory extraction. Transport. Res. Rec. 2626, 25–33. Institute of Highway Engineering, RWTH Aachen University.
Kim, E.J., Park, H.C., Ham, S.W., Kho, S.Y., Kim, D.K., 2019. Extracting vehicle His research interests include computer vision in traffic sur-
trajectories using unmanned aerial vehicles in congested traffic conditions. J. Adv. veillance, real-time traffic data acquisition and analysis, and
Transport. 2019, 9060797. adaptive warning systems.
Kloeker, L., Geller, C., Kloeker, A., Eckstein, L., 2020. High-precision digital traffic
recording with multi-LiDAR infrastructure sensor setups. In: 2020 IEEE 23rd
International Conference on Intelligent Transportation Systems (ITSC), pp. 1–8.
Krajewski, R., Bock, J., Kloeker, L., Eckstein, L., 2018. The highD dataset: a drone dataset
of naturalistic vehicle trajectories on German highways for validation of highly
automated driving systems. In: 2018 21st International Conference on Intelligent
Transportation Systems (ITSC), pp. 2118–2125.
10
M. Berghaus et al. Communications in Transportation Research 4 (2024) 100133
J€
org Ehlers received the M.Sc. degree in transportation engi- Markus Oeser received the Ph.D. degree in civil engineering
neering from RWTH Aachen University. He currently works as a from TU Dresden in 1998 and 2004, respectively. He was a
Research Assistant with the Institute of Highway Engineering, university Lecturer at the Institute of Geotechnics, Road Con-
RWTH Aachen University. His research interests include traffic struction and Traffic Engineering, University of New South
safety and transportation systems. Wales (UNSW), Sydney, from 2007 to 2011. He is currently a
Professor with the Institute of Highway Engineering, RWTH
Aachen University. From 2015 to 2021, he was Dean of the
Faculty of Civil Engineering, RWTH Aachen University. Since
2021, he has been the President of the German Federal High-
way Research Institute (BASt). His research interests include
pavement and traffic engineering.
11