0% found this document useful (0 votes)
105 views

Understanding Pedestrian Behavior in Complex Traffic Scenes: Amir Rasouli, Iuliia Kotseruba, and John K. Tsotsos

This document discusses understanding pedestrian behavior in complex traffic scenes. It analyzes pedestrian behavior from two perspectives: how pedestrians communicate their crossing intentions and factors that influence their decision making. The study found that changes in head orientation, such as looking at traffic, strongly indicates crossing intention. It also found that context like road properties, traffic dynamics, and pedestrian demographics can impact behavior after initial intention is shown. Understanding pedestrian-driver interaction is important for safety and traffic flow, and remains a challenge for autonomous vehicles.

Uploaded by

Soulayma Gazzeh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views

Understanding Pedestrian Behavior in Complex Traffic Scenes: Amir Rasouli, Iuliia Kotseruba, and John K. Tsotsos

This document discusses understanding pedestrian behavior in complex traffic scenes. It analyzes pedestrian behavior from two perspectives: how pedestrians communicate their crossing intentions and factors that influence their decision making. The study found that changes in head orientation, such as looking at traffic, strongly indicates crossing intention. It also found that context like road properties, traffic dynamics, and pedestrian demographics can impact behavior after initial intention is shown. Understanding pedestrian-driver interaction is important for safety and traffic flow, and remains a challenge for autonomous vehicles.

Uploaded by

Soulayma Gazzeh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

1

Understanding Pedestrian Behavior in Complex


Traffic Scenes
Amir Rasouli, Iuliia Kotseruba, and John K. Tsotsos

Abstract—Designing autonomous vehicles for urban environ-


ments remains an unresolved problem. One major dilemma faced
by autonomous cars is understanding the intention of other road
users and communicating with them.
To investigate one aspect of this, specifically pedestrian cross-
ing behavior, we have collected a large dataset of pedestrian
samples at crosswalks under various conditions (e.g. weather)
and in different types of roads. Using the data, we analyzed
pedestrian behavior from two different perspectives: the way
they communicate with drivers prior to crossing and the factors
that influence their behavior.
Our study shows that changes in head orientation in the form
of looking or glancing at the traffic is a strong indicator of
crossing intention. We also found that context in the form of
the properties of a crosswalk (e.g. its width), traffic dynamics
(e.g. speed of the vehicles) as well as pedestrian demographics can Fig. 1. An example of a pedestrian intending to cross with corresponding
alter pedestrian behavior after the initial intention of crossing has behavioral labels, the pedestrian’s characteristics (age, gender), static context
been displayed. Our findings suggest that the contextual elements (weather, time of day, street width) and Time-To-Collision (TTC). Will this
can be interrelated, meaning that the presence of one factor pedestrian cross the street?
may increase/decrease the influence of other factors. Overall, our
work formulates the problem of pedestrian-driver interaction and autonomous behavior? The answer to this question is contro-
sheds light on its complexity in typical traffic scenarios. versial. Some companies such as Tesla [10] and BMW [11]
Index Terms—Autonomous driving, intelligent vehicles, driver- are more optimistic and claim that they will have their first
pedestrian interaction, human intention and behavior analysis, fully autonomous vehicles entering the market by 2020 and
pedestrian behavior understanding, safety and collision avoidance 2022 respectively. Other companies such as Toyota are more
skeptical and believe that we are nowhere close to achieving
level 5 autonomy yet [12].
I. I NTRODUCTION Aside from challenges associated with developing suit-
Autonomous driving technologies have come a long way able infrastructure [13] and regulating autonomous cars [14],
since the first introduction of commercial automobiles in the technologies currently used in autonomous vehicles have not
1920s. Ever since the early attempts on achieving autonomy achieved the level of robustness to handle various traffic
[1], we have witnessed great success stories throughout the scenarios such as different weather or lighting conditions (e.g.
past decades such as autonomous driving on highways [2], snow, rain), road types (e.g. driving on roads without clear
platooning [3], unsupervised driving on rough terrains [4] and marking or bridges) or environments (e.g. the GPS localization
urban environments [5], and even autonomous racing cars problem in cities with high-rise buildings) [15].
[6]. Today, autonomous driving is one of the major topics in In addition, when it comes to driving in complex scenes,
technology research and a large number of companies have such as urban environments, autonomous vehicles face another
been heavily investing in it. According to some economists, dilemma, namely interacting with other road users [16]. In
the size of the global autonomous vehicle industry and related order to do so, the autonomous vehicle needs to understand
software and hardware technologies is estimated to be more their intentions, which can be achieved by communicating with
than 40 billion dollars by the year 2030 [7]. them and predicting what they are going to do next (see Fig. 1).
The current level of autonomy available on some commer- Interaction between traffic participants is vital for a number
cial cars such as Tesla is level 2, and some manufacturers such of reasons:
as Audi are promising level 3 capability on their latest models 1) Ensures the flow of traffic. We as humans, in addition
such as A8 [8]. According to SAE standard, autonomy level to official traffic laws, often rely on informal laws (or
3 means that the car can handle all aspects of the dynamic social norms) to interact with other road users. Such
driving task in specific environments with the exception that norms influence the way we perceive other road users
human driver may be requested to intervene [9]. Now the and how we interpret their actions [17]. We also often
question is how far are we from achieving level 5 or fully engage in non-verbal communication to exchange our
The authors are with the Department of Electrical Engineering intentions in order to disambiguate certain situations. For
and Computer Science, York University, Toronto, ON, Canada, e-mail: instance, if a driver wants to turn onto a road without
{aras,yulia_k,tsotsos}@eecs.yorku.ca traffic signals, he/she might wait for another driver’s
2

signal to indicate the right of way. Failure to understand


others’ intentions can often result in accidents, which
is evident from recent reports on Google’s autonomous
test vehicles [18], [19].
2) Increases safety. Interaction also ensures the safety
of road users, in particular, pedestrians as the most
vulnerable traffic participants. For instance, at the point
of crossing, pedestrians often establish eye contact with
drivers or wait for an explicit signal from the drivers to
ensure that they have been noticed and it is safe to cross
[20]. (a) (b) (c) (d)
3) Prevents being subject to malicious behaviors. Given Fig. 2. Examples of pedestrian hand gestures in a traffic scene without context.
that autonomous cars may potentially commute with- What are these pedestrians trying to say? 2
out any passengers on board, they may be subject
to bullying [17]. For example, people might jump in pedestrians communicate their intention, and finding various
front of the vehicles, forcing them to stop or change factors that influence pedestrian decision-making process.
direction. Instances of bullying have been reported for
the autonomous robots that are currently being used in
II. R ELATED W ORKS
malls. Some of these robots were defaced, kicked or
pushed over by drunk pedestrians [21]. A. Communication in traffic
To address the above issues, one might argue that au- The role of non-verbal communication in resolving traffic
tonomous vehicles can be designed to be very cautious and ambiguities is emphasized by a number of scholars [26],
conservative by always assuming the worst-case scenario and [27], [28]. In traffic scenes, communication is particularly
being prepared for it. However, it has been shown that con- challenging because, first, there is no official set of signals
servative driving not only does not reduce accidents but can and many of them are ambiguous, and second, the type of
be very disruptive itself [22]. communication may change depending on the atmosphere of
The automotive industry offers a number of solutions for the traffic situation, e.g. city or country [29].
dealing with the interaction problem. For instance, vehicle-to- The lack of communication or miscommunication can
vehicle (V2V) and vehicle to infrastructure (V2I) techniques greatly contribute to traffic conflicts. It has been shown
that use cellular technology for communication are widely that more than a quarter of traffic conflicts occur due to
studied [23], [24]. Similar vehicle-to-everything (V2X) ap- the absence of effective communication between road users.
proaches also have been proposed for communicating with Out of the conflicts caused by miscommunication, 47% of
pedestrians [25]. In this method, the pedestrian’s phone broad- the cases occurred with no communication, 11% were due
casts its position to warn the autonomous cars about the pedes- to the ineffective communication and 42% happened during
trian’s presence. V2X technologies, however, raise a number communication [29].
of concerns. First, the communication is highly dependent on Traffic participants use different methods to communicate
all components to function properly. A malfunction in any with each other. Sucha [28] lists some of the means of
communication device in any of the sub-systems involved communication used by road users. According to the au-
can lead to catastrophic safety issues. In addition, sharing thor, pedestrians use eye contact (glancing/staring), handwave,
information can raise some privacy concerns regarding the smile or nod. Drivers, on the other hand, flash lights, wave
transmission of personal information. Last but not least, al- hands or make eye contact.
though these technologies can be effective in communicating In the context of autonomous driving, a number of scholars
dynamic information (e.g. velocity and position) of the road evaluate different means whereby an autonomous vehicle in
users, they cannot capture the higher level intention of the the absence of the driver can communicate with pedestrians
involved parties. [30]. Methods such as using LED lights [31] or displays on
The aforementioned shortcomings point to the need for a the vehicle [32] have been investigated.
new approach to resolving the interaction problem in traffic. In To interpret communication signals, one should take the
this paper, we address the issue of interaction between traffic context into account. For instance, pedestrians initiate eye
participants, in particular, between drivers and pedestrians. For contact to ask for the right of way whereas drivers establish eye
this purpose, we collected a large dataset consisting of videos contact to yield to the pedestrians. Handwave by a pedestrian
of pedestrians crossing (or intending to cross) the street. We may convey different meanings too, e.g. request for the right
annotated this data with various behavioral and contextual of way or signal to show gratitude. Therefore, to understand
tags in an attempt to identify factors that influence pedestrian the intended meaning of a signal, it is important to know the
decision-making at the point of crossing. context within which the signal is observed (see Fig. 2).
More specifically, in this work we address the following is-
sues through the analysis of our dataset: identifying pedestrian 2 a) Yielding, b) asking for the right of way, c) showing gratitude, and d)
crossing patterns from a moving car perspective, realizing how greeting a person on the other side of the street.
3

B. Context and pedestrian behavior understanding


In the psychology literature, pedestrian crossing behavior
is linked to various factors among which the most influential
ones are dynamic factors, social factors and physical context.
Dynamic factors are related to the distance to approaching
vehicles and their velocities. For instance, gap acceptance, or
the time gap between vehicles, determines how safe pedes-
trians feel about crossing [33]. Gap acceptance is measured
in terms of Time-To-Collision (TTC) or how far (in seconds) Fig. 3. The placement of the camera inside the test vehicle.
the approaching vehicles are from the point of impact [34].
Vehicle speed in isolation also can influence pedestrian visual be achieved [43]. A major drawback of these models is that
perception. Studies show that the faster the speed of the they cannot predict non-motion, i.e. when the pedestrian stops
vehicle, the less accurate is the pedestrian’s estimation of the the motion tracking prediction fails. This means that dynamics-
vehicle’s speed [27]. based models are only effective when the motion is continuous.
Besides dynamics, social factors are found to have a great Very few works have attempted to explicitly take advantage
impact on pedestrian behavior. In particular, social norms of contextual information for pedestrian path prediction. In
determine the extent to which pedestrians obey the law [26], some works, for instance, the authors assign a goal to each
take risks [34] or the way they communicate with one another pedestrian based on which they predict the trajectory of the
[17]. Another determining factor is pedestrian group size pedestrians [44]. Others look at head orientation as a sign of
which influences pedestrians’ speed [35], the level of risk pedestrian awareness. They argue that when a pedestrian is
taking [36], or even the way drivers would react to it [37]. For looking towards the traffic, he/she is more likely to coordinate
instance, larger groups tend to reduce the speed of pedestrians. with the approaching vehicles prior to crossing [45].
At the same time, pedestrians feel safer in large groups and A number of studies also use contextual elements such as
are willing to accept a shorter time gap than when they are group size [46] or street structure [47] for predicting pedestrian
crossing individually [36]. behavior. However, these works are done in simulation and
Moreover, pedestrian behavior may vary in different phys- none of them propose a practical solution to the visual
ical contexts. For instance, a pedestrian who is crossing a perception problem for identifying contextual elements.
signalized intersection will more likely cross without looking
at the traffic because he/she expects the road users to comply III. M ETHOD
with the signal. On the other hand, in the absence of a traffic
A. Instrumentation
signal, the pedestrian may be more conservative and will only
cross the street if the approaching vehicles yield to him/her Three drivers over the age of 30 participated in our data col-
[26]. The geometrical features of the road also can impact lection procedure. The drivers utilized two SUV passenger cars
crossing behavior. For example, pedestrians have a longer gap (one small and one mid-size) during the recording sessions.
acceptance in wide streets compared to narrow ones [34]. The data was collected using three HD camera models, namely
Behavioral analysts have found factors such as demograph- GoPro HERO+, Garmin GDR-35 and Highscreen Black Box
ics [38], pedestrian state [39] and traffic characteristics [40] to Connect. In each recording session only one camera was used
play a role in pedestrian decision-making. However, detailed at a time, which was placed inside the car below the rear
discussion of these factors is beyond the scope of this paper. view mirror (see Fig. 3). This ensures that the camera remains
The missing element in the pedestrian behavioral studies is inconspicuous to the eyes of pedestrians.
establishing connection between different factors. The Major- The data was collected in a naturalistic setting, meaning
ity of the studies focus on an isolated issue such as the role that recording took place as part of the participants’ daily
of physical context or dynamics on pedestrian behavior and activities and pedestrians were not notified about the recording.
typically do not explain how these elements are connected In addition, no particular instruction was given to the drivers
to one another. Moreover, these studies generally focus on for changing their driving habits.
analyzing pedestrian behavior in a fixed or limited context,
e.g. only at signalized intersections, fixed camera position, or B. The data
specific groups of pedestrian such as children or the elderly.
We recorded approximately 240 hours of driving footage
over a period of 6 months in 5 geographical locations including
C. Pedestrian intention estimation Canada, USA, Germany and Ukraine. To allow further analysis
In intelligent transportation research, the task of pedestrian and future research on the topic of pedestrian behavior, we
intention estimation is generally treated as a tracking problem. released our dataset in the form of 346 short video clips
This means that the algorithms developed for this purpose (on average 5-15s). The data is annotated with ground truth
treat the pedestrian as a moving object and try to model its information including bounding boxes for pedestrian detection,
dynamic behavior to estimate its future location [41], [42]. behavioral data of pedestrians and the driver of the recording
Some works also consider the vehicle’s motion and distance vehicle, and contextual information such as demographics
to the pedestrian and show that better prediction results can (e.g. gender and age of pedestrians), ambient conditions (e.g.
4

(a) rainy (b) snowy

Fig. 4. An example of the behavioral annotation of the image sequences in


our dataset.

weather, time of day) and environmental factors (e.g. signal,


(c) sunny (d) nighttime
road structure).
Fig. 5. Examples of samples collected under various weather and lighting
An example of behavioral annotation is depicted in Fig. 4. conditions.
We divide the behavioral data into two groups: pedestrian
and driver behavior. For pedestrians there is a crossing la- Attempts have been made to increase the variability of the
bel indicating the period when the crossing takes place. A data to capture more diverse pedestrian behaviors. Our data
crossing event starts when the pedestrian steps off the curb is collected in different seasons and contains samples with
into the crosswalk and finishes when the pedestrian clears the different weather (snowy, rainy or sunny) and illumination
crosswalk. The crossing label can co-occur with any other (different time of the day) conditions (see Fig. 5).
label. Labels such as ‘looking’or ‘glancing’reflect the attentive The data also contains various road structures including
state of the pedestrian, showing at what points the pedestrian urban streets, regional roads and parking lots. Each of these
is looking towards the recording vehicle. Looking can also structures has different characteristics in terms of availability
happen at the same time with other actions. For instance, a of traffic signals, density of the crowd or driving speed limits.
pedestrian can be crossing while looking at the traffic. Such characteristics may give rise to different behavioral
In addition, there are two types of labels, which describe the patterns, which are important for our analysis.
dynamic state (‘moving slow’, ‘moving fast’ or ‘standing’) and Our proposed dataset is called Joint Attention in Au-
the reactions of the pedestrians. The speed of pedestrians are tonomous Driving (JAAD) 3 .
estimated qualitatively based on the judgment of the labelers.
The reactions are either explicit, such as ‘nod’ and ‘handwave’,
C. Behavioral labeling procedure
or implicit, such as ‘slow down’ and ‘speed up’. The explicit
reactions can co-occur with any dynamic state. However, if There are two main challenges when it comes to labeling
the reaction is implicit, the dynamic state and reaction labels behavioral data: uncertainty about the intention of the study
are mutually exclusive. The reason behind this exclusivity is subjects and subjective bias of the labeler in assessing the data.
that implicit reaction might change the dynamic state of the We tried to minimize these effects by using multiple labelers
pedestrian. Therefore, splitting the pedestrian’s dynamic state to process our data. If the labelers could not achieve consensus
better highlights how and when the transition has taken place. about the nature of a pedestrian action, it was excluded from
Consider the following scenario as an example: a pedestrian the study to minimize error.
initial state is ‘moving slow’, he/she reacts by speeding up and
his/her new dynamic state is ‘moving fast’. D. Pedestrian Samples
The driver behavioral tags only contain the last two labels
We identified more than 2600 pedestrians in our data, out of
described above, one showing the state of the vehicle (e.g.
which we annotated 654 pedestrians crossing or near crossing
‘moving slow’or ‘moving fast’) and the other the reaction of
(the pedestrian intends to cross but does not do so for some
the driver (e.g. ‘slow down’ or ‘speed up’). The current state of
reasons). Unfortunately, not all of these samples contain a full
the vehicle (moving slow or fast) is based on the actual speed
crossing event, i.e. showing the pedestrian before, during and
of the vehicle at the time. Moving slow means the vehicle is
after the crossing. Also, in some cases the pedestrians intention
driving below 20 km/h whereas moving fast means the vehicle
was ambiguous. Therefore, we excluded 208 samples from the
is moving with speed equal to 20 km/h or higher. The changes
analysis, leaving a total of 446 samples.
in the speed reflect the reaction of the driver whether he is
For the analysis we only looked at the pedestrians’ age.
slowing down, stopping or speeding up. It should be noted that
Thus we categorized pedestrians as children (mid teen and
similar to pedestrians, the driver behavioral labels are mutually
below), adults and the elderly (above 65). Other factors such
exclusive.
as gender and group size were excluded from the study.
In addition, our dataset has temporal correspondences be-
tween the frames meaning that each pedestrian has a unique 3 The JAAD dataset can be obtained at http://data.nvision2.eecs.yorku.ca/
id throughout the sequence and can be tracked easily. JAAD dataset/. Ethics certificate # 2016-203 from York University.
5

(a) Crossing events

(b) No-crossing events


Fig. 6. A visualization of sequences of events observed in the dataset. Diagram a) shows a summary of 345 sequences of pedestrians’ actions before and
after crossing takes place. Diagram b) shows 92 sequences of actions which did not result in crossing. Vertical bars represent the start of actions. Different
types of actions are color-coded as the precondition to crossing, attention, reaction to driver’s actions, crossing or ambiguous actions. Curved lines between
the bars show connections between consecutive actions. The thickness of lines reflects the frequency of the action in the ‘crossing’ or ‘no-crossing’ subset.
The sequences longer than 10 actions (e.g. when the pedestrian hesitates to cross) are extremely rare, and are truncated from both ends in the figure.

IV. O BSERVATIONS AND ANALYSIS crossing. The remaining two-thirds of the crossing scenarios
are more complex and involve multiple actions before and after
A. Crossing patterns the crossing. For instance, a pedestrian may move towards the
We observed a high variability of pedestrian behaviors curb, stop, attend to the traffic, acknowledge the yielding driver
before and after crossing events as well as in the cases when by nodding and finally cross the street, while checking again
the crossing did not take place. To quantify these behaviors for whether it is safe to cross (‘moving, looking, nod, crossing,
further analysis we collected action labels for each pedestrian looking’). In rare ambiguous situations pedestrians and drivers
and sorted them by the start time. This led to a list of more may go through a cycle of actions and reactions before one
than a hundred unique sequences of actions. We visualized of them yields to the other.
these sequences in Fig. 6 for crossing and no-crossing events Similarly, in no-crossing events, 1/3 of all pedestrians are
separately. waiting at the curb and observing the traffic, which corre-
In the figure, the beginning of each new action is marked sponds to the ‘standing, looking’ sequence in Fig. 6b. In other
with a vertical bar and curved lines are used to show connec- scenarios pedestrians may have started crossing already but are
tions between consecutive actions. The thickness of each line forced to clear the way (‘clear path’), slow down or stop if
reflects the frequency of the action occurrence in the dataset. the driver is not giving them the right of way. Pedestrians also
We categorize actions based on their semantic meaning into may yield to the drivers. In one of the videos, the pedestrian is
5 types: action, precondition to crossing, attention, reaction approaching the road while looking at the traffic, slows down,
to driver’s actions, crossing. These are shown in the diagram waves his hand at the approaching car and stops to let it pass
in different colors. Note that only some actions belong to a (‘moving, looking, slow down, handwave, standing, looking’).
single type. For example, attention includes only ‘looking’ The diagram also shows the frequency of occurrence of ac-
and ‘glancing’, whereas ‘standing’ may be either a precon- tions at certain points during the crossing/no-crossing events.
dition to crossing (e.g. standing at the curb) or a reaction to This is reflected in the vertical bar height (taller bars corre-
driver’s action (e.g. stopping when the driver did no yield). spond to more common actions) as well as the thickness of the
The diagram does not reflect the durations of actions and lines connecting this bar to the next (frequency for different
overlapping actions, however, it demonstrates the variability types of subsequent actions). For example, it can be inferred
of pedestrian behavior and uneven distribution of occurrences that most no-crossing events observed in the data start with
of certain actions. The driver’s actions are not explicitly shown pedestrians standing at the curb. In approximately 20% of the
in order to simplify the diagram. crossing events, pedestrians, who already started crossing, look
The diagram in Fig. 6a shows 345 sequences of pedestrian’s again at the traffic to check that it is safe to continue.
actions prior to and after the crossing takes place. Two
patterns, namely ‘standing, looking, crossing’ and ‘moving, B. Non-verbal communication
looking, crossing’, describe more than 1/3 of the crossing In more than 90% of the clips in our dataset, we observed
events. This means that many pedestrians attend to traffic as some form of non-verbal communication. Perhaps the most
they are waiting at the curb or approaching the road before prominent form of body language (which was present in all
6

(a) (b) (c) (d)


Fig. 7. Examples of pedestrian looking to assess the environment.

these cases) is the change in pedestrians’ head orientation.


Head movement and looking towards the approaching traffic Fig. 8. Types and frequency of attention and response occurrences.
is often a strong indicator of pedestrian crossing intention. In
fact, out of the total number of instances of pedestrians’ head
movements, more than 80% occurred prior to crossing. The
remaining 20% were the cases when the vehicles were fully
stopped behind a traffic signal. (a) Non-designated
Changes in head orientation also serve as evidence of pedes-
trians paying attention to their surroundings. For example, they
may check the state of the traffic signal or the distance to
the approaching vehicle. This assessment can happen either
before or during crossing. Sometimes pedestrians do not show (b) Zebra crossing
initial intention of crossing but might attend to the traffic
while crossing (which accounted for 20% of the attention
occurrences).
Pedestrian head orientation can be either in the form of
looking (90% of the time) or glancing (10% of the time).
(c) Signal
Looking lasts for a longer period of time (more than 1s),
whereas glancing is very brief (less than 1s) and is in the
form of a quick head movement towards the traffic. The range
of motion that pedestrians exhibit while paying attention also
varies significantly. As illustrated in Fig. 7, head orientation
can be very subtle (Fig. 7a) or rather extreme, involving major
changes in body posture (Fig. 7d). (d) Zebra crossing and signal
Other forms of communication are rarer and often are used Fig. 9. Examples of crosswalks with/without signal and zebra crossing.
as a sign of acknowledgment or response to the driver’s
behavior. These non-verbal cues are either explicit or implicit. treat the cases in which both a signal and a zebra crossing are
The explicit forms include nodding and hand gesture which available as signalized crosswalks.
could convey a different meaning depending on the context. Note that in the above categorization we differentiate be-
For example, hand gesture is used as a form of showing tween the crosswalks with only zebra crossing (or/and pedes-
gratitude, yielding or asking for the right of way. The implicit trian crossing sign) and the ones with a signal. The reason
responses, on the other hand, are in the form of changes in is that traffic signals have a stronger prohibitive strength
pedestrian action including stopping, clearing path, slowing in forcing vehicles to yield to pedestrians. For example, a
down or speeding up. vehicle must stop before a stop sign or a red light, whereas
The occurrence frequency of attention and response behav- it can chose whether to yield to the pedestrian when there is
iors is shown in Fig. 8. only zebra crossing present. Naturally, pedestrians are more
cautious when crossing a street without a signal compared to
crossing at a signalized crosswalk.
C. Environmental factors Being more cautious means that pedestrians are more likely
To analyze the effects of environmental factors, we consider to assess their surroundings prior to crossing. This was con-
the following two features that characterize a crosswalk: the firmed in our analysis. According to our data, pedestrians paid
presence of a traffic signal or a zebra crossing (Fig. 9), and attention (turned their heads towards the traffic) in 81% of the
the width of the street (Fig. 10). The former factor identifies times at non-designated crosswalks, compared to 69% at zebra
a crosswalk as non-designated (no form of traffic signal, sign crossings and only 36% at signalized crosswalks.
or zebra crossing is present), zebra crossing (either a zebra Besides attention frequency, the presence of traffic signals
crossing or a pedestrian crossing sign is present), or signal (a can be a direct indicator of how likely pedestrians will cross.
traffic signal such as traffic light or stop sign is present). We To show this, we split our data into crossing and no-crossing
7

(a) Single lane

Fig. 12. The relationship between the TTC and the probability of attention
occurrence prior to crossing.
(b) Two lanes
higher. These two factors directly affect pedestrian behavior.
According to our findings, on average pedestrians pay attention
to the approaching traffic prior to crossing 87% of the times
when crossing narrow streets, whereas they do so over 95%
of the times in wide streets. This means that pedestrians are
(c) Three lanes generally more cautious when crossing wider streets.

D. Pedestrian factors
In this section, we analyze the effect of pedestrian demo-
graphics, in particular, age on crossing behavior. Here, we are
(d) Four lanes interested to see how age influences the frequency of attention
Fig. 10. Examples of crosswalks with different widths. prior to crossing. We found that the older a pedestrian, the
more conservatively he/she behaves, therefore he/she will be
more likely to pay attention to the traffic. In fact, in our data,
the frequency of attention was below 40% for children, 72%
for adults and 76% for the elderly.
Another finding is that the attention duration of pedestrians
may vary. On average, adults tend to look at the traffic for
1.32s, children for 1.43s and elderly for 1.45s.

(a) (b) E. Dynamic factors


Fig. 11. a) the distribution of data for crossing and no-crossing events, and In this section we examine pedestrian gap acceptance in
b) the probability of crossing based on crosswalk characteristics. terms of Time-To-Collision (TTC). Based on our data, pedes-
trians on average cross the street between the TTC of 3 to
events. Fig. 11a shows the distribution of the data in each 7 seconds. TTC is also shown to influence how frequently
category. By computing the likelihood of crossing for each pedestrians will pay attention to the approaching traffic prior
crosswalk type (Fig. 11a) we can see that in over 95% of the to crossing. As depicted in Fig. 12, there are no instances of
cases pedestrians crossed the street (regardless of other factors) pedestrians attempting to cross without paying attention when
when some form of designated signal or zebra crossing was the TTC is below 3s. Furthermore, it can be seen that the
present. On the other hand, the crossing probability is less than frequency of crossing without paying attention is correlated
50% when there is no zebra crossing or traffic signal available with TTC, i.e. the chance of crossing without attention rises
(see Fig. 11b). as the TTC increases. This is expected because the farther (or
Street width also plays a role in defining pedestrian behav- slower) the vehicles are, the safer pedestrians feel to cross.
ior. The types of streets in our data can be grouped into narrow We further split our findings into two groups of non-
(1 and 2 lanes) and wide (3 and 4 lanes) streets. Narrow streets designated and designated (with a signal or a zebra crossing)
are mainly residential roads in which vehicles drive slowly. crosswalks to measure the influence of street structure on
Wide streets, on the other hand, are main roads where traffic attention occurrence. The results are illustrated in Fig. 13.
is usually bidirectional and vehicles commute with a higher Here, once again, we can see that when some form of traffic
speed. signal or designated crossing is present, pedestrians feel more
Given the above characteristics, the width of the street can certain about crossing, and thus, tend to pay less attention prior
impact crossing in two ways: first, the width of the street deter- to crossing. In the case of designated crosswalks, the results
mines how long it would take a pedestrian to cross. Therefore, are somewhat complementary. Pedestrians tend to pay more
the pedestrian’s crossing affordance decreases as the street attention between the TTC of 2 to 5s, whereas they do so less
width increases. Second, the fact that vehicles drive faster in between the TTC of 6 to 9 seconds, and, of course, much less
wider streets, means that the associated risk of crossing is also when the TTC is 15s or higher meaning that the car is far
8

Fig. 16. The pedestrian is crossing the street regardless of the drivers action
(a) designated (b) non-designated because he anticipates that the vehicle will stop due to traffic congestion.

Fig. 13. Pedestrian attention frequency based on TTC.

Fig. 17. The pedestrian is giving the right of way to the driver, despite the
driver stopping.

where the vehicle is at a far distance. Given that the elderly


take more care in developing their decisions they require more
time for observing the traffic.

Fig. 14. Occurrence of attention based on TTC and the structure of the street. F. Driver action and pedestrian crossing

TABLE I
P EDESTRIAN CROSSING FREQUENCY AND THE DRIVER ’ S ACTION . DA
AND DL STAND FOR DRIVER ’ S ACTION AND DELINEATION RESPECTIVELY.

no-crossing crossing
b DA
b
DL b Speeds Slows down Stops Speeds Slows down Stops
b
Non-designated 0.962 0.013 0.025 0.196 0.643 0.161
Zebra crossing 1 0 0 0.25 0.6 0.15
Traffic signal 0 1 0 0.571 0.286 0.143

Besides the contextual elements we discussed earlier, driver


Fig. 15. The relationship between pedestrian attention duration and TTC. actions are of vital importance in determining pedestrian
crossing behavior. To investigate their impact, we divide driver
away or moving very slow. As for non-designated crosswalks, actions into three types: when the driver either maintains the
in general, pedestrians do pay attention prior to crossing, and current speed or speeds up (speeds), reduces the speed (slows
they do more so when the TTC is low. down), or comes to a full stop (stops). In addition, we divide
TTC in conjunction with the street’s structure also influ- the scenarios into non-designated, zebra crossing and traffic
ences the probability of attention occurrence prior to crossing. signal (see Section IV-C for definitions).
As shown in Fig. 14, although attention occurrence is less Table I shows the distribution of each scenario according to
likely in narrow streets, overall, the higher the TTC gets, driver action. As can be seen, when no signal or zebra crossing
the lower is the chance of pedestrians paying attention when is present, in less than 20% of the cases a crossing event takes
crossing both wide and narrow streets. place even if the vehicle had maintained or increased its speed.
We found that pedestrians behave differently at different In these cases, either the TTC is very high (average of 25.7s)
TTC values. For example, as shown in Fig. 15, children and or there is traffic congestion and the pedestrian anticipates
adults, on average, spend less time assessing the traffic prior that the car would stop shortly (see Fig. 16 for an example).
to crossing, the higher the TTC gets. This is expected because More than 98% of no-crossing scenarios occurred when the
the associated risk of crossing decreases when the vehicles driver did not slow down or stop for the pedestrian. There were
are far away or their speed is low. In the case of the elderly, cases, however, in which the driver yielded to the pedestrian
however, we observe that the attention duration increases and stopped but the pedestrian did not cross and gave the right
steadily, reaching a maximum point somewhere between the of way to the driver by making a hand gesture (see Fig. 17
TTC of 7 to 8s and suddenly plummets when approaching the for an example).
TTC of 10s. At first, this result might seem counterintuitive Our data suggests that pedestrians are still fairly conser-
but a closer look at the data reveals that these are the cases vative despite the presence of zebra crossings (or pedestrian
9

crossing signs). In this case, only 25% of pedestrians crossed ACKNOWLEDGMENT


the street when the driver did not slow down or stop. At the
signalized crosswalks, however, we observe a very different This work was supported by the Natural Sciences and En-
pedestrian behavior. In about 60% of the cases pedestrians gineering Research Council of Canada (NSERC), the NSERC
cross the street even though the driver speeds. This means Canadian Field Robotics Network (NCFRN), the Air Force
that, in this context, driver action is almost irrelevant because Office for Scientific Research (USA), and the Canada Research
pedestrians expect the driver to comply with the law and stop Chairs Program through grants to JKT.
at the signal. It should be noted that no-crossing events only
occurred at signalized crosswalks when the pedestrian chose R EFERENCES
not to cross and yielded to the driver.
[1] F. Kröger, “Automated driving in its social, historical and cultural
contexts,” in Autonomous Driving. Springer, 2016, pp. 41–68.
[2] E. D. Dickmanns and A. Zapp, “A curvature-based scheme for improving
V. C ONCLUSION road vehicle guidance by computer vision,” in Cambridge Sympo-
sium Intelligent Robotics Systems. International Society for Optics
Communicating with pedestrians and understanding their and Photonics, 1987, pp. 161–168.
intentions is a complex problem. Not only is the state of [3] A. Broggi, M. Bertozzi, A. Fascioli, C. G. L. Bianco, and A. Piazzi, “The
pedestrians, such as their pose, head orientation or gait, an ARGO autonomous vehicle’s vision and control systems,” International
Journal of Intelligent Control and Systems, vol. 3, no. 4, pp. 409–441,
indicator of crossing intention, but also the context in which 1999.
they are observed can play an important role. [4] S. Thrun, M. Montemerlo, H. Dahlkamp, D. Stavens, A. Aron, J. Diebel,
P. Fong, J. Gale, M. Halpenny, G. Hoffmann et al., “Stanley: The robot
Pedestrians often use explicit means of communication such that won the DARPA grand challenge,” Journal of field Robotics, vol. 23,
as handwave to resolve conflicts in traffic scenes, e.g. yielding no. 9, pp. 661–692, 2006.
to the driver, requesting the right of way, etc. The variability [5] C. Urmson, J. Anhalt, D. Bagnell, C. Baker, R. Bittner, M. Clark,
of these behaviors is high and they can convey very different J. Dolan, D. Duggins, T. Galatali, C. Geyer et al., “Autonomous driving
in urban environments: Boss and the urban challenge,” Journal of Field
meaning depending on the context. Robotics, vol. 25, no. 8, pp. 425–466, 2008.
We showed that the elements present in a scene can help to [6] “Watch Stanford’s self-driving vehicle hit 120mph: Autonomous
Audi proves to be just as good as a race car driver,”
predict what a pedestrian is going to do next. Street properties Online, 2017-05-28. [Online]. Available: http://www.dailymail.co.uk/
such as width, the presence of zebra crossings or traffic signals sciencetech/article-3472223/Watch-Stanford-s-self-driving-vehicle-hit-
can determine pedestrians level of confidence while crossing. 120mph-Autonomous-Audi-proves-just-good-race-car-driver.html
In addition, the driver’s dynamic state with respect to the [7] (2014, nov) Think Act: Autonomous Driving. Online. [Online].
Available: https://new.rolandberger.com/wp-content/uploads/Roland{ }
pedestrians is important. Factors such as TTC, which reflects Berger{ }Autonomous-Driving1.pdf
the speed and the position of the vehicle, should be considered. [8] V. Nguyen, “2019 Audi A8 level 3 autonomy first-drive:
Chasing the perfect ‘jam’,” Online, 2017-11-10. [Online]. Available:
Our findings also suggest that there is an interrelationship https://www.slashgear.com/2019-audi-a8-level-3-autonomy-first-drive-
between contextual elements. For instance, although the major- chasing-the-perfect-jam-11499082/
ity of pedestrians tend to look at the traffic prior to crossing, [9] SAE On-Road Automated Vehicle Standards Committee and others,
they do so less when the street is narrow or when TTC is “Taxonomy and definitions for terms related to on-road motor vehicle
automated driving systems,” SAE Standard J3016, pp. 01–16, 2014.
high. This is also true if the crosswalk is signalized because [10] F. Lambert, “Elon Musk clarifies Tesla’s plan for level 5 fully
pedestrians feel safer and are, therefore, less cautious while autonomous driving: 2 years away from sleeping in the car,” Online,
crossing. 2017-05-30. [Online]. Available: https://electrek.co/2017/04/29/elon-
musk-tesla-plan-level-5-full-autonomous-driving/
Understanding the context of a traffic scene is not limited to [11] G. Nica, “BMW CEO wants autonomous driving cars
what we investigated in this work. There are other aspects that within five years,” Online, 2017-05-28. [Online]. Available:
http://www.bmwblog.com/2016/08/02/bmw- ceo- wants- autonomous-
need to be further studied such as environmental conditions, driving-cars-within-five-years/
e.g. weather or lighting, social conditions, e.g. group vs [12] E. Ackerman, “Toyota’s Gill Pratt on self-driving cars and the reality
individual behavior, and demographics of the participants, e.g. of full autonomy,” Online, 2017-05-30. [Online]. Available: http://
the elderly vs children. spectrum.ieee.org/cars-that-think/transportation/self-driving/toyota-gill-
pratt-on-the-reality-of-full-autonomy
Furthermore, to better understand the nature of communica- [13] B. Friedrich, “The effect of autonomous vehicles on traffic,” Autonomous
tion that takes place between drivers and pedestrians, it would Driving, Technical, Legal and Social Aspects, pp. 317–334, 2016.
be beneficial to collect data that contains both the motions of [14] T. M. Gasser, “Fundamental and special legal questions for autonomous
vehicles,” Autonomous Driving, Technical, Legal and Social Aspects, pp.
pedestrians and the driver (e.g. by recording the driver’s facial 523–551, 2016.
expressions). [15] D. Muoio, “6 scenarios self-driving cars still can’t handle,” Online,
2017-05-30. [Online]. Available: http://www.businessinsider.com/
One particular limitation of this study was that our analysis autonomous-car-limitations-2016-8/#1-driverless-cars-struggle-going-
was only based on a naturalistic dataset. Therefore, some over-bridges-1
subjectivity bias was present in judging pedestrians’ intention [16] I. Wolf, “The interaction between humans and autonomous agents,” in
or determining whether they are looking towards the recording Autonomous Driving. Springer, 2016, pp. 103–124.
[17] B. Färber, “Communication and communication problems between
vehicle or not. This issue can be resolved by conducting autonomous vehicles and human drivers,” in Autonomous Driving.
a survey from drivers and pedestrians after crossing, asking Springer, 2016, pp. 125–144.
them about their true intention or whether they noticed the [18] M. Richtel, “Google’s driverless cars run into problem:
Cars with drivers,” Online, 2017-05-30. [Online]. Available:
vehicle (or the pedestrian’s gaze) prior to crossing. We intend https://www.nytimes.com/2015/09/02/technology/personaltech/google-
to investigate this in the future. says-its-not-the-driverless-cars-fault-its-other-drivers.html? r=2
10

[19] S. E. Anthony, “The trollable self-driving car,” On- [44] H. Bai, S. Cai, N. Ye, D. Hsu, and W. S. Lee, “Intention-aware online
line, 2017-05-30. [Online]. Available: http://www.slate.com/ POMDP planning for autonomous driving in a crowd,” in ICRA, 2015,
articles / technology / future tense / 2016 / 03 / google self driving cars pp. 454–460.
lack a human s intuition for what other drivers.html [45] J. F. P. Kooij, N. Schneider, F. Flohr, and D. M. Gavrila, “Context-based
[20] M. Gough, “Machine smarts: how will pedestrians negotiate pedestrian path prediction,” in ECCV. Springer, 2014, pp. 618–633.
with driverless cars?” Online, 2017-05-30. [Online]. Avail- [46] Y. Hashimoto, Y. Gu, L.-T. Hsu, and S. Kamijo, “Probability estimation
able: https://www.theguardian.com/sustainable- business/2016/sep/09/ for pedestrian crossing intention at signalized crosswalks,” in Interna-
machine-smarts-how-will-pedestrians-negotiate-with-driverless-cars tional Conference on Vehicular Electronics and Safety (ICVES), 2015,
[21] M. McFarland, “Robots hit the streets – and the streets hit back,” pp. 114–119.
Online, 2017-05-30. [Online]. Available: http://money.cnn.com/2017/ [47] F. Schneemann and P. Heinemann, “Context-based detection of pedes-
04/28/technology/robot-bullying/ trian crossing intention for autonomous driving in urban environments,”
[22] D. Johnston, “Road accident casuality: A critique of the literature and an in IROS, 2016, pp. 2243–2248.
illustrative case,” Ontario: Grand Rounds. Department of Psy chiatry,
Hotel Dieu Hospital, 1973.
[23] G. Silberg and R. Wallace. (2012) Self-driving cars: The next
revolution. Online. [Online]. Available: https://www.kpmg.com/Ca/
en/IssuesAndInsights/ArticlesPublications/Documents/self-driving-cars-
next-revolution.pdf
[24] W. Knight. (2015) Car-to-Car Communication. Online. [Online]. Amir Rasouli received his B.Eng. degree in Com-
Available: https://www.technologyreview.com/s/534981/car- to- car- puter Systems Engineering at Royal Melbourne In-
communication/ stitute of Technology in 2010 and his M.A.Sc.
[25] (2016, may) Honda tech warns drivers of pedestrian presence. Online. degree in Computer Engineering at York University
[Online]. Available: http://www.cnet.com/roadshow/news/nikola-motor- in 2015. He is currently working towards the PhD
company-the-ev-startup-with-the-worst-most-obvious-name-ever/ degree in Computer Science at the Laboratory for
[26] G. Wilde, “Immediate and delayed social interaction in road user Active and Attentive Vision, York University. His
behaviour,” Applied Psychology, vol. 29, no. 4, pp. 439–460, 1980. research interests are autonomous robotics, computer
[27] D. Clay, “Driver attitude and attribution: implications for accident vision, visual attention, autonomous driving and
prevention,” Ph.D. dissertation, Cranfield University, 1995. related applications.
[28] M. Sucha, D. Dostal, and R. Risser, “Pedestrian-driver communication
and decision strategies at marked crossings,” Accident Analysis &
Prevention, vol. 102, pp. 41–50, 2017.
[29] R. Risser, “Behavior in traffic conflict situations,” Accident Analysis &
Prevention, vol. 17, no. 2, pp. 179–197, 1985.
[30] D. Rothenbücher, J. Li, D. Sirkin, B. Mok, and W. Ju, “Ghost driver:
A field study investigating the interaction between pedestrians and Iuliia Kotseruba received her B.Sc. degree in Com-
driverless vehicles,” in International Symposium on Robot and Human puter Science at University of Toronto in 2010 and
Interactive Communication (RO-MAN), 2016, pp. 795–802. her M.Sc. degree in Computer Science at York
[31] T. Lagstrom and V. M. Lundgren, “AVIP-autonomous vehicles interac- University in 2016. She is currently working as
tion with pedestrians,” Ph.D. dissertation, Thesis, 2015. a Research Associate at the Laboratory for Active
[32] C. P. Urmson, I. J. Mahon, D. A. Dolgov, and J. Zhu, “Pedestrian and Attentive Vision, York University. Her research
notifications,” US Patent US 9 196 164B1, 11 24, 2015. interests include visual attention, computer vision,
[33] E. CYingzi Du, K. Yang, F. Jiang, P. Jiang, R. Tian, M. Luzetski, cognitive systems and autonomous driving.
Y. Chen, R. Sherony, and H. Takahashi, “Pedestrian behavior analysis
using 110-car naturalistic driving data in usa,” Online, 2017-06-
3. [Online]. Available: https://www-nrd.nhtsa.dot.gov/pdf/Esv/esv23/
23ESV-000291.pdf
[34] S. Schmidt and B. Färber, “Pedestrians at the kerb–recognising the action
intentions of humans,” Transportation research part F: traffic psychology
and behaviour, vol. 12, no. 4, pp. 300–310, 2009.
[35] M. M. Ishaque and R. B. Noland, “Behavioural issues in pedestrian
speed choice and street crossing behaviour: a review,” Transport Re- John K. Tsotsos is Distinguished Research Profes-
views, vol. 28, no. 1, pp. 61–85, 2008. sor of Vision Science at York University. He received
[36] T. Wang, J. Wu, P. Zheng, and M. McDonald, “Study of pedestrians’ his doctorate in Computer Science from the Univer-
gap acceptance behavior when they jaywalk outside crossing facilities,” sity of Toronto. After a postdoctoral fellowship in
in Intelligent Transportation Systems (ITSC), 2010, pp. 1295–1300. Cardiology at Toronto General Hospital, he joined
[37] D. Sun, S. Ukkusuri, R. F. Benekohal, and S. T. Waller, “Modeling of the University of Toronto on faculty in Computer
motorist-pedestrian interaction at uncontrolled mid-block crosswalks,” Science and in Medicine. In 1980 he founded the
Urbana, vol. 51, p. 61801, 2002. Computer Vision Group at the University of Toronto,
[38] E. Papadimitriou, G. Yannis, and J. Golias, “A critical assessment of which he led for 20 years. He was recruited to
pedestrian behaviour models,” Transportation research part F: traffic York University in 2000 as Director of the Centre
psychology and behaviour, vol. 12, no. 3, pp. 242–255, 2009. for Vision Research. He has been a Canadian Heart
[39] R. R. Oudejans, C. F. Michaels, B. van Dort, and E. J. Frissen, “To cross Foundation Research Scholar, Fellow of the Canadian Institute for Advanced
or not to cross: The effect of locomotion on street-crossing behavior,” Research and Canada Research Chair in Computational Vision. He received
Ecological psychology, vol. 8, no. 3, pp. 259–267, 1996. many awards and honours including several best paper awards, the 2006
[40] A. Lindgren, F. Chen, P. W. Jordan, and H. Zhang, “Requirements for Canadian Image Processing and Pattern Recognition Society Award for
the design of advanced driver assistance systems-the differences between Research Excellence and Service, the 1st President’s Research Excellence
Swedish and Chinese drivers,” International Journal of Design, vol. 2, Award by York University in 2009, and the 2011 Geoffrey J. Burton Memorial
no. 2, 2008. Lectureship from the United Kingdom’s Applied Vision Association for
[41] M. M. Trivedi, T. B. Moeslund et al., “Trajectory analysis and prediction significant contribution to vision science. He was elected as Fellow of the
for improved pedestrian safety: Integrated framework and evaluations,” Royal Society of Canada in 2010 and was awarded its 2015 Sir John William
in Intelligent Vehicles Symposium (IV), 2015, pp. 330–335. Dawson Medal for sustained excellence in multidisciplinary research, the first
[42] J. F. Kooij, N. Schneider, and D. M. Gavrila, “Analysis of pedestrian computer scientist to be so honoured. Over 125 trainees have passed through
dynamics from a vehicle perspective,” in Intelligent Vehicles Symposium his lab. His current research focuses on a comprehensive theory of visual
(IV), 2014, pp. 1445–1450. attention in humans. A practical outlet for this theory embodies elements of
[43] B. Völz, K. Behrendt, H. Mielenz, I. Gilitschenski, R. Siegwart, and the theory into the vision systems of mobile robots.
J. Nieto, “A data-driven approach for pedestrian intention estimation,”
in Intelligent Transportation Systems (ITSC), 2016, pp. 2607–2612.

You might also like