SlideShare a Scribd company logo
2
Most read
4
Most read
6
Most read
Yamato OKAMOTO
2018/12/16
IntelliLight: A Reinforcement
Learning Approach for Intelligent
Traffic Light Control
(KDD’18)
Who is Yamato ??
 Master of Informatics, Kyoto University, JAPAN (2013)
 Working as a Business developer & AI Researcher
OMRON.Inc (2013~)
twitter RoadRoller_DESU
@ICDM’18
Banquet
Today’s paper
IntelliLight: A Reinforcement Learning Approach for Intelligent
Traffic Light Control (KDD’18)
Hua Wei, Guanjie Zheng, Huaxiu Yao, Zhenhui Li
Pennsylvania State University, University Park, PA, USA
Why chose?
- In this paper, there are two unique points
1. they tested the methods on the real-world traffic data.
2. they try to tried to interpret the policies.
Motivation
Traffic congestion has become increasingly costly.
One way to reduce the traffic congestion is by intelligently
controlling traffic lights.
Related Work (1/2)
Self-Organizing Traffic Light Control (SOTL)
controls the traffic light according to the current traffic state
“state” including the eclipsed time and the number of vehicles
waiting at the red light.)
the traffic light will change when the number of waiting cars is above
a hand-tuned threshold.
Remaining Challenges
without taking into account future situation.
In order to control traffic lights intelligently…
Detect waiting cars
and change traffic light
Related Work (2/2)
Deep Reinforcement Learning for Traffic Light Control
Apply Deep Q-learning to solve the in-managablely large state space.
Learn a Q-function (e.g. a deep neural network) to map state and
action to reward. These works vary in the state representation
and also reward design
Remaining Challenges
Previous studies all take traffic light phase as one feature, And
this one feature does not play a role enough.
agents are having difficulties in distinguishing the decision
process for different traffic light phases.
In order to taking into account future situation…
Key Idea (1/2)
1. A phase-gated model learning
To distinguish the decision process for different phases, they
design a separate learning process of making decisions Q(s, a).
These separate processes are selected through a gate
controlled by the phase.
when phase P = 0, the left
branch will be activated,
while when phase P = 1, the
right branch will be activated.
Key Idea (2/2)
2. Memory Palace and Model Updating
imbalanced samples of traffic on different lanes will lead to
inferior performance on less frequent situation.
To solve this, using different memory palaces for different
traffic-light-phase-action combinations.
training samples for different
phase-action combinations
are stored into different
memory palaces
Traffic Light Optimization Framework
State
Just one intersection and For each lane 𝒊 at this intersection
𝑳𝒊 :queue length
𝑽𝒊 :number of vehicles
𝑾𝒊 :waiting time of vehicles
𝑴 :image representation of vehicles’ position
Action
Traffic light has two actions
a = 1: change the light to next phase
a = 0: keep the current phase
Traffic Light Optimization Framework
Reward
reward is defined as a weighted sum of the following factors
𝑳𝒊 :sum of queue length
𝑫𝒊 :sum of delay D over all approaching lanes
𝑾𝒊 :sum of updated waiting time W over all approaching lanes
𝑪 : C = 0 for keeping and C = 1 for changing the current phase
𝑵 : number of vehicles during time interval ∆t after the last action
𝑻 : total time that vehicles spent on approaching lanes.
*heuristic
Traffic Light Optimization Framework
Training(1) ~Offline Part~
• to collect data samples, let traffic go through with fixed lights timetable,
Training(2) ~Online Part~
• at every time interval ∆t, the traffic light agent observe the state from the
environment and take the action with maximum estimated reward
according to greedy strategy
• After that, the agent will observe the environment and get the reward.
Then, the tuple (s, a, r) will be stored into memory.
• After several timestamps agent update the network according to the logs
in the memory.
Experiment
In this paper, they conduct experiments using both synthetic
and real-world traffic data.
To evaluate the effectiveness of proposed model, they
compare with the following baseline methods
- Fixed-time Control (FT)
- Self-Organizing Traffic Light Control (SOTL)
- Deep Reinforcement Learning for Traffic Light Control (DRL).
For Interpretation of learned signal, they show the percentage
of each action.
Experiment (1/3)
About Synthetic Data
four traffic flow settings:
1. simple changing traffic (configuration 1)
2. equally steady traffic (configuration 2)
3. unequally steady traffic (configuration 3)
4. complex traffic (configuration 4) (*)combination of previous configurations.
Experiment (1/3)
Performance on Synthetic Data
Proposed method IntelliLight achieves the best reward
Proposed method MP(Memory Palace) and PG(Phase Gate) boost the
reword (But not in all configuration).
Experiment (2/3)
About Real-world Data and Performance on it
Data is collected by 1,704 surveillance cameras in Jinan (China), over the
time period from 08/01/2016 to 08/31/2016.
By analyzing records with camera locations, the trajectories of vehicles
are recorded when they pass through road intersections.
The dataset covers 935 locations, and they feed this real-world traffic
setting into SUMO as online experiments.
Proposed method
IntelliLight achieves
the best reward
Experiment (3/3)
Adjusting intelligently to different traffic conditions.
Peak hour vs. Non-peak hour, Weekday vs. Weekend,
Conclusion
This paper address the traffic light control problem
using a well-designed reinforcement learning
approach.
proposed method distinguish the decision process
for different traffic light phases.
They conducted experiments using both synthetic
and real world.
proposed method showed superior performance
over state-of-the-art methods.
Thank you
r2d.info

More Related Content

What's hot (20)

PPT
PSO.ppt
grssieee
 
PDF
Vinod_Autonomous_car_ppts
vinumukkati
 
PPTX
Application of traffic light
are you
 
PPTX
Machine learning with ADA Boost
Aman Patel
 
PPTX
automatic number plate recognition
Sairam Taduvai
 
PPTX
Self driving car
abdulrahman1225
 
PDF
Recurrent neural networks rnn
Kuppusamy P
 
PPTX
Autonomous vehicles
Rabiya Khalid
 
PDF
Automatic Train collision and Accident Avoidance system
PradeepRaj
 
PPTX
Traffic sign recognition
AKR Education
 
PDF
Mini project rain sensing wiper
Shubham gupta
 
PPTX
Vehicle to vehicle communication
Mohamed Zaki
 
PPTX
Advanced driver assistance systems
Car Leasing Made Simple
 
PPTX
Driver drowsinees detection and alert.pptx slide
kavinakshi
 
PPTX
Traffic light control using atmega16 ppt
SHIVA KUMAR
 
PDF
Garbage Monitoring System using Arduino
ijtsrd
 
PPTX
Adaptive cruise control
VIBHOR RATHI
 
PPTX
Adaptive cruise control’
Mohd Nazir Shakeel
 
PPTX
Autonomous car
Anil kale
 
PPTX
Embedded system in automobile
Swaraj Nayak
 
PSO.ppt
grssieee
 
Vinod_Autonomous_car_ppts
vinumukkati
 
Application of traffic light
are you
 
Machine learning with ADA Boost
Aman Patel
 
automatic number plate recognition
Sairam Taduvai
 
Self driving car
abdulrahman1225
 
Recurrent neural networks rnn
Kuppusamy P
 
Autonomous vehicles
Rabiya Khalid
 
Automatic Train collision and Accident Avoidance system
PradeepRaj
 
Traffic sign recognition
AKR Education
 
Mini project rain sensing wiper
Shubham gupta
 
Vehicle to vehicle communication
Mohamed Zaki
 
Advanced driver assistance systems
Car Leasing Made Simple
 
Driver drowsinees detection and alert.pptx slide
kavinakshi
 
Traffic light control using atmega16 ppt
SHIVA KUMAR
 
Garbage Monitoring System using Arduino
ijtsrd
 
Adaptive cruise control
VIBHOR RATHI
 
Adaptive cruise control’
Mohd Nazir Shakeel
 
Autonomous car
Anil kale
 
Embedded system in automobile
Swaraj Nayak
 

Similar to IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control (KDD’18) (20)

PDF
Traffic light control in non stationary environments based on multi
Mohamed Omari
 
PDF
Traffic Flow Prediction Using Machine Learning Algorithms
IRJET Journal
 
PPTX
Comp prese (1)
Mohmmad Khasawneh
 
PDF
A Review on Traffic Signal Identification
ijtsrd
 
PPTX
T13 (1).pptx
praveen gautam
 
PDF
Integrating Machine Learning and Traffic Simulation for Enhanced Traffic Mana...
IRJET Journal
 
PDF
Evolutionary reinforcement learning multi-agents system for intelligent traf...
IJECEIAES
 
PDF
IRJET- Artificial Intelligence Based Smart Traffic Management System using Vi...
IRJET Journal
 
PPTX
0TH-AN INTELLIGENT TRAFFIC LIGHT CONTROL SYSTEM USING CNN.pptx
SanjayLove1
 
PDF
Integrated tripartite modules for intelligent traffic light system
IJECEIAES
 
PDF
Adaptive traffic lights based on traffic flow prediction using machine learni...
IJECEIAES
 
PPTX
Traffic PPT.pptx
PallaviLattupally
 
PDF
Smart Traffic System using Machine Learning
IRJET Journal
 
PDF
Vehicle Traffic Analysis using CNN Algorithm
IRJET Journal
 
PDF
The International Journal of Engineering and Science (The IJES)
theijes
 
PDF
IRJET- Intelligent Traffic Signal Control System using ANN
IRJET Journal
 
PDF
IRJET- Intelligent Traffic Management System
IRJET Journal
 
PPTX
Reinforcement learning
Zahra Khoobi
 
PDF
Smart Traffic Congestion Control System: Leveraging Machine Learning for Urba...
IRJET Journal
 
PPTX
Expert System - Automated Traffic Light Control Based on Road Congestion
Kartik Shenoy
 
Traffic light control in non stationary environments based on multi
Mohamed Omari
 
Traffic Flow Prediction Using Machine Learning Algorithms
IRJET Journal
 
Comp prese (1)
Mohmmad Khasawneh
 
A Review on Traffic Signal Identification
ijtsrd
 
T13 (1).pptx
praveen gautam
 
Integrating Machine Learning and Traffic Simulation for Enhanced Traffic Mana...
IRJET Journal
 
Evolutionary reinforcement learning multi-agents system for intelligent traf...
IJECEIAES
 
IRJET- Artificial Intelligence Based Smart Traffic Management System using Vi...
IRJET Journal
 
0TH-AN INTELLIGENT TRAFFIC LIGHT CONTROL SYSTEM USING CNN.pptx
SanjayLove1
 
Integrated tripartite modules for intelligent traffic light system
IJECEIAES
 
Adaptive traffic lights based on traffic flow prediction using machine learni...
IJECEIAES
 
Traffic PPT.pptx
PallaviLattupally
 
Smart Traffic System using Machine Learning
IRJET Journal
 
Vehicle Traffic Analysis using CNN Algorithm
IRJET Journal
 
The International Journal of Engineering and Science (The IJES)
theijes
 
IRJET- Intelligent Traffic Signal Control System using ANN
IRJET Journal
 
IRJET- Intelligent Traffic Management System
IRJET Journal
 
Reinforcement learning
Zahra Khoobi
 
Smart Traffic Congestion Control System: Leveraging Machine Learning for Urba...
IRJET Journal
 
Expert System - Automated Traffic Light Control Based on Road Congestion
Kartik Shenoy
 
Ad

More from Yamato OKAMOTO (20)

PDF
第七回全日本コンピュータビジョン勉強会 A Multiplexed Network for End-to-End, Multilingual OCR
Yamato OKAMOTO
 
PDF
部下のマネジメントはAI開発に学べ
Yamato OKAMOTO
 
PDF
ICLR2020 オンライン読み会 Deep Semi-Supervised Anomaly Detection
Yamato OKAMOTO
 
PDF
ICLR'2020 参加速報
Yamato OKAMOTO
 
PDF
Domain Generalization via Model-Agnostic Learning of Semantic Features
Yamato OKAMOTO
 
PDF
(SURVEY) Active Learning
Yamato OKAMOTO
 
PDF
(SURVEY) Semi Supervised Learning
Yamato OKAMOTO
 
PDF
[ICML2019読み会in京都] (LT)Bayesian Nonparametric Federated Learning of Neural Net...
Yamato OKAMOTO
 
PDF
[ICML2019読み会in京都] Agnostic Federated Learning
Yamato OKAMOTO
 
PDF
CVPR2019@ロングビーチ参加速報(後編 ~本会議~)
Yamato OKAMOTO
 
PDF
CVPR2019@ロングビーチ参加速報(前編~Tutorial&Workshop~)
Yamato OKAMOTO
 
PDF
ICML2019@Long Beach 参加速報(5~6日目 Workshop)
Yamato OKAMOTO
 
PDF
ICML2019@Long Beach 参加速報(最終日 Workshop)
Yamato OKAMOTO
 
PDF
ICML2019@Long Beach 参加速報(4日目)
Yamato OKAMOTO
 
PDF
ICML2019@Long Beach 参加速報(3日目)
Yamato OKAMOTO
 
PDF
ICML2019@Long Beach 参加速報(2日目)
Yamato OKAMOTO
 
PDF
ICML2019@Long Beach 参加速報(1日目)
Yamato OKAMOTO
 
PDF
ICLR2019 読み会in京都 ICLRから読み取るFeature Disentangleの研究動向
Yamato OKAMOTO
 
PDF
ICLR'19 読み会 in 京都 [LT枠] Domain Adaptationの研究動向
Yamato OKAMOTO
 
PDF
CVPR2019 survey Domain Adaptation on Semantic Segmentation
Yamato OKAMOTO
 
第七回全日本コンピュータビジョン勉強会 A Multiplexed Network for End-to-End, Multilingual OCR
Yamato OKAMOTO
 
部下のマネジメントはAI開発に学べ
Yamato OKAMOTO
 
ICLR2020 オンライン読み会 Deep Semi-Supervised Anomaly Detection
Yamato OKAMOTO
 
ICLR'2020 参加速報
Yamato OKAMOTO
 
Domain Generalization via Model-Agnostic Learning of Semantic Features
Yamato OKAMOTO
 
(SURVEY) Active Learning
Yamato OKAMOTO
 
(SURVEY) Semi Supervised Learning
Yamato OKAMOTO
 
[ICML2019読み会in京都] (LT)Bayesian Nonparametric Federated Learning of Neural Net...
Yamato OKAMOTO
 
[ICML2019読み会in京都] Agnostic Federated Learning
Yamato OKAMOTO
 
CVPR2019@ロングビーチ参加速報(後編 ~本会議~)
Yamato OKAMOTO
 
CVPR2019@ロングビーチ参加速報(前編~Tutorial&Workshop~)
Yamato OKAMOTO
 
ICML2019@Long Beach 参加速報(5~6日目 Workshop)
Yamato OKAMOTO
 
ICML2019@Long Beach 参加速報(最終日 Workshop)
Yamato OKAMOTO
 
ICML2019@Long Beach 参加速報(4日目)
Yamato OKAMOTO
 
ICML2019@Long Beach 参加速報(3日目)
Yamato OKAMOTO
 
ICML2019@Long Beach 参加速報(2日目)
Yamato OKAMOTO
 
ICML2019@Long Beach 参加速報(1日目)
Yamato OKAMOTO
 
ICLR2019 読み会in京都 ICLRから読み取るFeature Disentangleの研究動向
Yamato OKAMOTO
 
ICLR'19 読み会 in 京都 [LT枠] Domain Adaptationの研究動向
Yamato OKAMOTO
 
CVPR2019 survey Domain Adaptation on Semantic Segmentation
Yamato OKAMOTO
 
Ad

Recently uploaded (20)

PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
Basics of Electronics for IOT(actuators ,microcontroller etc..)
arnavmanesh
 
PDF
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
PPTX
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
PDF
Lecture A - AI Workflows for Banking.pdf
Dr. LAM Yat-fai (林日辉)
 
PDF
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
PPTX
Machine Learning Benefits Across Industries
SynapseIndia
 
PDF
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
PDF
SalesForce Managed Services Benefits (1).pdf
TechForce Services
 
PPTX
Earn Agentblazer Status with Slack Community Patna.pptx
SanjeetMishra29
 
PDF
introduction to computer hardware and sofeware
chauhanshraddha2007
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
The Past, Present & Future of Kenya's Digital Transformation
Moses Kemibaro
 
PDF
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Per Axbom: The spectacular lies of maps
Nexer Digital
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Basics of Electronics for IOT(actuators ,microcontroller etc..)
arnavmanesh
 
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
Lecture A - AI Workflows for Banking.pdf
Dr. LAM Yat-fai (林日辉)
 
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
Machine Learning Benefits Across Industries
SynapseIndia
 
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
SalesForce Managed Services Benefits (1).pdf
TechForce Services
 
Earn Agentblazer Status with Slack Community Patna.pptx
SanjeetMishra29
 
introduction to computer hardware and sofeware
chauhanshraddha2007
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
The Past, Present & Future of Kenya's Digital Transformation
Moses Kemibaro
 
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Per Axbom: The spectacular lies of maps
Nexer Digital
 

IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control (KDD’18)

  • 1. Yamato OKAMOTO 2018/12/16 IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control (KDD’18)
  • 2. Who is Yamato ??  Master of Informatics, Kyoto University, JAPAN (2013)  Working as a Business developer & AI Researcher OMRON.Inc (2013~) twitter RoadRoller_DESU @ICDM’18 Banquet
  • 3. Today’s paper IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control (KDD’18) Hua Wei, Guanjie Zheng, Huaxiu Yao, Zhenhui Li Pennsylvania State University, University Park, PA, USA Why chose? - In this paper, there are two unique points 1. they tested the methods on the real-world traffic data. 2. they try to tried to interpret the policies.
  • 4. Motivation Traffic congestion has become increasingly costly. One way to reduce the traffic congestion is by intelligently controlling traffic lights.
  • 5. Related Work (1/2) Self-Organizing Traffic Light Control (SOTL) controls the traffic light according to the current traffic state “state” including the eclipsed time and the number of vehicles waiting at the red light.) the traffic light will change when the number of waiting cars is above a hand-tuned threshold. Remaining Challenges without taking into account future situation. In order to control traffic lights intelligently… Detect waiting cars and change traffic light
  • 6. Related Work (2/2) Deep Reinforcement Learning for Traffic Light Control Apply Deep Q-learning to solve the in-managablely large state space. Learn a Q-function (e.g. a deep neural network) to map state and action to reward. These works vary in the state representation and also reward design Remaining Challenges Previous studies all take traffic light phase as one feature, And this one feature does not play a role enough. agents are having difficulties in distinguishing the decision process for different traffic light phases. In order to taking into account future situation…
  • 7. Key Idea (1/2) 1. A phase-gated model learning To distinguish the decision process for different phases, they design a separate learning process of making decisions Q(s, a). These separate processes are selected through a gate controlled by the phase. when phase P = 0, the left branch will be activated, while when phase P = 1, the right branch will be activated.
  • 8. Key Idea (2/2) 2. Memory Palace and Model Updating imbalanced samples of traffic on different lanes will lead to inferior performance on less frequent situation. To solve this, using different memory palaces for different traffic-light-phase-action combinations. training samples for different phase-action combinations are stored into different memory palaces
  • 9. Traffic Light Optimization Framework State Just one intersection and For each lane 𝒊 at this intersection 𝑳𝒊 :queue length 𝑽𝒊 :number of vehicles 𝑾𝒊 :waiting time of vehicles 𝑴 :image representation of vehicles’ position Action Traffic light has two actions a = 1: change the light to next phase a = 0: keep the current phase
  • 10. Traffic Light Optimization Framework Reward reward is defined as a weighted sum of the following factors 𝑳𝒊 :sum of queue length 𝑫𝒊 :sum of delay D over all approaching lanes 𝑾𝒊 :sum of updated waiting time W over all approaching lanes 𝑪 : C = 0 for keeping and C = 1 for changing the current phase 𝑵 : number of vehicles during time interval ∆t after the last action 𝑻 : total time that vehicles spent on approaching lanes. *heuristic
  • 11. Traffic Light Optimization Framework Training(1) ~Offline Part~ • to collect data samples, let traffic go through with fixed lights timetable, Training(2) ~Online Part~ • at every time interval ∆t, the traffic light agent observe the state from the environment and take the action with maximum estimated reward according to greedy strategy • After that, the agent will observe the environment and get the reward. Then, the tuple (s, a, r) will be stored into memory. • After several timestamps agent update the network according to the logs in the memory.
  • 12. Experiment In this paper, they conduct experiments using both synthetic and real-world traffic data. To evaluate the effectiveness of proposed model, they compare with the following baseline methods - Fixed-time Control (FT) - Self-Organizing Traffic Light Control (SOTL) - Deep Reinforcement Learning for Traffic Light Control (DRL). For Interpretation of learned signal, they show the percentage of each action.
  • 13. Experiment (1/3) About Synthetic Data four traffic flow settings: 1. simple changing traffic (configuration 1) 2. equally steady traffic (configuration 2) 3. unequally steady traffic (configuration 3) 4. complex traffic (configuration 4) (*)combination of previous configurations.
  • 14. Experiment (1/3) Performance on Synthetic Data Proposed method IntelliLight achieves the best reward Proposed method MP(Memory Palace) and PG(Phase Gate) boost the reword (But not in all configuration).
  • 15. Experiment (2/3) About Real-world Data and Performance on it Data is collected by 1,704 surveillance cameras in Jinan (China), over the time period from 08/01/2016 to 08/31/2016. By analyzing records with camera locations, the trajectories of vehicles are recorded when they pass through road intersections. The dataset covers 935 locations, and they feed this real-world traffic setting into SUMO as online experiments. Proposed method IntelliLight achieves the best reward
  • 16. Experiment (3/3) Adjusting intelligently to different traffic conditions. Peak hour vs. Non-peak hour, Weekday vs. Weekend,
  • 17. Conclusion This paper address the traffic light control problem using a well-designed reinforcement learning approach. proposed method distinguish the decision process for different traffic light phases. They conducted experiments using both synthetic and real world. proposed method showed superior performance over state-of-the-art methods.