Artificial Intelligence Operated Elevator Using RL AIOERL

Uploaded by

hello world

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Artificial Intelligence Operated Elevator Using RL AIOERL

Uploaded by

hello world

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

International Journal of Scientific Research in Engineering and Management (IJSREM)

Volume: 08 Issue: 03 | March - 2024 SJIF Rating: 8.176 ISSN: 2582-3930

Artificial Intelligence Operated Elevator using RL (AIOERL)

Vinod H. Yadav1, Bharat L. Kathe2, Arvindkumar R. Mishra3
1Vinod
H Yadav P V Polytechnic SNDT Women’s University Mumbai Maharashtra India
Bharat L Kathe P V Polytechnic SNDT Women’s University Mumbai Maharashtra India
2
3 Arvindkumar R Mishra P V Polytechnic SNDT Women’s University Mumbai Maharashtra India

---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Our paper explores the implementation of an Model-Free Reinforcement Learning (MFRL) is a subset of
Artificial Intelligence (AI) operated elevator system aimed at RL that doesn't require knowledge of the environment's
reducing user waiting times in a residential complex. With two dynamics or transition probabilities. In other words, the agent
elevators servicing a 14-story building, each floor learns directly from experiences without explicitly modeling the
accommodating six flats with approximately four residents per environment. MFRL algorithms are particularly useful in
home, efficiency is paramount. Leveraging AI algorithms, our
scenarios where the environment is complex, and obtaining a
system dynamically adjusts elevator operations based on user
demand patterns, traffic flow, and predictive analysis, ensuring precise model is difficult or impractical. Instead, these
minimal wait times and optimal passenger distribution. By algorithms focus on learning optimal policies through
integrating AI into elevator management, we aim to enhance exploration and exploitation of the environment's state-action
user experience and streamline vertical transportation in high- space.
density residential settings. Exploration of Model-Free Reinforcement Learning Methods:
Various MFRL methods have been developed to tackle
Key Words: Artificial Intelligence, elevator optimization,
residential complexes, waiting time reduction, predictive different types of problems, each with its strengths and
analysis, passenger distribution, efficiency improvement, weaknesses. Some common MFRL algorithms include Q-
traffic flow management. Learning, SARSA (State-Action-Reward-State-Action), Deep
Q-Networks (DQN), and Policy Gradient methods such as
REINFORCE. Each of these approaches has unique
characteristics and is suited to specific types of tasks and
INTRODUCTION environments.
For the elevator optimization problem described in our
The advent of Artificial Intelligence (AI) has revolutionized
paper, a suitable MFRL method would be Deep Q-Networks
various domains, and its application in elevator systems holds
significant promise for enhancing vertical transportation (DQN). DQN is a powerful algorithm that combines Q-learning
efficiency in high-rise residential complexes. With the with deep neural networks, enabling it to handle large state-
proliferation of urbanization and the construction of taller action spaces efficiently. Here are a few reasons why DQN is
buildings, the demand for efficient elevator operations has preferable for this application:
intensified. Our paper delves into the integration of AI
technology to mitigate user waiting times within a residential • Complex State-Action Space: Elevator systems
complex comprising a 14-story building. With six flats per floor operate in dynamic environments with multiple floors,
and an average of four occupants per home, optimizing elevator varying passenger demand, and different elevator
operations becomes imperative to ensure smooth passenger flow states (e.g., idle, moving, loading/unloading). DQN's
and minimal congestion. This introduction outlines the necessity ability to approximate the optimal action-values for
and potential benefits of employing AI in elevator management
to address the challenges of vertical transportation in densely large state spaces makes it well-suited for handling the
populated residential environments. complexity of elevator control.
• Continuous Learning: Elevator systems are subject to
changing traffic patterns and user preferences,
RL and Model-Free RL requiring continuous adaptation to optimize
Reinforcement Learning (RL) is a branch of machine performance. DQN's iterative learning process allows
learning concerned with decision-making and control processes. the agent to update its policy based on new
Unlike supervised learning, where an algorithm learns from experiences, enabling it to adapt to evolving conditions
labeled input-output pairs, and unsupervised learning, where the over time.
algorithm discovers patterns in unlabeled data, RL focuses on • Exploration and Exploitation: Balancing exploration
learning from interactions with an environment to achieve a (trying new actions to discover optimal strategies) and
cumulative reward. At the core of RL is the concept of an agent, exploitation (leveraging known information to
which learns to navigate an environment through trial and error, maximize rewards) is crucial for elevator optimization.
aiming to maximize its cumulative reward over time. DQN incorporates epsilon-greedy exploration
strategies, allowing the agent to explore different

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM29598 | Page 1

International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 03 | March - 2024 SJIF Rating: 8.176 ISSN: 2582-3930

actions while gradually shifting towards exploiting Python codes for AEORL
learned policies as it gains experience. import numpy as np
• Scalability and Efficiency: With two elevators serving import random
a 14-story building with multiple flats on each floor, from collections import deque
scalability and computational efficiency are essential from keras.models import Sequential
considerations. DQN's use of deep neural networks from keras.layers import Dense
enables it to scale to larger environments while from keras.optimizers import Adam
efficiently approximating Q-values, making it suitable class DQNAgent:
for real-time elevator control. def __init__(self, state_size, action_size):
self.state_size = state_size
Algorithm for AEORL self.action_size = action_size
• State Representation: Define the state space for the self.memory = deque(maxlen=2000)
elevator system. This could include information such self.gamma = 0.95 # discount rate
as the current floor of each elevator, the direction of self.epsilon = 1.0 # exploration rate
each elevator, the number of passengers in each self.epsilon_min = 0.01
elevator, the destination floors of the passengers, and self.epsilon_decay = 0.995
the waiting time of passengers in the lobby. self.learning_rate = 0.001
• Action Space: Define the action space for the self.model = self._build_model()
elevators. Actions could include moving up, moving def _build_model(self):
down, stopping at a floor, or opening/closing doors. model = Sequential()
• Reward Function: Design a reward function that model.add(Dense(24, input_dim=self.state_size,
incentivizes efficient elevator operation. For example, activation='relu'))
rewards could be based on minimizing the waiting time model.add(Dense(24, activation='relu'))
of passengers, minimizing the time taken to reach model.add(Dense(self.action_size,
destinations, and minimizing energy consumption. activation='linear'))
• Q-Network: Implement a deep neural network (DNN) model.compile(loss='mse',
to approximate the Q-values for state-action pairs. The optimizer=Adam(lr=self.learning_rate))
input to the network would be the state representation, return model
and the output would be the Q-values for each possible def remember(self, state, action, reward, next_state,
action. done):
• Experience Replay: Implement experience replay to self.memory.append((state, action, reward, next_state,
store and sample past experiences (state, action, done))
reward, next state) for training the Q-network. This def act(self, state):
helps stabilize training and improve sample efficiency. if np.random.rand() <= self.epsilon:
• Target Q-Network: Use a separate target Q-network return random.randrange(self.action_size)
to stabilize training. Periodically update the parameters act_values = self.model.predict(state)
of the target network with the parameters of the main return np.argmax(act_values[0])
Q-network. def replay(self, batch_size):
• Epsilon-Greedy Exploration: Implement epsilon- minibatch = random.sample(self.memory, batch_size)
greedy exploration to balance exploration and for state, action, reward, next_state, done in minibatch:
exploitation. With probability epsilon, select a random target = reward
action (explore); otherwise, select the action with the if not done:
highest Q-value (exploit). target = (reward + self.gamma *
• Training Procedure: Train the Q-network using a np.amax(self.model.predict(next_state)[0]))
variant of the DQN algorithm such as Double DQN or target_f = self.model.predict(state)
Dueling DQN. Use techniques such as gradient descent target_f[0][action] = target
to minimize the temporal difference error between the self.model.fit(state, target_f, epochs=1, verbose=0)
predicted Q-values and the target Q-values. if self.epsilon > self.epsilon_min:
• Deployment: Once the Q-network is trained, deploy it self.epsilon *= self.epsilon_decay
to control the elevators in real-time. At each time step,
use the trained Q-network to select actions for the # Define state and action sizes
elevators based on the current state of the system state_size = 10 # Example: number of elevator states
. action_size = 4 # Example: number of elevator actions (up,
down, stop, open door)

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM29598 | Page 2

International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 03 | March - 2024 SJIF Rating: 8.176 ISSN: 2582-3930

# Initialize DQN agent passengers, it can either remain stationary or start

agent = DQNAgent(state_size, action_size) moving again.
# Training loop • Unloading: The elevator transitions to this state when
for episode in range(num_episodes): it stops at a floor to unload passengers. After unloading
state = env.reset() # Reset environment to initial state passengers, it can either remain stationary or start
for time in range(max_timesteps): moving again.
action = agent.act(state) # Choose action based on Each state represents a particular condition or action of the
current state elevator system, and the arrows indicate the possible transitions
next_state, reward, done, _ = env.step(action) # Take between states. The probabilities of transitioning from one state
action and observe next state and reward to another would depend on factors such as passenger demand,
agent.remember(state, action, reward, next_state, elevator capacity, and system constraints.
done) # Store experience in replay buffer
state = next_state # Update current state
if done: # If episode is done, exit loop Conclusion for AEORL
break In conclusion, the implementation of an Artificial
if len(agent.memory) > batch_size: # Start training if Intelligence (AI) operated elevator system presents a promising
enough experiences are accumulated solution to optimize vertical transportation in high-rise
agent.replay(batch_size). residential complexes. Through the integration of Model-Free
Reinforcement Learning (MFRL), particularly the Deep Q-
System Requirement for AEORL Networks (DQN) algorithm, we have demonstrated the potential
This block diagram outlines the key hardware components to reduce user waiting times and enhance overall efficiency.
of an AI-operated elevator system: Our analysis has highlighted the importance of considering
1. Central Control System: factors such as passenger demand patterns, traffic flow
• Hosts AI algorithms, decision logic, and data dynamics, and elevator states in designing an effective AI-
processing units. driven elevator control system. By leveraging the capabilities of
• Manages communication and networking DQN, our proposed solution adapts dynamically to changing
interfaces. environmental conditions, continuously learning and improving
• Handles data storage, analytics, and its decision-making capabilities.
optimization algorithms. The experimental results, although not presented in this
2. Elevator Controllers: paper, are expected to show improvements in elevator
• Includes sensors for detecting passengers, performance, reflected in reduced waiting times, smoother
elevator movement, and door status. passenger flow, and optimized energy consumption. These
• Actuators control elevator motors, door enhancements translate into enhanced user satisfaction and
mechanisms, and other functions. operational efficiency for residential complexes with high-
3. Building Infrastructure: density populations.
• Comprises elevator cars, shafts, hoistways, Looking ahead, further research and development efforts
and lobby areas. can focus on refining the DQN-based elevator optimization
• Provides the physical framework for elevator system, incorporating additional features such as predictive
operation. maintenance, destination dispatching, and energy-efficient
These components work together to enable the AI-operated operation. Additionally, real-world deployment and validation
elevator system to efficiently manage passenger traffic, of the proposed solution will be essential to assess its scalability,
optimize elevator operations, and provide a seamless vertical robustness, and practicality in diverse residential settings.
transportation experience within the building. In conclusion, the application of AI in elevator operations
holds immense potential to revolutionize vertical transportation,
Markov Chain for AEORL offering tangible benefits in terms of user experience, energy
efficiency, and operational effectiveness in residential
In this Markov chain diagram:
complexes and beyond.
• Idle: The initial state of the elevator when it's not in
motion and waiting for passengers or instructions.
• Moving Up: The elevator transitions to this state when
it receives a command to move upward. From this
state, it can continue moving up, change direction, stop
at a floor to load passengers, or reach the top floor.
• Loading: The elevator transitions to this state when it
stops at a floor to load passengers. After loading

International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 03 | March - 2024 SJIF Rating: 8.176 ISSN: 2582-3930

. BIOGRAPHIES

Mr. Vinod Yadav graduated from

Amravati University with a degree
in Electronics and Telecom
Engineering. Mr. Yadav has done
Masters in Engineering with
specialization in Digital
Electronoics from Sant Gadage
Baba Amravati University
With 3 years industrial application.
Since past 22 years Mr. Yadav is
associated with SNDT Women’s
University as a faculty in department
Figure 1:- System Requirements of Electronics at P V Polytech nic
his focusing key area are VLSI
Design, Artificial Intelligence,
power electronics and electronic
communication.

Mr. Bharat Kathe graduated from

Pune University with a degree in
Electronics Engineering is pursuing
Masters in Electronics and
Telecommunication Engineering
from Mumbai University.
Mr. Kathe has experience in the
Manufacturing Industry as an R and
D Engineer and as a Field
Application Engineer.
For the past 9 years Mr. Kathe is
Figure 2 :- Markov Chain
teaching various electronics subjects
including Control System and PLC,
and Robotics and Automation at a
REFERENCES
prestigious SNDT Women's
University Mumbai
1. Reinforcement Learning: An Introduction (Adaptive Computation
and Machine Learning series) by Richard S.Sutton (Author),
Maharashtra India
Andrew G. Barto (Author)
2. Foundations of Deep Reinforcement Learning: Theory and Practice
in Python by Laura Graesser (Author), Wah Loon Keng (Author) Mr. Arvindkumar Mishra have
3. Practical Deep Reinforcement Learning with Python: Concise completed Master's of Engineering
Implementation of Algorithms, Simplified Maths, and Effective in Electronics and
Use of TensorFlow and PyTorch by Ivan Gridin Telecommunication Engineering
from University of Mumbai. He has
a teaching experience of 15 years
and is presently working as a
Faculty in Electronics at P V
Polytechnic. Mr. Mishra is
instrumental in implementing
project based learning at P V
Polytechnic by promoting major and
minor project implementation of
diploma students