0% found this document useful (0 votes)

4 views

MCA2 CA435 M1 Intro Agent Problem Solving

The document outlines a syllabus for a course on Modern Artificial Intelligence, covering topics such as intelligent agents, problem-solving techniques, and the foundations of AI. It includes target sample questions, review questions, and definitions related to AI, as well as the concept of rational agents and their design using the PEAS framework. Additionally, it references textbooks and online resources for further study and practical applications in various fields.

Uploaded by

rb292983

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

MCA2 CA435 M1 Intro Agent Problem Solving

Uploaded by

rb292983

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 90

CA 435 MODERN ARTIFICIAL

INTELLIGENCE

Text books:
[T1] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
[RefT1] https://github.com/aimacode
https://github.com/aimacode/aima-pseudocode/blob/master/algorithms.pdf

Slides contain content adapted from the text/reference books for lectures to BIT Students only
Syllabus
• MODULE-I
• Introduction: Overview of Artificial Intelligence- Problems of AI, AI Technique,
Tic-Tac-Toe Problem.
• Intelligent Agents: Agents & Environment, Nature of Environment, Structure of
Agents, Goal Based Agents, Utility Based Agents, Learning Agents.
• Problem Solving: Problems, Problem Space & Search: Defining the Problem as
State Space Search, Production System, Problem Characteristics, Issues In The
Design Of Search Programs.
Target Sample Questions
1. What is AI technique? What is Turing-test? Explain different application areas of AI
2. With suitable example show the difference between AI programming and conventional
programming.
3. Relationship/ contrast between AI, ML, DL, and logic programming
4. What is an agent? Describe briefly how an agent should act in a computer program
5. Explain i) goal-based agents ii) utility-based agents iii) simple reflex agents iv) model based reflex
agents v) learning agents
6. Differentiate between – i) Deterministic and stochastic environment ii) Episodic and sequential
environment
7. How AI technique can be applied for Tic-Tac-Toe problem? Explain in details
8. Suggest one application of artificial intelligence (AI) in each of the following indicating the
preceptor, environment, actuator, and performance measure- i) healthcare ii) housekeeping iii) public
transport iv) banking v) manufacturing and production vi) Defence vii) space mission viii)
environment care ix) agriculture x) e-Governancexi) Education
9. What capability the following components add to a computing system to make it act like human -
natural language processing, knowledge representation, automated reasoning, machine learning,
computer vision, robotics
Module I
AI Concepts
“Can a machine think and behave like humans do?”
• Defining Intelligence
• Introduction to AI and Intelligent Agents
• AI problems and Solution approaches
• Problem solving using Search and Heuristics
• AI Knowledge-base: creation, updation and
reasoning,
• Broad category of branches in AI and
intelligent Systems .
Types of Intelligence [Howard Gardner]
A machine or a system is artificially intelligent when it is equipped with at least one
or all intelligences, as listed below, in it
Review Questions
• Suggest the type of intelligence for following AI systems:
• i) computer based medical diagnosis assistant ii) automized surgical system iii)
brain tumor detector iv) vaccum cleaner v) automated washing machine vi)
automated taxi vii) banking fraud detector viii) manufacturing part picking robot ix)
automated surveillance drone x) new celestial body detector xi) earthquake
detector xii) crop growth monitor xiii) geographical information system (GIS)
based water management system xiv) crime monitoring xv) Intelligent tutoring
system xvi) Gaming system for children with mental disorder xvii) Airport robot
like RADA xviii) socially assistive robot like PARO, iCube, NAO and OHMNI
xix) Pathfinder Game
Review Questions
• Examine the AI literature to discover whether the following
tasks can currently be solved by computers:
a. Playing a decent game of table tennis (ping-pong).
b. Driving in the center of Cairo.
c. Buying a week's worth of groceries at the market.
d. Buying a week’s worth of groceries on the web.
e. Playing a decent game of bridge at a competitive level.
f. Discovering and proving new mathematical theorems.
g. Writing an intentionally funny story.
h. Giving competent legal advice in a specialized area of law.
i. Translating spoken English into spoken Swedish in real time.
j. Performing a complex surgical operation.
What is Intelligence Composed Of?
• Reasoning - It is the set of processes that enable us to provide
basis for judgement, making decisions, and prediction.
– Inductive reasoning/Deductive reasoning
• Learning
– Auditory/ Episodic/ Motor/ Observational/ Perceptual/
Relational/ Sensory/
• Problem Solving - includes decision making, which is the process
of selecting the best suitable alternative out of multiple alternatives
to reach the desired goal.
• Perception - process of acquiring, interpreting, selecting, and
organizing sensory information.
• Linguistic Intelligence - ability to use, comprehend, speak,
and write the verbal and written language.
• Philosophy
Foundations of AI
– Can formal rules be used to draw valid conclusions?
– How does the mental mind arise from a physical
– Where does knowledge come from?
– How does knowledge lead to action?
• Mathematics
– Can formal rules be used to draw valid conclusions?
– How does the mental mind arise from a physical brain?
– Where does knowledge come from?
– How does knowledge lead to action?
• Economics
– What are the formal rules to draw valid
– What can be computed?
– How do we reason with uncertain information?
• Neuroscience
– How should we make decisions so as to maximize payoff?
– How should we do this when others may not go along?
– How should we do this when the payoff may be in the future?
• Psychology
– How do humans and animals think and act?
• Computer Engineering
– How can we build an efficient computer?
• Control Theory and Cybernetics
– How can artifacts operate under their own control?
• Linguistics
– How does language relate to thought?
[R2. pp-VI]
Module I : AI Concepts- Introduction to AI
https://www.cs.utexas.edu/~mooney/cs343/slide-handouts/intro.4.pdf
➢ Definition of AI- Art of designing Systems that { think like humans +
think rationally + act like humans + act rationally }
➢ Thinking Humanly: Cognitive Modeling- Interdisciplinary field (AI,
psychology, linguistics, philosophy, anthropology) that tries to form
computational theories of human cognition
➢ Thinking Rationally : Laws of Thought- Formalize “correct”
reasoning using a mathematical model (e.g. of deductive reasoning)
➢ Acting Humanly: The Turing Test - If the response of a computer to
an unrestricted textual natural-language conversation cannot be
distinguished from that of a human being then it can be said to be
intelligent
➢ Acting Rationally: Rational Agents- rationality involves maximizing
goals within the computational and other resources available.
Module I : AI Concepts- Introduction to AI
https://www.cs.utexas.edu/~mooney/cs343/slide-handouts/intro.4.pdf
➢ Foundations of AI: Philosophy, Mathematics, Psychology; Computer
Science; Linguistics

➢ Expert Systems: detailed knowledge of the specific domain can help

control search and lead to expert level performance for restricted tasks
➢ Typical Applications:
➢ Industrial Applications: Character and hand-writing recognition;
Speech recognition; Processing credit card applications; Financial
prediction; Chemical process control;
➢ Intelligent agents and Internet applications (softbots, believable
agents, intelligent information access);
➢ Scheduling/configuration applications (Successful companies: I2, Red
Pepper, Trilogy)
AI = “Thinking Humanly”
• Get inside the actual working of human minds.
• Two ways to do that
– Through Introspection- to catch our own thoughts as
they go by.
– Through psychological experiments
• If the program's input/output and timing behaviors match
corresponding human behaviors, that is evidence that some of
the program's mechanisms could also be operating in humans
• Cognitive Science: brings together computer
models from AI and experimental techniques
from psychology to try to construct precise and
testable theories of the of the human mind.
AI= “Thinking Rationally”: “Law of Thought”
• Aristotle first codified “right thinking” i.e. perfect
reasoning process.
• He proposed syllogisms, that provided patterns for
argument structures that always yielded correct
conclusions when given correct premises-
– e.g. "Socrates is a man; all men are mortal; therefore, is
mortal."
• These laws of thought were LOGIC supposed to govern
the operation of the mind; their study initiated the
field called logic.
• Logical notations provided by Propositional calculus
and Predicate logic can be programmed using
languages like LISP, PROLOG etc.
AI = “Acting Humanly”- The Turing Test approach (1950)
• Was designed to provide a satisfactory operational definition of
intelligence.
• The computer passes the test if a human interrogator, after posing some
written questions, cannot tell whether the written responses come from
a person or not.
• For programming a computer to pass the test, the computer would need
to possess the following 6 capabilities:
– natural language processing- to enable it to communicate successfully in
English/specific language
– knowledge representation- to store what it knows or hears
– automated reasoning- to use the stored information to answer questions and to draw
new conclusions
– machine learning- to adapt to new circumstances and to detect and extrapolate
patterns
– computer vision to perceive objects
– robotics to manipulate objects and move about
• Total Turing Test- includes a video signal so that the interrogator can test
the subject's perceptual abilities, as well as the opportunity for the
interrogator to pass physical objects "through the hatch." To pass the total
Turing Test, the computer will need computer vision and robotics.
AI= “Acting Rationally”
• The study of AI as rational-agent design has at
least two advantages.
– First, it is more general than the "laws of thought“
approach, because correct inference is just one of
several possible mechanisms for achieving
rationality.
– Second, it is more amenable to scientific
development than are approaches based on
human behavior or human thought because the
standard of rationality is clearly defined general.
Agents

• An agent is an entity that perceives and acts .

• So an agent is anything that can be viewed as perceiving its environment through
sensors and acting upon that environment through actuators
• Example:
– Human agent: eyes, ears, and other organs for sensors; hands, legs, mouth, and other body parts for
actuators;
– Robotic agent: cameras and infrared range finders for sensors; various motors for actuators

• An agent is completely specified by the agent function mapping percept sequences

(histories) to actions: [f: P* → A]
• The agent program runs on the physical architecture to produce f:
– agent = architecture + program
[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
17
AI based Gaming Agent
**More on AI based Game Implementation
using pyunity etc
Case Reference Programming Resource

Python + Unity = pyunity https://www.stxnext.com/blog/python-for-game-development/

https://github.com/pyunity/pyunity
https://docs.pyunity.x10.bz/en/develop/tutorials/tutorials.html

A* based Pathfinder Game [AI Game Programming wisdom ] ch3.3 https://learn.unity.com/project/beginner-ai-pathfinding

https://docs.unity3d.com/Manual/nav-BuildingNavMesh.html
[Unity 4.x Game AI Programming] Ch 7 https://www.codementor.io/blog/basic-pathfinding-
explained-with-python-5pil8767c1
https://arongranberg.com/astar/
https://www.pygame.org/project-
A*+Pathfinder+Simple+Implementation-1687-.html
https://github.com/jwilder4690/pathFinder
Acting rationally: Rational agent
• Rational behavior: For each possible percept sequence, a rational agent
should select right action.
• The right thing/action: that which is expected to maximize its
performance measure, given the evidence provided by the percept
sequence and whatever built-in knowledge the agent has. Doesn't
necessarily involve thinking – e.g., blinking reflex – but thinking should be
in the service of rational action.
• Performance measure: An objective criterion for success of an agent's
behavior.
– E.g. performance measure of a vacuum-cleaner agent could be amount of dirt
cleaned up, amount of time taken, amount of electricity consumed, amount of
noise generated, etc.

• Rationality is distinct from omniscience (all-knowing with infinite

knowledge).

• Caveat: computational limitations make perfect rationality unachievable

→ design best program for given machine resources

20
…Rational agents
• Agents can perform actions in order to modify future percepts
so as to obtain useful information (information gathering,
exploration)
• An agent is autonomous if its behavior is determined by its own
experience (with ability to learn and adapt)

21
Vacuum-cleaner world

• Percepts: location and contents, e.g., [A,Dirty]

• Actions: Left, Right, Suck, NoOp

[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
22
PEAS

• PEAS: Performance measure, Environment, Actuators,

Sensors
• Must first specify the setting for intelligent agent design
• Consider, e.g., the task of designing an automated taxi
driver:
– Performance measure: Safe, fast, legal, comfortable trip, maximize profits
– Environment: Roads, other traffic, pedestrians, customers
– Actuators: Steering wheel, accelerator, brake, signal, horn
– Sensors: Cameras, sonar, speedometer, GPS, odometer, engine
sensors, keyboard
23
[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
PEAS
• Agent: Medical diagnosis system
– Performance measure: Healthy patient, minimize
costs, lawsuits
– Environment: Patient, hospital, staff
– Actuators: Screen display (questions, tests,
diagnoses, treatments, referrals)
– Sensors: Keyboard (entry of symptoms, findings,
patient's answers)

24
PEAS
• Agent: Part-picking robot
– Performance measure: Percentage of parts in
correct bins
– Environment: Conveyor belt with parts, bins
– Actuators: Jointed arm and hand
– Sensors: Camera, joint angle sensors

25
PEAS
• Agent: Interactive English tutor
– Performance measure: Maximize student's score on test
– Environment: Set of students
– Actuators: Screen display (exercises, suggestions, corrections)
– Sensors: Keyboard

26
Rational Agent?
Definition of a rational agent: What is rational at any
given time depends on four things:
• The performance measure that defines the criterion of
success.
• The agent's prior knowledge of the environment.
• The actions that the agent can perform.
• The agent's percept sequence to date.

The agent in Fig aside in indeed rational - its expected

performance is at least as high as any other agent’s.
• The performance measure awards one point for each
clean square at each time step.
• The "geography" of the environment is known a priori
(Figure 2.2) but the dirt distribution and the initial
location of the agent are not.
• The only available actions are Left, Right, Suck, and
(do nothing).
• The agent correctly perceives its location and whether
that location contains dirt.
AI system for legal advice in a
specialized area
• Feasibility of automization
– Identifying modules and
– mapping to specialized technologies and
methodologies
– Viability and limitations
• Design of the system
– PEAS
– Agent model
Overview of Steps of System Design (additional related topic)
https://nptel.ac.in/courses/Webcourse-contents/IISc-BANG/System%20Analysis%20and%20Design/pdf/PPTs/mod2.pdf

• SDLC includes the following activities

1. Requirements Determinations
2. Requirements Specifications
3. Feasibility Analysis
4. Final Specifications
5. Hardware Study
6. System Design
7. System Implementation
8. System Evaluation
9. System Modification

• * Feasibility Assessment: (will guide

setting up the ASSUMPTIONS for
PEAS specification)
– Managerial feasibility
– Technical feasibility
– Financial feasibility
– Operational feasibility
– Legal viability
Agent functions and programs
• An agent is completely specified by the agent
function mapping percept sequences to
actions
• One agent function (or a small equivalence
class) is rational
• Aim: find a way to implement the rational
agent function concisely

30
Review Questions
2.3 This exercise explores the differences between agent functions and agent
programs.
a. Can there be more than one agent program that implements a given agent function? Give
an example, or show why one is not possible.
b. Are there agent functions that cannot be implemented by any agent program?
c. Given a fixed machine architecture, does each agent program implement exactly one agent
function?
d. Given an architecture with n bits of storage, how many different possible agent programs
are there?
2.4 Let us examine the rationality of various vacuum-cleaner agent functions.
a. Show that the simple vacuum-cleaner agent function described in Figure 2.3 is indeed
rational under the assumptions listed on page 36.
b. Describe a rational agent function for the modified performance measure that deducts one
point for each movement. Does the corresponding agent program require internal state?
c. Discuss possible agent designs for the cases in which clean squares can become dirty and
the geography of the environment is unknown. Does it make sense for agent to learn from its
experience in these cases? If so, what should it learn?
Review Questions
2.12 The vacuum environments in the preceding exercises have all been deterministic.
Discuss possible agent programs for each of the following stochastic versions:
a. Murphy's law: 25% of the time, the action fails to clean the floor if it is dirty and
deposits dirt onto the floor if the floor is clean. How is your agent program affected if
the dirt sensor gives the wrong answer 10% of the time?
b. Small children: At each time step, each clean square has a 10% chance of becoming
dirty. Can you come up with a rational agent design for this case?
Agent types
• Four basic types in order of increasing
generality:
– Simple reflex agents
– Model-based reflex agents
– Goal-based agents
– Utility-based agents

33
Simple reflex agents

Environment types (characteristics):

Static, Dynamic (Deterministic), single agent, fully observable

34
[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
• Simple reflex agents act only on the basis of the current percept,
ignoring the rest of the percept history. The agent function is based
on the condition-action rule: "if condition, then action".

• This agent function only succeeds when the environment is fully

observable. Some reflex agents can also contain information on
their current state which allows them to disregard conditions whose
actuators are already triggered.

• Infinite loops are often unavoidable for simple reflex agents

operating in partially observable environments. Note: If the agent
can randomize its actions, it may be possible to escape from infinite
loops.
Model-based reflex agents

Environment types (characteristics):

Environment’s response (Static, Dynamic (Deterministic)), agent_count( ),
observability of system state ( partially),
36
[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
• A model-based agent can handle partially observable
environments. Its current state is stored inside the agent
maintaining some kind of structure which describes the part
of the world which cannot be seen. This knowledge about
"how the world works" is called a model of the world, hence
the name "model-based agent".
• A model-based reflex agent should maintain some sort
of internal model that depends on the percept history and
thereby reflects at least some of the unobserved aspects of
the current state. Percept history and impact of action on the
environment can be determined by using internal model. It
then chooses an action in the same way as reflex agent.
• An agent may also use models to describe and predict the
behaviors of other agents in the environment
Review Questions
2.7/2.9 Implement a performance-measuring environment simulator for the vacuum-cleaner
world in which the agent is penalized one point for each movement.
a. Can a simple reflex agent be perfectly rational for this environment? Explain.
b. What about a reflex agent with state? Design such an agent.
c. How do your answers to a and b change if the agent's percepts give it the status of every
square in the environment?
2.10 Consider a modified version of the vacuum environment in Exercise 2.7, in the
geography of the environment-its extent, boundaries, and obstacles-is unknown, as is the
initial dirt configuration. (The agent can go Up and Down as well as Left and Right.)
a. Can a simple reflex agent be perfectly rational for this environment? Explain.
b. Can a simple reflex agent with a randomized agent function outperform a simple reflex
agent? Design such an agent and measure its performance on several environments.
c. Can you design an environment in which your randomized agent will perform very
poorly? Show your results.
d. Can a reflex agent with state outperform a simple reflex agent? Design such an agent and
measure its performance on several environments. Can you design a rational agent of this
type?

https://aimacode.github.io/aima-exercises/agents-exercises/
Agent Design Paradigms
• [ https://hellotars.com/blog/understanding-ai-agents-and-environments-a-comprehensive-guide ]
• Learning from environments: Reinforcement learning: Reinforcement
learning is a method where agents learn from interactions with their
environment. Key concepts include:
• Rewards and penalties: Agents receive rewards or penalties based on
their actions to reinforce desirable behaviors (such as the robot
vacuum in the previous example).

• Markov Decision Processes (MDPs): Provide a framework for modeling

decision-making problems, helping agents optimize their behavior. For
example, an AI-powered robot uses a map with different paths to
decide the best way to reach its destination while avoiding obstacles.

• Q-learning and Policy Iteration: Q-learning involves trial and error to

learn optimal policies, while policy iteration involves evaluating and
improving policies iteratively. If a chess-playing AI tries different moves
and learns the best strategy by remembering which moves led to
wins—this is an example of Q-learning. Policy Iteration would come into
play when the AI evaluates its current strategy, makes small changes,
and repeats this process until it finds the best way to play the game.
Random Agent Program
• [ https://notebook.community/jo-tez/aima-python/vacuum_world ]
• A random agent program, as the name suggests, chooses an action at random, without taking
into account the percepts. Here, we will demonstrate a random vacuum agent for a trivial
vacuum environment, that is, the two-state environment.
Goal-based agents

Environment types (characteristics):

Environment’s response (Static, Dynamic (Deterministic)), agent_count( ),
observability of system state ( partially),
41
[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
• Goal-based agents further expand on the capabilities
of the model-based agents, by using "goal"
information. Goal information describes situations
that are desirable.
• This allows the agent a way to choose among
multiple possibilities, selecting the one which
reaches a goal state.
• Search and planning are the subfields of artificial
intelligence devoted to finding action sequences that
achieve the agent's goals
• E.g.- 8-puzzle problem
Utility-based agents

43
[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
• Goal-based agents only distinguish between goal states and non-goal
states.

• It is possible to define a measure of how desirable a particular state is.

This measure can be obtained through the use of a utility
function which maps a state to a measure of the utility of the state. A
more general performance measure should allow a comparison of
different world states according to exactly how happy they would
make the agent. The term utility can be used to describe how "happy"
the agent is.

• A rational utility-based agent chooses the action that maximizes the

expected utility of the action outcomes - that is, what the agent
expects to derive, on average, given the probabilities and utilities of
each outcome. A utility-based agent has to model and keep track of its
environment, tasks that have involved a great deal of research on
perception, representation, reasoning, and learning.
Learning agents

45
[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
Learning Element
• The most important distinction is between the
– "learning element", which is responsible for making
improvements: uses feedback from the "critic" on
how the agent is doing and determines how the
performance element should be modified to do better
in the future.
– "performance element", which is responsible for
selecting external actions: what we have previously
considered to be the entire agent: it takes in percepts
and decides on actions
– "problem generator“: responsible for suggesting
actions that will lead to new and informative
experiences.
Module I : AI Concepts- Intelligent Agents
[Russel] [http://aima.cs.berkeley.edu/algorithms.pdf]
TABLE-DRIVEN-AGENT; REFLEX-VACUUM-AGENT; SIMPLE-REFLEX-AGENT; MODEL-BASED-REFLEX-AGENT
def TableDrivenAgentProgram(table):
"""
[Figure 2.7]
This agent selects an action based on the
percept sequence.
It is practical only for tiny domains.
To customize it, provide as table a dictionary
of all
{percept_sequence:action} pairs.
"""
percepts = []

def program(percept):
percepts.append(percept)
action = table.get(tuple(percepts))
return action

return program

Python code at - https://github.com/aimacode/aima-python/blob/master/agents.py

Module I : AI Concepts- Intelligent Agents
[Russel] [http://aima.cs.berkeley.edu/algorithms.pdf]
TABLE-DRIVEN-AGENT; REFLEX-VACUUM-AGENT; SIMPLE-REFLEX-AGENT; MODEL-BASED-REFLEX-AGENT
A simple reflex agent. It acts according to a rule whose condition matches the current
state, as defined by the percept.
def SimpleReflexAgentProgram(rules,
interpret_input):
"""
[Figure 2.10]
This agent takes action based solely on the
percept.
"""

def program(percept):
state = interpret_input(percept)
rule = rule_match(state, rules)
action = rule.action
return action

return program

Python code at - https://github.com/aimacode/aima-python/blob/master/agents.py

[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
Module I : AI Concepts- Intelligent Agents
[Russel] [http://aima.cs.berkeley.edu/algorithms.pdf]
TABLE-DRIVEN-AGENT; REFLEX-VACUUM-AGENT; SIMPLE-REFLEX-AGENT; MODEL-BASED-REFLEX-AGENT
A model-based reflex agent. It keeps track of the current state of the world using an
internal model. It then chooses an action in the same way as the reflex agent.
def ModelBasedReflexAgentProgram(rules, update_state,
model):
"""
[Figure 2.12]
This agent takes action based on the percept and state.
"""
def program(percept):
program.state = update_state(program.state,
program.action, percept, model)
rule = rule_match(program.state, rules)
action = rule.action
def rule_match(state, rules): return action
"""Find the first rule that matches state."""
for rule in rules: program.state = program.action = None
if rule.matches(state): return program
return rule

[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
Python code at - https://github.com/aimacode/aima-python/blob/master/agents.py
Module I : AI Concepts- Intelligent Agents
[Russel] [http://aima.cs.berkeley.edu/algorithms.pdf]
SEARCH BASED PROBLEM SOLVING AGENTS

[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
Implement Agent
https://github.com/aimacode/aima-python/blob/master/agents.py
An agent program is a callable instance, taking
percepts and choosing actions

Thing ## A physical object that can exist in an

environment
Agent
Wumpus
Dirt
Wall
...
Environment ## An environment holds objects,
runs simulations
XYEnvironment
VacuumEnvironment
WumpusEnvironment
...
EnvGUI ## A window with a graphical
representation of the Environment
EnvToolbar ## contains buttons for controlling
EnvGUI
EnvCanvas ## Canvas to display the environment
of an EnvGUI
Environment types
https://www.aitude.com/understand-types-of-environments-in-
artificial-intelligence/
• Fully observable (vs. partially observable): An agent's
sensors give it access to the complete state of the
environment at each point in time.
• Deterministic (vs. stochastic): The next state of the
environment is completely determined by the current
state and the action executed by the agent. (If the
environment is deterministic except for the actions of
other agents, then the environment is strategic)
• Episodic (vs. sequential): The next episode does not
depend on the actions taken in previous episodes; The
choice of action in each episode depends only on the
episode itself 53
Environment types
• Static (vs. dynamic): The environment is unchanged
while an agent is deliberating. (The environment is
semi-dynamic if the environment itself does not
change with the passage of time but the agent's
performance score does)
• Discrete (vs. continuous): A limited number of
distinct, clearly defined percepts and actions.
• Single agent (vs. multiagent): An agent operating by
itself in an environment.

54
• 1. Self-Driving Car
• Static vs. Dynamic: Dynamic (The environment changes over time, such as
traffic lights, other vehicles, and pedestrians.)
• Discrete vs. Continuous: Continuous (The car's position, speed, and steering
require continuous values.)
• Deterministic vs. Stochastic: Stochastic (The outcomes depend on other
drivers, weather, and unpredictable events.)
• Episodic vs. Sequential: Sequential (Decisions impact future states, such as
navigating an intersection affecting subsequent paths.)
• 2. Chess-Playing AI
• Static vs. Dynamic: Static (The board does not change unless the agent or
opponent moves.)
• Discrete vs. Continuous: Discrete (The board has finite positions and moves.)
• Deterministic vs. Stochastic: Deterministic (Outcomes of moves are
predictable and depend only on the current state and the agent’s actions.)
• Episodic vs. Sequential: Sequential (Every move impacts the game’s future
state.)
• 3. Online Shopping Recommender System
• Static vs. Dynamic: Dynamic (Customer preferences and inventory may change in real-
time.)
• Discrete vs. Continuous: Discrete (Actions involve selecting from a finite set of
recommendations.)
• Deterministic vs. Stochastic: Stochastic (User behavior can be unpredictable;
recommendations might not always lead to purchases.)
• Episodic vs. Sequential: Episodic (Each recommendation episode is independent of
previous ones.)
• 4. Robot Vacuum Cleaner
• Static vs. Dynamic: Dynamic (The environment may change, such as humans moving
furniture or dropping objects.)
• Discrete vs. Continuous: Discrete (If the cleaner divides the room into a grid) or
Continuous (If it operates freely in space.)
• Deterministic vs. Stochastic: Stochastic (Outcomes like moving to a spot might be
affected by sensor noise or unexpected obstacles.)
• Episodic vs. Sequential: Sequential (Cleaning a particular area impacts the next steps,
such as avoiding cleaned areas.)
• 5. Medical Diagnosis AI
• Static vs. Dynamic: Static (Data remains fixed during analysis.)
• Discrete vs. Continuous: Discrete (Symptoms and test results may have categorical
outcomes) or Continuous (Some results, like blood pressure, are measured on a scale.)
• Deterministic vs. Stochastic: Stochastic (Patient conditions and responses to treatment
can be uncertain.)
• Episodic vs. Sequential: Episodic (Each diagnosis is a separate event.)
• 6. Weather Forecasting System
• Static vs. Dynamic: Dynamic (Weather conditions change continuously.)
• Discrete vs. Continuous: Continuous (Weather parameters like temperature, humidity,
and wind speed are continuous values.)
• Deterministic vs. Stochastic: Stochastic (Weather depends on complex and
unpredictable factors.)
• Episodic vs. Sequential: Sequential (Forecasting today’s weather impacts predictions for
the coming days.)
Practical Resources- Agents
• Solution – agent design vaccum cleaner
– https://github.com/abhin4v/russell-norvig-ai-
problems/blob/master/chapter2/AI/Vacuum/Cleaner.hs
– https://github.com/abhin4v/russell-norvig-ai-
problems/tree/master/chapter2
• Execute on Haskell online editor
– https://www.tutorialspoint.com/compile_haskell_online.php
• Python Solution – Agent design – “Dog in a Park”
– https://github.com/aimacode/aima-
python/blob/master/agents.ipynb
– GitHub - aimacode/aima-python: Python implementation of
algorithms from Russell And Norvig's "Artificial Intelligence - A
Modern Approach"
PROBLEM SOLVING
Problem Description- Four components
=> State-space description
• Initial state that the agent starts in.
• Successor function - A description of the possible actions available to the
agent. Given a particular state x, returns a set of (action, successor)
ordered pairs, where each action is one of the legal actions in state x and
each successor is a state that can be reached from x by applying the
action.
– ➔ State space – The set of all states reachable from the initial state. The state
space forms a graph in which the nodes are states and the arcs between
nodes are actions. A path in the state space is a sequence of states connected
by a sequence of actions.
• The goal test, which determines whether a given state is a goal state.
• A path cost function that assigns a numeric cost to each path. The
problem-solving agent chooses a cost function that reflects its own
performance measure.
– The step cost of taking action a to go from state x to state y is denoted by C(x,
a , y).
– an optimal solution has the lowest path cost among all solutions.
Problem Solving
•Rational agents need to perform sequences of actions in order
to achieve goals.
•Intelligent behavior can be generated by having a look-up table
or reactive policy that tells the agent what to do in every
circumstance, but:
- Such a table or policy is difficult to build
- All contingencies must be anticipated
•A more general approach is for the agent to have knowledge of
the world and how its actions affect it and be able to
simulate execution of actions in an internal model of the
world in order to determine a sequence of actions that will
accomplish its goals.
•This is the general task of problem solving and is typically
performed by searching through an internally modeled space 61
of world states.
Problem Solving Task
(after internal model of the problem has been specified)
•Given:
-An initial state of the world
-A set of possible actions or operators (with constraints) that can be
performed.
-A goal test that can be applied to a single state of the world to determine if it
is a goal state.
•Find:
-A solution stated as a path of states and operators that shows how to
transform the initial state into one that satisfies the goal test.
•The initial state and set of operators implicitly define a space of states of the
world and operator transitions between them. May be infinite.

62
Measuring Performance
•Path cost: a function that assigns a cost to a path,
typically by summing the cost of the individual
operators in the path. May want to find minimum
cost solution.

•Search cost: The computational time and space

(memory) required to find the solution.

•Generally there is a trade-off between path cost and

search cost and one must satisfy and find the best
solution in the time that is available.

63
Problem types
• Deterministic, fully observable → single-state problem
– Agent knows exactly which state it will be in; solution is a
sequence
• Non-observable → sensorless problem (conformant problem)
– Agent may have no idea where it is; solution is a sequence
• Nondeterministic and/or partially observable → contingency
problem
– percepts provide new information about current state
– often interleave {search, execution}
• Unknown state space → exploration problem

64
Module I : AI Concepts- AI problems and Solution approaches
[Russel/ Mooney] [https://www.cs.utexas.edu/~mooney/cs343/slide-handouts/search.4.pdf]

➢ The general task of problem solving is typically performed by

searching through an internally modeled space of world states
➢ Problem Solving Task
o Given: -An initial state of the world; A set of possible actions or operators
(with constraints) that can be performed; A goal test that can be applied to a
single state of the world to determine if it is a goal state.
o Find: -A solution stated as a path of states and operators that shows how to
transform the initial state into one that satisfies the goal test.
o The initial state and set of operators implicitly define a state space of states
of the world and operator transitions between them. May be infinite.
➢ Measuring Performance
o Path cost: a function that assigns a cost to a path, typically by summing the
cost of the individual operators in the path. May want to find minimum cost
solution.
o Search cost: The computational time and space (memory) required to find
the solution.
o Generally there is a trade-off between path cost and search cost and one
must satisfice and find the best solution in the time that is available.
Module I : AI Concepts- Common Problems
[Russel/ Mooney] [https://www.cs.utexas.edu/~mooney/cs343/slide-handouts/search.4.pdf]

➢Route Finding,
➢8-Puzzle
➢8-Queen
➢Missionaries and cannibals
Module I : AI Concepts- Solution Approaches
[Russel/ Mooney] [https://www.cs.utexas.edu/~mooney/cs343/slide-handouts/search.4.pdf]

➢ Search Problem
➢ TSP- optimization problem – redefined as a search problem = to search min cost
from the set of permutations:
➢ Given: Internal model: Graph G = <V,E> : Given set V of nodes, connectivity given as set E of
edges : V={A,B,C,D}, n=|V|=4
➢ Given: Initial state: first node is A (=given source node): |{A}|=1
➢ Given: Goal state: sequence of nodes ending at A with all nodes included : |{A,_,_,_,A}
|=n+1=5
➢ Find: output O= sequence of vertices with initially O={A} and at last O= {A,_,_,_,A}
…==(to begin with A, add next connected node until getting a path returning to A )
➢ Find : optimization criterion: min cost path
➢ Find : constraint: s.t. will be back to Source node A only if |O|=4};
e.g. if instantaneous |O|=|{A,C,B}|=3 will not get to O={A,C,B,A}. But when
O={A,C,B,D} then it will get to O={A,C,B,D,A} and return as one possible solution.
➢ TSP as problem of search across S= (permutation of nodes in V with A as first node; also
as suffix. ➔ {perm({A,B,C,D})} and accept min cost permutation :This constraint is
implicit if we are evaluating all possible permutations of ABCD having A as the first
node.
➢ Solution (permutation-vector) search space (n) = n!
➢ Approaches
➢ Uninformed search
➢ Informed search
Module I : AI Concepts- Solution Approaches
[Russel/ Mooney] [https://www.cs.utexas.edu/~mooney/cs343/slide-handouts/search.4.pdf]

➢Search Problem
➢Knapsack- optimization problem – redefined as a search
problem = to search max profit subset of objects from the set of
objects with given wt. and profit and capacity:
➢ Given: Internal model: Set O = <W,P> : Given set V of objects, wt and
profit of each
➢ Given: Initial state: first object: |{A}|=1
➢ Given: Goal state: subject of objects with maximum profit
➢ Find: output O= Subset of objects with maximum profit withing
capacity C.
➢ Find : optimization criterion: max profit
➢ Find : constraint: s.t. sum of weights <= C
Knapsack as problem of search across S= (subsets of n objects.).
➢Solution (subset vector) search space (n) = 2^n
➢Approaches
➢Uninformed search
➢Informed search
Example: MinCost Path from Single Source=> Multistage
Graph (internal model) => Informed Search based Solution

• On holiday in Romania; currently in Arad.

• Flight leaves tomorrow from Bucharest
• Formulate goal:
– be in Bucharest
• Formulate problem: (define internal model of search
space)
– states: various cities
– actions: drive between cities
• Find solution:
– sequence of cities, e.g., Arad, Sibiu, Fagaras, Bucharest
• Path cost: Number of intermediate cities, distance traveled,
expected travel time

69
Example: MultiStage Graph Search

Multistage graph problem

Path Cost- 1) distance_travelled + expected_distance_to_destination
2) distance_travelled + expected_distance_to_destination + hopping_cost
70
[R3] Russel Norvig Artificial Intelligence A Modern Approah, 2nd edition, Prentice Hall
Expanding Nodes and Search

71
Example: The 8-puzzle Problem

• states? locations of tiles

• actions? move blank left, right, up, down
• goal test? = goal state (given) (LEAF); initial state (ROOT)
• path cost? 1 per move
• Heuristic/ Cost-function:{(level) distance-from-root + count-of-out-of-
order-tiles}
[Note: optimal solution of n-Puzzle family is NP-hard]

72
More Realistic Problems
• Route finding
• Travelling salesman problem
• VLSI layout
• Robot navigation
• Web searching

73
Searching Concepts
•A state can be expanded by generating all states that can be
reached by applying a legal operator to the state.
•State space can also be defined by a successor function that
returns all states produced by applying a single legal operator.
•A search tree is generated by generating search nodes by
successively expanding states starting from the initial state as
the root.
•A search node in the tree can contain
-Corresponding state
-Parent node
-Operator applied to reach this node
-Length of path from root to node (depth)
-Path cost of path from initial state to node 74
Summary
• AI Agent’s process needs
– internal model of the problem : matrix, graph, statistical
formulation, linear equations, ….. : suitable definition of
STATE and STATE-TRANSITIONS
– Can be represented as optimization problem
– Problem to be defined in proper format : {Given, FIND,
Process}
– Problems can be mapped as SEARCH problem- (simple
reflex agents have obvious SEARCH task to refer to
LookUpTable to decide action)
– Suitable Search Method to be applied
• Informed Search- If PATH-COST/SEARCH-COST can be determined
• UnInformed Search- BFS, DFS
Tic-Tac-Toe Problem
• https://github.com/aimacode/aima-python/blob/master/games4e.ipynb

• Games vs Search Problems

– "Unpredictable" opponent : specifying a move for every possible
opponent reply
– Time limits : unlikely to find goal, must approximate
• Game Playing Strategy
– Maximize winning possibility assuming that opponent will try to
minimize (Minimax Algorithm)
– Ignore the unwanted portion of the search tree (Alpha Beta Pruning)
– Evaluation(Utility) Function
• A measure of winning possibility of the player
Tic-Tac-Toe
X O

e(p) = 6 - 5 = 1
 Initial State: Board position of 3x3 matrix with 0 and X.
 Operators: Putting 0’s or X’s in vacant positions alternatively
 Terminal test: Which determines game is over
 Utility function:
e(p) = (No. of complete rows, columns or diagonals are
still open for player ) – (No. of complete rows, columns or
diagonals are still open for opponent )
Minimax Algorithm
• Generate the game tree
• Apply the utility function to each terminal state to get its
value
• Use these values to determine the utility of the nodes one
level higher up in the search tree
– From bottom to top
– For a max level, select the maximum value of its successors
– For a min level, select the minimum value of its successors
• From root node select the move which leads to highest value
Game tree for Tic-Tac-Toe

Figure 6.1 A (partial) search tree for the game of tic-tac-toe. The top node is the initial
state, and MAX moves first, placing an in an empty square. We show part of the search
tree, giving alternating moves by MIN (0) and MAX, until we eventually reach terminal
states, which can be assigned utilities according to the rules of the game.
Game tree for Tic-Tac-Toe

Figure 6.2 A two-ply game tree. The nodes are "MAX nodes," in which it is turn to
move, and the nodes are "MIN nodes." The terminal nodes show the utility
values for the other nodes are labeled with their minimax values. best move at
the root is because it leads to the successor with the highest minimax value, and
best reply is because it leads to the successor with the lowest minimax value.
Courtesy : Principles of Artificial Intelligence , Nilsson
Properties of Minimax

• Complete : Yes (if tree is finite)

• Time complexity : O(bd)
• Space complexity : O(bd) (depth-first
exploration)
Observation
• Minimax algorithm, presented above, requires
expanding the entire state-space.
• Severe limitation, especially for problems with
a large state-space.
• Some nodes in the search can be proven to be
irrelevant to the outcome of the search
Alpha-Beta Strategy
• Maintain two bounds:
Alpha (α): a lower bound on best that the
player to move can achieve
Beta (β): an upper bound on what the
opponent can achieve
• Search, maintaining α and β
• Whenever α ≥ βhigher, or β ≤ αhigher further
search at this node is irrelevant
How to Prune the Unnecessary Path
• If beta value of any MIN node below a MAX
node is less than or equal to its alpha value,
then prune the path below the MIN node.

• If alpha value of any MAX node below a MIN

node exceeds the beta value of the MIN node,
then prune the nodes below the MAX node.
Example
Tic-Tac-Toe
X : MAX player
(MAX) Start
0 : MIN player

e(p) = 0 e(p) = (rows + cols + diagonals

open to ‘X’) – (Same to ‘0’)

X
X X X’s Turn

e=8–4=4 e=8–6=2
e=8–5=3

X X 0’s Turn
0 0

e=5–4=1 e=5–3=2
Courtesy : CS621-Artificial Intelligence , 2007, Prof. Pushpak Bhatacharya
Alpha-Beta Search Algorithm
• If the MAXIMIZER nodes already possess αmin values, then their current
αmin value = Max (αmin value, α’min); on the other hand, if the
MINIMIZER nodes already possess βmax values, then their current
βmax value = Min (βmax value, β’max).

• If the estimated βmax value of a MINIMIZER node N is less than the

αmin value of its parent MAXIMIZER node N’ then there is no need
to search below the node MINIMIZER node N. Similarly, if the
αmin value of a MAXIMIZER node N is more than the βmax value of
its parent node N’ then there is no need to search below node N.
Alpha-Beta Analysis
• Pruning does not affect the final result.
• Assume a fixed branching factor and a fixed depth
• Best case: bd/2 + b(d/2)-1
• Approximate as bd/2
• Impact ?
Minmax: 109 = 1,000,000,000
Alpha-beta: 105+ 104 = 110,000
• But best-case analysis depends on choosing the best move
first at cut nodes (not always possible)
• The worst case : No cut-offs, and Alpha-Beta degrades to
Minmax

BCA AI Chapter-1 Complete
No ratings yet
BCA AI Chapter-1 Complete
56 pages
Lcture-1 Introduction To Artificial Intelligence Version-1
No ratings yet
Lcture-1 Introduction To Artificial Intelligence Version-1
54 pages
AI CH 1
No ratings yet
AI CH 1
25 pages
1 Introduction To AI
No ratings yet
1 Introduction To AI
72 pages
Overview of Artificial Intelligence: Abu Saleh Musa Miah
No ratings yet
Overview of Artificial Intelligence: Abu Saleh Musa Miah
54 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
37 pages
Chapter 1
No ratings yet
Chapter 1
20 pages
Chapter - 1 Introduction to AI 2
No ratings yet
Chapter - 1 Introduction to AI 2
32 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
23 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
33 pages
Ai Course File
No ratings yet
Ai Course File
67 pages
AI Modue1 5TH SEMESTER VTU bcs515b Notes
No ratings yet
AI Modue1 5TH SEMESTER VTU bcs515b Notes
82 pages
Arificial Intelligence.
No ratings yet
Arificial Intelligence.
43 pages
chapter 1-introduction
No ratings yet
chapter 1-introduction
44 pages
#1 Lecture
No ratings yet
#1 Lecture
9 pages
Module 1
No ratings yet
Module 1
52 pages
Module 1-Intro To Ai
No ratings yet
Module 1-Intro To Ai
74 pages
Unit 1 (Introduction To AI)
No ratings yet
Unit 1 (Introduction To AI)
36 pages
1 Introduction To AI Concise
No ratings yet
1 Introduction To AI Concise
30 pages
AI Intro-1
No ratings yet
AI Intro-1
225 pages
Lec 1
No ratings yet
Lec 1
52 pages
21cs502 Ai Unit-I Notes Short 42 Pges
No ratings yet
21cs502 Ai Unit-I Notes Short 42 Pges
42 pages
Ai Mod1
No ratings yet
Ai Mod1
19 pages
AI Lecture 1 Introduction
No ratings yet
AI Lecture 1 Introduction
35 pages
AI-Unit-1
No ratings yet
AI-Unit-1
240 pages
AITools Unit 1
No ratings yet
AITools Unit 1
14 pages
Introduction To A.I.
No ratings yet
Introduction To A.I.
29 pages
COMPPB52283rArtrPr AI Module 1 Slides
No ratings yet
COMPPB52283rArtrPr AI Module 1 Slides
103 pages
Advance Topic in Artificial Intelligence & Neural Computing
No ratings yet
Advance Topic in Artificial Intelligence & Neural Computing
140 pages
Chapter 1 - Introduction to AI
No ratings yet
Chapter 1 - Introduction to AI
37 pages
Lecture Notes
No ratings yet
Lecture Notes
40 pages
18csc305j Ai Unit1
No ratings yet
18csc305j Ai Unit1
155 pages
Ai Module1
No ratings yet
Ai Module1
15 pages
Artificial Intelligence: Unit-1 Introduction: Chapter 1 Text Book: Stuart Russell, Norvig
No ratings yet
Artificial Intelligence: Unit-1 Introduction: Chapter 1 Text Book: Stuart Russell, Norvig
13 pages
1 Introduction
No ratings yet
1 Introduction
37 pages
Ai Unit 1 Bec Part 1 Final
No ratings yet
Ai Unit 1 Bec Part 1 Final
60 pages
BIT 2319 Artificial Intelligence: Nyairo Evanson
No ratings yet
BIT 2319 Artificial Intelligence: Nyairo Evanson
56 pages
AI Chapter One
No ratings yet
AI Chapter One
34 pages
1 Introduction to AI (2)
No ratings yet
1 Introduction to AI (2)
24 pages
ai Unit full
No ratings yet
ai Unit full
818 pages
Artificial Intelligence (AI)
No ratings yet
Artificial Intelligence (AI)
74 pages
Artificial Intelligence - Course Overview
No ratings yet
Artificial Intelligence - Course Overview
34 pages
Unit 1 AI
No ratings yet
Unit 1 AI
64 pages
Unit - 1 Artificial Intelligence: History of AI
No ratings yet
Unit - 1 Artificial Intelligence: History of AI
19 pages
module1-241118043742-be7649f1
No ratings yet
module1-241118043742-be7649f1
23 pages
Ai Notes
No ratings yet
Ai Notes
137 pages
Module 1 Updated
No ratings yet
Module 1 Updated
97 pages
Lec01 Intro-1689184239927
No ratings yet
Lec01 Intro-1689184239927
21 pages
AIppt
No ratings yet
AIppt
27 pages
Module 1 - Das
No ratings yet
Module 1 - Das
76 pages
Chapter 1
No ratings yet
Chapter 1
28 pages
Unit 1 AI Notes
No ratings yet
Unit 1 AI Notes
14 pages
Ai Intro
No ratings yet
Ai Intro
33 pages
AI Unit-1 Notes
No ratings yet
AI Unit-1 Notes
17 pages
AI Chapter - One
No ratings yet
AI Chapter - One
42 pages
Chapter 1 Introduction To AI
No ratings yet
Chapter 1 Introduction To AI
26 pages
AAI - Intro Lec 1 2
No ratings yet
AAI - Intro Lec 1 2
57 pages
Introduction to Artificial Intelligence
No ratings yet
Introduction to Artificial Intelligence
25 pages
Introduction To AI: Chapter One
No ratings yet
Introduction To AI: Chapter One
26 pages
Artificial Intelligence: A Beginner's Guide
From Everand
Artificial Intelligence: A Beginner's Guide
Park Windsor
No ratings yet
Exploring Game Playing AI Using Reinforcement Learning Techniques
No ratings yet
Exploring Game Playing AI Using Reinforcement Learning Techniques
5 pages
Reinforced Learning
No ratings yet
Reinforced Learning
25 pages
Instructors Manual
No ratings yet
Instructors Manual
96 pages
Reinforcement Learning: Russell and Norvig: CH 21
No ratings yet
Reinforcement Learning: Russell and Norvig: CH 21
16 pages
Exploration of Reinforcement Learning To SNAKE: Bowei Ma, Meng Tang, Jun Zhang
No ratings yet
Exploration of Reinforcement Learning To SNAKE: Bowei Ma, Meng Tang, Jun Zhang
5 pages
A Deep Learning Approach To The Geometry Friends Game (Artículo)
No ratings yet
A Deep Learning Approach To The Geometry Friends Game (Artículo)
10 pages
A Study On Dynamic Pricing in The Airline Industry Using Reinforcement Learning Analyzing The Impact of Reinforcement Learning On Airline Pricing Strategies
No ratings yet
A Study On Dynamic Pricing in The Airline Industry Using Reinforcement Learning Analyzing The Impact of Reinforcement Learning On Airline Pricing Strategies
4 pages
CS234_A2
No ratings yet
CS234_A2
9 pages
Deep Reinforcement Learning - Guide To Deep Q-Learning
No ratings yet
Deep Reinforcement Learning - Guide To Deep Q-Learning
1 page
Aiml Answers
No ratings yet
Aiml Answers
42 pages
Probabilistic Robotics Final Report
No ratings yet
Probabilistic Robotics Final Report
9 pages
Basics of Machine Learning
No ratings yet
Basics of Machine Learning
38 pages
UNIT IV-1
No ratings yet
UNIT IV-1
32 pages
Quantfury Whitepaper
No ratings yet
Quantfury Whitepaper
40 pages
DRL GNN
No ratings yet
DRL GNN
12 pages
14D006 Automated Trading RF SM MZ HC
No ratings yet
14D006 Automated Trading RF SM MZ HC
21 pages
Machine Learning Practical File
No ratings yet
Machine Learning Practical File
41 pages
B12158 Mastering PyTorch Ebook 15 Pages
No ratings yet
B12158 Mastering PyTorch Ebook 15 Pages
15 pages
Road Safety
No ratings yet
Road Safety
13 pages
Multi-Agent Reinforcement Learning For Spectrum
No ratings yet
Multi-Agent Reinforcement Learning For Spectrum
5 pages
Deep Reinforcement Learning For Network Slicing With Heterogeneous Resource Requirements and Time Varying Traffic Dynamics - Koo Et Al. 2019
No ratings yet
Deep Reinforcement Learning For Network Slicing With Heterogeneous Resource Requirements and Time Varying Traffic Dynamics - Koo Et Al. 2019
9 pages
Game Playing AI: (Based On Earlier Lecture From Stephen Gould)
No ratings yet
Game Playing AI: (Based On Earlier Lecture From Stephen Gould)
28 pages
Lecture 5: Model-Free Control: David Silver
No ratings yet
Lecture 5: Model-Free Control: David Silver
43 pages
Lec 17 SARSA Expected SARSA Q Learning
No ratings yet
Lec 17 SARSA Expected SARSA Q Learning
4 pages
cs188 sp23 Note14
No ratings yet
cs188 sp23 Note14
2 pages
CASE STUDY E-Commerce Warehousing Learning A Storage Policy
No ratings yet
CASE STUDY E-Commerce Warehousing Learning A Storage Policy
19 pages
An agent based framework for open pit mine planning
No ratings yet
An agent based framework for open pit mine planning
18 pages
Demand Response in Electricity Markets: An Overview: July 2007
No ratings yet
Demand Response in Electricity Markets: An Overview: July 2007
7 pages
RL SEM 8
No ratings yet
RL SEM 8
3 pages
CS461 Intermediate Report Team7
No ratings yet
CS461 Intermediate Report Team7
5 pages