0% found this document useful (0 votes)
35 views109 pages

AI Final

Uploaded by

Bushra Qadr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views109 pages

AI Final

Uploaded by

Bushra Qadr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 109

Artificial Intelligence

Dr. Sadegh
Soran University
Today we’ll discus:
• Intelligence
• Artificial Intelligence
• A brief history of AI
• Cool current projects in AI
What’s involved in Intelligence?
• Ability to interact with the world (speech, vision, motion,
manipulation)
• Ability to model the world and to reason about it
• Ability to learn and to adapt
Goals in AI
• To build systems that exhibit intelligent behavior
• To understand intelligence in order to model it
Modeling people?
• Sometimes
• But sometimes we want AI systems to be better and smarter than we
are
Computer Chess
• 2/96: Kasparov vs Deep Blue
• Kasparov victorious: 3 wins, 2 draws, 1 loss
• 3/97: Kasparov vs Deeper Blue
• First match won against world champion
• 512 processors: 200 million chess positions per second

How do you think it works???


Machine Learning
• Looking for patterns in vast amounts of data that is just too big for
humans to analyze
Speech Systems
Movie ticket reservations by phone
1-800-Fandango
You talk, it types
IBM’s ViaVoice
Problem Formulation
Topics of this lecture
• Review of tree structure
• Review of graph structure
• Graph implementation
• State space representation
• Search graph and search tree
The Tree Structure
• Connected list, stack, and queue
are 1-D data structures.
• Tree is a 2-D data structure (see
right figure).
• Examples:
Family tree; – Tournament tree
for a football game; –
Organization tree of a company;
and – Directory tree of a file
management system.
Queue vs Stack: Reminder
Stack Queue
Data insertion and data removing occur only at one Data insertion and data removing occur at two
end different ends
Stack follows LIFO mechanism. Queue follows FIFO mechanism.
Adding operation in the stack is called “PUSH” Adding element in the Queue is called as Enqueue
operation. operation.
Removing element in the stack is called as POP Removing element in the Queue is called as Dequeue
operation. operation.
Stack require only one pointer so-called “top” pointer. Queue uses two pointers so-called “Front” and “Rear”
pointers.
Example: in the searching algorithm, (DFS) Example: A system interupt /Programming to process
element which is added first

.The array is a data structure where you can add or remove any object you wish
Useful terminologies
• A tree consists of a set of nodes and a set of edges.
• An edge is the connection between two nodes.
• There are different nodes:
• Root: the first (top) node
• Parent: a node above some other node(s) (connected)
• Child: A node below another node
• Internal (non-terminal) node: any parent node
• Terminal (external or leaf) node: nodes that do not have any child
Useful terminologies
• Siblings: nodes that share the same parent
• Path: a sequence of nodes connected by edges
• Level of a node: number of edges contained in the path from the root
to this node
• Height of the tree: maximum distance (number of edges) from the
root to the terminal nodes
Multi-way tree and binary tree
• Multi-way tree or multi-branch tree: An internal node may have m
(m>2) child nodes.
• Binary tree: An internal node has at most 2 child nodes. – Binary tree
is useful both for information retrieval (binary search tree) and for
pattern classification (e.g. decision tree).
• Complete binary tree : A binary tree in which every level, except
possibly the last, is completely filled, and nodes in the last level are as
far left as possible.
Tree Traversal
• Tree traversal is the process to visit all nodes of a tree, without
repeating. – Example: print the contents of all nodes; search for all
nodes that have a specified property (e.g. key), etc.
• Order of traversal – pre-order, in-order, post-order, and level-order –
The first three correspond to depth-first search, and the last one to
breadth-first search.
Pre-order, in-order, and post-order
traversal
Level-order traversal
• Pre-order: Start from the root, visit (recursively) the current node, the
left node, and the right node;
• In-order: start from the root, visit (recursively) the left node, the
current node, and the right node.
• Post-order: start from the root, visit (recursively) the left node, the right
node, and the current node.
• A tree can be traversed in level-order using a queue.
Put the root in the queue first, and then repeat the following:
– Get a node (if any) from the queue, visit it, and put its children (if any)
into the queue.
Graph structure
• Graph is a more general data structure.
• Formally, a graph is defined as 2-tuple where 𝑉 is a set of vertices or
nodes; and 𝐸 is a set of edges, arcs, or connections.
G=(V,E)
• Tree is a special graph without cycles. – Each node has one path from
the root. – The path is unique. – All nodes are connected to the root.
Examples
• Computer networks
• Flight maps of airlines
• Highway networks
• City water / sewage networks
• Electrical circuits
Graph is a convenient way for representing all
these physical networks in digital (virtual) forms.
Useful terminologies
• Path: A sequence of nodes connected by edges.
• Simple path: A path in which all nodes are different.
• Cycle: A simple path with the same start and end nodes.
Connected graph: There is a path between any two nodes.
• Connected component: A sub-graph which itself is a connected graph.
• Directed graph: The edges have directions (e.g. one way route).
• Undirected graph: The edges do not have directions (or do not care
the direction).
Useful terminologies
• Weighted graph: Each edge has a weight (e.g. direct
cost to move from one city to another).
• Node expansion: The process to get all child nodes of a
node (useful for graph traversal or graph-based search).
• Spanning tree: A tree that contains all nodes of a
graph.
• Directed acyclic graph (DAG): A special case of directed
graph in which there is no cycles or loops. DAG is often
used to construct a larger classification system from
many two-class classifiers (see the figure).
Examples
State space representation of AI problems
(Search Method)
The maze problem
• Initial state: (0,0)
• Target state: (2,2)
• Available operations:
– Move forward
– Move backward
– Move left
– Move right
• Depends on the current state, the same
operation may have different results.
• Also, an operation may not be executed for
some states.
State space representation of AI problems
The Hanoi’s tower
• Initial state: (123;000;000)
• Target state: (000;123;000)
• Available operations:
– Move a disk from one place to
another.
• Restriction:
– A larger disk cannot be put on a
smaller one.
Why state space representation?
• Any problem can be represented in the same form, formally.
• Any problem can be solved by finding the target state via state
transition, using the available operations → Search problem!
• The results (i.e. the state transition process or the method for finding
this process) can be reused as knowledge.
• Problem: The computation cost can be large!
The maze problem → search graph
• To find the solution, we can just
traverse the graph, starting from (0,0),
and stop when we visit (2,2).
• The result is a path from the initial
node to the target node.
• The result can be different, depends
on the order of graph traversal.
Hanoi’s tower problem → search tree
• Hanoi’s tower can also be solved in a
similar way, but more difficult if we
do it manually because the number
of nodes is much larger.
• Instead of using a search graph, we
can use a search tree.
• That is, start from the initial node,
expend the current node recursively,
and stop when we find the target
node.
Hanoi’s Tower
Simple Search Algorithms
Topics of this lecture
• Random search
• Search with closed list
• Search with open list
• Depth-first and breadth-first search again
• Uniform-cost search
Random search
• Step 1: Current node x=initial node;
• Step 2: If x=target node, stop with success;
• Step 3: Expand x, and get a set S of child nodes;
• Step 4: Select a node x’ from S at random;
• Step 5: x=x’, and return to Step 2.
Random search is not good
• At each step, the next node
is determined at random.
• We cannot guarantee to
reach the target node.
• Even if we can, the path so
obtained can be very
redundant (extremely long).
Search with a closed list
• Do not visit the same node more than once!
• Step 1: Current node x=initial node;
• Step 2: If x=target node, stop with success;
• Step 3: Expand x, and get a set S of child nodes. If
S is empty, stop with failure. Add x to the closed
list.
• Step 4: Select from S a new node x’ that is not in
the closed list.
• Step 5: x=x’, and return to Step 2.
Closed list is not enough!
• Using a closed list, we can
guarantee termination of
search in finite steps.
• However, we may never
reach the target node!
Search with open list
Keep all “un-visited” nodes in another list!
• Step 1: Add the initial node to the open list.
• Step 2: Take a node x from the open list from the top. If the open list
is empty, stop with failure; on the other hand, if x is the target node,
stop with success.
• Step 3: Expand x to obtain a set S of child nodes, and put x into the
closed list.
• Step 4: For each node x’ in S, if it is not in the closed list, add it to the
open list along with the edge (x, x’).
• Step 5: Return to Step 2.
step 1: Open: A
Closed: B Open and Closed Lists
Step 2: Open: S Visit all nods starting from B
Closed: B A

Step 3: Open: C G
Closed: B A S

Step 4 Open: G D E F
Closed: B A S C

Step 5 Open: G D F H
Closed: B A S C E Step 8 Open: D
Closed: B A S C E H G F
Step 6 Open: G D F
Closed: B A S C E H

Step 7: Open: D F Step 9 Open:


Closed: B A S C E H G Closed: B A S C E H G F D
Property of open list Algorithm
• It is known that search based on the open list is complete in the sense
that we can always find the solution in a finite number of steps if the
search graph is finite.
Depth-first search (DFS) ‫لێگەڕان بە الی قوواڵیی‬
and breadth-first search (BFS) ‫لێگەڕان بە الی بەرینی‬
• Algorithm I is a depth-first search if we implement the open list using
a stack.
• Algorithm I becomes the breadth-first search if the open list is
implemented using a queue.
• It is known the breadth-first search is better because even for an
infinite search graph, we can get the solution in finite steps, if the
solution exists.
Output1: A B S C D E H G F Preferred Alphabetically Depth First Search
Output2: A S G H E C F D B Optionally

tput: A S G H E C F D B
F F D D
C C C C
E E E E
H H H H
G G G G
S S S B
S
A A A A
A
SH U SH U SH U
Stack
DFS
4 5 10 11
3 3 6 9 9 12
7 8 12
2 2 2 8 8 8
1 1 1 1 1
1 1 1

Queue BFS
1 2 7 8 3 6 9 12 4 5 10 11
2 7 8 3 6 9 12 4 5 10 11
7 8 3 6 9 12 4 5 10 11
8 3 6 9 12 4 5 10 11
6 12 4 5 10 11
5 11
output: A
Breadth First Search

A B S C G D E F H
deque
B S C G D E F H
S G D E F H
E F H
F H

Enque
Uniform-cost search: The Dijkstra's
algorithm
• Usually, the solution is not unique. It is expected to find the BEST one.
• For example, if we want to travel around the world, we may try to
find the fastest route; the most economic route; or the route in which
we can visit more friends.
• The uniform-cost search or Dijkstra’s algorithm is a method for solving
this problem.
Uniform-cost search
• Step 1: Add the initial node x0 and its cost C(x0)=0 to the open list.
• Step 2: Get a node x from the top of the open list. If the open list is
empty, stop with failure. If x is the target node, stop with success.
• Step 3: Expand x to get a set S of child nodes, and move x to the closed
list.
• Step 4: For each x’ in S but not in the closed list, find its accumulated cost
C(x’)=C(x)+d(x,x’); and add x’, C(x’), and (x, x’) to the open list. If x’ is
already in the open list, update its cost and link if the new cost is smaller.
• Step 5: Sort the open list based on the node costs, and return to Step 2.
Uniform-cost search
• During uniform-cost search, we can always find the best path from
the initial node to the current node. That is, when search stops with
success, the solution must be the best one.
• In the algorithm, c(x) is the cost of the node x accumulated from the
initial node; and d(x,x’) is the cost for state transition (e.g. the distance
between to adjacent cities).
• If we set d(x,x’)=1 for all edges, uniform-cost search is equivalent to
the breadth-first search.
Uniform-cost search: 3 goals: Find the
nearest Goal
S
S 5
9 6
5 1
9 6 6 A B D
3
TC=5 Total Cost=9
A B C D TC=6
1 2
2
2 Cheapest one is A
9 5 7 2
Visited: S A
7 Open list:A B D G1
G1 G2 F G3 E
8
Second and third searches of our graph
S S
5 6 5 6
9 9

A TC=5 B Total Cost=9 D TC=6 A TC=5 B Total Cost=9 D TC=6

9 3 2 2 9 3

Cheapest one is D
TC=14 G1 B TC=8 C E TC=14 G1 B TC=8
Visited: S, A
5 TC=87 TC=8
7
Cheapest are: B, C, E we continue with B alphabetically: It is your option

Visited: S, A, D
G2 F G3
8 TC=15
TC=13 TC=15 G3
TC=23
S
1- We close B and C active terminals
5 6
9
because they are visited with lower
cost 8 and 8, respectively lower than 9 and 9.
A TC=5 B Total Cost=9 D TC=6 2- We don’t continue F because D and G3
already visited.
9 3 closed 2 2 3- between three of our goals, G2 won.
4- the low cost path for achieving goal is
TC=14 G1 B TC=8 TC=8 C TC=8 E S,D,C,G2

1 5 7 7

TC=9 C G2 F G3

closed TC=13 TC=15 TC=15

Cheapest are: C, E we continue with C alphabetically: It is optional


Cheapest is E

Visited: S, A, D, B, C, E
Example: Search a path with the minimum
cost
4 5
10

2 8 6 9

5 6
Search the path with the minimum:………
cost
Heuristic Search
Algorithms
Topics of this lecture
• What are heuristics?
• What is heuristic search?
• Best first search
• A* algorithm
• Generalization of search problems
What are heuristics?
• Heuristics are know-hows obtained through a lot experiences.
• Heuristics often enable us to make decisions quickly without thinking
deeply about the reasons.
• In many cases, heuristics are “tacit knowledge” that cannot be
explained verbally.
• The more experiences we have, the better the heuristics will be.
Why heuristic search?
• Based on the heuristics, we can get good solutions without
investigating all possible cases.
• In fact, a deep learner used in Alpha-Go can learn heuristics for
playing Go-game, and this learner can help the system to make
decisions more efficiently.
• Without using heuristics, many AI-related problems cannot be solved
at all (may take many years to get a solutions).
Some heuristics for finding H(x)
• For any given node x, we need an estimation function to find H(x).
• For maze problem, for example, H(x) can be estimated by the
Manhattan distance between the current node the target node. This
distance is usually smaller than the true value (and this is good)
because some edges may not exist in practice.
• For more complex problems, we may need a method (e.g. neural
network) to learn the function from experiences or observed data.
Heuristics: Direct Distance from nodes to
Goal
Heuristics: Manhattan distance from nodes to
goals
Heuristics: More than Distance!
Manhattan Distance and Manhattan Cost

Available Manhattan Costs: AHGFEB=9


AHKDB=13
Manhattan Cost=Min(9,13)=9
Manhattan Distance is unique
Best first search
• The basic idea of best first search is similar to uniform cost search.
• The only difference is that the “cost” (or in general, the evaluation
function) is estimated based on some heuristics, rather than the real
one calculated during search.
Best First Search
• Step 1: Put the initial node x0 and its heuristic value H(x0) to the open list.
• Step 2: Take a node x from the top of the open list. If the open list is empty,
stop with failure. If x is the target node, stop with success.
• Step 3: Expand x and get a set S of child nodes. Add x to the closed list.
• Step 4: For each x’ in S but not in the closed list, estimate its heuristic value
H. If x’ is not in the open list, put x’ along with the edge (x,x’) and H into the
open list; otherwise, if H is smaller than the old value H(x’), update x’ with
the new edge and the new heuristic value.
• Step 5: Sort the open list according to the heuristic values of the nodes, and
return to Step 2.
Example
The A* Algorithm
• We cannot guarantee to obtain the optimal solution using the “best-
first search”. This is because the true cost from the initial node to the
current node is ignored.
• The A* algorithm can solve this problem (A=admissible). • In the A*
algorithm, the cost of a node is evaluated using both the estimated cost
and the true cost as follows: F(x)=C(x)+H(x)
• It has been proved the A* algorithm can obtain the best solution
provided that the estimated cost H(x) is always smaller (conservative)
than the best possible value H*(x).
The A* Algorithm
• Step 1: Put the initial node x0 and its cost F(x0)=H(x0) to the open list.
• Step 2: Get a node x from the top of the open list. If the open list is empty, stop
with failure. If x is the target node, stop with success.
• Step 3: Expand x to get a set S of child nodes. Put x to the closed list.
• Step 4: For each x’ in S, find its cost 𝐹 = 𝐹(𝑥) + 𝑑(𝑥, 𝑥’) + [𝐻(𝑥’) − 𝐻(𝑥)]
– If x’ is in the closed list but the new cost is smaller than the old one, move x’ to the
open list and update the edge (x,x’) and the cost. – Else, if x’ is in the open list, but
the new cost is smaller than the old one, update the edge (x,x’) and the cost. – Else
(if x’ is not in the open list nor in the closed list), put x’ along with the edge (x,x’) and
the cost F to the open list.
• Step 5: Sort the open list according to the costs of the nodes, and return to Step 2.
We are in H: (5+8 far from A)+(Heuristic
distance to P)
A to P
Erbil to Bokan: If we don’t consider Heuristics, between
Shaqlawa and Maxmur, Maxmur is chosen which is wrong
Road Distance = Real cost Heuristic= Direct distance
+ (10%for Border Crossing
Piranshar/Mariwan)/(20%for Sardasht Border
Crossing)
https://en.wikipedia.org/wiki/A*_search_algorithm
S to G1 or G2 or G3
Example
Example
Property of the A* Algorithm
• The A* algorithm can obtain the best solution because it considers
both the cost calculated up to now and the estimated future cost.
• A sufficient condition for obtaining the best solution is that the
estimated future cost is smaller than the best possible value.
• If this condition is not satisfied, we may not be able to get the best
solution. This kind of algorithms are called A algorithms.
Generalization of Search
• Generally speaking, search is a problem for finding the best solution
x* from a given domain D.
• Here, D is a set containing all “states” or candidate solutions.
• Each state is often represented as a n-dimensional state vector or
feature vector.
• To see if a state x is the best (or the desired) one or not, we need to
define an objective function (e.g. cost function) f(x). The problem can
be formulated by
• Min f(x), for x in D
Regression Models

EPI 809/Spring 2008 73


Types of
Probabilistic Models

P ro b a b ilis tic
M o d e ls

R e g re s s io n C o rre la tio n O th e r
M o d e ls M o d e ls M o d e ls

74 EPI 809/Spring 2008


Regression Models
• Relationship between one dependent variable and explanatory variable(s)
• Use equation to set up relationship
• Numerical Dependent (Response) Variable
• 1 or More Numerical or Categorical Independent (Explanatory) Variables
• Used Mainly for Prediction & Estimation

75 EPI 809/Spring 2008


Regression Modeling Steps
• 1. Hypothesize Deterministic Component
• Estimate Unknown Parameters
• 2. Specify Probability Distribution of Random Error Term
• Estimate Standard Deviation of Error
• 3. Evaluate the fitted Model
• 4. Use Model for Prediction & Estimation

76 EPI 809/Spring 2008


Model Specification

EPI 809/Spring 2008 77


Specifying the deterministic component

• 1. Define the dependent variable and independent variable

• 2. Hypothesize Nature of Relationship


• Expected Effects (i.e., Coefficients’ Signs)
• Functional Form (Linear or Non-Linear)
• Interactions

78 EPI 809/Spring 2008


Choosing A curve for classification

79 EPI 809/Spring 2008


Types of
Regression Models

80 EPI 809/Spring 2008


Types of
Regression Models
Regression
Models

81 EPI 809/Spring 2008


Types of
Regression Models
Explanatory 1 Regression
Variable Models

Simple

82 EPI 809/Spring 2008


Types of
Regression Models
Explanatory 1 Regression Explanatory +2
Variable Models Variables

Simple Multiple

83 EPI 809/Spring 2008


Types of
Regression Models
Explanatory 1 Regression Explanatory +2
Variable Models Variables

Simple Multiple

Linear

84 EPI 809/Spring 2008


Types of
Regression Models
Explanatory 1 Regression Explanatory +2
Variable Models Variables

Simple Multiple

-Non
Linear
Linear

85 EPI 809/Spring 2008


Types of
Regression Models
Explanatory 1 Regression Explanatory +2
Variable Models Variables

Simple Multiple

-Non
Linear Linear
Linear

86 EPI 809/Spring 2008


Types of
Regression Models
Explanatory 1 Regression Explanatory +2
Variable Models Variables

Simple Multiple

-Non -Non
Linear Linear
Linear Linear

87 EPI 809/Spring 2008


Linear Regression Model

EPI 809/Spring 2008 88


Types of
Regression Models
1 E xplan atory R eg ression 2+ E xp lanato ry
V ariable M od els V ariables

S im ple M ultiple

N on - N on -
Linear Linear
Linear Linear

89 EPI 809/Spring 2008


Linear Equations
Y
Y = mX + b
C hange
m = S lope in Y
C hange in X
b = Y-intercept
X

.T/Maker Co 1984-1994 ©
90 EPI 809/Spring 2008
Linear Regression Model
Relationship Between Variables Is a Linear Function.1 •

Population Population Slope Random Error


Y-Intercept

Yi   0   1 X i   i
Dependent Independent (Explanatory) Variable
(Response) Variable (e.g., Years s. serocon.)
(e.g., CD+ c.)
Population & Sample Regression Models

92 EPI 809/Spring 2008


Population & Sample Regression Models

Population

 


93 EPI 809/Spring 2008
Population & Sample Regression Models

Population

Unknown
Relationship 
Yi   0   1X i   i
 


94 EPI 809/Spring 2008
Population & Sample Regression Models

Population Random Sample

Unknown
Relationship 
Yi   0   1X i   i 

 


95 EPI 809/Spring 2008
Population & Sample Regression Models

Population Random Sample

Unknown
 
Yi   0   1X i   i
Relationship 
Yi   0   1X i   i 

 


96 EPI 809/Spring 2008
Population Linear Regression Model

Y Yi   0   1 X i   i Observedvalue

i = Random error

E Y    0  1 X i

X
Observed value
97 EPI 809/Spring 2008
Sample Linear Regression Model

Y  
Yi    X  
0 1 i i

^ = Random error
i

Unsampled
observation
 
Yi    X
0 1 i

X
Observed value
98 EPI 809/Spring 2008
Estimating Parameters:
Least Squares Method

EPI 809/Spring 2008 99


Scatter plot
• 1. Plot of All (Xi, Yi) Pairs
• 2. Suggests How Well Model Will Fit

Y
60
40
20
0 X
0 20 40 60
100 EPI 809/Spring 2008
Thinking Challenge

How would you draw a line through the points? How do you
determine which line ‘fits best’?

Y
60
40
20
0 X
0 20 40 60
101 EPI 809/Spring 2008
Thinking Challenge
How would you draw a line through the points? How do you
determine which line ‘fits best’?

Slope changed
Y
60
40
20
0 X
0 20 40 60
Intercept unchanged
102 EPI 809/Spring 2008
Thinking Challenge
How would you draw a line through the points? How do you
determine which line ‘fits best’?

Slope unchanged

Y
60
40
20
0 X
0 20 40 60
Intercept changed
103 EPI 809/Spring 2008
Thinking Challenge
How would you draw a line through the points? How do you
determine which line ‘fits best’?

Slope changed
Y
60
40
20
0 X
0 20 40 60
Intercept changed
104 EPI 809/Spring 2008
Least Squares
• 1. ‘Best Fit’ Means Difference Between Actual Y Values &
Predicted Y Values Are a Minimum. But Positive Differences Off-Set
Negative ones

105 EPI 809/Spring 2008


Least Squares
• 1. ‘Best Fit’ Means Difference Between Actual Y Values &
Predicted Y Values is a Minimum. But Positive Differences Off-Set
Negative ones. So square errors!

 Y 
n n

 ˆ
2
i  Yˆi  2
i
i 1 i 1

106 EPI 809/Spring 2008


Least Squares
• 1. ‘Best Fit’ Means Difference Between Actual Y Values &
Predicted Y Values Are a Minimum. But Positive Differences Off-Set
Negative. So square errors!

 
n n

 ˆ
2
Yi  Yˆi  2
i
i 1 i 1
• 2. LS Minimizes the Sum of the Squared Differences (errors) (SSE)

107 EPI 809/Spring 2008


Least Squares Graphically
n
LS minimizes   i   1   2   3   4
 2  2  2  2  2

i 1
Y  
Y2    X  
0 1 2 2

^4
^2
^1 ^3
 
Yi    X
0 1 i

X
108 EPI 809/Spring 2008
Coefficient Equations
• Prediction equation
yˆ i  ˆ0  ˆ1 xi
• Sample slope
SS xy  xi  x  yi  y 
ˆ1  
2
SS xx  xi  x 
• Sample Y - intercept

ˆ0  y  ˆ1x
EPI 809/Spring 2008 109

You might also like