0% found this document useful (0 votes)

22 views

CI

The document discusses the mini-max algorithm and alpha-beta pruning techniques used in artificial intelligence and game theory. It explains how mini-max algorithm works and provides properties and limitations. It then describes how alpha-beta pruning improves upon mini-max algorithm by reducing search space. Finally, it gives an overview of expert systems, their components and how they function.

Uploaded by

Rameez N

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

CI

Uploaded by

Rameez N

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 98

UNIT 1

1. Mini-Max Algorithm

 Mini-max algorithm is a recursive or backtracking algorithm which is used in decision-making

and game theory. It provides an optimal move for the player assuming that opponent is also
playing optimally.
 Mini-Max algorithm uses recursion to search through the game-tree.
 Min-Max algorithm is mostly used for game playing in AI. Such as Chess, Checkers, tic-tac-toe,
go, and various tow-players game. This Algorithm computes the minimax decision for the
current state.
 In this algorithm two players play the game, one is called MAX and other is called MIN.
 Both the players fight it as the opponent player gets the minimum benefit while they get the
maximum benefit.
 Both Players of the game are opponent of each other, where MAX will select the maximized
value and MIN will select the minimized value.
 The minimax algorithm performs a depth-first search algorithm for the exploration of the
complete game tree.
 The minimax algorithm proceeds all the way down to the terminal node of the tree, then
backtrack the tree as the recursion.

Working of Min-Max Algorithm:

 The working of the minimax algorithm can be easily described using an example. Below we have
taken an example of game-tree which is representing the two-player game.
 In this example, there are two players one is called Maximizer and other is called Minimizer.
 Maximizer will try to get the Maximum possible score, and Minimizer will try to get the
minimum possible score.
 This algorithm applies DFS, so in this game-tree, we have to go all the way through the leaves to
reach the terminal nodes.
 At the terminal node, the terminal values are given so we will compare those value and
backtrack the tree until the initial state occurs. Following are the main steps involved in solving
the two-player game tree:

Properties of Mini-Max algorithm:

 Complete- Min-Max algorithm is Complete. It will definitely find a solution (if exist), in the finite
search tree.
 Optimal- Min-Max algorithm is optimal if both opponents are playing optimally.
 Time complexity- As it performs DFS for the game-tree, so the time complexity of Min-Max
algorithm is O(bm), where b is branching factor of the game-tree, and m is the maximum depth
of the tree.
 Space Complexity- Space complexity of Mini-max algorithm is also similar to DFS which is
O(bm).

Limitation of the minimax Algorithm:

The main drawback of the minimax algorithm is that it gets really slow for complex games such as Chess,
go, etc. This type of games has a huge branching factor, and the player has lots of choices to decide. This
limitation of the minimax algorithm can be improved from alpha-beta pruning which we have discussed
in the next topic.

2. Alpha-Beta Pruning

 Alpha-beta pruning is a modified version of the minimax algorithm. It is an optimization

technique for the minimax algorithm.
 As we have seen in the minimax search algorithm that the number of game states it has to
examine are exponential in depth of the tree. Since we cannot eliminate the exponent, but we
can cut it to half. Hence there is a technique by which without checking each node of the game
tree we can compute the correct minimax decision, and this technique is called pruning. This
involves two threshold parameter Alpha and beta for future expansion, so it is called alpha-beta
pruning. It is also called as Alpha-Beta Algorithm.
 Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only prune the
tree leaves but also entire sub-tree.
 The two-parameter can be defined as:
1. Alpha: The best (highest-value) choice we have found so far at any point along the path
of Maximizer. The initial value of alpha is -∞.
2. Beta: The best (lowest-value) choice we have found so far at any point along the path of
Minimizer. The initial value of beta is +∞.
 The Alpha-beta pruning to a standard minimax algorithm returns the same move as the
standard algorithm does, but it removes all the nodes which are not really affecting the final
decision but making algorithm slow. Hence by pruning these nodes, it makes the algorithm fast.

Note: To better understand this topic, kindly study the minimax algorithm.

Condition for Alpha-beta pruning:

The main condition which required for alpha-beta pruning is:

α>=β

Key points about alpha-beta pruning:

 The Max player will only update the value of alpha.

 The Min player will only update the value of beta.
 While backtracking the tree, the node values will be passed to upper nodes instead of values of
alpha and beta.
 We will only pass the alpha, beta values to the child nodes.
3. Expert Systems

An expert system is a computer program that is designed to solve complex problems and to provide
decision-making ability like a human expert. It performs this by extracting knowledge from its
knowledge base using the reasoning and inference rules according to the user queries.

The performance of an expert system is based on the expert's knowledge stored in its knowledge base.
The more knowledge stored in the KB, the more that system improves its performance.

Characteristics of Expert System

 High Performance: The expert system provides high performance for solving any type of
complex problem of a specific domain with high efficiency and accuracy.
 Understandable: It responds in a way that can be easily understandable by the user. It can take
input in human language and provides the output in the same way.
 Reliable: It is much reliable for generating an efficient and accurate output.
 Highly responsive: ES provides the result for any complex query within a very short period of
time.

Components of Expert System

An expert system mainly consists of three components:

 User Interface
 Inference Engine
 Knowledge Base

1. User Interface

With the help of a user interface, the expert system interacts with the user, takes queries as an input in
a readable format, and passes it to the inference engine. After getting the response from the inference
engine, it displays the output to the user. In other words, it is an interface that helps a non-expert user
to communicate with the expert system to find a solution.
2. Inference Engine(Rules of Engine)

 The inference engine is known as the brain of the expert system as it is the main processing unit
of the system. It applies inference rules to the knowledge base to derive a conclusion or deduce
new information. It helps in deriving an error-free solution of queries asked by the user.
 With the help of an inference engine, the system extracts the knowledge from the knowledge
base.
 There are two types of inference engine:
 Deterministic Inference engine: The conclusions drawn from this type of inference engine are
assumed to be true. It is based on facts and rules.
 Probabilistic Inference engine: This type of inference engine contains uncertainty in
conclusions, and based on the probability.

Inference engine uses the below modes to derive the solutions:

 Forward Chaining: It starts from the known facts and rules, and applies the inference rules to
add their conclusion to the known facts.
 Backward Chaining: It is a backward reasoning method that starts from the goal and works
backward to prove the known facts.

3. Knowledge Base

 The knowledgebase is a type of storage that stores knowledge acquired from the different
experts of the particular domain. It is considered as big storage of knowledge. The more the
knowledge base, the more precise will be the Expert System.
 It is similar to a database that contains information and rules of a particular domain or subject.
 One can also view the knowledge base as collections of objects and their attributes. Such as a
Lion is an object and its attributes are it is a mammal, it is not a domestic animal, etc.

Components of Knowledge Base

 Factual Knowledge: The knowledge which is based on facts and accepted by knowledge
engineers comes under factual knowledge.
 Heuristic Knowledge: This knowledge is based on practice, the ability to guess, evaluation, and
experiences.

Knowledge Representation: It is used to formalize the knowledge stored in the knowledge base using
the If-else rules.

Knowledge Acquisitions: It is the process of extracting, organizing, and structuring the domain
knowledge, specifying the rules to acquire the knowledge from various experts, and store that
knowledge into the knowledge base.

Participants in the development of Expert System

There are three primary participants in the building of Expert System:

1. Expert: The success of an ES much depends on the knowledge provided by human experts.
These experts are those persons who are specialized in that specific domain.
2. Knowledge Engineer: Knowledge engineer is the person who gathers the knowledge from the
domain experts and then codifies that knowledge to the system according to the formalism.
3. End-User: This is a particular person or a group of people who may not be experts, and working
on the expert system needs the solution or advice for his queries, which are complex.

Why Expert System?

1. No memory Limitations: It can store as much data as required and can memorize it at the time
of its application. But for human experts, there are some limitations to memorize all things at
every time.
2. High Efficiency: If the knowledge base is updated with the correct knowledge, then it provides a
highly efficient output, which may not be possible for a human.
3. Expertise in a domain: There are lots of human experts in each domain, and they all have
different skills, different experiences, and different skills, so it is not easy to get a final output for
the query. But if we put the knowledge gained from human experts into the expert system, then
it provides an efficient output by mixing all the facts and knowledge
4. Not affected by emotions: These systems are not affected by human emotions such as fatigue,
anger, depression, anxiety, etc.. Hence the performance remains constant.
5. High security: These systems provide high security to resolve any query.
6. Considers all the facts: To respond to any query, it checks and considers all the available facts
and provides the result accordingly. But it is possible that a human expert may not consider
some facts due to any reason.
7. Regular updates improve the performance: If there is an issue in the result provided by the
expert systems, we can improve the performance of the system by updating the knowledge
base.

Capabilities of the Expert System

 Advising: It is capable of advising the human being for the query of any domain from the
particular ES.
 Provide decision-making capabilities: It provides the capability of decision making in any
domain, such as for making any financial decision, decisions in medical science, etc.
 Demonstrate a device: It is capable of demonstrating any new products such as its features,
specifications, how to use that product, etc.
 Problem-solving: It has problem-solving capabilities.
 Explaining a problem: It is also capable of providing a detailed description of an input problem.
 Interpreting the input: It is capable of interpreting the input given by the user.
 Predicting results: It can be used for the prediction of a result.
 Diagnosis: An ES designed for the medical field is capable of diagnosing a disease without using
multiple components as it already contains various inbuilt medical tools.

Advantages of Expert System

 These systems are highly reproducible.

 They can be used for risky places where the human presence is not safe.
 Error possibilities are less if the KB contains correct knowledge.
 The performance of these systems remains steady as it is not affected by emotions, tension, or
fatigue.
 They provide a very high speed to respond to a particular query.

Limitations of Expert System

 The response of the expert system may get wrong if the knowledge base contains the wrong
information.
 Like a human being, it cannot produce a creative output for different scenarios.
 Its maintenance and development costs are very high.
 Knowledge acquisition for designing is much difficult.
 For each domain, we require a specific ES, which is one of the big limitations.
 It cannot learn from itself and hence requires manual updates.

Applications of Expert System

 In designing and manufacturing domain

It can be broadly used for designing and manufacturing physical devices such as camera lenses
and automobiles.
 In the knowledge domain
These systems are primarily used for publishing the relevant knowledge to the users. The two
popular ES used for this domain is an advisor and a tax advisor.
 In the finance domain
In the finance industries, it is used to detect any type of possible fraud, suspicious activity, and
advise bankers that if they should provide loans for business or not.
 In the diagnosis and troubleshooting of devices
In medical diagnosis, the ES system is used, and it was the first area where these systems were
used.
 Planning and Scheduling
The expert systems can also be used for planning and scheduling some particular tasks for
achieving the goal of that task.

4. Forward and Backward Chaining

Backward Chaining

In logic programming and artificial intelligence, the technique of "backward chaining" is used to get from
the objective to the assumptions or circumstances that support it.

Backward chaining starts with a hypothesis or objective and works backward through a set of
circumstances or rules to see if the goal is supported by those conditions. The system verifies each
requirement until it reaches a point where all requirements are met or until it reaches a requirement
that cannot be met, at which time the system terminates and communicates the outcome.
Backward chaining, for instance, could be employed in a medical diagnosis system to identify the
primary reason behind a group of symptoms. In order to identify the diseases or disorders that might be
producing such symptoms, the system starts with the symptoms as the goal and works backward
through a series of criteria and conditions.

Advantages of Backward Chaining

 Effective use of resources − Backward chaining is a method of problem-solving that is effective

because it only investigates the pertinent laws or conditions required to achieve a goal.
Compared to alternative methods, this can save time and computational resources.
 Goal oriented − Backward chaining is goal-oriented in that it starts with a predetermined
objective and works backward to identify the pertinent circumstances or regulations that
support it.
 Flexible − Backward chaining is adaptable since it is simple to configure for many applications
and has a wide range of problem-solving capabilities.

Disadvantages of Backward Chaining

 Restricted reasoning − Backward chaining only works in one direction and might not be able to
produce fresh insights or solutions that weren't specifically coded into the system.
 Incomplete search − Backward chaining occasionally generates partial findings or fails to fully
investigate all potential solutions.
 Handling conflicts − Conflict resolution may be challenging when using backward chaining to
reconcile inconsistencies or conflicts between several laws or facts.

Forward Chaining

By starting with the premises or conditions and applying them one at a time to arrive at a conclusion,
forward chaining is a reasoning technique used in artificial intelligence and logic programming.

By applying a set of rules to an initial set of facts or circumstances, the system can then generate new
facts or conditions. This process is known as forward chaining. The system keeps using these rules and
producing new facts until it reaches a conclusion or a goal.

For instance, forward chaining might be employed in a rule-based system for diagnosing automobile
issues to identify a specific problem with the vehicle. Starting with observations of the car's behaviour,
the system would employ a set of rules to create potential reasons of the issue. As it narrows the
options and keeps applying the rules to rule out unlikely explanations, the system eventually comes to a
conclusion about the issue.
Advantages of Forward Chaining

 Efficiency − Forward chaining is a method of problem-solving that is effective because it draws

on previously established facts or circumstances in order to arrive at a solution. Compared to
alternative methods, this can save time and computational resources.
 Flexibility − Forward chaining is adaptable because it can handle a variety of problem kinds and
is simple to modify for various purposes.
 Real-time decision making − Because forward chaining can produce a conclusion fast based on a
set of facts or circumstances, it is appropriate for real-time decision making.

Disadvantages of Forward Chaining

Incomplete search: In some circumstances, forward chaining may not fully investigate all potential
solutions or may produce partial results.

Absence of a global perspective: As forward chaining simply takes into account the current set of facts or
circumstances, it might not evaluate the problem's wider context, which could result in inaccurate
conclusions.

Difficulty in handling conflicts: Conflict resolution may be challenging with forward chaining when there
are inconsistencies or conflicts between several facts or rules.
5. Informed Search Algorithms

Informed search algorithm contains an array of knowledge such as how far we are from the goal, path
cost, how to reach to goal node, etc. This knowledge help agents to explore less to the search space and
find more efficiently the goal node.

The informed search algorithm is more useful for large search space. Informed search algorithm uses the
idea of heuristic, so it is also called Heuristic search.

Heuristics function: Heuristic is a function which is used in Informed Search, and it finds the most
promising path. It takes the current state of the agent as its input and produces the estimation of how
close agent is from the goal. The heuristic method, however, might not always give the best solution,
but it guaranteed to find a good solution in reasonable time. Heuristic function estimates how close a
state is to the goal. It is represented by h(n), and it calculates the cost of an optimal path between the
pair of states. The value of the heuristic function is always positive.

Admissibility of the heuristic function is given as:

1. h(n) <= h*(n)

Here h(n) is heuristic cost, and h*(n) is the estimated cost. Hence heuristic cost should be less than or
equal to the estimated cost.

 Best First Search Algorithm(Greedy search)

 A* Search Algorithm

Best-first Search Algorithm (Greedy Search):

Greedy best-first search algorithm always selects the path which appears best at that moment. It is the
combination of depth-first search and breadth-first search algorithms. It uses the heuristic function and
search. Best-first search allows us to take the advantages of both algorithms. With the help of best-first
search, at each step, we can choose the most promising node. In the best first search algorithm, we
expand the node which is closest to the goal node and the closest cost is estimated by heuristic function,
i.e.

1. f(n)= g(n).

Were, h(n)= estimated cost from node n to the goal.

The greedy best first algorithm is implemented by the priority queue.

Advantages:

 Best first search can switch between BFS and DFS by gaining the advantages of both the
algorithms.
 This algorithm is more efficient than BFS and DFS algorithms.
Disadvantages:

 It can behave as an unguided depth-first search in the worst case scenario.

 It can get stuck in a loop as DFS.
 This algorithm is not optimal.

A* Search Algorithm:

A* search is the most commonly known form of best-first search. It uses heuristic function h(n), and cost
to reach the node n from the start state g(n). It has combined features of UCS and greedy best-first
search, by which it solve the problem efficiently. A* search algorithm finds the shortest path through the
search space using the heuristic function. This search algorithm expands less search tree and provides
optimal result faster. A* algorithm is similar to UCS except that it uses g(n)+h(n) instead of g(n).

In A* search algorithm, we use search heuristic as well as the cost to reach the node. Hence we can
combine both costs as following, and this sum is called as a fitness number.

Advantages:

 A* search algorithm is the best algorithm than other search algorithms.

 A* search algorithm is optimal and complete.
 This algorithm can solve very complex problems.

Disadvantages:

 It does not always produce the shortest path as it mostly based on heuristics and
approximation.
 A* search algorithm has some complexity issues.
 The main drawback of A* is memory requirement as it keeps all generated nodes in the
memory, so it is not practical for various large-scale problems.

6. Genetic Algorithm

A genetic algorithm is an adaptive heuristic search algorithm inspired by "Darwin's theory of evolution
in Nature." It is used to solve optimization problems in machine learning. It is one of the important
algorithms as it helps solve complex problems that would take a long time to solve.

How Genetic Algorithm Work?

The genetic algorithm works on the evolutionary generational cycle to generate high-quality solutions.
These algorithms use different operations that either enhance or replace the population to give an
improved fit solution.

It basically involves five phases to solve the complex optimization problems, which are given as below:

 Initialization
 Fitness Assignment
 Selection
 Reproduction
 Termination

1. Initialization

The process of a genetic algorithm starts by generating the set of individuals, which is called population.
Here each individual is the solution for the given problem. An individual contains or is characterized by a
set of parameters called Genes. Genes are combined into a string and generate chromosomes, which is
the solution to the problem. One of the most popular techniques for initialization is the use of random
binary strings.

2. Fitness Assignment

Fitness function is used to determine how fit an individual is? It means the ability of an individual to
compete with other individuals. In every iteration, individuals are evaluated based on their fitness
function. The fitness function provides a fitness score to each individual. This score further determines
the probability of being selected for reproduction. The high the fitness score, the more chances of
getting selected for reproduction.

3. Selection

The selection phase involves the selection of individuals for the reproduction of offspring. All the
selected individuals are then arranged in a pair of two to increase reproduction. Then these individuals
transfer their genes to the next generation.

There are three types of Selection methods available, which are:

 Roulette wheel selection

 Tournament selection
 Rank-based selection

4. Reproduction

After the selection process, the creation of a child occurs in the reproduction step. In this step, the
genetic algorithm uses two variation operators that are applied to the parent population. The two
operators involved in the reproduction phase are given below:
 Crossover: The crossover plays a most significant role in the reproduction phase of the genetic
algorithm. In this process, a crossover point is selected at random within the genes. Then the
crossover operator swaps genetic information of two parents from the current generation to
produce a new individual representing the offspring.

The genes of parents are exchanged among themselves until the crossover point is met. These
newly generated offspring are added to the population. This process is also called or crossover.
Types of crossover styles available:
o One point crossover
o Two-point crossover
o Livery crossover
o Inheritable Algorithms crossover
 Mutation
The mutation operator inserts random genes in the offspring (new child) to maintain the
diversity in the population. It can be done by flipping some bits in the chromosomes.
Mutation helps in solving the issue of premature convergence and enhances diversification. The
below image shows the mutation process:
Types of mutation styles available,
o Flip bit mutation
o Gaussian mutation
o Exchange/Swap mutation

5. Termination

After the reproduction phase, a stopping criterion is applied as a base for termination. The algorithm
terminates after the threshold fitness solution is reached. It will identify the final solution as the best
solution in the population.

Advantages of Genetic Algorithm

 The parallel capabilities of genetic algorithms are best.

 It helps in optimizing various problems such as discrete functions, multi-objective problems, and
continuous functions.
 It provides a solution for a problem that improves over time.
 A genetic algorithm does not need derivative information.

Limitations of Genetic Algorithms

 Genetic algorithms are not efficient algorithms for solving simple problems.
 It does not guarantee the quality of the final solution to a problem.
 Repetitive calculation of fitness values may generate some computational challenges.

7. Heuristic Search

A heuristic is a technique that is used to solve a problem faster than the classic methods. These
techniques are used to find the approximate solution of a problem when classical methods do not.
Heuristics are said to be the problem-solving techniques that result in practical and quick solutions.

Heuristics are strategies that are derived from past experience with similar problems. Heuristics use
practical methods and shortcuts used to produce the solutions that may or may not be optimal, but
those solutions are sufficient in a given limited timeframe.

Types are Greedy BFS and A* Algo

8. Game Playing Techniques
9. Problems in AI

Common-place tasks (Mundane tasks):

1. Data Bias:

- AI systems heavily depend on training data. If the data used for training contains biases, the AI model
can perpetuate and even amplify those biases in its predictions or decisions.

2. Lack of Understanding:

- Many AI systems operate as "black boxes," making it challenging to understand the decision-making
process. This lack of transparency can be a significant barrier, especially in critical applications where
explanations for decisions are essential.

3. Ethical Concerns:

- As AI systems become more integrated into daily life, ethical considerations become crucial. Issues
like privacy invasion, job displacement, and unintended consequences of AI decisions need to be
addressed.

4. Over-reliance on Data:

- AI models often require vast amounts of data to perform well. In situations where data is scarce or
biased, the performance of AI systems can be compromised.

Expert Tasks:

1. Limited Context Understanding:

- AI models may struggle to grasp the broader context of a situation, leading to potential errors or
misinterpretations, especially in complex expert tasks where a deep understanding of the domain is
crucial.

2. High Resource Requirements:

- Developing and training advanced AI models for expert tasks often demands significant
computational resources and expertise, limiting accessibility for smaller organizations or researchers.

3. Interdisciplinary Challenges:

- Expert tasks often require interdisciplinary knowledge. Integrating information from diverse fields
into AI systems can be challenging and might require collaboration between experts in different
domains.

4. Security Concerns:

- In expert tasks, security is a paramount concern. AI systems used in critical domains like healthcare or
finance must be resilient against adversarial attacks and unauthorized access.
UNIT 2

1. Propositional Logic

Propositional logic (PL) is the simplest form of logic where all the statements are made by propositions.
A proposition is a declarative statement which is either true or false. It is a technique of knowledge
representation in logical and mathematical form.

Following are some basic facts about propositional logic:

 Propositional logic is also called Boolean logic as it works on 0 and 1.

 In propositional logic, we use symbolic variables to represent the logic, and we can use any
symbol for a representing a proposition, such A, B, C, P, Q, R, etc.
 Propositions can be either true or false, but it cannot be both.
 Propositional logic consists of an object, relations or function, and logical connectives.
 These connectives are also called logical operators.
 The propositions and connectives are the basic elements of the propositional logic.
 Connectives can be said as a logical operator which connects two sentences.
 A proposition formula which is always true is called tautology, and it is also called a valid
sentence.
 A proposition formula which is always false is called Contradiction.
 A proposition formula which has both true and false values is called
 Statements which are questions, commands, or opinions are not propositions such as "Where is
Rohini", "How are you", "What is your name", are not propositions.

Syntax of propositional logic:

The syntax of propositional logic defines the allowable sentences for the knowledge representation.
There are two types of Propositions:

1. Atomic Propositions
2. Compound propositions

 Atomic Proposition: Atomic propositions are the simple propositions. It consists of a single
proposition symbol. These are the sentences which must be either true or false.

ADExample:

1. a) 2+2 is 4, it is an atomic proposition as it is a true fact.

2. b) "The Sun is cold" is also a proposition as it is a false fact.

 Compound proposition: Compound propositions are constructed by combining simpler or

atomic propositions, using parenthesis and logical connectives.

Example:

1. a) "It is raining today, and street is wet."

2. b) "Ankit is a doctor, and his clinic is in Mumbai."

Write about connectives like negation conjunction disjunction implication biconditional and their
tables

Properties of Operators:

 Commutativity:
o P∧ Q= Q ∧ P, or
o P ∨ Q = Q ∨ P.
 Associativity:
o (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
o (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
 Identity element:
o P ∧ True = P,
o P ∨ True= True.
 Distributive:
o P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
o P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
 DE Morgan's Law:
o ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
o ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
 Double-negation elimination:
o ¬ (¬P) = P.

Limitations of Propositional logic:

 We cannot represent relations like ALL, some, or none with propositional logic. Example:
1. All the girls are intelligent.
2. Some apples are sweet.
 Propositional logic has limited expressive power.
 In propositional logic, we cannot describe statements in terms of their properties or logical
relationships.
2. FOL

PL is not sufficient to represent the complex sentences or natural language statements. The
propositional logic has very limited expressive power. Consider the following sentence, which we cannot
represent using PL logic.

 "Some humans are intelligent", or

 "Sachin likes cricket."

To represent the above statements, PL logic is not sufficient, so we required some more powerful logic,
such as first-order logic.

First-Order logic:

 First-order logic is another way of knowledge representation in artificial intelligence. It is an

extension to propositional logic.
 FOL is sufficiently expressive to represent the natural language statements in a concise way.
 First-order logic is also known as Predicate logic or First-order predicate logic. First-order logic
is a powerful language that develops information about the objects in a more easy way and can
also express the relationship between those objects.
 First-order logic (like natural language) does not only assume that the world contains facts like
propositional logic but also assumes the following things in the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......
o Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation
such as: the sister of, brother of, has color, comes between
o Function: Father of, best friend, third inning of, end of, ......
 As a natural language, first-order logic also has two main parts:
1. Syntax
2. Semantics

Syntax of First-Order logic:

The syntax of FOL determines which collection of symbols is a logical expression in first-order logic. The
basic syntactic elements of first-order logic are symbols. We write statements in short-hand notation in
FOL.

Basic Elements of First-order logic:

Following are the basic elements of FOL syntax:

Constant 1, 2, A, John, Mumbai, cat,....

Variables x, y, z, a, b,....

Predicates Brother, Father, >,....

Function sqrt, LeftLegOf, ....

Connectives ∧, ∨, ¬, ⇒, ⇔

Equality ==

Quantifier ∀, ∃

First-order logic statements can be divided into two parts:

 Subject: Subject is the main part of the statement.

 Predicate: A predicate can be defined as a relation, which binds two atoms together in a
statement.

Consider the statement: "x is an integer.", it consists of two parts, the first part x is the subject of the
statement and second part "is an integer," is known as a predicate.

Quantifiers in First-order logic:

 A quantifier is a language element which generates quantification, and quantification specifies

the quantity of specimen in the universe of discourse.
 These are the symbols that permit to determine or identify the range and scope of the variable
in the logical expression. There are two types of quantifier:
1. Universal Quantifier, (for all, everyone, everything)
2. Existential quantifier, (for some, at least one).

Universal Quantifier:

Universal quantifier is a symbol of logical representation, which specifies that the statement within its
range is true for everything or every instance of a particular thing.

The Universal quantifier is represented by a symbol ∀, which resembles an inverted A.

Note: In universal quantifier we use implication "→".

If x is a variable, then ∀x is read as:

 For all x
 For each x
 For every x.
Existential Quantifier:

Existential quantifiers are the type of quantifiers, which express that the statement within its scope is
true for at least one instance of something.

It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with a predicate
variable then it is called as an existential quantifier.

Note: In Existential quantifier we always use AND or Conjunction symbol (∧).

If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:

 There exists a 'x.'

 For some 'x.'
 For at least one 'x.'

Properties of Quantifiers:

 In universal quantifier, ∀x∀y is similar to ∀y∀x.

 In Existential quantifier, ∃x∃y is similar to ∃y∃x.
 ∃x∀y is not similar to ∀y∃x.

Free and Bound Variables:

The quantifiers interact with variables which appear in a suitable way. There are two types of variables
in First-order logic which are given below:

Free Variable: A variable is said to be a free variable in a formula if it occurs outside the scope of the
quantifier.

Example: ∀x ∃(y)[P (x, y, z)], where z is a free variable.

Bound Variable: A variable is said to be a bound variable in a formula if it occurs within the scope of the
quantifier.

Example: ∀x [A (x) B( y)], here x and y are the bound variables.

3. Forward and Backward chaining in FOL

Inference engine:
The inference engine is the component of the intelligent system in artificial intelligence, which applies
logical rules to the knowledge base to infer new information from known facts. The first inference
engine was part of the expert system. Inference engine commonly proceeds in two modes, which are:

1. Forward chaining
2. Backward chaining

Horn Clause and Definite clause:

Horn clause and definite clause are the forms of sentences, which enables knowledge base to use a
more restricted and efficient inference algorithm. Logical inference algorithms use forward and
backward chaining approaches, which require KB in the form of the first-order definite clause.

Definite clause: A clause which is a disjunction of literals with exactly one positive literal is known as a
definite clause or strict horn clause.

Horn clause: A clause which is a disjunction of literals with at most one positive literal is known as horn
clause. Hence all the definite clauses are horn clauses.

Example: (¬ p V ¬ q V k). It has only one positive literal k.

It is equivalent to p ∧ q → k.

A. Forward Chaining

Forward chaining is also known as a forward deduction or forward reasoning method when using an
inference engine. Forward chaining is a form of reasoning which start with atomic sentences in the
knowledge base and applies inference rules (Modus Ponens) in the forward direction to extract more
data until a goal is reached.

The Forward-chaining algorithm starts from known facts, triggers all rules whose premises are satisfied,
and add their conclusion to the known facts. This process repeats until the problem is solved.

Properties of Forward-Chaining:

 It is a down-up approach, as it moves from bottom to top.

 It is a process of making a conclusion based on known facts or data, by starting from the initial
state and reaches the goal state.
 Forward-chaining approach is also called as data-driven as we reach to the goal using available
data.
 Forward -chaining approach is commonly used in the expert system, such as CLIPS, business, and
production rule systems.

B. Backward Chaining:
Backward-chaining is also known as a backward deduction or backward reasoning method when using
an inference engine. A backward chaining algorithm is a form of reasoning, which starts with the goal
and works backward, chaining through rules to find known facts that support the goal.

Properties of backward chaining:

 It is known as a top-down approach.

 Backward-chaining is based on modus ponens inference rule.
 In backward chaining, the goal is broken into sub-goal or sub-goals to prove the facts true.
 It is called a goal-driven approach, as a list of goals decides which rules are selected and used.
 Backward -chaining algorithm is used in game theory, automated theorem proving tools,
inference engines, proof assistants, and various AI applications.
 The backward-chaining method mostly used a depth-first search strategy for proof.

Weapons/America Example : https://www.javatpoint.com/first-order-logic-in-artificial-intelligence

4. Unification

 Unification is a process of making two different logical atomic expressions identical by finding a
substitution. Unification depends on the substitution process.
 It takes two literals as input and makes them identical using substitution.
 Let Ψ1 and Ψ2 be two atomic sentences and 𝜎 be a unifier such that, Ψ1𝜎 = Ψ2𝜎, then it can be
expressed as UNIFY(Ψ1, Ψ2).
 Example: Find the MGU for Unify{King(x), King(John)}

Let Ψ1 = King(x), Ψ2 = King(John),

Substitution θ = {John/x} is a unifier for these atoms and applying this substitution, and both
expressions will be identical.

 The UNIFY algorithm is used for unification, which takes two atomic sentences and returns a
unifier for those sentences (If any exist).
 Unification is a key component of all first-order inference algorithms.
 It returns fail if the expressions do not match with each other.
 The substitution variables are called Most General Unifier or MGU.

Conditions for Unification:

Following are some basic conditions for unification:

 Predicate symbol must be same, atoms or expression with different predicate symbol can never
be unified.
 Number of Arguments in both expressions must be identical.
 Unification will fail if there are two similar variables present in the same expression.
5. Resolution

Resolution is a theorem proving technique that proceeds by building refutation proofs. Resolution is
used, if there are various statements are given, and we need to prove a conclusion of those statements.
Unification is a key concept in proofs by resolutions. Resolution is a single inference rule which can
efficiently operate on the conjunctive normal form or clausal form.

Clause: Disjunction of literals (an atomic sentence) is called a clause. It is also known as a unit clause.

Conjunctive Normal Form: A sentence represented as a conjunction of clauses is said to be conjunctive

normal form or CNF.

The resolution inference rule:

The resolution rule for first-order logic is simply a lifted version of the propositional rule. Resolution can
resolve two clauses if they contain complementary literals, which are assumed to be standardized apart
so that they share no variables.

Where li and mj are complementary literals.

This rule is also called the binary resolution rule because it only resolves exactly two literals.
Example:

We can resolve two clauses which are given below:

[Animal (g(x) V Loves (f(x), x)] and [￢ Loves(a, b) V ￢Kills(a, b)]

Where two complimentary literals are: Loves (f(x), x) and ￢ Loves (a, b)

These literals can be unified with unifier θ= [a/f(x), and b/x] , and it will generate a resolvent clause:

[Animal (g(x) V ￢ Kills(f(x), x)].

Steps for Resolution:

1. Conversion of facts into first-order logic.

2. Convert FOL statements into CNF
i. Eliminate all implication (→) and rewrite
ii. Move negation (¬) inwards and rewrite
iii. Rename variables or standardize variables
iv. Eliminate existential instantiation quantifier by elimination.
v. Drop Universal quantifiers.
vi. Distribute conjunction ∧ over disjunction ¬.
3. Negate the statement which needs to prove (proof by contradiction)
4. Draw resolution graph (unification).

John likes peanuts example: https://www.javatpoint.com/ai-resolution-in-first-order-logic

6. Inference in FOL

Inference in First-Order Logic is used to deduce new facts or sentences from existing sentences. Before
understanding the FOL inference rule, let's understand some basic terminologies used in FOL.

FOL inference rules for quantifier:

As propositional logic we also have inference rules in first-order logic, so following are some basic
inference rules in FOL:

1. Universal Generalization:

 Universal generalization is a valid inference rule which states that if premise P(c) is true for any
arbitrary element c in the universe of discourse, then we can have a conclusion as ∀ x P(x).

 It can be represented as: .

 This rule can be used if we want to show that every element has a similar property.
 In this rule, x must not appear as a free variable.

Example: Let's represent, P(c): "A byte contains 8 bits", so for ∀ x P(x) "All bytes contain 8 bits.", it will
also be true.

2. Universal Instantiation:

 Universal instantiation is also called as universal elimination or UI is a valid inference rule. It can
be applied multiple times to add new sentences.
 The new KB is logically equivalent to the previous KB.
 As per UI, we can infer any sentence obtained by substituting a ground term for the variable.
 The UI rule state that we can infer any sentence P(c) by substituting a ground term c (a constant
within domain x) from ∀ x P(x) for any object in the universe of discourse.

 It can be represented as: .

Example:1.

IF "Every person like ice-cream"=> ∀x P(x) so we can infer that

"John likes ice-cream" => P(c)

Example: 2.

Let's take a famous example,

"All kings who are greedy are Evil." So let our knowledge base contains this detail as in the form of FOL:

∀x king(x) ∧ greedy (x) → Evil (x),

So from this information, we can infer any of the following statements using Universal Instantiation:

 King(John) ∧ Greedy (John) → Evil (John),

 King(Richard) ∧ Greedy (Richard) → Evil (Richard),
 King(Father(John)) ∧ Greedy (Father(John)) → Evil (Father(John)),

3. Existential Instantiation:

 Existential instantiation is also called as Existential Elimination, which is a valid inference rule in
first-order logic.
 It can be applied only once to replace the existential sentence.
 The new KB is not logically equivalent to old KB, but it will be satisfiable if old KB was satisfiable.
 This rule states that one can infer P(c) from the formula given in the form of ∃x P(x) for a new
constant symbol c.
 The restriction with this rule is that c used in the rule must be a new term for which P(c ) is true.
 It can be represented as:

Example:

From the given sentence: ∃x Crown(x) ∧ OnHead(x, John)

So we can infer: Crown(K) ∧ OnHead( K, John), as long as K does not appear in the knowledge base.

 The above used K is a constant symbol, which is called Skolem constant.

 The Existential instantiation is a special case of Skolemization process.

4. Existential introduction

 An existential introduction is also known as an existential generalization, which is a valid

inference rule in first-order logic.
 This rule states that if there is some element c in the universe of discourse which has a property
P, then we can infer that there exists something in the universe which has the property P.

 It can be represented as:

 Example: Let's say that,
"Priyanka got good marks in English."
"Therefore, someone got good marks in English."

7. Knowledge Representation and Ontological Engineering

 Knowledge representation and reasoning (KR, KRR) is the part of Artificial intelligence which
concerned with AI agents thinking and how thinking contributes to intelligent behavior of
agents.
 It is responsible for representing information about the real world so that a computer can
understand and can utilize this knowledge to solve the complex real world problems such as
diagnosis a medical condition or communicating with humans in natural language.
 It is also a way which describes how we can represent knowledge in artificial intelligence.
Knowledge representation is not just storing data into some database, but it also enables an
intelligent machine to learn from that knowledge and experiences so that it can behave
intelligently like a human.

What to Represent:

Following are the kind of knowledge which needs to be represented in AI systems:

 Object: All the facts about objects in our world domain. E.g., Guitars contains strings, trumpets
are brass instruments.
 Events: Events are the actions which occur in our world.
 Performance: It describe behavior which involves knowledge about how to do things.
 Meta-knowledge: It is knowledge about what we know.
 Facts: Facts are the truths about the real world and what we represent.
 Knowledge-Base: The central component of the knowledge-based agents is the knowledge
base. It is represented as KB. The Knowledgebase is a group of the Sentences (Here, sentences
are used as a technical term and not identical with the English language).
 Knowledge: Knowledge is awareness or familiarity gained by experiences of facts, data, and
situations.

Types of knowledge

Declarative Knowledge:

 Declarative knowledge is to know about something.

 It includes concepts, facts, and objects.
 It is also called descriptive knowledge and expressed in declarativesentences.
 It is simpler than procedural language.

2. Procedural Knowledge

 It is also known as imperative knowledge.

 Procedural knowledge is a type of knowledge which is responsible for knowing how to do
something.
 It can be directly applied to any task.
 It includes rules, strategies, procedures, agendas, etc.
 Procedural knowledge depends on the task on which it can be applied.

3. Meta-knowledge:

 Knowledge about the other types of knowledge is called Meta-knowledge.

4. Heuristic knowledge:

 Heuristic knowledge is representing knowledge of some experts in a filed or subject.

 Heuristic knowledge is rules of thumb based on previous experiences, awareness of approaches,
and which are good to work but not guaranteed.

5. Structural knowledge:

 Structural knowledge is basic knowledge to problem-solving.

 It describes relationships between various concepts such as kind of, part of, and grouping of
something.
 It describes the relationship that exists between concepts or objects.

Requirements for knowledge Representation system:

A good knowledge representation system must possess the following properties.

1. 1. Representational Accuracy:
KR system should have the ability to represent all kind of required knowledge.
2. 2. Inferential Adequacy:
KR system should have ability to manipulate the representational structures to produce new
knowledge corresponding to existing structure.
3. 3. Inferential Efficiency:
The ability to direct the inferential knowledge mechanism into the most productive directions by
storing appropriate guides.
4. 4. Acquisitional efficiency- The ability to acquire the new knowledge easily using automatic
methods.

Approaches to knowledge representation:

There are mainly four approaches to knowledge representation, which are givenbelow:

1. Simple relational knowledge:

 It is the simplest way of storing facts which uses the relational method, and each fact about a
set of the object is set out systematically in columns.
 This approach of knowledge representation is famous in database systems where the
relationship between different entities is represented.
 This approach has little opportunity for inference.

2. Inheritable knowledge:

 In the inheritable knowledge approach, all data must be stored into a hierarchy of classes.
 All classes should be arranged in a generalized form or a hierarchal manner.
 In this approach, we apply inheritance property.
 Elements inherit values from other members of a class.
 This approach contains inheritable knowledge which shows a relation between instance and
class, and it is called instance relation.
 Every individual frame can represent the collection of attributes and its value.
 In this approach, objects and values are represented in Boxed nodes.
 We use Arrows which point from objects to their values.

3. Inferential knowledge:

 Inferential knowledge approach represents knowledge in the form of formal logics.

 This approach can be used to derive more facts.
 It guaranteed correctness.
 Example: Let's suppose there are two statements:
1. Marcus is a man
2. All men are mortal
Then it can represent as;

man(Marcus)
∀x = man (x) ----------> mortal (x)s
4. Procedural knowledge:

 Procedural knowledge approach uses small programs and codes which describes how to do
specific things, and how to proceed.
 In this approach, one important rule is used which is If-Then rule.
 In this knowledge, we can use various coding languages such as LISP language and Prolog
language.
 We can easily represent heuristic or domain-specific knowledge using this approach.
 But it is not necessary that we can represent all cases in this approach.
8. Non-monotonic reasoning
9. English to Prolog Facts

Prolog, short for "Programming in Logic," is a declarative programming language primarily used in the
field of artificial intelligence. Unlike traditional imperative languages, Prolog focuses on expressing
relationships and logical rules rather than step-by-step instructions. Its distinctive feature is the ability to
perform automated reasoning and inference based on a set of rules and facts.

Prolog Facts:

In Prolog, facts are fundamental building blocks used to represent statements or truths. They can range
from simple declarations to complex relationships. Here's an exploration of Prolog facts:

1. Simple Facts:

% English: The sky is blue.

sky_color(blue).

2. Binary Relations:

% English: Mary is the mother of John.

mother(mary, john).

3. Conditional Facts (Rules):

% English: If it's raining, then people carry umbrellas.

carries_umbrella(Person) :- raining, person(Person).

4. Variable Usage:

% English: Cats chase mice.

chases(cat, mouse).

5. Negation in Facts:

% English: The sun is not a planet.

not(planet(sun)).

Arithmetic Operations in Prolog:

Prolog supports arithmetic operations within its facts and rules, allowing for the inclusion of
mathematical expressions:

% English: Twice the value of X is equal to Y.

twice(X, Y) :- Y is 2 X.

Matching and Unification in Prolog:

Prolog uses the `=` operator for matching expressions. Two terms are considered matched if they are
equal or can be made equal by assigning values to variables:

% English: X plus 3 equals 2 times Y.

equation(X, Y) :- X + 3 =:= 2 Y.

By combining these elements, Prolog provides a powerful framework for representing and reasoning
about relationships, making it particularly well-suited for applications in artificial intelligence, natural
language processing, and databases. Its declarative nature allows developers to focus on specifying what
they want, leaving the Prolog interpreter to determine how to achieve the desired outcomes.
10. Categories and Objects

Categories:

1. Definition:

- Categories are high-level, conceptual groupings that help organize knowledge. They represent classes
or sets of entities that share common characteristics.

2. Examples:

- Animals: This category encompasses various subcategories like mammals, birds, and reptiles.

- Fruits: Subcategories could include apples, oranges, and bananas.

3. Hierarchical Structure:

- Categories often have a hierarchical structure, forming a taxonomy. This structure allows for the
representation of broader and more specific categories.

Example:

% Animal hierarchy

is_a(mammal, animal).

is_a(bird, animal).

Objects:

1. Definition:

- Objects are instances or individual entities within a category. They represent specific occurrences or
items belonging to a particular class.

2. Examples:

- Golden Retriever: An instance of the "Dog" category under the broader "Mammal" category.

- Banana: An instance of the "Fruits" category.

3. Attributes and Properties:

- Objects have attributes or properties that define their characteristics.

Example:

% Dog attributes

has_property(golden_retriever, fur_color, golden).

Knowledge Representation Techniques:

1. Semantic Networks:

- Example: Representing relationships between categories and objects using nodes and edges. Nodes
can represent categories, and edges signify relationships.

2. Frames:

- Example: Structuring information using frames, where each frame corresponds to a category or
object. Frames contain slots for attributes and properties.

Example:

% Frame for a fruit

fruit(apple, color(red), taste(sweet)).

3. Ontologies:

- Example: Developing ontologies to formally define relationships and hierarchies between different
categories and objects.

Example:

% Ontology relationships

has_parent(mammal, animal).
11. Reasoning System

Reasoning systems in computational intelligence play a crucial role in enabling machines to make
decisions, draw inferences, and solve problems. These systems emulate human-like reasoning
processes, allowing AI systems to navigate complex scenarios. Here's an overview of reasoning systems
in the context of artificial intelligence:

1. Symbolic Reasoning:

- Definition:

- Symbolic reasoning involves manipulating symbols and logical operations to derive new information
from existing knowledge.

- Example:

- Using logical rules to infer that if an entity is a bird and can fly, then it belongs to the category of
flying creatures.

2. Rule-Based Reasoning:

- Definition:

- Rule-based reasoning relies on a set of predefined rules to make decisions or draw conclusions.

- Example:

- If it's raining, carry an umbrella. This rule guides the decision-making process based on a specific
condition.

3. Case-Based Reasoning:

- Definition:

- Case-based reasoning involves solving new problems by recalling and adapting solutions from
similar past cases.

- Example:

- Diagnosing a medical condition based on similarities to previous cases with known outcomes.

4. Fuzzy Logic Reasoning:

- Definition:

- Fuzzy logic reasoning deals with uncertainty by allowing values to range between true and false,
enabling a more flexible approach to decision-making.

- Example:

- Describing the temperature as "warm" rather than a precise numerical value.

5. Probabilistic Reasoning:

- Definition:
- Probabilistic reasoning involves assessing and calculating probabilities to make decisions under
uncertainty.

- Example:

- Predicting the likelihood of a disease based on statistical analysis of risk factors.

6. Machine Learning Reasoning:

- Definition:

- Machine learning reasoning involves learning patterns and making predictions based on data
without explicit programming.

- Example:

- Training a model to recognize handwritten digits and making predictions on new, unseen data.

7. Abductive Reasoning:

- Definition:

- Abductive reasoning involves making the best explanation or hypothesis for observed facts or
evidence.

- Example:

- Inferring the cause of a car breakdown based on observable symptoms.

8. Commonsense Reasoning:

- Definition:

- Commonsense reasoning enables AI systems to make deductions based on general knowledge and
common understanding.

- Example:

- Understanding that water is wet or fire is hot based on common knowledge.

9. Inference Engines:

- Definition:

- Inference engines are components that process rules and information to draw conclusions in a
reasoning system.

- Example:

- Utilizing an inference engine to evaluate rules and deduce outcomes in a medical diagnostic system.

10. Automated Reasoning:

- Definition:
- Automated reasoning involves the use of algorithms and computational methods to perform logical
inference and solve problems.

- Example:

- Proving mathematical theorems using automated reasoning tools.

Reasoning systems in computational intelligence encompass a diverse set of techniques and

methodologies that empower artificial intelligence to emulate human-like decision-making processes.
These systems are foundational in creating intelligent machines that can analyze information, draw
conclusions, and make informed decisions across various domains.
UNIT 3

1. Fuzzy Logic/Set/Rules

Fuzzy Logic is a form of many-valued logic in which the truth values of variables may be any real number
between 0 and 1, instead of just the traditional values of true or false. It is used to deal with imprecise
or uncertain information and is a mathematical method for representing vagueness and uncertainty in
decision-making. It allows for partial truths, where a statement can be partially true or false, rather than
fully true or false.

The fundamental concept of Fuzzy Logic is the membership function, which defines the degree of
membership of an input value to a certain set or category. The membership function is a mapping from
an input value to a membership degree between 0 and 1, where 0 represents non-membership and 1
represents full membership.

ARCHITECTURE

Its Architecture contains four parts :

 RULE BASE: It contains the set of rules and the IF-THEN conditions provided by the experts to
govern the decision-making system, on the basis of linguistic information. Recent developments
in fuzzy theory offer several effective methods for the design and tuning of fuzzy controllers.
Most of these developments reduce the number of fuzzy rules.
 FUZZIFICATION: It is used to convert inputs i.e. crisp numbers into fuzzy sets. Crisp inputs are
basically the exact inputs measured by sensors and passed into the control system for
processing, such as temperature, pressure, rpm’s, etc.
 INFERENCE ENGINE: It determines the matching degree of the current fuzzy input with respect
to each rule and decides which rules are to be fired according to the input field. Next, the fired
rules are combined to form the control actions.
 DEFUZZIFICATION: It is used to convert the fuzzy sets obtained by the inference engine into a
crisp value. There are several defuzzification methods available and the best-suited one is used
with a specific expert system to reduce the error.

Membership function

Definition: A graph that defines how each point in the input space is mapped to membership value
between 0 and 1. Input space is often referred to as the universe of discourse or universal set (u), which
contains all the possible elements of concern in each particular application.
What is Fuzzy Control?

 It is a technique to embody human-like thinkings into a control system.

 It may not be designed to give accurate reasoning but it is designed to give acceptable
reasoning.
 It can emulate human deductive thinking, that is, the process people use to infer conclusions
from what they know.
 Any uncertainties can be easily dealt with the help of fuzzy logic.

Advantages of Fuzzy Logic System

 This system can work with any type of inputs whether it is imprecise, distorted or noisy input
information.
 The construction of Fuzzy Logic Systems is easy and understandable.
 Fuzzy logic comes with mathematical concepts of set theory and the reasoning of that is quite
simple.
 It provides a very efficient solution to complex problems in all fields of life as it resembles
human reasoning and decision-making.
 The algorithms can be described with little data, so little memory is required.

Disadvantages of Fuzzy Logic Systems

 Many researchers proposed different ways to solve a given problem through fuzzy logic which
leads to ambiguity. There is no systematic approach to solve a given problem through fuzzy
logic.
 Proof of its characteristics is difficult or impossible in most cases because every time we do not
get a mathematical description of our approach.
 As fuzzy logic works on precise as well as imprecise data so most of the time accuracy is
compromised.

Application

 Fuzzy logic is used in Natural language processing and various intensive applications in Artificial
Intelligence.
 Fuzzy logic is extensively used in modern control systems such as expert systems.
 Fuzzy Logic is used with Neural Networks as it mimics how a person would make decisions, only
much faster. It is done by Aggregation of data and changing it into more meaningful data by
forming partial truths as Fuzzy sets.

Fuzzy set:

1. Fuzzy set is a set having degrees of membership between 1 and 0. Fuzzy sets are represented
with tilde character(~). For example, Number of cars following traffic signals at a particular time
out of all cars present will have membership value between [0,1].
2. Partial membership exists when member of one fuzzy set can also be a part of other fuzzy sets in
the same universe.
3. The degree of membership or truth is not same as probability, fuzzy truth represents
membership in vaguely defined sets.
4. A fuzzy set A~ in the universe of discourse, U, can be defined as a set of ordered pairs and it is
given by

Operations on Fuzzy Sets

Having two fuzzy sets A˜

and B˜, the universe of information U

and an element 𝑦 of the universe, the following relations express the union, intersection and
complement operation on fuzzy sets.

Union/Fuzzy ‘OR’

Let us consider the following representation to understand how the Union/Fuzzy ‘OR’ relation works −

μA˜∪B˜(y)=μA˜∨μB˜∀y∈U

Here ∨ represents the ‘max’ operation.

Intersection/Fuzzy ‘AND’

Let us consider the following representation to understand how the Intersection/Fuzzy ‘AND’ relation
works −

μA˜∩B˜(y)=μA˜∧μB˜∀y∈U

Here ∧ represents the ‘min’ operation.

Complement/Fuzzy ‘NOT’

Let us consider the following representation to understand how the Complement/Fuzzy ‘NOT’ relation
works −

μA˜=1−μA˜(y)y∈U

Properties of Fuzzy Sets

 Commutative Property
 Associative Property
 Distributive Property
 Idempotency Property
 Identity Property
 Involution Property
 De Morgan’s Law
2. Fuzzy Inference System

Fuzzy Inference System is the key unit of a fuzzy logic system having decision making as its primary work.
It uses the “IF…THEN” rules along with connectors “OR” or “AND” for drawing essential decision rules.

Characteristics of Fuzzy Inference System

Following are some characteristics of FIS −

 The output from FIS is always a fuzzy set irrespective of its input which can be fuzzy or crisp.
 It is necessary to have fuzzy output when it is used as a controller.
 A defuzzification unit would be there with FIS to convert fuzzy variables into crisp variables.

Functional Blocks of FIS

The following five functional blocks will help you understand the construction of FIS −

 Rule Base − It contains fuzzy IF-THEN rules.

 Database − It defines the membership functions of fuzzy sets used in fuzzy rules.
 Decision-making Unit − It performs operation on rules.
 Fuzzification Interface Unit − It converts the crisp quantities into fuzzy quantities.
 Defuzzification Interface Unit − It converts the fuzzy quantities into crisp quantities. Following is
a block diagram of fuzzy interference system.

Working of FIS

The working of the FIS consists of the following steps −

 A fuzzification unit supports the application of numerous fuzzification methods, and converts
the crisp input into fuzzy input.
 A knowledge base - collection of rule base and database is formed upon the conversion of crisp
input into fuzzy input.
 The defuzzification unit fuzzy input is finally converted into crisp output.
Methods of FIS

Let us now discuss the different methods of FIS. Following are the two important methods of FIS, having
different consequent of fuzzy rules −

 Mamdani Fuzzy Inference System

 Takagi-Sugeno Fuzzy Model (TS Method)

Mamdani Fuzzy Inference System

This system was proposed in 1975 by Ebhasim Mamdani. Basically, it was anticipated to control a steam
engine and boiler combination by synthesizing a set of fuzzy rules obtained from people working on the
system.

Steps for Computing the Output

Following steps need to be followed to compute the output from this FIS −

 Step 1 − Set of fuzzy rules need to be determined in this step.

 Step 2 − In this step, by using input membership function, the input would be made fuzzy.
 Step 3 − Now establish the rule strength by combining the fuzzified inputs according to fuzzy
rules.
 Step 4 − In this step, determine the consequent of rule by combining the rule strength and the
output membership function.
 Step 5 − For getting output distribution combine all the consequents.
 Step 6 − Finally, a defuzzified output distribution is obtained.
Takagi-Sugeno Fuzzy Model (TS Method)

This model was proposed by Takagi, Sugeno and Kang in 1985. Format of this rule is given as −

IF x is A and y is B THEN Z = f(x,y)

Here, AB are fuzzy sets in antecedents and z = f(x,y) is a crisp function in the consequent.

Fuzzy Inference Process

The fuzzy inference process under Takagi-Sugeno Fuzzy Model (TS Method) works in the following way −

 Step 1: Fuzzifying the inputs − Here, the inputs of the system are made fuzzy.
 Step 2: Applying the fuzzy operator − In this step, the fuzzy operators must be applied to get
the output.

Rule Format of the Sugeno Form

The rule format of Sugeno form is given by −

if 7 = x and 9 = y then output is z = ax+by+c

Comparison between the two methods

Let us now understand the comparison between the Mamdani System and the Sugeno Model.

 Output Membership Function − The main difference between them is on the basis of output
membership function. The Sugeno output membership functions are either linear or constant.
 Aggregation and Defuzzification Procedure − The difference between them also lies in the
consequence of fuzzy rules and due to the same their aggregation and defuzzification procedure
also differs.
 Mathematical Rules − More mathematical rules exist for the Sugeno rule than the Mamdani
rule.
 Adjustable Parameters − The Sugeno controller has more adjustable parameters than the
Mamdani controller.

3. Neuro Fuzzy

A Neuro-Fuzzy Inference System (NFIS) is a type of artificial intelligence system that combines fuzzy logic
with neural network technology to improve the accuracy and performance of fuzzy inference systems.
The goal of a NFIS is to provide a more flexible and adaptive system that can better handle complex and
uncertain data.

ANFIS combines the ability of neural networks to learn from data with the ability of fuzzy logic to handle
uncertainty and imprecision in the data.

ANFIS uses a hybrid learning algorithm that combines the backpropagation algorithm of neural networks
and the gradient descent method of the least squares method for adjusting the parameters of the fuzzy
inference system. ANFIS is essentially a Sugeno-type fuzzy inference system with the parameters of the
fuzzy rules determined by the backpropagation algorithm.

Fuzzy control is a type of control system that uses fuzzy logic to adjust the output of a system based on
input data that is uncertain or imprecise. The goal of fuzzy control is to provide a flexible and adaptable
control system that can handle changing conditions and uncertain data. It is a high level representation
language with local semantics and an interpreter/compiler to synthesize non-linear (control) surfaces.

The Fuzzy Control Types present are: – Type I: RHS is a monotonic function
– Type II: RHS is a fuzzy set
– Type III: RHS is a (linear) function of state

Note that Type II Fuzzy Control must be tuned manually and Type III Fuzzy Control (Takagi-Sugeno type)
have an automatic Right Hand Side (RHS) tuning.

ANFIS will provide both:

– RHS tuning, by implementing the TSK(Takagi–Sugeno–Kang ) controller as a network
– and LHS tuning, by using back-propagation
4. Temporal Logic

Temporal logic is a subfield of mathematical logic that deals with reasoning about time and the temporal
relationships between events. In artificial intelligence, temporal logic is used as a formal language to
describe and reason about the temporal behavior of systems and processes.

Temporal logic extends classical propositional and first-order logic with constructs for specifying
temporal relationships, such as “before,” “after,” “during,” and “until.” This allows for the expression of
temporal constraints and the modeling of temporal aspects of a system, such as its evolution over time
and the relationships between events.
Advantages of using Temporal Logic in Artificial Intelligence:

1. Formal specification: Temporal logic provides a formal language for specifying the desired
behavior of systems and processes, making it easier to ensure that these systems behave
correctly and satisfy the specified requirements.
2. Verification: Temporal logic can be used to verify that a system satisfies the specified temporal
properties, providing a rigorous method for checking the correctness of systems and reducing
the risk of errors.
3. Modeling: Temporal logic allows for the modeling of complex temporal behavior of systems and
processes, making it useful for a wide range of applications in artificial intelligence, such as
robotics and control systems.
4. Completeness: Temporal logic provides a complete system for reasoning about time, making it
well-suited for applications that involve temporal reasoning.
Disadvantages of using Temporal Logic in Artificial Intelligence:

1. Complexity: The formal syntax and semantics of temporal logic can be complex, making it
challenging to use for some applications and requiring a high level of mathematical expertise.
2. Limitations: Temporal logic is a formal language and may not be well-suited for certain
applications that involve uncertain or vague temporal relationships.

Temporal reasoning in AI refers to the ability of an artificial intelligence system to understand and
reason about events, actions, and relationships that occur over time. It involves the capability to
perceive, represent, and manipulate temporal information to make predictions, plan actions, or make
decisions in dynamic environments.

There are several approaches to temporal reasoning in AI, including:

1. Temporal Logic: Temporal logic is a formal language used to reason about time and temporal
relationships. It provides a set of operators and rules for expressing temporal constraints, such
as "before," "after," "during," and "until." Temporal logic can be used to specify properties of
systems and verify their behavior.
2. Time Series Analysis: Time series analysis involves analyzing and forecasting data points
collected over time. It includes techniques such as autoregressive integrated moving average
(ARIMA), exponential smoothing, and recurrent neural networks (RNNs). Time series analysis
enables AI systems to detect patterns, trends, and anomalies in temporal data.
3. Temporal Probabilistic Models: Probabilistic models, such as hidden Markov models (HMMs)
and dynamic Bayesian networks (DBNs), can incorporate temporal information and uncertainty.
These models can capture dependencies between variables over time and make predictions
based on probabilistic reasoning.
4. Temporal Reasoning in Planning: Temporal reasoning is crucial in planning and scheduling
problems, where actions and events must be ordered and coordinated over time. Techniques
like temporal planning networks (TPNs) and temporal constraint networks (TCNs) help AI
systems reason about temporal constraints and create effective plans.
5. Temporal Reasoning in Natural Language Processing: Understanding and generating natural
language often requires temporal reasoning. Temporal expressions such as dates, durations, and
temporal relations play a significant role in language comprehension and generation tasks. AI
models need to interpret and reason about these temporal aspects to generate accurate and
contextually appropriate responses.
5. Back Propagation Neural Networks

Backpropagation is an algorithm that backpropagates the errors from the output nodes to the
input nodes. Therefore, it is simply referred to as the backward propagation of errors. It uses in
the vast applications of neural networks in data mining like Character recognition, Signature
verification, etc.

Neural Network:

Neural networks are an information processing paradigm inspired by the human nervous
system. Just like in the human nervous system, we have biological neurons in the same way in
neural networks we have artificial neurons, artificial neurons are mathematical functions
derived from biological neurons. The human brain is estimated to have about 10 billion neurons,
each connected to an average of 10,000 other neurons. Each neuron receives a signal through a
synapse, which controls the effect of the signconcerning on the neuron.

Artificial Neural Network Structure

Backpropagation is a widely used algorithm for training feedforward neural networks. It

computes the gradient of the loss function with respect to the network weights. It is very
efficient, rather than naively directly computing the gradient concerning each weight. This
efficiency makes it possible to use gradient methods to train multi-layer networks and update
weights to minimize loss; variants such as gradient descent or stochastic gradient descent are
often used.
The backpropagation algorithm works by computing the gradient of the loss function with
respect to each weight via the chain rule, computing the gradient layer by layer, and iterating
backward from the last layer to avoid redundant computation of intermediate terms in the
chain rule.

Features of Backpropagation:
• it is the gradient descent method as used in the case of simple perceptron network with
the differentiable unit.
• it is different from other networks in respect to the process by which the weights are
calculated during the learning period of the network.
• training is done in the three stages :
• the feed-forward of input training pattern
• the calculation and backpropagation of the error
• updation of the weight
• Working of Backpropagation:
• Neural networks use supervised learning to generate output vectors from input vectors
that the network operates on. It Compares generated output to the desired output and
generates an error report if the result does not match the generated output vector.
Then it adjusts the weights according to the bug report to get your desired output.

Backpropagation Algorithm:

Step 1: Inputs X, arrive through the preconnected path.

Step 2: The input is modeled using true weights W. Weights are usually chosen randomly.
Step 3: Calculate the output of each neuron from the input layer to the hidden layer to the
output layer.

Step 4: Calculate the error in the outputs

Backpropagation Error= Actual Output – Desired Output

Step 5: From the output layer, go back to the hidden layer to adjust the weights to reduce the
error.

Step 6: Repeat the process until the desired output is achieved.

6. Conventional and Non-Conventional Reasoning System

Reasoning systems in computational intelligence can be broadly categorized into conventional

and non-conventional systems, each employing different approaches to decision-making and
problem-solving.

Conventional Reasoning Systems:

1. Symbolic Logic:
- Description:
- Symbolic logic is a conventional reasoning approach that uses symbols and logical
operations to represent and manipulate knowledge.
- Example:
- Utilizing propositional or first-order logic to express relationships and rules.

2. Rule-Based Systems:
- Description:
- Rule-based systems use a set of predefined rules to guide decision-making and inference.
- Example:
- Implementing rules such as "if condition A and condition B, then action C" in expert systems.

3. Decision Trees:
- Description:
- Decision trees are hierarchical structures that use a series of if-else conditions to make
decisions.
- Example:
- Constructing a decision tree for diagnosing medical conditions based on patient symptoms.

4. Expert Systems:
- Description:
- Expert systems emulate human expertise by encoding domain-specific knowledge in a set of
rules.
- Example:
- A medical expert system diagnosing diseases based on symptoms and medical history.

5. Predicate Logic:
- Description:
- Predicate logic extends symbolic logic by introducing predicates, allowing for more complex
expressions.
- Example:
- Expressing relationships with predicates like "is_a(animal, mammal)" in knowledge
representation.

Non-Conventional Reasoning Systems:

1. Fuzzy Logic Systems:
- Description:
- Fuzzy logic handles uncertainty by allowing values to range between true and false,
providing a more flexible approach to decision-making.
- Example:
- Describing temperature as "warm" rather than a precise numerical value.

2. Neural Networks:
- Description:
- Neural networks are computational models inspired by the human brain, capable of learning
patterns and making predictions based on data.
- Example:
- Training a neural network to recognize objects in images.

3. Genetic Algorithms:
- Description:
- Genetic algorithms mimic the process of natural selection to evolve solutions to
optimization and search problems.
- Example:
- Optimizing parameters for a complex system through iterative evolution.

4. Case-Based Reasoning:
- Description:
- Case-based reasoning involves solving new problems by recalling and adapting solutions
from similar past cases.
- Example:
- Diagnosing technical issues based on similarities to previously resolved cases.

5. Probabilistic Reasoning:
- Description:
- Probabilistic reasoning involves assessing and calculating probabilities to make decisions
under uncertainty.
- Example:
- Predicting the likelihood of a stock price increase based on historical data.

6. Swarm Intelligence:
- Description:
- Swarm intelligence models collective behavior observed in natural systems, such as ant
colonies or bird flocks, to solve problems.
- Example:
- Swarm robotics coordinating multiple robots to perform a task collectively.
Conventional and non-conventional reasoning systems in computational intelligence offer
diverse approaches to problem-solving and decision-making. While conventional systems rely on
rule-based and logical methods, non-conventional systems leverage approaches inspired by
nature and learning algorithms. The choice of reasoning system depends on the nature of the
problem, the type of data available, and the desired outcomes in a given AI application.

Conventional reasoning systems follow a rule- Non-conventional reasoning systems often adopt
based approach, relying on explicit programming a learning-based approach, allowing systems to
of rules and logical operations. learn patterns and relationships from data.

Decisions are made based on predefined rules, The system learns from examples, adapting and
making the reasoning process explicit and evolving its behavior over time.
deterministic.

Conventional systems tend to be structured, Non-conventional systems, like fuzzy logic or

transparent, and easy to interpret. probabilistic reasoning, are adept at handling
uncertainty.

The decision-making process is often clear, These systems can manage imprecision,
providing a straightforward representation of uncertainty, and incomplete information,
knowledge and rules. providing more flexibility in decision-making.
UNIT - 4

1. Bayes Theorem and Bayesian Belief Network

Bayes' theorem is determining the probability of an event with uncertain knowledge. In probability
theory, it relates the conditional probability and marginal probabilities of two random events.

Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is an
application of Bayes' theorem, which is fundamental to Bayesian statistics.

It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).

Bayes' theorem allows updating the probability prediction of an event by observing new information of
the real world.

As from product rule we can write:

P(A ⋀ B)= P(A|B) P(B) or

Similarly, the probability of event B with known event A:

P(A ⋀ B)= P(B|A) P(A)

Equating right hand side of both the equations, we will get:

P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of hypothesis
A when we have occurred an evidence B.

P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculate the
probability of evidence.

P(A) is called the prior probability, probability of hypothesis before considering the evidence

P(B) is called marginal probability, pure probability of an evidence.

Bayesian belief network

Bayesian belief network is key computer technology for dealing with probabilistic events and to solve a
problem which has uncertainty. We can define a Bayesian network as:

"A Bayesian network is a probabilistic graphical model which represents a set of variables and their
conditional dependencies using a directed acyclic graph."

Bayesian networks are probabilistic, because these networks are built from a probability distribution,
and also use probability theory for prediction and anomaly detection.
Real world applications are probabilistic in nature, and to represent the relationship between multiple
events, we need a Bayesian network. It can also be used in various tasks including prediction, anomaly
detection, diagnostics, automated insight, reasoning, time series prediction, and decision making
under uncertainty.

Bayesian Network can be used for building models from data and experts opinions, and it consists of
two parts:

 Directed Acyclic Graph

 Table of conditional probabilities.

The generalized form of Bayesian network that represents and solve decision problems under uncertain
knowledge is known as an Influence diagram.

A Bayesian network graph is made up of nodes and Arcs (directed links), where:

 Each node corresponds to the random variables, and a variable can be continuous or discrete.
 Arc or directed arrows represent the causal relationship or conditional probabilities between
random variables. These directed links or arrows connect the pair of nodes in the graph. These
links represent that one node directly influence the other node, and if there is no directed link
that means that nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables represented by the nodes of
the network graph.
o If we are considering node B, which is connected with node A by a directed arrow,
then node A is called the parent of Node B.
o Node C is independent of node A.
SUM IN https://www.javatpoint.com/bayesian-belief-network-in-artificial-intelligence

2. Hidden Markov Models

Hidden Markov Models (HMMs) are a type of probabilistic model that are commonly used in machine
learning for tasks such as speech recognition, natural language processing, and bioinformatics. They are
a popular choice for modelling sequences of data because they can effectively capture the underlying
structure of the data, even when the data is noisy or incomplete. In this article, we will give a
comprehensive overview of Hidden Markov Models, including their mathematical foundations,
applications, and limitations.

A Hidden Markov Model (HMM) is a probabilistic model that consists of a sequence of hidden states,
each of which generates an observation. The hidden states are usually not directly observable, and the
goal of HMM is to estimate the sequence of hidden states based on a sequence of observations. An
HMM is defined by the following components:

The basic idea behind an HMM is that the hidden states generate the observations, and the observed
data is used to estimate the hidden state sequence. This is often referred to as the forward-backwards
algorithm.

Hidden Markov Model Algorithm

The Hidden Markov Model (HMM) algorithm can be implemented using the following steps:

Step 1: Define the state space and observation space

The state space is the set of all possible hidden states, and the observation space is the set of all possible
observations.

Step 2: Define the initial state distribution

This is the probability distribution over the initial state.

Step 3: Define the state transition probabilities

These are the probabilities of transitioning from one state to another. This forms the transition matrix,
which describes the probability of moving from one state to another.

Step 4: Define the observation likelihoods:

These are the probabilities of generating each observation from each state. This forms the emission
matrix, which describes the probability of generating each observation from each state.

Step 5: Train the model

The parameters of the state transition probabilities and the observation likelihoods are estimated using
the Baum-Welch algorithm, or the forward-backward algorithm. This is done by iteratively updating the
parameters until convergence.

Step 6: Decode the most likely sequence of hidden states

Given the observed data, the Viterbi algorithm is used to compute the most likely sequence of hidden
states. This can be used to predict future observations, classify sequences, or detect patterns in
sequential data.

Step 7: Evaluate the model

The performance of the HMM can be evaluated using various metrics, such as accuracy, precision, recall,
or F1 score.

Applications of Hidden Markov Models

 Speech Recognition - HMMs are widely used in speech recognition systems. The model is
trained on a large corpus of speech data, and the transitions between phonemes are modeled
using a Markov process. The output of the model is a sequence of phonemes, which can be used
to recognize words and sentences.
 Natural Language Processing - HMMs are also used in natural language processing (NLP)
applications such as part-of-speech tagging and named entity recognition. In these applications,
the HMM is trained on a corpus of text data, and the model learns to predict the sequence of
parts of speech or named entities based on the context of the input text.
 Bioinformatics - HMMs have applications in bioinformatics, where they are used for protein
structure prediction and sequence alignment. In protein structure prediction, the HMM is
trained on a set of known protein structures, and the model is used to predict the structure of
new proteins. In sequence alignment, HMMs are used to find the best match between two or
more protein or DNA sequences.
 Finance - HMMs have applications in finance, where they are used for modeling stock prices,
interest rates, and credit risk. In stock price modeling, the HMM is used to predict the future
movement of stock prices based on historical data. In interest rate modeling, the HMM is used
to predict changes in interest rates based on economic factors. In credit risk modeling, the HMM
is used to predict the probability of default based on the borrower's credit history.

Limitations of Hidden Markov Models

Now, we will explore some of the key limitations of HMMs and discuss how they can impact the
accuracy and performance of HMM-based systems.
 Limited Modeling Capabilities One of the key limitations of HMMs is that they are relatively
limited in their modelling capabilities. HMMs are designed to model sequences of data, where
the underlying structure of the data is represented by a set of hidden states. However, the
structure of the data can be quite complex, and the simple structure of HMMs may not be
enough to accurately capture all the details. For example, in speech recognition, the complex
relationship between the speech sounds and the corresponding acoustic signals may not be fully
captured by the simple structure of an HMM.
 Overfitting Another limitation of HMMs is that they can be prone to overfitting, especially when
the number of hidden states is large or the amount of training data is limited. Overfitting occurs
when the model fits the training data too well and is unable to generalize to new data. This can
lead to poor performance when the model is applied to real-world data and can result in high
error rates. To avoid overfitting, it is important to carefully choose the number of hidden states
and to use appropriate regularization techniques.
 Lack of Robustness HMMs are also limited in their robustness to noise and variability in the
data. For example, in speech recognition, the acoustic signals generated by speech can be
subjected to a variety of distortions and noise, which can make it difficult for the HMM to
accurately estimate the underlying structure of the data. In some cases, these distortions and
noise can cause the HMM to make incorrect decisions, which can result in poor performance. To
address these limitations, it is often necessary to use additional processing and filtering
techniques, such as noise reduction and normalization, to pre-process the data before it is fed
into the HMM.
 Computational Complexity Finally, HMMs can also be limited by their computational
complexity, especially when dealing with large amounts of data or when using complex models.
The computational complexity of HMMs is due to the need to estimate the parameters of the
model and to compute the likelihood of the data given in the model. This can be time-
consuming and computationally expensive, especially for large models or for data that is
sampled at a high frequency. To address this limitation, it is often necessary to use parallel
computing techniques or to use approximations that reduce the computational complexity of
the model.

3. EM Algorithm

The Expectation-Maximization (EM) algorithm is defined as the combination of various unsupervised

machine learning algorithms, which is used to determine the local maximum likelihood estimates (MLE)
or maximum a posteriori estimates (MAP) for unobservable variables in statistical models. Further, it is
a technique to find maximum likelihood estimation when the latent variables are present. It is also
referred to as the latent variable model.

A latent variable model consists of both observable and unobservable variables where observable can
be predicted while unobserved are inferred from the observed variable. These unobservable variables
are known as latent variables.

Being an iterative approach, it consists of two modes. In the first mode, we estimate the missing or
latent variables. Hence it is referred to as the Expectation/estimation step (E-step). Further, the other
mode is used to optimize the parameters of the models so that it can explain the data more clearly. The
second mode is known as the maximization-step or M-step.
Convergence is defined as the specific situation in probability based on intuition, e.g., if there are two
random variables that have very less difference in their probability, then they are known as converged.

In other words, whenever the values of given variables are matched with each other, it is called
convergence.

Steps in EM Algorithm

The EM algorithm is completed mainly in 4 steps, which include Initialization Step, Expectation Step,
Maximization Step, and convergence Step. These steps are explained as follows:
 1st Step: The very first step is to initialize the parameter values. Further, the system is provided
with incomplete observed data with the assumption that data is obtained from a specific model.

 2nd Step: This step is known as Expectation or E-Step, which is used to estimate or guess the
values of the missing or incomplete data using the observed data. Further, E-step primarily
updates the variables.

 3rd Step: This step is known as Maximization or M-step, where we use complete data obtained
from the 2nd step to update the parameter values. Further, M-step primarily updates the
hypothesis.

 4th step: The last step is to check if the values of latent variables are converging or not. If it gets
"yes", then stop the process; else, repeat the process from step 2 until the convergence occurs.

4. Reinforcement Learning

Reinforcement learning is used to find the best possible behavior or path it should take in a specific
situation. Reinforcement learning differs from supervised learning in a way that in supervised learning
the training data has the answer key with it so the model is trained with the correct answer itself
whereas in reinforcement learning, there is no answer but the reinforcement agent decides what to do
to perform the given task. In the absence of a training dataset, it is bound to learn from its experience.

Reinforcement learning is an autonomous, self-teaching system that essentially learns by trial and error.
It performs actions with the aim of maximizing rewards, or in other words, it is learning by doing in
order to achieve the best outcomes.

Main points in Reinforcement learning –

 Input: The input should be an initial state from which the model will start
 Output: There are many possible outputs as there are a variety of solutions to a particular
problem
 Training: The training is based upon the input, the model will return a state and the user will
decide to reward or punish the model based on its output.
 The model keeps continuing to learn.
 The best solution is decided based on the maximum reward.

Reinforcement learning uses algorithms that learn from outcomes and decide which action to take next.
After each action, the algorithm receives feedback that helps it determine whether the choice it made
was correct, neutral or incorrect. It is a good technique to use for automated systems that have to make
a lot of small decisions without human guidance.

Types of Reinforcement:

There are two types of Reinforcement:

1. Positive: Positive Reinforcement is defined as when an event, occurs due to a particular behavior,
increases the strength and the frequency of the behavior. In other words, it has a positive effect on
behavior.

Advantages of reinforcement learning are:

 Maximizes Performance
 Sustain Change for a long period of time
 Too much Reinforcement can lead to an overload of states which can diminish the results

2. Negative: Negative Reinforcement is defined as strengthening of behavior because a negative

condition is stopped or avoided.

Advantages of reinforcement learning:

 Increases Behavior
 Provide defiance to a minimum standard of performance
 It Only provides enough to meet up the minimum behavior

Elements of Reinforcement Learning

Reinforcement learning elements are as follows:

1. Policy
2. Reward function
3. Value function
4. Model of the environment

Policy: Policy defines the learning agent behavior for given time period. It is a mapping from perceived
states of the environment to actions to be taken when in those states.

Reward function: Reward function is used to define a goal in a reinforcement learning problem. A
reward function is a function that provides a numerical score based on the state of the environment

Value function: Value functions specify what is good in the long run. The value of a state is the total
amount of reward an agent can expect to accumulate over the future, starting from that state.

Model of the environment: Models are used for planning.

 Credit assignment problem: Reinforcement learning algorithms learn to generate an internal
value for the intermediate states as to how good they are in leading to the goal. The learning
decision maker is called the agent. The agent interacts with the environment that includes
everything outside the agent.

The agent has sensors to decide on its state in the environment and takes action that modifies its state.
 The reinforcement learning problem model is an agent continuously interacting with an
environment. The agent and the environment interact in a sequence of time steps. At each time
step t, the agent receives the state of the environment and a scalar numerical reward for the
previous action, and then the agent then selects an action.

Reinforcement learning is a technique for solving Markov decision problems.

 Reinforcement learning uses a formal framework defining the interaction between a learning
agent and its environment in terms of states, actions, and rewards. This framework is intended
to be a simple way of representing essential features of the artificial intelligence problem.

5. SVM

Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is
used for Classification as well as Regression problems. However, primarily, it is used for Classification
problems in Machine Learning.

The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-
dimensional space into classes so that we can easily put the new data point in the correct category in
the future. This best decision boundary is called a hyperplane.

SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are
called as support vectors, and hence algorithm is termed as Support Vector Machine. Consider the
below diagram in which there are two different categories that are classified using a decision boundary
or hyperplane:

Hyperplane and Support Vectors in the SVM algorithm:

Hyperplane: There can be multiple lines/decision boundaries to segregate the classes in n-dimensional
space, but we need to find out the best decision boundary that helps to classify the data points. This
best boundary is known as the hyperplane of SVM.

The dimensions of the hyperplane depend on the features present in the dataset, which means if there
are 2 features (as shown in image), then hyperplane will be a straight line. And if there are 3 features,
then hyperplane will be a 2-dimension plane.
We always create a hyperplane that has a maximum margin, which means the maximum distance
between the data points.

Support Vectors:

The data points or vectors that are the closest to the hyperplane and which affect the position of the
hyperplane are termed as Support Vector. Since these vectors support the hyperplane, hence called a
Support vector.

Types of SVM

SVM can be of two types:

 Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be
classified into two classes by using a single straight line, then such data is termed as linearly
separable data, and classifier is used called as Linear SVM classifier.

Linear SVMs work by finding a hyperplane that separates data points into different classes in a
linearly separable space. The hyperplane is a linear equation that is defined by the weights and
biases of the model. The decision boundary is a straight line that separates the classes. Linear
SVMs are efficient and easy to interpret, but they can only be used for linearly separable data.

 Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a
dataset cannot be classified by using a straight line, then such data is termed as non-linear data
and classifier used is called as Non-linear SVM classifier.

Non-linear SVMs are used when the data is not linearly separable. In non-linear SVMs, the data
is mapped to a higher-dimensional feature space where it becomes linearly separable. The
mapping is done using a kernel function, which transforms the data from the input space to a
higher-dimensional space. Once the data is mapped, a hyperplane is used to separate the
classes. The decision boundary is a curved line that separates the classes. Non-linear SVMs are
more powerful than linear SVMs, but they are also more computationally expensive.

6. Decision Tree

 Decision Tree is a Supervised learning technique that can be used for both classification and
Regression problems, but mostly it is preferred for solving Classification problems. It is a tree-
structured classifier, where internal nodes represent the features of a dataset, branches
represent the decision rules and each leaf node represents the outcome.
 In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision
nodes are used to make any decision and have multiple branches, whereas Leaf nodes are the
output of those decisions and do not contain any further branches.
 The decisions or the test are performed on the basis of features of the given dataset.
 It is a graphical representation for getting all the possible solutions to a problem/decision
based on given conditions.
 It is called a decision tree because, similar to a tree, it starts with the root node, which expands
on further branches and constructs a tree-like structure.
 In order to build a tree, we use the CART algorithm, which stands for Classification and
Regression Tree algorithm.
 A decision tree simply asks a question, and based on the answer (Yes/No), it further split the
tree into subtrees.
 Below diagram explains the general structure of a decision tree:

Decision Tree Terminologies

 Root Node: Root node is from where the decision tree starts. It represents the entire dataset,
which further gets divided into two or more homogeneous sets.
 Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further
after getting a leaf node.
 Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes
according to the given conditions.
 Branch/Sub Tree: A tree formed by splitting the tree.
 Pruning: Pruning is the process of removing the unwanted branches from the tree.
 Parent/Child node: The root node of the tree is called the parent node, and other nodes are
called the child nodes.

How does the Decision Tree Algorithm Work?

In a decision tree, for predicting the class of the given dataset, the algorithm starts from the root node
of the tree. This algorithm compares the values of root attribute with the record (real dataset) attribute
and, based on the comparison, follows the branch and jumps to the next node.

For the next node, the algorithm again compares the attribute value with the other sub-nodes and move
further. It continues the process until it reaches the leaf node of the tree. The complete process can be
better understood using the below algorithm:
 Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
 Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
 Step-3: Divide the S into subsets that contains possible values for the best attributes.
 Step-4: Generate the decision tree node, which contains the best attribute.
 Step-5: Recursively make new decision trees using the subsets of the dataset created in step -3.
Continue this process until a stage is reached where you cannot further classify the nodes and
called the final node as a leaf node.

Pruning: Getting an Optimal Decision tree

Pruning is a process of deleting the unnecessary nodes from a tree in order to get the optimal decision
tree.

A too-large tree increases the risk of overfitting, and a small tree may not capture all the important
features of the dataset. Therefore, a technique that decreases the size of the learning tree without
reducing accuracy is known as Pruning. There are mainly two types of tree pruning technology used:

 Cost Complexity Pruning

 Reduced Error Pruning.

Advantages of the Decision Tree

 It is simple to understand as it follows the same process which a human follow while making any
decision in real-life.
 It can be very useful for solving decision-related problems.
 It helps to think about all the possible outcomes for a problem.
 There is less requirement of data cleaning compared to other algorithms.

Disadvantages of the Decision Tree

 The decision tree contains lots of layers, which makes it complex.

 It may have an overfitting issue, which can be resolved using the Random Forest algorithm.
 For more class labels, the computational complexity of the decision tree may increase.

7. Statistical Learning

Statistical learning is a fundamental concept in artificial intelligence (AI) and machine learning (ML). It
refers to the process of developing algorithms and models that can automatically learn patterns and
make predictions or decisions from data. Statistical learning methods use statistical techniques to
analyze and interpret data, allowing machines to learn from examples and make informed decisions.

In AI, statistical learning is often used in supervised learning tasks, where the algorithm learns from
labeled training data to make predictions or classify new, unseen data points. The algorithm learns a
function or model that maps input features to output labels by optimizing a specific objective or loss
function. Examples of supervised learning algorithms include:
 Linear Regression: Linear regression is used for predicting a continuous dependent variable
based on one or more independent variables. It fits a linear equation to the data by minimizing
the sum of squared differences between the observed and predicted values.
 Logistic Regression: Logistic regression is used for binary classification problems. It models the
relationship between a set of independent variables and the probability of a binary outcome
using a logistic function.
 Decision Trees: Decision trees are versatile algorithms used for both classification and regression
tasks. They split the data based on the values of input features to create a tree-like model that
can be used for predictions.
 Support Vector Machines (SVM): SVM is a powerful algorithm used for both classification and
regression tasks. It finds an optimal hyperplane that separates the data into different classes
while maximizing the margin between the classes.
 Naive Bayes: Naive Bayes is a probabilistic classifier based on Bayes' theorem. It assumes that
features are independent of each other, which is a naive assumption, but often performs well in
practice.

In statistical learning, the underlying assumption is that the observed data has some underlying
statistical properties and patterns. By using statistical techniques, such as probability theory,
optimization methods, and hypothesis testing, algorithms can uncover these patterns and make
predictions or inferences about future or unseen data.

Statistical learning also encompasses unsupervised learning, where the algorithm learns from unlabeled
data to discover patterns, structures, or relationships within the data. Clustering algorithms, such as k-
means clustering or hierarchical clustering, and dimensionality reduction techniques, such as principal
component analysis (PCA), are common examples of unsupervised learning algorithms.

Some of the areas where statistical learning is commonly used include:

1. Finance: Statistical learning is used for credit scoring, fraud detection, stock market prediction,
portfolio optimization, and risk assessment.
2. Healthcare: It is employed for medical image analysis, disease diagnosis, patient monitoring,
drug discovery, and personalized medicine.
3. Marketing and Advertising: Statistical learning helps in customer segmentation, market analysis,
targeted advertising, recommender systems, and customer churn prediction.
4. Natural Language Processing (NLP): Statistical learning techniques are used for sentiment
analysis, text classification, information extraction, machine translation, and speech recognition.
8. Regression and Classification

Regression and Classification algorithms are Supervised Learning algorithms. Both the algorithms are
used for prediction in Machine learning and work with the labeled datasets. But the difference between
both is how they are used for different machine learning problems.

Classification:

Classification is a process of finding a function which helps in dividing the dataset into classes based on
different parameters. In Classification, a computer program is trained on the training dataset and based
on that training, it categorizes the data into different classes.

The task of the classification algorithm is to find the mapping function to map the input(x) to the
discrete output(y).

Example: The best example to understand the Classification problem is Email Spam Detection. The
model is trained on the basis of millions of emails on different parameters, and whenever it receives a
new email, it identifies whether the email is spam or not. If the email is spam, then it is moved to the
Spam folder.

Types of ML Classification Algorithms:

Classification Algorithms can be further divided into the following types:

 Logistic Regression
 K-Nearest Neighbours
 Support Vector Machines
 Kernel SVM
 Naïve Bayes
 Decision Tree Classification
 Random Forest Classification

Regression:

Regression is a process of finding the correlations between dependent and independent variables. It
helps in predicting the continuous variables such as prediction of Market Trends, prediction of House
prices, etc.

The task of the Regression algorithm is to find the mapping function to map the input variable(x) to the
continuous output variable(y).

Example: Suppose we want to do weather forecasting, so for this, we will use the Regression algorithm.
In weather prediction, the model is trained on the past data, and once the training is completed, it can
easily predict the weather for future days.

Types of Regression Algorithm:

 Simple Linear Regression

 Multiple Linear Regression
 Polynomial Regression
 Support Vector Regression
 Decision Tree Regression
 Random Forest Regression
8. Supervised and Unsupervised Learning

Supervised Machine Learning:

Supervised learning is a machine learning method in which models are trained using labeled data. In
supervised learning, models need to find the mapping function to map the input variable (X) with the
output variable (Y).

In supervised learning, models are trained using labelled dataset, where the model learns about each
type of data. Once the training process is completed, the model is tested on the basis of test data (a
subset of the training set), and then it predicts the output.

Types of supervised Machine learning Algorithms:

1. Regression

Regression algorithms are used if there is a relationship between the input variable and the output
variable. It is used for the prediction of continuous variables, such as Weather forecasting, Market
Trends, etc. Below are some popular Regression algorithms which come under supervised learning:

 Linear Regression
 Regression Trees
 Non-Linear Regression
 Bayesian Linear Regression
 Polynomial Regression

2. Classification

Classification algorithms are used when the output variable is categorical, which means there are two
classes such as Yes-No, Male-Female, True-false, etc.

Spam Filtering,

 Random Forest
 Decision Trees
 Logistic Regression
 Support vector Machines

Example: Suppose we have an image of different types of fruits. The task of our supervised learning
model is to identify the fruits and classify them accordingly. So to identify the image in supervised
learning, we will give the input data as well as output for that, which means we will train the model by
the shape, size, color, and taste of each fruit. Once the training is completed, we will test the model by
giving the new set of fruit. The model will identify the fruit and predict the output using a suitable
algorithm.
Unsupervised Machine Learning:

Unsupervised learning is another machine learning method in which patterns inferred from the
unlabeled input data. The goal of unsupervised learning is to find the structure and patterns from the
input data. Unsupervised learning does not need any supervision. Instead, it finds patterns from the
data by its own.

Unsupervised learning cannot be directly applied to a regression or classification problem because

unlike supervised learning, we have the input data but no corresponding output data. The goal of
unsupervised learning is to find the underlying structure of dataset, group that data according to
similarities, and represent that dataset in a compressed format.

Types of Unsupervised Learning Algorithm:

 Clustering: Clustering is a method of grouping the objects into clusters such that objects with
most similarities remains into a group and has less or no similarities with the objects of another
group. Cluster analysis finds the commonalities between the data objects and categorizes them
as per the presence and absence of those commonalities.
 Association: An association rule is an unsupervised learning method which is used for finding
the relationships between variables in the large database. It determines the set of items that
occurs together in the dataset. Association rule makes marketing strategy more effective. Such
as people who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item. A
typical example of Association rule is Market Basket Analysis.

Example: So unlike supervised learning, here we will not provide any supervision to the model. We will
just provide the input dataset to the model and allow the model to find the patterns from the data. With
the help of a suitable algorithm, the model will train itself and divide the fruits into different groups
according to the most similar features between them.

NLP stands for Natural Language Processing, which is a part of Computer Science, Human language,
and Artificial Intelligence. It is the technology that is used by machines to understand, analyse,
manipulate, and interpret human's languages. It helps developers to organize knowledge for performing
tasks such as translation, automatic summarization, Named Entity Recognition (NER), speech
recognition, relationship extraction, and

Advantages of NLP

Artificial Neural Networks contain artificial neurons which are called units. These units are arranged in a
series of layers that together constitute the whole Artificial Neural Network in a system. A layer can
have only a dozen units or millions of units as this depends on how the complex neural networks will be
required to learn the hidden patterns in the dataset. Commonly, Artificial Neural Network has an input
layer, an output layer as well as hidden layers. The input layer receives data from the outside world
which the neural network needs to analyze or learn about. Then this data passes through one or
multiple hidden layers that transform the input into data that is valuable for the output layer. Finally,
the output layer provides an output in the form of a response of the Artificial Neural Networks to input
data provided.

In the majority of neural networks, units are interconnected from one layer to another. Each of these
connections has weights that determine the influence of one unit on another unit. As the data transfers
from one unit to another, the neural network learns more and more about the data which eventually
results in an output from the output layer.

The structures and operations of human neurons serve as the basis for artificial neural networks. It is
also known as neural networks or neural nets. The input layer of an artificial neural network is the first
layer, and it receives input from external sources and releases it to the hidden layer, which is the second
layer. In the hidden layer, each neuron receives input from the previous layer neurons, computes the
weighted sum, and sends it to the neurons in the next layer. These connections are weighted means
effects of the inputs from the previous layer are optimized more or less by assigning different-different
weights to each input and it is adjusted during the training process by optimizing these weights for
improved model performance.

Artificial neurons vs Biological neurons

The concept of artificial neural networks comes from biological neurons found in animal brains So they
share a lot of similarities in structure and function wise.

Structure: The structure of artificial neural networks is inspired by biological neurons. A biological
neuron has a cell body or soma to process the impulses, dendrites to receive them, and an axon that
transfers them to other neurons. The input nodes of artificial neural networks receive input signals, the
hidden layer nodes compute these input signals, and the output layer nodes compute the final output
by processing the hidden layer’s results using activation functions.

Synapses: Synapses are the links between biological neurons that enable the transmission of impulses
from dendrites to the cell body. Synapses are the weights that join the one-layer nodes to the next-layer
nodes in artificial neurons. The strength of the links is determined by the weight value.

Learning: In biological neurons, learning happens in the cell body nucleus or soma, which has a nucleus
that helps to process the impulses. An action potential is produced and travels through the axons if the
impulses are powerful enough to reach the threshold. This becomes possible by synaptic plasticity,
which represents the ability of synapses to become stronger or weaker over time in reaction to changes
in their activity. In artificial neural networks, backpropagation is a technique used for learning, which
adjusts the weights between nodes according to the error or differences between predicted and actual
outcomes.

Activation: In biological neurons, activation is the firing rate of the neuron which happens when the
impulses are strong enough to reach the threshold. In artificial neural networks, A mathematical
function known as an activation function maps the input to the output, and executes activations.

How do Artificial Neural Networks learn?

Artificial neural networks are trained using a training set. For example, suppose you want to teach an
ANN to recognize a cat. Then it is shown thousands of different images of cats so that the network can
learn to identify a cat. Once the neural network has been trained enough using images of cats, then you
need to check if it can identify cat images correctly. This is done by making the ANN classify the images it
is provided by deciding whether they are cat images or not. The output obtained by the ANN is
corroborated by a human-provided description of whether the image is a cat image or not. If the ANN
identifies incorrectly then back-propagation is used to adjust whatever it has learned during training.
Backpropagation is done by fine-tuning the weights of the connections in ANN units based on the error
rate obtained. This process continues until the artificial neural network can correctly recognize a cat in
an image with minimal possible error rates.

What are the types of Artificial Neural Networks?

Feedforward Neural Network: The feedforward neural network is one of the most basic artificial neural
networks. In this ANN, the data or the input provided travels in a single direction. It enters into the ANN
through the input layer and exits through the output layer while hidden layers may or may not exist. So
the feedforward neural network has a front-propagated wave only and usually does not have
backpropagation.

Convolutional Neural Network: A Convolutional neural network has some similarities to the feed-
forward neural network, where the connections between units have weights that determine the
influence of one unit on another unit. But a CNN has one or more than one convolutional layer that uses
a convolution operation on the input and then passes the result obtained in the form of output to the
next layer. CNN has applications in speech and image processing which is particularly useful in computer
vision.

Modular Neural Network: A Modular Neural Network contains a collection of different neural networks
that work independently towards obtaining the output with no interaction between them. Each of the
different neural networks performs a different sub-task by obtaining unique inputs compared to other
networks. The advantage of this modular neural network is that it breaks down a large and complex
computational process into smaller components, thus decreasing its complexity while still obtaining the
required output.

Radial basis function Neural Network: Radial basis functions are those functions that consider the
distance of a point concerning the center. RBF functions have two layers. In the first layer, the input is
mapped into all the Radial basis functions in the hidden layer and then the output layer computes the
output in the next step. Radial basis function nets are normally used to model the data that represents
any underlying trend or function.

Recurrent Neural Network: The Recurrent Neural Network saves the output of a layer and feeds this
output back to the input to better predict the outcome of the layer. The first layer in the RNN is quite
similar to the feed-forward neural network and the recurrent neural network starts once the output of
the first layer is computed. After this layer, each unit will remember some information from the
previous step so that it can act as a memory cell in performing computations.
Applications of Artificial Neural Networks

Social Media: Artificial Neural Networks are used heavily in Social Media. For example, let’s take the
‘People you may know’ feature on Facebook that suggests people that you might know in real life so
that you can send them friend requests. Well, this magical effect is achieved by using Artificial Neural
Networks that analyze your profile, your interests, your current friends, and also their friends and
various other factors to calculate the people you might potentially know. Another common application
of Machine Learning in social media is facial recognition. This is done by finding around 100 reference
points on the person’s face and then matching them with those already available in the database using
convolutional neural networks.

Marketing and Sales: When you log onto E-commerce sites like Amazon and Flipkart, they will
recommend your products to buy based on your previous browsing history. Similarly, suppose you love
Pasta, then Zomato, Swiggy, etc. will show you restaurant recommendations based on your tastes and
previous order history. This is true across all new-age marketing segments like Book sites, Movie
services, Hospitality sites, etc. and it is done by implementing personalized marketing. This uses Artificial
Neural Networks to identify the customer likes, dislikes, previous shopping history, etc., and then tailor
the marketing campaigns accordingly.

Healthcare: Artificial Neural Networks are used in Oncology to train algorithms that can identify
cancerous tissue at the microscopic level at the same accuracy as trained physicians. Various rare
diseases may manifest in physical characteristics and can be identified in their premature stages by
using Facial Analysis on the patient photos. So the full-scale implementation of Artificial Neural
Networks in the healthcare environment can only enhance the diagnostic abilities of medical experts
and ultimately lead to the overall improvement in the quality of medical care all over the world.

Personal Assistants: I am sure you all have heard of Siri, Alexa, Cortana, etc., and also heard them based
on the phones you have!!! These are personal assistants and an example of speech recognition that uses
Natural Language Processing to interact with the users and formulate a response accordingly. Natural
Language Processing uses artificial neural networks that are made to handle many tasks of these
personal assistants such as managing the language syntax, semantics, correct speech, the conversation
that is going on, etc.
10. Feed Forward Neural Network

"The process of receiving an input to produce some kind of output to make some kind of prediction is
known as Feed Forward." Feed Forward neural network is the core of many other important neural
networks such as convolution neural network.

In the feed-forward neural network, there are not any feedback loops or connections in the network.
Here is simply an input layer, a hidden layer, and an output layer.

• Input Layer:
The input layer accepts the input data and passes it to the next layer.
• Hidden Layers:
One or more hidden layers that process and transform the input data. Each hidden layer has a
set of neurons connected to the neurons of the previous and next layers. These layers use
activation functions, such as ReLU or sigmoid, to introduce non-linearity into the network,
allowing it to learn and model more complex relationships between the inputs and outputs.
• Output Layer:
The output layer generates the final output. Depending on the type of problem, the number of
neurons in the output layer may vary. For example, in a binary classification problem, it would
only have one neuron. In contrast, a multi-class classification problem would have as many
neurons as the number of classes.

The purpose of Neural networks feedforward is to approximate certain functions. The input to the
network is a vector of values, x, which is passed through the network, layer by layer, and transformed
into an output, y. The network's final output predicts the target function for the given input. The
network makes this prediction using a set of parameters, θ (theta), adjusted during training to minimize
the error between the network's predictions and the target function.

The training involves adjusting the �θ (theta) values to minimize errors. This is done by presenting the
network with a set of input-output pairs (also called training data) and computing the error between
the network's prediction and the true output for each pair. This error is then used to compute
the gradient of the error concerning the parameters, which tells us how to adjust the parameters to
reduce the error. This is done using optimization techniques like gradient descent. Once the training
process is completed, the network has " learned " the function and can be used to predict new input.
Finally, the network stores this optimal value of �θ (theta) in its memory, so it can use it to predict new
inputs.

• I:
Input node (the starting point for data entering the neural network)
• W:
Connection weight (used to determine the strength of the connection between nodes)
• H:
Hidden node (a layer within the network that processes input)
• HA:
Activated hidden node (the value of the hidden node after passing through a predefined
function)
• O:
Output node (the final output of the network, calculated as a weighted sum of the last hidden
layer)
• OA:
Activated output node (the final output of the network after passing through a predefined
function)
• B:
Bias node (a constant value, typically set to 1.0, used to adjust the output of the network)

Feedforward Neural Network Layers:

1. Input Layer:

- Function: Accepts input data, no computations.

- Neurons: Correspond to input features (e.g., pixels in an image).

- Additional: Can include a bias neuron for information like a constant term.

2. Hidden Layer:

- Function: Extracts features and abstract representations.

- Neurons: Process input from the previous layer.

- Activation: Uses functions like ReLU for non-linearity.

- Flexibility: Adjustable number of neurons and layers as a hyperparameter.

3. Output Layer:

- Function: Generates final network output.

- Neurons: Number depends on the problem (e.g., binary or multi-class classification).

- Activation: Chooses based on the problem, e.g., sigmoid or softmax.

- Learnable Parameters: Includes weights and biases, updated during training.

Learnable Parameters (Weights and Biases):

- Weights: Control connection strength between neurons, represented as matrices.

- Biases: Control neuron baseline activation, also represented as matrices.

- Training: Updated iteratively using backpropagation and optimization algorithms.

Activation Function:

- Function: Applies mathematical function to introduce non-linearity.

- Importance: Enables learning complex relationships in data.

- Common Types:
- Sigmoid: Maps to values between 0 and 1 (for binary classification).

- ReLU: Outputs input for positive values, 0 for negatives (common and computationally efficient).

- tanh: Maps values from -1 to 1.

- Softmax: Maps to a probability distribution (for multi-class classification).

- Usage: Each neuron can have its activation function, but common to use the same function for all
neurons in a layer. In some cases, a linear activation function is used for regression problems in the
output layer.

Training

To train a neural networks feedforward, the following steps are typically followed:

• Step 1: Collect and prepare a dataset.

• Step 2: Define the network architecture.
• Step 3: Initialize the weights and biases.
• Step 4: Feed the training data through the network.
• Step 5: Adjust the weights and biases to minimize the error.
• Step 7: Repeat the process for multiple epochs.
• Step 8: This process improves the network's performance on a given task by minimizing the
error between the predicted output and the desired target. Once training is complete, the
network can make predictions on new data.
UNIT 5

1. NLP

Advantages of NLP

 NLP helps users to ask questions about any subject and get a direct response within seconds.
 NLP offers exact answers to the question means it does not offer unnecessary and unwanted
information.
 NLP helps computers to communicate with humans in their languages.
 It is very time efficient.
 Most of the companies use NLP to improve the efficiency of documentation processes, accuracy
of documentation, and identify the information from large databases.

Disadvantages of NLP

A list of disadvantages of NLP is given below:

 NLP may not show context.

 NLP is unpredictable
 NLP may require more keystrokes.
 NLP is unable to adapt to the new domain, and it has a limited function that's why NLP is built
for a single and specific task only.

Components of NLP

There are the following two components of NLP -

1. Natural Language Understanding (NLU)

Natural Language Understanding (NLU) helps the machine to understand and analyse human language
by extracting the metadata from content such as concepts, entities, keywords, emotion, relations, and
semantic roles.

NLU mainly used in Business applications to understand the customer's problem in both spoken and
written language.

NLU involves the following tasks -

 It is used to map the given input into useful representation.

 It is used to analyze different aspects of the language.
2. Natural Language Generation (NLG)

Natural Language Generation (NLG) acts as a translator that converts the computerized data into natural
language representation. It mainly involves Text planning, Sentence planning, and Text Realization.

Applications of NLP

There are the following applications of NLP -

1. Question Answering

2. Spam Detection

3. Sentiment Analysis

4. Machine Translation

How to build an NLP pipeline?

There are the following steps to build an NLP pipeline -

Step1: Sentence Segmentation

Sentence Segment is the first step for building the NLP pipeline. It breaks the paragraph into separate
sentences.

Example: Consider the following paragraph -

Independence Day is one of the important festivals for every Indian citizen. It is celebrated on the
15th of August each year ever since India got independence from the British rule. The day celebrates
independence in the true sense.

Sentence Segment produces the following result:

1. "Independence Day is one of the important festivals for every Indian citizen."
2. "It is celebrated on the 15th of August each year ever since India got independence from the
British rule."
3. "This day celebrates independence in the true sense."

Step2: Word Tokenization

Word Tokenizer is used to break the sentence into separate words or tokens.

Example:

JavaTpoint offers Corporate Training, Summer Training, Online Training, and Winter Training.

Word Tokenizer generates the following result:

"JavaTpoint", "offers", "Corporate", "Training", "Summer", "Training", "Online", "Training", "and",

"Winter", "Training", "."

Step3: Stemming

Stemming is used to normalize words into its base form or root form. For example, celebrates,
celebrated and celebrating, all these words are originated with a single root word "celebrate." The big
problem with stemming is that sometimes it produces the root word which may not have any meaning.

For Example, intelligence, intelligent, and intelligently, all these words are originated with a single root
word "intelligen." In English, the word "intelligen" do not have any meaning.

Step 4: Lemmatization

Lemmatization is quite similar to the Stamming. It is used to group different inflected forms of the word,
called Lemma. The main difference between Stemming and lemmatization is that it produces the root
word, which has a meaning.

For example: In lemmatization, the words intelligence, intelligent, and intelligently has a root word
intelligent, which has a meaning.

Step 5: Identifying Stop Words

In English, there are a lot of words that appear very frequently like "is", "and", "the", and "a". NLP
pipelines will flag these words as stop words. Stop words might be filtered out before doing any
statistical analysis.

Example: He is a good boy.

Note: When you are building a rock band search engine, then you do not ignore the word "The."

Step 6: Dependency Parsing

Dependency Parsing is used to find that how all the words in the sentence are related to each other.

Step 7: POS tags

POS stands for parts of speech, which includes Noun, verb, adverb, and Adjective. It indicates that how a
word functions with its meaning as well as grammatically within the sentences. A word has one or more
parts of speech based on the context in which it is used.

Example: "Google" something on the Internet.

In the above example, Google is used as a verb, although it is a proper noun.

Step 8: Named Entity Recognition (NER)

Named Entity Recognition (NER) is the process of detecting the named entity such as person name,
movie name, organization name, or location.

Example: Steve Jobs introduced iPhone at the Macworld Conference in San Francisco, California.

Step 9: Chunking

Chunking is used to collect the individual piece of information and grouping them into bigger pieces of
sentences.

Phases of NLP

There are the following five phases of NLP:

1. Lexical Analysis and Morphological

The first phase of NLP is the Lexical Analysis. This phase scans the source code as a stream of characters
and converts it into meaningful lexemes. It divides the whole text into paragraphs, sentences, and
words.

2. Syntactic Analysis (Parsing)

Syntactic Analysis is used to check grammar, word arrangements, and shows the relationship among the
words.

Example: Agra goes to the Poonam

In the real world, Agra goes to the Poonam, does not make any sense, so this sentence is rejected by the
Syntactic analyzer.

3. Semantic Analysis

Semantic analysis is concerned with the meaning representation. It mainly focuses on the literal
meaning of words, phrases, and sentences.

4. Discourse Integration

Discourse Integration depends upon the sentences that proceeds it and also invokes the meaning of the
sentences that follow it.

5. Pragmatic Analysis

Pragmatic is the fifth and last phase of NLP. It helps you to discover the intended effect by applying a set
of rules that characterize cooperative dialogues.

For Example: "Open the door" is interpreted as a request instead of an order.

2. Information Retrieval System

Information Retrieval (IR) can be defined as a software program that deals with the organization,
storage, retrieval, and evaluation of information from document repositories, particularly textual
information. Information Retrieval is the activity of obtaining material that can usually be documented
on an unstructured nature i.e. usually text which satisfies an information need from within large
collections which is stored on computers. For example, Information Retrieval can be when a user enters
a query into the system.

An IR system has the ability to represent, store, organize, and access information items. A set of
keywords are required to search. Keywords are what people are searching for in search engines. These
keywords summarize the description of the information.
What is an IR Model?

An Information Retrieval (IR) model selects and ranks the document that is required by the user or the
user has asked for in the form of a query. The documents and the queries are represented in a similar
manner, so that document selection and ranking can be formalized by a matching function that returns a
retrieval status value (RSV) for each document in the collection. Many of the Information Retrieval
systems represent document contents by a set of descriptors, called terms, belonging to a vocabulary V.
An IR model determines the query-document matching function according to four main approaches:

 Acquisition: In this step, the selection of documents and other objects from various web
resources that consist of text-based documents takes place. The required data is collected by web
crawlers and stored in the database.
 Representation: It consists of indexing that contains free-text terms, controlled vocabulary,
manual & automatic techniques as well. example: Abstracting contains summarizing and
Bibliographic description that contains author, title, sources, data, and metadata.
 File Organization: There are two types of file organization methods. i.e. Sequential: It contains
documents by document data. Inverted: It contains term by term, list of records under each term.
Combination of both.
 Query: An IR process starts when a user enters a query into the system. Queries are formal
statements of information needs, for example, search strings in web search engines. In
information retrieval, a query does not uniquely identify a single object in the collection. Instead,
several objects may match the query, perhaps with different degrees of relevancy.
User Interaction With Information Retrieval System

The information first is supposed to be translated into a query by the user. In the information retrieval
system, there is a set of words that convey the semantics of the information that is required whereas, in
a data retrieval system, a query expression is used to convey the constraints which are satisfied by the
objects. Example: A user wants to search for something but ends up searching with another thing. This
means that the user is browsing and not searching. The above figure shows the interaction of the user
through different tasks.

3. Information Extraction

Information Extraction’s main goal is to find out meaningful information from the document set. IE is
one type of IR. IE automatically gets structured information from a set of unstructured documents or
corpus. IE focuses more on texts that can be read and written by humans and utilize them with NLP
(natural language processing). But information retrieval system finds information that is relevant to the
user’s information need and that is stored into a computer. It returns documents of text (unstructured
form) from a large set of corpses.

The information extraction system used in online text extraction should come at a low cost. It needs to
have flexibility in development and must have an easy conversion to new domains. Let’s take the natural
language processing of the machine as an example, i.e. Here IE (information extraction) is able to
recognize the IR system of a person’s need. Using information extraction, we want to make a machine
capable of extracting structured information from documents. The importance of an information
extraction system is determined by the growing amount of information available in unstructured form
(data without metadata), like on the Internet. This knowledge can be made more accessible utilizing
transformation into relational form, or by marking-up with XML tags.

We always try to use automated learning systems in information extraction and we always use this. This
type of IE system will decrease the faults in information extraction. This will also reduce dependencies
on a domain by diminishing the requirement for supervision. IE of structured information relies on the
basic content management principle: “Content must be in context to have value “. Information
Extraction is difficult than Information Retrieval.
4. Machine Translation

Machine translation of languages refers to the use of artificial intelligence (AI) and machine learning
algorithms to automatically translate text or speech from one language to another. This technology has
been developed over the years and has become increasingly sophisticated, with the ability to produce
accurate translations across a wide range of languages.

Machine translation (MT) is a subfield of natural language processing (NLP) that focuses on
automatically translating text or speech from one language to another. It involves the use of
computational algorithms and models to enable computers to understand the structure, meaning, and
context of the source language and generate equivalent translations in the target language.

Why Machine Translation of Languages in Artificial Intelligence ?

1. Improved communication: Machine translation makes it easier for people who speak different
languages to communicate with each other, breaking down language barriers and facilitating
international cooperation.
2. Cost savings: Machine translation is typically faster and less expensive than human translation,
making it a cost-effective solution for businesses and organizations that need to translate large
amounts of text.
3. Increased accessibility: Machine translation can make digital content more accessible to users
who speak different languages, improving the user experience and expanding the reach of
digital products and services.
4. Improved efficiency: Machine translation can streamline the translation process, allowing
businesses and organizations to quickly translate large amounts of text and improving overall
efficiency.
5. Language learning: Machine translation can be a valuable tool for language learners, helping
them to understand the meaning of unfamiliar words and phrases and improving their language
skills.

Machine translation has many applications, including:

1. Cross-border communication: Machine translation allows people from different countries to

communicate with each other more easily, breaking down language barriers and facilitating
international cooperation.
2. Localization: Machine translation can be used to quickly and efficiently translate websites,
software, and other digital content into different languages, making them more accessible to
users around the world.
3. Business: Machine translation can be used by businesses to translate documents, contracts, and
other important materials, enabling them to work with partners and customers from around the
world.
4. Education: Machine translation can be used in education to help students learn new languages
and improve their language skills.
5. Government: Machine translation can be used by governments to translate official documents
and communications, improving accessibility and transparency.

There are different approaches and techniques used in machine translation:

1. Rule-Based Machine Translation (RBMT): RBMT relies on a set of linguistic rules and dictionaries that
are manually created by human experts. These rules define the grammatical and syntactic structures of
both the source and target languages. RBMT systems often require extensive linguistic knowledge and
rule engineering, making them labor-intensive to develop and maintain.

2. Statistical Machine Translation (SMT): SMT relies on statistical models that are trained on large
parallel corpora, which are collections of aligned source and target language texts. These models learn
the probabilities of word or phrase translations based on their co-occurrence patterns in the training
data. SMT models use techniques like phrase-based translation and language models to generate
translations. They are flexible and can handle complex language phenomena but may struggle with
translating rare or unseen phrases.

3. Neural Machine Translation (NMT): NMT is an advanced approach that utilizes neural networks,
specifically recurrent neural networks (RNNs) or transformers, to translate text. NMT models learn an
end-to-end mapping between source and target language sequences, allowing them to capture long-
range dependencies and generate fluent translations. NMT has shown significant improvements over
traditional approaches and is currently the dominant method in machine translation research and
applications.

4. Hybrid Approaches: Hybrid approaches combine the strengths of different machine translation
techniques. For example, a hybrid system may use rule-based methods to handle specific linguistic rules
and exceptions, while employing statistical or neural models for general translation tasks.

Machine translation faces several challenges, including ambiguity, idiomatic expressions, word sense
disambiguation, and handling language-specific nuances. Translating accurately and capturing the
intended meaning can be particularly challenging when dealing with languages that have different word
orders or grammatical structures.

Despite these challenges, machine translation has made significant advancements over the years and is
widely used for various applications, including web page translation, multilingual customer support,
localization of software and content, and language accessibility. Machine translation systems continue
to improve with the availability of larger parallel corpora, advancements in neural network
architectures, and the integration of techniques such as transfer learning and reinforcement learning.

5. Syntactic Analysis/ Semantic Analysis/ Morphological Analysis

i. Syntactic Analysis

Syntactic analysis or parsing or syntax analysis is the third phase of NLP. The purpose of this phase is to
draw exact meaning, or you can say dictionary meaning from the text. Syntax analysis checks the text for
meaningfulness comparing to the rules of formal grammar. For example, the sentence like “hot ice-
cream” would be rejected by semantic analyzer.

In this sense, syntactic analysis or parsing may be defined as the process of analyzing the strings of
symbols in natural language conforming to the rules of formal grammar. The origin of the word ‘parsing’
is from Latin word ‘pars’ which means ‘part’.

Concept of Parser

It is used to implement the task of parsing. It may be defined as the software component designed for
taking input data (text) and giving structural representation of the input after checking for correct syntax
as per formal grammar. It also builds a data structure generally in the form of parse tree or abstract
syntax tree or other hierarchical structure.
The main roles of the parse include −

 To report any syntax error.

 To recover from commonly occurring error so that the processing of the remainder of program
can be continued.
 To create parse tree.
 To create symbol table.
 To produce intermediate representations (IR).

Types of Parsing

Derivation divides parsing into the followings two types −

 Top-down Parsing
 Bottom-up Parsing

Top-down Parsing

In this kind of parsing, the parser starts constructing the parse tree from the start symbol and then tries
to transform the start symbol to the input. The most common form of topdown parsing uses recursive
procedure to process the input. The main disadvantage of recursive descent parsing is backtracking.

Bottom-up Parsing

In this kind of parsing, the parser starts with the input symbol and tries to construct the parser tree up
to the start symbol.

ii. Semantic Analysis

The semantic analysis looks after the meaning. It allocates the meaning to all the structures built
by the syntactic analyzer. Then every syntactic structure and the objects are mapped together into
the task domain. If mapping is possible the structure is sent, if not then it is rejected. For
example, “hot ice-cream” will give a semantic error. During semantic analysis two main
operations are executed:

 First, each separate word will be mapped with appropriate objects in the database. The
dictionary meaning of every word will be found. A word might have more than one
meaning.
 Secondly, all the meanings of each different word will be integrated to find a proper
correlation between the word structures. This process of determining the correct meaning
is called lexical disambiguation. It is done by associating each word with the context.

This process defined above can be used to determine the partial meaning of a sentence. However
semantic and syntax are two completely contrasting concepts. It might be possible that a
syntactically correct sentence is semantically incorrect.

For example, “A rock smelled the colour nine.” It is syntactically correct as it obeys all the rules
of English, but is semantically incorrect. The semantic analysis verifies that a sentence is abiding
by the rules and creates correct information.

iii. Morphological Analysis

While performing the morphological analysis, each particular word is analyzed. Non-word
tokens such as punctuation are removed from the words. Hence the remaining words are
assigned categories. For instance, Ram’s iPhone cannot convert the video from .mkv to .mp4. In
Morphological analysis, word by word the sentence is analyzed.

So here, Ram is a proper noun, Ram’s is assigned as possessive suffix and .mkv and .mp4 is
assigned as a file extension.

As shown above, the sentence is analyzed word by word. Each word is assigned a syntactic
category. The file extensions are also identified present in the sentence which is behaving as an
adjective in the above example. In the above example, the possessive suffix is also identified.
This is a very important step as the judgement of prefixes and suffixes will depend on a syntactic
category for the word. For example, swims and swim’s are different. One makes it plural, while
the other makes it a third-person singular verb. If the prefix or suffix is incorrectly interpreted
then the meaning and understanding of the sentence are completely changed. The interpretation
assigns a category to the word. Hence, discard the uncertainty from the word.
6. Machine Learning

Machine learning algorithms create a mathematical model that, without being explicitly programmed,
aids in making predictions or decisions with the assistance of sample historical data, or training data. For
the purpose of developing predictive models, machine learning brings together statistics and computer
science. Algorithms that learn from historical data are either constructed or utilized in machine learning.
The performance will rise in proportion to the quantity of information we provide.

A machine can learn if it can gain more data to improve its performance.

A machine learning system builds prediction models, learns from previous data, and predicts the output
of new data whenever it receives it. The amount of data helps to build a better model that accurately
predicts the output, which in turn affects the accuracy of the predicted output.

Features of Machine Learning:

o Machine learning uses data to detect various patterns in a given dataset.

o It can learn from past data and improve automatically.
o It is a data-driven technology.
o Machine learning is much similar to data mining as it also deals with the huge amount of the data.

The demand for machine learning is steadily rising. Because it is able to perform tasks that are too complex
for a person to directly implement, machine learning is required. Humans are constrained by our inability
to manually access vast amounts of data; as a result, we require computer systems, which is where
machine learning comes in to simplify our lives.

By providing them with a large amount of data and allowing them to automatically explore the data, build
models, and predict the required output, we can train machine learning algorithms. The cost function can
be used to determine the amount of data and the machine learning algorithm's performance. We can
save both time and money by using machine learning.

The significance of AI can be handily perceived by its utilization's cases, Presently, AI is utilized in self-
driving vehicles, digital misrepresentation identification, face acknowledgment, and companion idea by
Facebook, and so on. Different top organizations, for example, Netflix and Amazon have constructed AI
models that are utilizing an immense measure of information to examine the client interest and suggest
item likewise.

Following are some key points which show the importance of Machine Learning:

o Rapid increment in the production of data

o Solving complex problems, which are difficult for a human
o Decision making in various sector including finance
o Finding hidden patterns and extracting useful information from data.

7. Page Rank Algorithm

PageRank (PR) is an algorithm used by Google Search to rank websites in their search engine results.
PageRank was named after Larry Page, one of the founders of Google. PageRank is a way of measuring
the importance of website pages. According to Google:
PageRank works by counting the number and quality of links to a page to determine a rough estimate of
how important the website is. The underlying assumption is that more important websites are likely to
receive more links from other websites.

It is not the only algorithm used by Google to order search engine results, but it is the first algorithm
that was used by the company, and it is the best-known.

The above centrality measure is not implemented for multi-graphs.

Algorithm

The PageRank algorithm outputs a probability distribution used to represent the likelihood that a person
randomly clicking on links will arrive at any particular page. PageRank can be calculated for collections of
documents of any size. It is assumed in several research papers that the distribution is evenly divided
among all documents in the collection at the beginning of the computational process. The PageRank
computations require several passes, called “iterations”, through the collection to adjust approximate
PageRank values to more closely reflect the theoretical true value.

Simplified algorithm

1. Initialization:

- Assume a small universe of web pages, in this case, A, B, C, and D.

- Initialize PageRank values for each page to 0.25.

2. Link Structure:

- Ignore links from a page to itself or multiple links from one page to another.

- If a page links to other pages, the PageRank it transfers to its targets is divided equally among its
outbound links.

3. Iteration 1:

- Calculate PageRank transfer for each page based on the links.

- If only pages B, C, and D link to A, each transfers 0.25 PageRank to A, summing up to 0.75 for A.

4. Iteration 2:

- Adjust PageRank based on the incoming links.

- If page B has links to pages A and C, it transfers half of its PageRank (0.125 each) to A and C.

- Page C transfers its full PageRank (0.25) to A.

- Page D, with links to A, B, and C, transfers one-third of its PageRank (approximately 0.083) to A.

- Calculate new PageRank values for all pages.

5. Final PageRank:
- After iterations, the PageRank for each page stabilizes.

- Page A's final PageRank is approximately 0.458, considering the contributions from B, C, and D.

6. Generalization:

- PageRank for any page is the sum of contributions from pages linking to it, divided by the number of
outbound links from each contributing page.

- It's like each page distributes its importance to the pages it links to, sharing equally among its links.

7. Damping Factor:

- Introduce a damping factor to simulate the probability that a user randomly clicks a link on a page
rather than following the links on the page.

- The damping factor is like a tax on the PageRank, ensuring that not all PageRank is transferred, and
some "evaporates" in the process.

In essence, the algorithm models the idea that more important pages, as determined by their incoming
links, contribute more to the PageRank of the pages they link to. The process iterates until the PageRank
values stabilize, providing a measure of the relative importance of each page in the given web graph.

Physics 1C Midterm With Solution
No ratings yet
Physics 1C Midterm With Solution
9 pages
AI Notes-1
No ratings yet
AI Notes-1
11 pages
UNIT-4(ES)
No ratings yet
UNIT-4(ES)
15 pages
Unit II Games and Search Strategies
No ratings yet
Unit II Games and Search Strategies
31 pages
AI-UNIT-2 PPT
No ratings yet
AI-UNIT-2 PPT
135 pages
Adversarial Search MinMax Alpha Beta Pruning
No ratings yet
Adversarial Search MinMax Alpha Beta Pruning
43 pages
Game Playing in AI
No ratings yet
Game Playing in AI
12 pages
AI UNIT 3 (1)
No ratings yet
AI UNIT 3 (1)
138 pages
aiml cia 1 QUESTION WITH ANSWER(1)
No ratings yet
aiml cia 1 QUESTION WITH ANSWER(1)
5 pages
It-1 Aiml Answer Key
No ratings yet
It-1 Aiml Answer Key
8 pages
Lec11&12-Adversarial Search
No ratings yet
Lec11&12-Adversarial Search
30 pages
21CSC206T Unit3
100% (1)
21CSC206T Unit3
138 pages
Artificial Intelligence: Fuzzy Logic (FL)
No ratings yet
Artificial Intelligence: Fuzzy Logic (FL)
12 pages
Unit 2- AI
No ratings yet
Unit 2- AI
47 pages
AI Unit3 Gameplaying
No ratings yet
AI Unit3 Gameplaying
43 pages
Unit 3
No ratings yet
Unit 3
13 pages
AI 3 Unit New Savita
No ratings yet
AI 3 Unit New Savita
18 pages
UNIT-2-AI-Notes
No ratings yet
UNIT-2-AI-Notes
26 pages
Unit 5 AI
No ratings yet
Unit 5 AI
80 pages
Unit 5
No ratings yet
Unit 5
15 pages
Adversarial Search
No ratings yet
Adversarial Search
20 pages
AI Unit-III
No ratings yet
AI Unit-III
20 pages
AI unit 2 (1)
No ratings yet
AI unit 2 (1)
132 pages
Unit 2 MinMaxScaling With Alpha Beta Pruning
No ratings yet
Unit 2 MinMaxScaling With Alpha Beta Pruning
24 pages
Unit2e Adversarial Search
No ratings yet
Unit2e Adversarial Search
26 pages
Algorithms
No ratings yet
Algorithms
11 pages
Adversarial Search: in Artificial Intelligence
No ratings yet
Adversarial Search: in Artificial Intelligence
21 pages
6CS4 AI Unit-2
No ratings yet
6CS4 AI Unit-2
77 pages
Lecture11_AdversarialSearch
No ratings yet
Lecture11_AdversarialSearch
74 pages
Unit 3
No ratings yet
Unit 3
61 pages
AI - Module 2 - Min Max & Alpha Beta Pruning
No ratings yet
AI - Module 2 - Min Max & Alpha Beta Pruning
11 pages
Artificial Intelligence Module-III
No ratings yet
Artificial Intelligence Module-III
8 pages
AI - Unit - 2
No ratings yet
AI - Unit - 2
30 pages
AI-unit-3
No ratings yet
AI-unit-3
54 pages
CS632_Lecture_09
No ratings yet
CS632_Lecture_09
22 pages
Game Playing_AI
No ratings yet
Game Playing_AI
25 pages
Minimax Alpha Beta Pruning
No ratings yet
Minimax Alpha Beta Pruning
15 pages
18CS753 Ai Module 4
No ratings yet
18CS753 Ai Module 4
43 pages
Lecture 7
No ratings yet
Lecture 7
62 pages
Ch 5 Adversarial Search
No ratings yet
Ch 5 Adversarial Search
20 pages
Unit 3_ai_ii Aiml Full-1
No ratings yet
Unit 3_ai_ii Aiml Full-1
108 pages
Unit 2 - Part 2
No ratings yet
Unit 2 - Part 2
18 pages
Ai (Unit 3)
No ratings yet
Ai (Unit 3)
12 pages
Seminar PPT (Minimax Algorithm)
100% (1)
Seminar PPT (Minimax Algorithm)
35 pages
Game Theory Unit IV
No ratings yet
Game Theory Unit IV
6 pages
Unit 3 Updated
No ratings yet
Unit 3 Updated
112 pages
Game Playing
No ratings yet
Game Playing
53 pages
UNIT-II
No ratings yet
UNIT-II
56 pages
Games
No ratings yet
Games
41 pages
AD8402 - Artificial Intelligence (Unit III)
No ratings yet
AD8402 - Artificial Intelligence (Unit III)
24 pages
Aiunit 2
No ratings yet
Aiunit 2
18 pages
Biti1113 Games in Ai
No ratings yet
Biti1113 Games in Ai
58 pages
Chapter 3:game Theory: 3.1optimal Decision in Games
No ratings yet
Chapter 3:game Theory: 3.1optimal Decision in Games
17 pages
AI Lec07 Adversarial Search
No ratings yet
AI Lec07 Adversarial Search
29 pages
Min-Max and Alpha-Beta Pruning Algorithms
No ratings yet
Min-Max and Alpha-Beta Pruning Algorithms
7 pages
Lec03 Ai Chapter6 Adversarial Search and Game Playing Aima
No ratings yet
Lec03 Ai Chapter6 Adversarial Search and Game Playing Aima
52 pages
Game Playing: Mini-Max Algorithm in Artificial Intelligence
No ratings yet
Game Playing: Mini-Max Algorithm in Artificial Intelligence
12 pages
Game Playing: Adversarial Search
No ratings yet
Game Playing: Adversarial Search
66 pages
Artificial Intelligence Algorithms
From Everand
Artificial Intelligence Algorithms
akosnemeth
No ratings yet
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
From Everand
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
Peter Bradley
No ratings yet
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
MAD Lab Vishal
No ratings yet
MAD Lab Vishal
42 pages
Team 3
No ratings yet
Team 3
43 pages
UNIT II Session 5
No ratings yet
UNIT II Session 5
15 pages
DFD
No ratings yet
DFD
3 pages
Final MAD Lab V
No ratings yet
Final MAD Lab V
49 pages
Session 1
No ratings yet
Session 1
22 pages
UNIT II Session 3
No ratings yet
UNIT II Session 3
18 pages
Unit 4 Review 4
No ratings yet
Unit 4 Review 4
6 pages
Assignment 1
No ratings yet
Assignment 1
13 pages
Shell Gadus S2 A320 2
No ratings yet
Shell Gadus S2 A320 2
3 pages
Service Performance - Between Measurement and Information in The Public Sector
No ratings yet
Service Performance - Between Measurement and Information in The Public Sector
5 pages
Quadratic Equations: An Introduction
No ratings yet
Quadratic Equations: An Introduction
16 pages
Ic Communication Receiving Various Branches
No ratings yet
Ic Communication Receiving Various Branches
43 pages
Notes On Git
No ratings yet
Notes On Git
5 pages
Aeroelastic Tailoring in Wind-Turbine Blade Applications
No ratings yet
Aeroelastic Tailoring in Wind-Turbine Blade Applications
13 pages
Connotation Vs Denotation Dla 12-18-18
No ratings yet
Connotation Vs Denotation Dla 12-18-18
7 pages
Lockdep Plumbers 2011
No ratings yet
Lockdep Plumbers 2011
58 pages
SXS Memory Card / Hardware Compatibility Chart
No ratings yet
SXS Memory Card / Hardware Compatibility Chart
1 page
User Manual: PDS23 Plus Solar Pump Controller
No ratings yet
User Manual: PDS23 Plus Solar Pump Controller
38 pages
Phil G E P S: Ippine Overnment Lectronic Rocurement Ystem
No ratings yet
Phil G E P S: Ippine Overnment Lectronic Rocurement Ystem
11 pages
Vd01-Vd02 - Axitub Piros Winder 6-1600t-6 22.00kw - Uk
No ratings yet
Vd01-Vd02 - Axitub Piros Winder 6-1600t-6 22.00kw - Uk
2 pages
Math Review
No ratings yet
Math Review
4 pages
Gmail - Confirmation Mail
No ratings yet
Gmail - Confirmation Mail
6 pages
14-NTQ Electric Actuator-EN
No ratings yet
14-NTQ Electric Actuator-EN
7 pages
It'S The Detail That Counts!: Wabco Air Dryer Cartridges
No ratings yet
It'S The Detail That Counts!: Wabco Air Dryer Cartridges
1 page
Grade 3 Story Golly Grue and Grimbletoes
100% (1)
Grade 3 Story Golly Grue and Grimbletoes
6 pages
Changing Building Typologies Review NF
No ratings yet
Changing Building Typologies Review NF
2 pages
Introduction To Arrays: Types of Indexing in Array
No ratings yet
Introduction To Arrays: Types of Indexing in Array
23 pages
Survey Questionnaire Dear Respondents
No ratings yet
Survey Questionnaire Dear Respondents
3 pages
RaychemTraceTek TTSIM2 Module
No ratings yet
RaychemTraceTek TTSIM2 Module
3 pages
Su Jeet Kumar Gupta
No ratings yet
Su Jeet Kumar Gupta
3 pages
ASTM A608 Supplementary Req
No ratings yet
ASTM A608 Supplementary Req
6 pages
Cunanan, Franz Angela M. in UCSP
No ratings yet
Cunanan, Franz Angela M. in UCSP
3 pages
Dell E190S Monitor
No ratings yet
Dell E190S Monitor
2 pages
Study On Ripening of Custard Apple Fruit (Annona Squamosa L.)
No ratings yet
Study On Ripening of Custard Apple Fruit (Annona Squamosa L.)
4 pages
IBP - S10 Production Diagram
No ratings yet
IBP - S10 Production Diagram
27 pages