Artifical Intelligence and Machine Learning Lab
Artifical Intelligence and Machine Learning Lab
LAB MANUAL
B.TECH
Vision
To be a premier centre for academic excellence and research through innovative
interdisciplinary collaborations and making significant contributions to the
community, organizations, and society as a whole.
Mission
• To impart cutting-edge Artificial Intelligence technology in accordance with
industry norms.
• To instill in students a desire to conduct research in order to tackle challenging
technical problems for industry.
• To develop effective graduates who are responsible for their professional
growth, leadership qualities and are committed to lifelong learning.
Quality Policy
PEO1: To possess knowledge and analytical abilities in areas such as maths, science,
and fundamental engineering.
PEO2: To analyse, design, create products, and provide solutions to problems in
Computer Science and Engineering.
PEO3: To leverage the professional expertise to enter the workforce, seek higher
education, and conduct research on AI-based problem resolution.
PEO4: To be solution providers and business owners in the field of computer
science and engineering with an emphasis on artificial intelligence and machine
learning.
• Familiarity with the Prolog programming environment & Systematic introduction to Prolog
programming constructs
• Learning basic concepts of Prolog through illustrative examples and small exercises &
Understanding list data structure in Prolog.
• To introduce the basic concepts and techniques of Machine Learning and the need of
Machine Learning techniques in real-world problems.
• To provide understanding of various Machine Learning algorithms and the way to evaluate
performance of the Machine Learning algorithms.
• To apply Machine Learning to learn, predict and classify the real-world problems in the
Supervised Learning paradigms as well as discover the Unsupervised Learning paradigms of
Machine Learning.
• To understand, learn and design simple Artificial Neural Networks of Supervised Learning
for the selected problems.
• To understand the concept of Reinforcement Learning and Ensemble Methods.
Lab Outcomes:
Upon successful completion of this course, the students will be able to:
Guidelines to students
A. Standard operating procedure
a) Explanation on today’s experiment by the concerned faculty using PPT covering
the following aspects:
1) Name of the experiment
2) Aim
3) Software/Hardware requirements
4) Writing the python programs by the students
5) Commands for executing programs
Writing of the experiment in the Observation Book
The students will write the today’s experiment in the Observation book as per the
following format:
a) Name of the experiment
b) Aim
c) Writing the program
d) Viva-Voce Questions and Answers
e) Errors observed (if any) during compilation/execution
• Students are required to carry their lab observation book and record book with
completedexperiments while entering the lab.
• Students must use the equipment with care. Any damage is caused student is punishable.
• Students are not allowed to use their cell phones/pen drives/ CDs in labs.
• Students need to maintain proper dress code along with ID Card
• Students are supposed to occupy the computers allotted to them and are not supposed
totalk or make noise in the lab.
• Students, after completion of each experiment they need to be updated in observation
notes and same to be updated in the record.
• Lab records need to be submitted after completion of experiment and get it corrected
withthe concerned lab faculty.
• If a student is absent for any lab, they need to be completed the same experiment in the
free time before attending next lab.
Steps to perform experiments in the lab by the student
Step1: Students have to write the date, aim and for that experiment in the observation book.
Step2: Students have to listen and understand the experiment explained by the faculty and
note down the important points in the observation book.
Step3: Students need to write procedure/algorithm in the observation book.
Step4: Analyze and Develop/implement the logic of the program by the student in
respective platform
Step5: After approval of logic of the experiment by the faculty then the experiment has to be
executed on the system.
Step6: After successful execution the results are to be shown to the faculty and noted the
same in the observation book.
Step7: Students need to attend the Viva-Voce on that experiment and write the same in the
observation book.
Step8: Update the completed experiment in the record and submit to the concerned
facultyin-charge.
Regularity 3 Marks
Program written 3 Marks
Execution & Result 3 Marks
Viva-Voce 3 Marks
Dress Code 3 Marks
Allocation of Marks for Lab Internal
Total marks for lab internal are 30 Marks as per Autonomous (JNTUH.)
These 30 Marks are distributed as:
Average of day to day evaluation marks: 15 Marks
Lab Mid exam: 10 Marks
VIVA & Observation: 5 Marks
INDEX
Artificial Intelligence Programs Using PROLOG
S.No Name of the Program Page No
Study of PROLOG Programming language and its Functions. Write simple facts for
1.
the statements using PROLOG.
4. Solve 8-puzzle problem using Best First Search. Write a program to Implement A*.
Program:
Clauses
likes(ram ,mango).
girl(seema).
red(rose).
likes(bill ,cindy).
owns(john ,gold).
?-likes(Who,cindy).
Who= cindy
?-red(What).
What= rose
?-owns(Who,What).
Who= john
What= gold.
Viva Questions:
1- First create a source file for the genealogical logicbase application. Start by adding a few members of your family
tree. It is important to be accurate, since we will be exploring family relationships. Your own knowledge of who your
relatives are will verify the correctness of your Prolog programs.
2- Enter a two-argument predicate that records the parent-child relationship. One argument represents the parent, and
the other the child. It doesn't matter in which order you enter the arguments, as long as you are consistent. Often
Prolog programmers adopt the convention that parent(A,B) is interpreted "A is the parent of B".
3- Create a source file for the customer order entry program. We will begin it with three record types (predicates). The
first is customer/3 where the three arguments are
arg1
Customer name
arg2
City
arg3
Credit rating (aaa, bbb, etc)
4- Next add clauses that define the items that are for sale. It should also have three arguments
arg1
Item identification number
arg2
Item name
arg3
The reorder point for inventory (when at or below this level, reorder)
5- Next add an inventory record for each item. It has two arguments.
arg1
Item identification number (same as in the item record)
arg2
Amount in stock
Assignment:
Aim: Write facts for following:
1. Ram likes apple.
2. Ram is taller then Mohan.
3. My name is Subodh.
4. Apple is fruit.
5. Orange is fruit.
6. Ram is male.
AIM: Write simple queries for following facts.
Simple Queries
Now that we have some facts in our Prolog program, we can consult the program in the listener and query, or call, the
?- room(office).
yes
Prolog will respond with a 'yes' if a match was found. If we wanted to know if the attic was a room, we
would enter that goal.
?- room(attic).
No
Solution:-
clauses
likes(ram ,mango).
girl(seema).
red(rose).
likes(bill ,cindy).
owns(john ,gold).
queries
?-likes(ram,What).
What= mango
?-likes(Who,cindy).
Who= cindy
?-red(What).
What= rose
?-owns(Who,What).
Who= john
What= gold
Viva Questions:
Week-2
Aim:
Implementation of Depth First Search for Water Jug problem.
Theory/Description:
In the water jug problem in Artificial Intelligence, we are provided with two jugs: one having the capacity to
hold 3 gallons of water and the other has the capacity to hold 4 gallons of water. There is no other measuring
equipment available and the jugs also do not have any kind of marking on them. So, the agent’s task here is
to fill the 4-gallon jug with 2 gallons of water by using only these two jugs and no other material. Initially,
both our jugs are empty.
So, to solve this problem, following set of rules were proposed:
Production rules for solving the water jug problem
Solution:
The state space for this problem can be described as the set of ordered pairs of integers (x,y)
Where,
X represents the quantity of water in the 4-gallon jug X= 0,1,2,3,4
Y represents the quantity of water in 3-gallon jug Y=0,1,2,3
Start State: (0,0)
Goal State: (2,0)
Generate production rules for the water jug problem
Initialization:
Start State: (0,0)
Apply Rule 2:
(X,Y | Y<3) ->
(X,3)
{Fill 3-gallon jug}
Now the state is (X,3)
Iteration 1:
Current State: (X,3)
Apply Rule 7:
(X,Y | X+Y<=4 ^Y>0)
(X+Y,0)
{Pour all water from 3-gallon jug into 4-gallon jug}
Now the state is (3,0)
Progaram:
solve_dfs(State, History, []) :- final_state(State).
solve_dfs(State, History, [Move|Moves]) :-
test_dfs(Problem, Moves) :-
initial_state(Problem, State), solve_dfs(State, [State], Moves).
capacity(1, 10).
capacity(2, 7).
legal(jugs(V1, V2)).
move(jugs(V1, V2), fill(1)) :- capacity(1, C1), V1 < C1, capacity(2, C2), V2 < C2.
move(jugs(V1, V2), fill(2)) :- capacity(2, C2), V2 < C2, capacity(1, C1), V1 < C1.
Output:
Query: test_dfs(jugs, Moves).
Moves = [fill(1),
transfer(1,2),
empty(1),
transfer(2,1),
fill(2),
transfer(2,1),
empty(1),
transfer(2,1),
fill(2),
transfer(2,1),
empty(1),
transfer(2,1),
fill(2),
transfer(2,1),
fill(2),
transfer(2,1),
empty(1),
transfer(2,1),
fill(2),
transfer(2,1),
empty(1),
transfer(2,1),
fill(2),
transfer(2,1),
fill(2),
transfer(2,1),
empty(1),
transfer(2,1)]
Week-3
Aim:
Implementation of Breadth First Search for Tic-Tac-Toe problem.
Theory/Description:
The game Tic Tac Toe is also known as Noughts and Crosses or Xs and Os ,the player needs to take turns
marking the spaces in a 3x3 grid with their own marks,if 3 consecutive marks
(Horizontal, Vertical,Diagonal) are formed then the player who owns these moves get won.
Assume ,
Player 1 - X
Player 2 - O
So,a player who gets 3 consecutive marks first,they will win the game .
Let's have a discussion about how a board's data structure looks and how the Tic Tac Toe algorithm works.
Move Table
It is a vector of 3^9 elements, each element of which is a nine element vector representing board
position.
Total of 3^9(19683) elements in move table
Move Table
Index Current Board position New Board position
0 000000000 000010000
1 000000001 020000001
2 000000002 000100002
3 000000010 002000010
.
.
Algorithm
To make a move, do the following:
1. View the vector (board) as a ternary number and convert it to its corresponding decimal number.
2. Use the computed number as an index into the move table and access the vector stored there.
3. The vector selected in step 2 represents the way the board will look after the move that should be
made. So set board equal to that vector.
Let's start with empty board
Step 1:Now our board looks like 000 000 000 (tenary number) convert it into decimal no.The
decimal no is 0
Step 2:Use the computed number ie 0 as an index into the move table and access the vector stored in
New Board Position.
The new board position is 000 010 000
Step 3:The vector selected in step 2(000 010 000 ) represents the way the board will look after the
move that should be made. So set board equal to that vector.
After complete the 3rd step your board looks like\
Program:
other(x,o).
other(o,x).
display([A,B,C,D,E,F,G,H,I]) :- write([A,B,C]),nl,write([D,E,F]),nl,
write([G,H,I]),nl,nl.
selfgame :- game([b,b,b,b,b,b,b,b,b],x).
orespond(Board,Newboard) :-
move(Board, o, Newboard),
win(Newboard, o),
!.
orespond(Board,Newboard) :-
xmove([b,B,C,D,E,F,G,H,I], 1, [x,B,C,D,E,F,G,H,I]).
xmove([A,b,C,D,E,F,G,H,I], 2, [A,x,C,D,E,F,G,H,I]).
xmove([A,B,b,D,E,F,G,H,I], 3, [A,B,x,D,E,F,G,H,I]).
xmove([A,B,C,b,E,F,G,H,I], 4, [A,B,C,x,E,F,G,H,I]).
xmove([A,B,C,D,b,F,G,H,I], 5, [A,B,C,D,x,F,G,H,I]).
xmove([A,B,C,D,E,b,G,H,I], 6, [A,B,C,D,E,x,G,H,I]).
xmove([A,B,C,D,E,F,b,H,I], 7, [A,B,C,D,E,F,x,H,I]).
xmove([A,B,C,D,E,F,G,b,I], 8, [A,B,C,D,E,F,G,x,I]).
xmove([A,B,C,D,E,F,G,H,b], 9, [A,B,C,D,E,F,G,H,x]).
xmove(Board, _, Board) :- write('Illegal move.'), nl.
explain :-
write('You play X by entering integer positions followed by a period.'),
nl,
display([1,2,3,4,5,6,7,8,9]).
Output:
Query: playo.
You play X by entering integer positions followed by a period.
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
[x, o, b]
[b, b, b]
[b, b, b]
9
[x, o, b]
[b, b, b]
[b, b, x]
[x, o, b]
[b, o, b]
[b, b, x]
7
[x, o, b]
[b, o, b]
[x, b, x]
[x, o, b]
[b, o, b]
[x, o, x]
I win!
1true
Another Output:
[x, b, b]
[b, b, b]
[b, b, b]
[x, o, b]
[b, b, b]
[b, b, b]
[x, o, x]
[b, b, b]
[b, b, b]
[x, o, x]
[o, b, b]
[b, b, b]
[x, o, x]
[o, x, b]
[b, b, b]
[x, o, x]
[o, x, o]
[x, b, b]
[x, o, x]
[o, x, o]
[x, o, b]
[player, x, wins]
1true
Week:4
Aim:
Solve 8-puzzle problem using best first search.
Theory/Description:
Eight puzzle problem by the name of N puzzle problem or sliding puzzle problem.
N-puzzle that consists of N tiles (N+1 titles with an empty tile) where N can be 8, 15, 24 and so on.
In our example N = 8. (that is square root of (8+1) = 3 rows and 3 columns).
In the same way, if we have N = 15, 24 in this way, then they have Row and columns as follow (square root
of (N+1) rows and square root of (N+1) columns).
That is if N=15 than number of rows and columns= 4, and if N= 24 number of rows and columns= 5.
So, basically in these types of problems we have given a initial state or initial configuration (Start state) and
a Goal state or Goal Configuration.
Here We are solving a problem of 8 puzzle that is a 3x3 matrix.
Initial state Goal state
Solution:
The puzzle can be solved by moving the tiles one by one in the single empty space and thus achieving the
Goal state.
Rules of solving puzzle
Instead of moving the tiles in the empty space we can visualize moving the empty space in place of the tile.
The empty space can only move in four directions (Movement of empty space)
1. Up
2. Down
3. Right or
4. Left
The empty space cannot move diagonally and can take only one step at a time.
All possible move of a Empty tile
o- Position total possible moves are (2), x - position total possible moves are (3) and
#-position total possible moves are (4)
Let's solve the problem without Heuristic Search that is Uninformed Search or Blind Search
Solve Eight puzzle problem
Note: If we solve this problem with depth first search, then it will go to depth instead of exploring layer wise
nodes.
Time complexity: In worst case time complexity in BFS is O(b^d) know as order of b raise to power d. In
this particular case it is (3^20).
b-branch factor
d-depth factor
Let's solve the problem with Heuristic Search that is Informed Search (A* , Best First Search (Greedy
Search))
To solve the problem with Heuristic search or informed search we have to calculate Heuristic values of each
node to calculate cost function. (f=g+h)
Initial state Goal state
Note: See the initial state and goal state carefully all values except (4,5 and 8) are at their
respective places. so, the heuristic value for first node is 3.(Three values are misplaced to reach the goal).
And let's take actual cost (g) according to depth.
Program:
ids :-
start(State),
length(Moves, N),
dfs([State], Moves, Path), !,
show([start|Moves], Path),
format('~nmoves = ~w~n', [N]).
show([], _).
show([Move|Moves], [State|States]) :-
State = state(A,B,C,D,E,F,G,H,I),
format('~n~w~n~n', [Move]),
format('~w ~w ~w~n',[A,B,C]),
format('~w ~w ~w~n',[D,E,F]),
format('~w ~w ~w~n',[G,H,I]),
show(Moves, States).
start( state(6,1,3,4,*,5,7,2,0) ).
goal( state(*,0,1,2,3,4,5,6,7) ).
613
4*5
720
left
613
*45
720
up
*13
645
720
right
1*3
645
720
down
143
6*5
right
143
65*
720
down
143
650
72*
left
143
650
7*2
*67
up
143
*02
567
right
143
0*2
567
right
143
02*
567
up
14*
023
567
left
1*4
023
567
*14
023
567
down
014
*23
567
right
014
2*3
567
right
014
23*
567
up
01*
234
567
left
0*1
234
567
left
*01
234
567
moves = 26
1true
Week:5
Aim:
Write a PROLOG program to solve N-Queens problem.
Theory/Description:
N-Queens Problem
N - Queens problem is to place n - queens in such a manner on an n x n chessboard that no queens attack
each other by being in the same row, column or diagonal.
It can be seen that for n =1, the problem has a trivial solution, and no solution exists for n =2 and n =3. So
first we will consider the 4 queens problem and then generate it to n - queens problem.
Given a 4 x 4 chessboard and number the rows and column of the chessboard 1 through 4.
Since, we have to place 4 queens such as q1 q2 q3 and q4 on the chessboard, such that no two queens attack
each other. In such a conditional each queen must be placed on a different row, i.e., we put queen "i" on row
"i."
Now, we place queen q1 in the very first acceptable position (1, 1). Next, we put queen q2 so
that both these queens do not attack each other. We find that if we place q2 in column 1 and
2, then the dead end is encountered. Thus the first acceptable position for q2 in column 3, i.e.
(2, 3) but then no position is left for placing queen 'q3' safely. So we backtrack one step and
place the queen 'q2' in (2, 4), the next best possible solution. Then we obtain the position for
placing 'q3' which is (3, 2). But later this position also leads to a dead end, and no place is
found where 'q4' can be placed safely. Then we have to backtrack till 'q1' and place it to (1, 2)
and then all other queens are placed safely by moving q2 to (2, 4), q3 to (3, 1) and q4 to (4, 3).
That is, we get the solution (2, 4, 1, 3). This is one possible solution for the 4-queens
The implicit tree for 4 - queen problem for a solution (2, 4, 1, 3) is as follows:
Fig shows the complete state space for 4 - queens problem. But we can use backtracking
method to generate the necessary node and stop if the next node violates the rule, i.e., if two
queens are attacking.
Program:
% render solutions nicely.
:-use_rendering(chess).
queens(N, Queens) :-
length(Queens, N),
board(Queens, Board, 0, N, _, _),
queens(Board, 0, Queens).
constraints(0, _, _, _) :- !.
constraints(N, Row, [R|Rs], [C|Cs]) :-
arg(N, Row, R-C),
M is N-1,
constraints(M, Row, Rs, Cs).
queens([], _, []).
queens([C|Cs], Row0, [Col|Solution]) :-
Row is Row0+1,
select(Col-Vars, [C|Cs], Board),
arg(Row, Vars, Row-Row),
queens(Board, Row, Solution).
/** <examples>
?- queens(8, Queens).
*/
Output:
Query: queens(8, Queens).
Depth-first search
The depth-first search algorithm starts at the root node and explores as deep as possible along each branch
before taking backtracking. In our TSP, when a state node with all city labels is visited, its total distance is
memorized. This information will later be used to define the shortest path.
Let VISIT be a stack to save visited nodes, PATH be a set to save distances from the root node to the goal.
The depth-first algorithm can be written as
Breadth-first search
The depth-first search algorithm starts at the root node and explores all of the nodes at the present depth
level before moving on to the nodes at the next depth level. In our TSP, when a state node with all city labels
is visited, its total distance is memorized. This information will later be used to define the shortest path.
Let VISIT be a queue to save visited nodes, PATH be a set to save distances from the root node to the goal.
The breadth-first algorithm can be written as
Heuristic Search
In Brute-force search, all nodes are visited and the information from each node (distance from a node to a
node) is not considered. This leads to a large amount of time and memory consumption. To solve this
problem, a heuristic search is a solution. The information of each state node is used to consider visiting a
node or not. This information is represented by a heuristic function which commonly set up by user’s
experiences. For example, we can define the heuristic function by the distance from the root node to the
present visit node, or the distance from the present visit node to the goal node.
Best-first search
In the Best-first search, we use the information of distance from the root node to decide which node to visit
first. Let g(X) be the distance from the root node to node-X. Therefore, the distance from the root node to
node-Y by visiting node-X is g(Y)=g(X)+d(X, Y), where d(X, Y) is the distance between X and Y.
Let VISIT be a list to save visited nodes. The best-first algorithm can be written as
A-algorithm (A*-algorithm)
In the A algorithm search, we use the information of distance from the present visit node to the goal as a
heuristic function, h(X). Let g(X) be the distance from the root node to node-X. In this case, we consider the
priority of node visit order by f(X)=g(X)+h(X).
In real-world problems, it is impossible to obtain the exact value of h(X). In that case, an estimation value of
h(X), h’(X), is used. However, setting h’(X) takes risks in falling into a local optimal answer. To prevent this
problem, choosing h’(X) which h’(X)≤h(X) for all X is recommended. In this case, it is known as A*-
algorithm and it can be shown that the obtained answer is the global optimal answer.
In our experiment described in the following part, we are setting h’(X) as the sum of the minimum distance
of all possible routes from each city which is not presented in the current visit node label, and the present
edge(a, b, 3).
edge(a, c, 4).
edge(a, d, 2).
edge(a, e, 7).
edge(b, c, 4).
edge(b, d, 6).
edge(b, e, 3).
edge(c, d, 5).
edge(c, e, 8).
edge(d, e, 6).
edge(b, a, 3).
edge(c, a, 4).
edge(d, a, 2).
edge(e, a, 7).
edge(c, b, 4).
edge(d, b, 6).
edge(e, b, 3).
edge(d, c, 5).
edge(e, c, 8).
edge(e, d, 6).
edge(a, h, 2).
edge(h, d, 1).
/* Finds the length of a list, while there is something in the list it increments N
when there is nothing left it returns.*/
len([], 0).
len([H|T], N):-len(T, X), N is X+1 .
/*When we find a path back to the starting point, make that the total distance and make
sure the graph has touch every node*/
/*This is called to find the shortest path, takes all the paths, collects them in holder.
Then calls pick on that holder which picks the shortest path and returns it*/
/* Is called, compares 2 distances. If cost is smaller than bcost, no need to go on. Cut it.*/
best(Cost-Holder,Bcost-_,Cost-Holder):- Cost<Bcost,!.
best(_,X,X).
/*Takes the top path and distance off of the holder and recursively calls it.*/
pick([Cost-Holder|R],X):- pick(R,Bcost-Bholder),best(Cost-Holder,Bcost-Bholder,X),!.
pick([X],X).
Output:
Query: shortest_path(Path).
Path = 20-[a, h, d, e, b, c, a]
Week-1:
Implementation of Python Basic Libraries such as Math, Numpy and Scipy
Theory/Description:
• Python Libraries
There are a lot of reasons why Python is popular among developers and one of them is that it has an
amazingly large collection of libraries that users can work with. In this Python Library, we will discuss
Python Standard library and different libraries offered by Python Programming Language: scipy, numpy,
etc.
We know that a module is a file with some Python code, and a package is a directory for sub packages and
modules. A Python library is a reusable chunk of code that you may want to include in your programs/
projects. Here, a library loosely describes a collection of core modules. Essentially, then, a library is a
collection of modules. A package is a library that can be installed using a package manager like npm.
To display a list of all available modules, use the following command in the Python console:
>>> help('modules')
• List of important Python Libraries
o Python Libraries for Data Collection
▪ Beautiful Soup
▪ Scrapy
▪ Selenium
o Python Libraries for Data Cleaning and Manipulation
▪ Pandas
▪ PyOD
▪ NumPy
▪ Scipy
▪ Spacy
o Python Libraries for Data Visualization
▪ Matplotlib
▪ Bokeh
o Python Libraries for Modeling
▪ Scikit-learn
▪ TensorFlow
▪ Keras
▪ PyTorch
The math module is a standard module in Python and is always available. To use mathematical functions
under this module, you have to import the module using import math. It gives access to the underlying C
library functions. This module does not support complex datatypes. The cmath module is the complex
counterpart.
Program-1
Program-2
Program-3
Program-4
Program-5
o We also have a boolean function isNaN() which returns true if the given argument is a NaN and
returns false otherwise. We can also take a value and convert it to float to check whether it is NaN.
o A missing value is denoted as NaN (Stands for Not a Number eg. 0/0).
o Inf: 1/0 is one example of Inf. We can define the negative infinity as -inf and the positive infinity as inf
in Python.
Program-6
NumPy is an open source library available in Python that aids in mathematical, scientific, engineering, and
data science programming. NumPy is an incredible library to perform mathematical and statistical operations.
It works perfectly well for multi-dimensional arrays and matrices multiplication
For any scientific project, NumPy is the tool to know. It has been built to work with the N- dimensional
array, linear algebra, random number, Fourier transform, etc. It can be integrated to C/C++ and Fortran.
NumPy is a programming language that deals with multi-dimensional arrays and matrices. On top of the
arrays and matrices, NumPy supports a large number of mathematical operations.
NumPy is memory efficient, meaning it can handle the vast amount of data more accessible than any other
library. Besides, NumPy is very convenient to work with, especially for matrix multiplication and reshaping.
On top of that, NumPy is fast. In fact, TensorFlow and Scikitlearn use NumPy array to compute the matrix
multiplication in the back end.
It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers.
In NumPy dimensions are called axes. The number of axes is rank.
NumPy’s array class is called ndarray. It is also known by the alias array.
We use python numpy array instead of a list because of the below three reasons:
1. Less Memory
2. Fast
3. Convenient
Numpy Functions
Numpy arrays carry attributes around with them. The most important ones are:
ndim: The number of axes or rank of the array. ndim returns an integer that tells us how many dimensions the
array have.
shape: A tuple containing the length in each dimensionsize: The total number of elements
Program-1
Program-2
• Built-in Methods
Many standard numerical functions are available as methods out of the box:
Program-3
SciPy is an Open Source Python-based library, which is used in mathematics, scientific computing,
Engineering, and technical computing. SciPy also pronounced as "Sigh Pi."
SciPy contains varieties of sub packages which help to solve the most common issuerelated to Scientific
Computation.
SciPy is the most used Scientific library only second to GNU Scientific Library forC/C++ or Matlab's.
Easy to use and understand as well as fast computational power.
It can operate on an array of NumPy library.
Numpy VS SciPy
Numpy:
1. Numpy is written in C and used for mathematical or numeric calculation.
2. It is faster than other Python Libraries
3. Numpy is the most useful library for Data Science to perform basic calculations.
4. Numpy contains nothing but array data type which performs the most basic operation like sorting, shaping,
indexing, etc.
SciPy:
Program-1
Program-2
Exercise programs:
1. Consider a list datatype (1D) then reshape it into 2D, 3D matrix using numpy
2. Generate random matrices using numpy
3. Find the determinant of a matrix using scipy
4. Find eigenvalue and eigenvector of a matrix using scipy
• Pandas Library
The primary two components of pandas are the Series and DataFrame.
A Series is essentially a column, and a DataFrame is a multi-dimensional table made up of a collection of
Series.
DataFrames and Series are quite similar in that many operations that you can do with oneyou can do with the
other, such as filling in null values and calculating the mean.
With CSV files all you need is a single line to load in the data:
df = pd.read_csv('purchases.csv') df
Another fast and useful attribute is .shape, which outputs just a tuple of (rows, columns):
movies_df.shape
Note that .shape has no parentheses and is a simple tuple of format (rows, columns). So we have 1000 rows
and 11 columns in our movies DataFrame.
You'll be going to .shape a lot when cleaning and transforming data. For example, you might filter some rows
based on some criteria and then want to know quickly how many rows were removed.
We haven't defined an index in our example, but we see two columns in our output: The right column
contains our data, whereas the left column contains the index. Pandas created a default index starting with 0
going to 5, which is the length of the data minus 1.
dtype('int64'): The type int64 tells us that Python is storing each value within this column as a 64 bit integer
Program-2
We can directly access the index and the values of our Series S:
Program-3
If we compare this to creating an array in numpy, we will find lots of similarities:
So far our Series have not been very different to ndarrays of Numpy. This changes, as soon as we start
defining Series objects with individual indices:
Program-4
O UTPUT :
apples 37
oranges 46
cherries 83
pears 42
dtype: int64
sum of S: 115
Program-6
The indices do not have to be the same for the Series addition. The index will be the "union" of both indices.
If an index doesn't occur in both Series, the value for this Series will be NaN:
fruits = ['peaches', 'oranges', 'cherries', 'pears']
fruits2 = ['raspberries', 'oranges', 'cherries', 'pears']
O UTPUT :
cherries 83.0
oranges 46.0
peaches NaN
pears 42.0
raspberries NaN
dtype: float64
Program-7
In principle, the indices can be completely different, as in the following example. We have two indices. One
is the Turkish translation of the English fruit names:
fruits = ['apples', 'oranges', 'cherries', 'pears']
O UTPUT :
apples NaN
Program-8
Indexing
It's possible to access single values of a Series.
print(S['apples'])
O UTPUT :
20
• Matplotlib Library
Pyplot is a module of Matplotlib which provides simple functions to add plot elementslike lines, images,
text, etc. to the current axes in the current figure.
plot(x-axis values, y-axis values) — plots a simple line graph with x-axis valuesagainst y-axis values
show() — displays the graph
title(―string‖) — set the title of the plot as specified by the string
xlabel(―string‖) — set the label for x-axis as specified by the string
ylabel(―string‖) — set the label for y-axis as specified by the string
figure() — used to control a figure level attributes
subplot(nrows, ncols, index) — Add a subplot to the current figure
suptitle(―string‖) — It adds a common title to the figure specified by the string
subplots(nrows, ncols, figsize) — a convenient way to create subplots, in a single call.It returns a tuple
of a figure and number of axes.
set_title(―string‖) — an axes level method used to set the title of subplots in a figure
bar(categorical variables, values, color) — used to create vertical bar graphs
barh(categorical variables, values, color) — used to create horizontal bar graphs
legend(loc) — used to make legend of the graph
xticks(index, categorical variables) — Get or set the current tick locations and labelsof the x-axis
pie(value, categorical variables) — used to create a pie chart
Here we import Matplotlib‘s Pyplot module and Numpy library as most of the data thatwe will be working
with arrays only.
Program-1
We pass two arrays as our input arguments to Pyplot‘s plot() method and use show() method to invoke the
required plot. Here note that the first array appears on the x-axis andsecond array appears on the y-axis of
the plot. Now that our first plot is ready, let us add the title, and name x-axis and y-axis using methods title(),
xlabel() and ylabel() respectively.
Program-3
We can also specify the size of the figure using method figure()and passing the values as a tuple of the length
of rows and columns to the argument figsize
EXERCISE:
1. Write a python program to declare two series data and also add the index names. Use division operator to divide one
series by another. In the output one of the series data must be NaN and another Inf.
2. Write a python program to consider some values as (x,y) co-ordinate values and plot the graph using a line graph. The
color of the line graph should be red.
Week-3
1. Creation and loading different datasets in Python
Program-1
Method-I
B.Tech – CSE (Computational intelligence) R-20
Program-2
Method-II:
2. Write a python program to compute Mean, Median, Mode, Variance, Standard Deviation
using Datasets
• Measures of spread
These functions calculate a measure of how much the population or sample tends to deviate
from the typical or average values.
pstdev() Population standard deviation of data.
pvariance() Population variance of data.
stdev() Sample standard deviation of data.
variance() Sample variance of data.
B.Tech – CSE (Computational intelligence) R-20
Program-1
Program-2
Program-3
Program-4
B.Tech – CSE (Computational intelligence) R-20
Program-5
Demonstrate various data pre-processing techniques for a given dataset. Write a python
program to compute
a) Reshaping the data,
b) Filtering the data,
c) Merging the data
d) Handling the missing values in datasets
e) Feature Normalization: Min-max normalization
Program-1
Reshaping the data:
Method-I
B.Tech – CSE (Computational intelligence) R-20
Program-2
Method:II
Assigning the data:
B.Tech – CSE (Computational intelligence) R-20
Program-3
ARTIFICAL INTELLIGENCE AND MACHINE LEARNING LAB MANUAL2022-2023
Program-1
Program-2
Program-3
Merge data:
Merge operation is used to merge raw data and into the desired format.
Syntax:
pd.merge( data_frame1,data_frame2, on="field ")
Program-4
First type of data:
Program-5
Second type of data:
Program-6
Program-2
In order to check null values in Pandas DataFrame, we use isnull() function. This function
return dataframe of Boolean values which are True for NaN values.
Program-3
In order to check null values in Pandas Dataframe, we use notnull() function this function
return dataframe of Boolean values which are False for NaN values.
Program-4
Program-5
Program-6
Program-7
Method-I
Drop Columns with Missing Values
Program-8
Method-II
fillna() manages and let the user replace NaN values with some value of their own
Program-9
Program-10
Filling missing values with mean
Program-11
Filling missing values in csv files:
df=pd.read_csv(r'E:\mldatasets\Machine_Learning_Data_Preprocessing_Python-
master\Sample_real_estate_data.csv', na_values='NAN')
Program-12
Program-13
Code:
missing_value = ["n/a","na","--"]
data1=pd.read_csv(r'E:\mldatasets\Machine_Learning_Data_Preprocessing_Python-
master\Sample_real_estate_data.csv', na_values = missing_value)
df = data1
Exercise programs:
1. Load two standard ML datasets by using Method II and III shown in the above examples.
2. Write a python program to compute Mean, Median, Mode, Variance, Standard Deviation using
the first 5 or more rows from Iris dataset.
3. Load a real dataset (For example, Iris). Then apply Min-max normalization on the features of
the dataset.
Week-4
Write a program to demonstrate the working of the decision tree based ID3 algorithm by
considering a dataset.
Decision Tree: A decision tree mainly contains of a root node, interior nodes, and leaf
nodes which are then connected by branches. The main idea of decision trees (ID3) is to find
those descriptive features which contain the most "information" regarding the target feature and
then split the dataset along the values of these features such that the target feature values for the
resulting sub-datasets are as pure as possible. The descriptive feature which leaves the target
feature most purely is said to be the most informative one. This process of finding the "most
informative" feature is done until we accomplish a stopping criteria where we then finally end
up in so called leaf nodes. Information gain is a measure of how good a descriptive feature is
suited to split a dataset on. o be able to calculate the information gain, we have to first introduce
the term entropy of a dataset. The entropy of a dataset is used to measure the impurity of a
dataset and we will use this kind of informativeness measure in our calculations.
#ID3 from sklearn
Random Forest: The Random forest classifier creates a set of decision trees from a randomly
selected subset of the training set. It collects the votes from different decision trees to decide the
final prediction.
Exercise:
a) Apply ID3 on a different dataset.
b) Apply Random forest by varying the number of trees to 50, 100, 200, 500 and analyze the
variation in the accuracies obtained.
Write a Python program to implement Simple Linear Regression and plot the graph.
Linear Regression: Linear regression is defined as an algorithm that provides a linear relationship between
an independent variable and a dependent variable to predict the outcome of future events. It is a statistical
method used in data science and machine learning for predictive analysis. Linear regression is a supervised
learning algorithm that simulates a mathematical relationship between variables and makes predictions for
continuous or numeric variables such as sales, salary, age, product price, etc.
Program:
R-20
Write a Python program to implement Logistic Regression for iris using sklearn
Exercise:
a) Implement Simple Linear Regression on a different dataset and plot the graph.
b) Implement Logistic Regression on a different dataset and plot the confusion matrix.
R-20
R-20
R-20
Week-6
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import datasets
iris = datasets.load_iris()
X, y = datasets.load_iris( return_X_y = True)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.40)
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors=1)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
result = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(result)
result1 = classification_report(y_test, y_pred)
print("Classification Report:",)
print (result1)
result2 = accuracy_score(y_test,y_pred)
print("Accuracy:",result2)
R-20
Program:
R-20
Exercise:
a) Write programs to implement the KNN for k=3,5,7,11 and compare the results.
b) Write programs to implement the Linear and Polynomial and RBF kernels for SVM on IRIS and compare
the results.
c) Vary the number of clusters k values as follows on Iris dataset and compare the results. Remove the y-labels
from the dataset as pre-processing.
i. 1
ii. 3
iii. 5
iv. 7
v. 11
R-20
R-20