Dhana Doc 1
Dhana Doc 1
Submitted by
KANDULA DHANAVANTH
Roll No:216N1A0527
Department of COMPUTER SCIENCE AND ENGINEERING
SRINIVASA INSTITUTE OF ENGINEERING AND TECHNOLOGY
(UGC-Autonomous)
(Approved by AICTE, New Delhi & Permanently Affilliated to JNTUK, Kakinada)
(An ISO 9001:2015 Certified Institute, Accredited by NAAC with ‘A’ Grade) NH-216,
Amalapuram-Kakinada Highway, Cheyyeru (V), Amalapuram.
(2024-2025)
1
SRINIVASA INSTITUTE OF ENGINEERING AND TECHNOLOGY
(UGC-Autonomous)
(Approved by AICTE, New Delhi & Permanently Affilliated to JNTUK, Kakinada)
(An ISO 9001:2015 Certified Institute, Accredited by NAAC with ‘A’ Grade) NH-216,
Amalapuram-Kakinada Highway, Cheyyeru (V), Amalapuram.
CERTIFICATE
This is to certify that KANDULA DHANAVANTH Reg. No. 216N1A0527 has completed his/her
Internship in AICTE on Machine Learning as a part of partial fulfilment of the requirement for the
Degree of Bachelor of Technology in the Department of Computer Science and Engineering for the
academic year 2024-2025.
Principal
2
s.no Page no
contents
1 certificate 4
2 Acknowledgement 5
3 Abstract 6
4 Internship activities 7
5 Week -1 8 -9
6 Week -2 10 -11
7 Week-3 12 -13
8 Week -4 14 -15
9 Week -5 16-17
10 Week -6 18-19
11 Week -7 20-21
12 Week -8 22-23
14 Conclusion 25
3
INTERNSHIP CERTIFICATE :
4
ACKNOWLEDGEMENT
Our sincere gratitude goes to P.Chaitanya, our Internship Coordinator, whose constant
support, valuable feedback, and motivating presence steered us through the challenges
we encountered during the project. His leadership played a critical role in the successful
completion of our internship.
I would also like to express my sincere gratitude to my mentor, Dr.K.Vijay Kumar for
their invaluable guidance, mentorship, and support throughout the internship. Their
expertise and encouragement were instrumental in my learning and growth during this
experience. I am truly grateful for their dedication and commitment to my success.
I am deeply indebted to Mrs.V.Sai Priya, Head of the Department, for her guidance and
for ensuring we had access to the necessary resources and support throughout the
internship. Her encouragement has been a driving force in our progress.
KANDULA DHANAVANTH
(216N1A0527)
5
ABSTRACT
learning using python Virtual Internship is a comprehensive program designed to equip participants with
essential skills in Machine Learning and Python. This internship spans several weeks, offering a Machine
blend of theoretical knowledge and practical experience through hands-on projects and collaborative
learning.
Machine learning has revolutionized the way we approach data-driven problems, enabling computers to learn
from data and make predictions or decisions without explicit programming. Python, with its rich ecosystem
of libraries and tools, has become the de facto language for implementing machine learning algorithms.
Whether you’re new to the field or looking to expand your skills, understanding the fundamentals of machine
learning and how to apply them using Python is essential.
Throughout the program, interns engage in real-world projects, applying their skills to solve complex
problems and gain insights into the ML lifecycle. Mentorship and feedback sessions enhance the learning
experience, culminating in a final presentation of project findings.
In partnership with EduSkills, Machine learning has initiated a large scale skilling program aimed attaining
over 2,000 educators and 5,000 students across India. This virtual internship, endorsed by AICTE, offers
students a unique opportunity to gain practical, outcome-driven skills in Machine learning, enhancing their
career prospects in today’s tech driven world. Through the internship, participants gain the knowledge and
tools to confidently design, deploy, and manage solutions on the machine learning platform.
6
INTERNSHIP ACTIVITIES
List of Table:
Week-2
Class, Objects and constructors, Inheritance,
Introduction to NumPy
7
WEEK-1
INTRODUCTION TO PYTHON
Python is a widely used high-level, general-purpose, interpreted, dynamic programming
language. Its design philosophy emphasizes code readability, and its syntax allows
programmers to express concepts in fewer lines of code than would be possible in languages
such as C++ or Java. The language provides constructs intended to enable clear programs on
both a small and large scale.
Python supports multiple programming paradigms, including objectoriented,
imperative and functional programming or procedural styles. It features a dynamic type system
and automatic memory management and has a large and comprehensive standard library.
Python interpreters are available for installation on many operating systems, allowing Python
code execution on a wide variety of systems.
VARIABLES
Variables are nothing but reserved memory locations to store values. This means that when
you create a variable you reserve some space in memory.
Based on the data type of a variable, the interpreter allocates memory and decides what can be
stored in the reserved memory. Therefore, by assigning different data types to variables, you
can store integers, decimals or characters in these variables.
EX:
DATA TYPES
8
CONDITIONAL STATEMENTS
Statement Description
LOOPS IN PYTHON
Programming languages provide various control structures that allow for more
complicated execution paths.
Loop type Description
nested loops You can use one or more loop inside any another while, for or
do..while loop.
9
WEEK-2
CLASS, OBJECTS AND CONSTRUCTORS
PYTHON CLASS
A class is a blueprint for the objects. Like function definitions begin with the def keyword in
python, class definitions with a class keyword.
Class class_name:
Pass
Syntax: obj_1=class_name()
PYTHON CONSTRUCTORS
Like methods, a constructor also contains a collection of statements that are executed at the
time of object creation. It runs as soon as an object of a class is instantiated. The method is
useful to do any initialization you want to do with your object.
Constructors are generally used for instantiating an object. The task of constructors is to
initialize to the data members of the class when an object is created. In python the_ _ init_ _()
method is called the constructor and is always called when an object is created.
INHERITANCE IN PYTHON
Inheritance is the capability of one class to derive or inherit the properties from another
class.
10
TYPES OF INHERITANCE
Single inheritance:
Single inheritance enables a derived class to inherit properties from a single parent class,
thus enabling code reusability and the addition of new features to existing code.
Multiple inheritance:
When a class can be derived from more than one base class this type of inheritance is
called multiple inheritances. In multiple inheritances, all the features of the base classes are
inherited into the derived class.
Hierarchical inheritance:
When more than one derived class are created from a single base this type of
inheritance is called hierarchical inheritance. In this program, we have a parent class and two
child classes.
INTRODUCTION TO NUMPY
NumPy (Numerical python) is an open-source python library that’s used in almost every field
of science and engineering. It’s the universal standard for working with numerical data in
python, and it’s at the core of the scientific python and pyData ecosystems. NumPy users
include everyone from beginning coders to experienced researchers doing state-of-the-art
scientific and industrial research and development. The NumPy API is used extensively in
pandas, sciPy, Matplotib, scikit learn, scikit-image and most other data science and scientific
python packages.
11
WEEK-3
INTRODUCTION TO MACHINE LEARNING
INTRODUCTION
Machine Learning is the science of getting computers to learn without being explicitly
programmed. It is closely related to computational statistics, which focuses on making
prediction using computer. In its application across business problems, machine learning is also
referred as predictive analysis. Machine Learning is closely related to computational statistics.
Machine Learning focuses on the development of computer programs that can access data and
use it to learn themselves. The process of learning begins with observations or data, such as
examples, direct experience, or instruction, in order to look for patterns in data and make better
decisions in the future based on the examples that we provide. The primary aim is to allow the
computers learn automatically without human intervention or assistance and adjust actions
accordingly.
12
TYPES OF MACHINE LEARNING
The types of machine learning algorithms differ in their approach, the type of data they
input and output, and the type of task or problem that they are intended to solve. Broadly
Machine Learning can be categorized into four categories.
I. Supervised Learning
II. Unsupervised Learning
III. Reinforcement Learning
Supervised Learning:
Supervised Learning is a type of learning in which we are given a data set and we
already know what are correct output should look like, having the idea that there is a
relationship between the input and output. Basically, it is learning task of learning a function
that maps an input to an output based on example input output pairs. It infers a function from
labeled training data consisting of a set of training examples. Supervised learning problems are
categorized.
Unsupervised Learning:
Unsupervised Learning is a type of learning that allows us to approach problems with
little or no idea what our problem should look like. We can derive the structure by clustering
the data based on a relationship among the variables in data. With unsupervised learning there
is no feedback based on prediction result. Basically, it is a type of selforganized learning that
helps in finding previously unknown patterns in data set without preexisting label.
Reinforcement Learning:
Reinforcement learning is a learning method that interacts with its environment by
producing actions and discovers errors or rewards. Trial and error search and delayed reward
are the most relevant characteristics of reinforcement learning. This method allows machines
and software agents to automatically determine the ideal behaviour within a specific context in
order to maximize its performance. Simple reward feedback is required for the agent to learn
which action is best.
13
WEEK-4
LINEAR AND LOGISTIC REGRESSION
LINEAR REGRESSION
Linear regression is one of the supervised Machine learning algorithms in Python
that observes continuous features and predicts an outcome. Depending on whether it runs on a
single variable or on many features, we can call it simple linear regression or multiple linear
regression.
This is one of the most popular Python ML algorithms and often under-appreciated.
It assigns optimal weights to variables to create a line ax+b to predict the output. We often use
linear regression to estimate real values like a number of calls and costs of houses based on
continuous variables. The regression line is the best line that fits Y=a*X+b to denote a
relationship between independent and dependent variables.
LOGISTIC REGRESSION
Logistic regression is a supervised classification is unique Machine Learning
algorithms in Python that finds its use in estimating discrete values like 0/1, yes/no, and
true/false. This is based on a given set of independent variables. We use a logistic function to
predict the probability of an event and this gives us an output between 0 and 1. Although it says
‘regression’, this is actually a classification algorithm. Logistic regression fits data into a logit
function and is also called logistic regression.
14
DECISION TREE
A decision tree falls under supervised Machine Learning Algorithms in Python and
comes of use for both classification and regression- although mostly for classification. This
model takes an instance, traverses the tree, and compares important features with a determined
conditional statement. Whether it descends to the left child branch or the right depends on the
result. Usually, more important features are closer to the root. Decision Tree, a Machine
Learning algorithm in Python can work on both categorical and continuous dependent
variables. Here, we split a population into two or more homogeneous sets. Tree models where
the target variable can take a discrete set of values are called classification trees; in these tree
structures, leaves represent class labels and branches represent conjunctions of features that
lead to those class labels. Decision trees where the target variable can take continuous values
(typically real numbers) are called regression trees.
Purpose of spreadsheet:
15
WEEK-5
APPLICATIONS OF MACHINE LEARNING
APPLICATIONS:
Machine learning is one of the most exciting technologies that one would have ever
come across. As it is evident from the name, it gives the computer that which makes it more
similar to humans: The ability to learn. Machine learning is actively being used today, perhaps
in many more places than one would expect. We probably use a learning algorithm dozen of
time without even knowing it. Applications of Machine Learning include:
• Web Search Engine: One of the reasons why search engines like google, bing etc work
so well is because the system has learnt how to rank pages through a complex learning
algorithm.
• Photo tagging Applications: Be it Facebook or any other photo tagging application,
the ability to tag friends makes it even more happening. It is all possible because of a
face recognition algorithm that runs behind the application.
• Spam Detector: Our mail agent like Gmail or Hotmail does a lot of hard work for us
in classifying the mails and moving the spam mails to spam folder. This is again
achieved by a spam classifier running in the back end of mail application.
• Image Recognition: Image recognition is one of the most common applications of
machine learning. It is used to identify objects, persons, places, digital images, etc. The
popular use case of image recognition and face detection is, Automatic friend tagging
suggestion.
• Traffic prediction: If we want to visit a new place, we take help of Google Maps, which
shows us the correct path with the shortest route and predicts the traffic condition.
• Self-driving cars: One of the most exciting applications of machine learning is
selfdriving cars. Machine learning plays a significant role in self-driving cars. Tesla, the
most popular car manufacturing company is working on self-driving car. It is using
unsupervised learning method to train the car models to detect people.
16
CONFUSION MATRIX
A confusion matrix summarizes the performance of a machine learning model on a set of test
data. It is a means of displaying the number of accurate and inaccurate instances based on the
model’s predictions. It is often used to measure the performance of classification models, which
aim to predict a categorical label for each input instance.
The matrix displays the number of instances produced by the model on the test data.
• True Positive (TP): The model correctly predicted a positive outcome (the actual
outcome was positive).
• True Negative (TN): The model correctly predicted a negative outcome (the actual
outcome was negative).
• False Positive (FP): The model incorrectly predicted a positive outcome (the actual
outcome was negative). Also known as a Type I error.
• False Negative (FN): The model incorrectly predicted a negative outcome (the actual
outcome was positive). Also known as a Type II error.
17
WEEK-6
K-NEAREST NEIGHBORS
KNN ALGORITHM
This is a Python Machine Learning algorithm for classification and regression- mostly
for classification. This is a supervised learning algorithm that considers different centroids and
uses a usually Euclidean function to compare distance. Then, it analyzes the results and
classifies each point to the group to optimize it to place with all closest points to it. It classifies
new cases using a majority vote of k of its neighbors. The case it assigns to a class is the one
most common among its K nearest neighbors. For this, it uses a distance function. k-NN is a
type of instance-based learning, or lazy learning, where the function is only approximated
locally and all computation is deferred until classification. k-NN is a special case of a variable
bandwidth, kernel density "balloon" estimator with a uniform kernel.
The K-NN working can be explained on the basis of the below algorithm:
18
JUPYTER NOTEBOOK INTRO
Jupyter Notebook is a notebook authoring application, under the Project
Jupyter umbrella. Built on the power of the computational notebook format, Jupyter Notebook
offers fast, interactive new ways to prototype and explain your code, explore and visualize your
data, and share your ideas with others.
Notebooks extend the console-based approach to interactive computing
in a qualitatively new direction, providing a web-based application suitable for capturing the
whole computation process: developing, documenting, and executing code, as well as
communicating the results. The Jupyter notebook combines two components:
19
WEEK-7
CHRONIC KIDNEY DISEASE ANALYSIS
ANALYSIS:
Chronic kidney disease (CKD) is non-communicable disease that has significantly
contributed to morbidity, mortality and admission rate of patients worldwide.
In this Week, participants embark on a practical application of machine learning
techniques by analyzing a real-world healthcare dataset focusing on chronic kidney disease.
Participants begin by understanding the importance of healthcare data analysis and the
challenges associated with diagnosing chronic kidney disease. They gain insights into the
structure and characteristics of the dataset, including the various features and the target variable
related to kidney disease diagnosis.
The week emphasizes the data preprocessing stage, where participants learn
techniques for handling missing values, encoding categorical variables, and scaling numerical
features. They understand the significance of data preprocessing in ensuring the quality and
reliability of the machine learning model.
Following data preprocessing, participants explore different machine learning
algorithms suitable for classification tasks. They learn how to train and evaluate models using
techniques such as cross-validation and performance metrics like accuracy, precision, recall,
and F1-score.
Moreover, participants delve into model interpretation and feature importance
analysis to understand the factors contributing to chronic kidney disease diagnosis. They gain
insights into the clinical relevance of the predictive features identified by the model.
Through hands-on exercises and practical examples using Python libraries
such as Pandas, Scikit-learn, and Matplotlib, participants build predictive models for chronic
kidney disease diagnosis. They iteratively refine their models, fine-tune hyperparameters, and
optimize performance to achieve accurate and reliable predictions.
By the end of Week, participants have gained valuable experience in
applying machine learning techniques to healthcare datasets, specifically for chronic kidney
disease analysis. They are equipped with the skills to preprocess healthcare data, build
predictive models, and interpret model results effectively, contributing to improved diagnosis
and patient care in real-world healthcare settings.
20
Housing Price Prediction using Linear Regression
In this Week, participants engage in a practical project focused on predicting
housing prices using linear regression, a foundational machine learning algorithm.
The week begins with an overview of the housing price prediction task, highlighting its
significance in real estate, finance, and urban planning. Participants learn about the key features
affecting housing prices and the importance of data collection and preprocessing in building
accurate prediction models.
Participants delve into the process of collecting housing data from various sources,
including public datasets, real estate websites, and government databases. They explore
techniques for data cleaning, handling missing values, and feature engineering to prepare
the dataset for model training.
21
WEEK-8
CORRELATION AND COMPUTER VISION
CORRELATION:
Correlation explains how one or more variables are related to each other. These
variables can be input data features which have been used to forecast our target variable.
Correlation, statistical technique which determines how one variables
moves/changes in relation with the other variable. It gives us the idea about the degree of the
relationship of the two variables. It’s a bi-variate analysis measure which describes the
association between different variables. In most of the business it’s useful to express one
subject in terms of its relationship with others.
Positive Correlation:
Two features (variables) can be positively correlated with each other. It means
that when the value of one variable increase then the value of the other variable(s) also
increases.
Negative Correlation:
Two features (variables) can be negatively correlated with each other. It means
that when the value of one variable increase then the value of the other variable(s) decreases.
No Correlation:
Two features (variables) are not correlated with each other. It means that when the
value of one variable increase or decrease then the value of the other variable(s) doesn’t
increase or decreases.
22
COMPUTER VISION:
Computer vision, a fascinating field at the intersection of computer science and
artificial intelligence, which enables computers to analyze images or video data, unlocking a
multitude of applications across industries, from autonomous vehicles to facial recognition
systems.
This Computer Vision is designed for both beginners and experienced professionals, covering
both basic and advanced concepts of computer vision, including Digital
23
RESULTS & DISCUSSIONS
Results:
• Accomplished Tasks:
Discussions:
• Challenges Faced:
24
CONCLUSION
25