0% found this document useful (0 votes)

25 views15 pages

Silver Oak College of Computer Application: Subject:Machine Learning

Uploaded by

jnike9890

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views15 pages

Silver Oak College of Computer Application: Subject:Machine Learning

Uploaded by

jnike9890

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

SILVER OAK COLLEGE OF COMPUTER APPLICATION

SUBJECT :MACHINE LEARNING

TOPIC : Flow & Tools of machine learning

Machine Learning Workflow: From Data
Collection to Model Deployment
• Machine learning is a powerful approach that enables computers to learn patterns from data and make
predictions or decisions without explicit programming. The machine learning workflow consists of several
key steps that guide the process of building and deploying effective machine learning models. Let's explore
these steps in detail with examples:

• Step 1: Data Collection

- The first step in the machine learning workflow is to gather relevant data that represents the problem
we want to solve.
- Example: Suppose we want to build a machine learning model to predict house prices. We collect
data on various features of houses (e.g., size, number of bedrooms, location) and their corresponding
prices.

• Step 2: Data Preprocessing

- Once we have the data, we need to preprocess it to make it suitable for the machine learning
algorithms.
- Example: In the house price prediction example, we might need to handle missing values, scale
numerical features, and encode categorical variables (e.g., convert location names into numerical
values).
• Step 3: Feature Engineering
- Feature engineering involves selecting or creating the most relevant features that contribute to the
model's performance.
- Example: In the house price prediction, we may derive additional features like the age of the house or
the distance to the nearest school to improve the model's accuracy.
• Step 4: Model Selection
- Model selection is about choosing the appropriate algorithm or model that fits the problem's
requirements.
- Example: For the house price prediction task, we could consider using regression models like linear
regression or decision tree regression.

• Step 5: Model Training

- Once we select the model, we need to train it on the preprocessed data to learn patterns and
relationships.
- Example: The selected model is trained using the historical data on house features and prices, so it
learns how features influence house prices.

• Step 6: Model Evaluation

- Model evaluation assesses the performance of the trained model on unseen data to ensure it
generalizes well.
- Example: We split the data into training and testing sets. The model's predictions on the test set are
compared with the actual house prices to measure accuracy or other metrics.

• Step 7: Hyper parameter Tuning

- Many models have hyper parameters (settings that control the learning process) that need to be fine-
tuned for optimal performance.
- Example: In a decision tree model, the maximum depth of the tree is a hyper parameter that affects
the model's complexity. We adjust this parameter to improve accuracy.
• Step 8: Model Deployment
- After selecting the best model, we deploy it in a real-world environment to make predictions on new
data.
- Example: The trained house price prediction model is deployed as a web application where users can
enter house features, and the model predicts the price.

• Step 9: Monitoring and Maintenance

- Once the model is deployed, it is essential to continuously monitor its performance and update it as
needed.
- Example: If the house price prediction model starts making inaccurate predictions due to changing
market trends, we may need to retrain it with new data.

• Conclusion:
The machine learning workflow involves several iterative steps, starting from data collection to
deploying the trained model for real-world use. Each step is crucial for building accurate and reliable
machine learning models. By following this workflow and continuously improving the model, we can
leverage the power of machine learning to solve a wide range of problems effectively.
• Machine Learning, as the name suggests, is the science of programming a computer by which they are able
to learn from different kinds of data.
• A more general definition given by Arthur Samuel is – “Machine Learning is the field of study that gives
computers the ability to learn without being explicitly programmed.” They are typically used to solve
various types of life problems.
• In the older days, people used to perform Machine Learning tasks by manually coding all the algorithms
and mathematical and statistical formulas.
• This made the processing time-consuming, tedious, and inefficient. But in the modern days, it is become
very much easy and more efficient compared to the olden days with various python libraries, frameworks,
and modules.
• Today, Python is one of the most popular programming languages for this task and it has replaced many
languages in the industry, one of the reasons is its vast collection of libraries.
• Python libraries that are used in Machine Learning are:

• Numpy
• Scikit-learn
• Pandas
• Matplotlib
• NumPy is a very popular python library for large multi-dimensional array and matrix processing, with the help
of a large collection of high-level mathematical functions. It is very useful for fundamental scientific
computations in Machine Learning. It is particularly useful for linear algebra, Fourier transform, and random
number capabilities. High-end libraries like Tensor Flow uses NumPy internally for manipulation of Tensors.
• NumPy is a Python library used for working with arrays.
• Functions:
• array() : We can create a NumPy ndarray object by using the array() function.
• concatenate() : we can join the array.
• arange(): This function is used to create an array with a range of values.
• array_split(): We use array_split() for splitting arrays, we pass it the array we want to split and the
number of splits.
• sort() : The NumPy ndarray object has a function called sort(), that will sort a specified array.
• random.randint(): This function is used to create an array with random integer values between a
specified range.
• max(): This function is used to find the maximum value in an array.
• min(): This function is used to find the minimum value in an array.
• mean(): This function is used to find the mean value of an array.
• median(): This function is used to find the median value of an array.
• dot(): This function is used to find the dot product of two arrays.
• Example: # Find Median
import numpy as np arr = np.array([1, 2, 3])
# Creating two arrays of rank 2 median_value = np.median(arr)
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]]) print(median_value)

# Creating two arrays of rank 1 # Find Mean

v = np.array([9, 10]) mean_value = np.mean(arr)
w = np.array([11, 12])
print(mean_value)
# Inner product of vectors
print(np.dot(v, w), "\n") # Find Min
min_value = np.min(arr)
# Matrix and Vector product
print(min_value)
print(np.dot(x, v), "\n")

# Matrix and matrix product # Find Max

print(np.dot(x, y))
max_value = np.max(arr)
print(max_value)
• Scikit-learn is one of the most popular ML libraries for classical ML algorithms. It is built on top of two basic
Python libraries, viz., NumPy and SciPy. Scikit-learn supports most of the supervised and unsupervised learning
algorithms. Scikit-learn can also be used for data-mining and data-analysis, which makes it a great tool who is
starting out with ML.
• Scikit-learn is a machine learning library for Python. It features several regression, classification and clustering
algorithms including SVMs, gradient boosting, k-means, random forests and DBSCAN. It is designed to work with
Python Numpy and SciPy.
• Functions:
• load_iris() : if we want to load an Iris dataset
• head() : this command returns the first five rows of the Iris Flower dataset.
• train_test_split() : This method splits the data into train and test datasets.
• predict() : Scikit-learn provides a wide range of machine learning algorithms that have a unified/consistent
interface for fitting, predicting accuracy, etc.
• fit(): The classifier is trained using X_train data. The process is termed fitting. We pass the feature matrix and
the corresponding response vector.
Example: model.fit(X_train, y_train)
import numpy as np
# Make Predictions
import pandas as pd
y_pred = model.predict(X_test)
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
# Evaluate the Model
from sklearn.linear_model import LinearRegression
mse = mean_squared_error(y_test, y_pred)
from sklearn.metrics import mean_squared_error,
r2_score r2 = r2_score(y_test, y_pred)
# Create a Dataset print(f'Mean Squared Error: {mse}')
data = { print(f'R^2 Score: {r2}')
'Hours_Studied': [1, 2, 3, 4, 5],
'Scores': [10, 20, 30, 40, 50] #Visualize the Results
} plt.scatter(X, y, color='blue')
df = pd.DataFrame(data) plt.plot(X, model.predict(X), color='red')
# Split the dataset into training and testing sets. plt.xlabel('Hours Studied')
X = df[['Hours_Studied']] plt.ylabel('Scores')
y = df['Scores'] plt.title('Hours Studied vs Scores')
X_train, X_test, y_train, y_test = train_test_split(X, y, plt.show()
test_size=0.2, random_state=42)

# Create and Train the Model

model = LinearRegression()
• Pandas is a popular Python library for data analysis. It is not directly related to Machine Learning. As we
know that the dataset must be prepared before training. In this case, Pandas comes handy as it was developed
specifically for data extraction and preparation. It provides high-level data structures and wide variety tools for
data analysis. It provides many inbuilt methods for grouping, combining and filtering data.
• It is one of the most used libraries in Python for data science or data analysis. It can read data from CSV or
Excel files, manipulate the data, and generate insights from it. Pandas can also be used to clean data, filter
data, and visualize data.
• Functions:
• read_csv() : This function is used to retrieve data from CSV files in the form of a dataframe.
• Head(): This function is used to return the top n (5 by default) values of a data frame or series.
• Info(): This method is used to generate the summary of the DataFrame, this will include info about
columns with their names, their datatypes, and missing values.
• Describe(): Returns descriptive statistics about the data like mean, minimum, maximum, standard
deviation, etc.
• isNull() : Returns the DataFrame/Series of the boolean values. Missing values gets mapped to True and
non-missing value gets mapped to False.
• Sort_values() : This method sorts the data frame in ascending or descending order of passed Column.
• Values_count() : Returns the counts of the unique values in a series or from a dataframe’s column.
• Std() : It returns sample standard deviation over the requested axis.
Example:
import pandas as pd

data = {"country": ["Brazil", "Russia", "India", "China", "South Africa"],

"capital": ["Brasilia", "Moscow", "New Delhi", "Beijing", "Pretoria"],
"area": [8.516, 17.10, 3.286, 9.597, 1.221],
"population": [200.4, 143.5, 1252, 1357, 52.98] }

data_table = pd.DataFrame(data)
print(data_table)
• Matplotlib is a low level graph plotting library in python that serves as a visualization utility.
• Matplotlib is a very popular Python library for data visualization. Like Pandas, it is not directly related to
Machine Learning. It particularly comes in handy when a programmer wants to visualize the patterns in the
data. It is a 2D plotting library used for creating 2D graphs and plots. A module named pyplot makes it easy
for programmers for plotting as it provides features to control line styles, font properties, formatting axes,
etc. It provides various kinds of graphs and plots for data visualization, viz., histogram, error charts, bar
chats, etc,
• Functions:
• plot() : This function is used to create line charts, scatter plots, and other types of plots. It takes x and y
values as inputs and can also accept optional arguments such as line style, marker style, and color. This
function is the backbone of most Matplotlib visualizations.
• xlabel() & ylabel() : These functions are used to label the x-axis and y-axis, respectively. They take a
string as input and can also accept optional arguments such as font size, font weight, and color.
• title() : This function is used to add a title to the plot. It takes a string as input and can also accept
optional arguments such as font size, font weight, and color.
• grid() : This function to add grid lines to the plot.
• scatter() : This function plots one dot for each observation. It needs two arrays of the same length, one
for the values of the x-axis, and one for values on the y-axis.
• bar(): This function is used for Bar graph.
Example:
import matplotlib.pyplot as plt
# data to display on plots
x = [3, 1, 3]
y = [3, 2, 1]
plt.plot(x, y)
plt.title("Line Chart")
plt.legend(["Line"])
plt.show()
# This will plot a simple bar chart
plt.bar(x, y)
# Title to the plot
plt.title("Bar Chart")

# Adding the legends

plt.legend(["bar"])
plt.show()

MACHINE LEARNING R23 material
100% (8)
MACHINE LEARNING R23 material
32 pages
F20BC - CW - 2020 - Biologically Inspired
No ratings yet
F20BC - CW - 2020 - Biologically Inspired
6 pages
ML Notes-1
No ratings yet
ML Notes-1
59 pages
ML_notion_1
No ratings yet
ML_notion_1
18 pages
Rohit Unit 1 ML Notes
No ratings yet
Rohit Unit 1 ML Notes
27 pages
Act7
No ratings yet
Act7
18 pages
UNIT 1
No ratings yet
UNIT 1
4 pages
Machine Learning And Deep Learning With Python A Beginners Guide To Programming - 2 Books In 1
No ratings yet
Machine Learning And Deep Learning With Python A Beginners Guide To Programming - 2 Books In 1
132 pages
Library
No ratings yet
Library
23 pages
Module 5.pptx_20250608_201231_0000
No ratings yet
Module 5.pptx_20250608_201231_0000
43 pages
Introduction To Machine Learning Notes
No ratings yet
Introduction To Machine Learning Notes
26 pages
Unit-I
No ratings yet
Unit-I
23 pages
UNIT 1 Machine Learning
No ratings yet
UNIT 1 Machine Learning
38 pages
ML Unit 1
No ratings yet
ML Unit 1
21 pages
Unit_1_Introduction_to_ML (2)
No ratings yet
Unit_1_Introduction_to_ML (2)
2 pages
Chapter 01 machine learning
No ratings yet
Chapter 01 machine learning
22 pages
ML Notes
No ratings yet
ML Notes
52 pages
ML - Part - A
No ratings yet
ML - Part - A
10 pages
1
No ratings yet
1
6 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Unit 1 Machine Learning - PDF Lands
No ratings yet
Unit 1 Machine Learning - PDF Lands
5 pages
Ml Notes All
No ratings yet
Ml Notes All
32 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
8 pages
Planet, Code - MACHINE LEARNING WITH PYTHON_ a Comprehensive Guide to Algorithms, Deep Learning Techniques, And Practical Applications (2025)
No ratings yet
Planet, Code - MACHINE LEARNING WITH PYTHON_ a Comprehensive Guide to Algorithms, Deep Learning Techniques, And Practical Applications (2025)
233 pages
Assignment 3 (1)
No ratings yet
Assignment 3 (1)
4 pages
Basic of Machine Learning
No ratings yet
Basic of Machine Learning
7 pages
Machine Learning Most Important Question For Mid Term Ipu University
No ratings yet
Machine Learning Most Important Question For Mid Term Ipu University
36 pages
DSF - UNIT III Notes
No ratings yet
DSF - UNIT III Notes
17 pages
Python Machine Learning - Session 2
No ratings yet
Python Machine Learning - Session 2
6 pages
Module 3 Data Science Machine Learning
No ratings yet
Module 3 Data Science Machine Learning
53 pages
MACHINE LEARNING 1-5 (Ai &DS)
100% (1)
MACHINE LEARNING 1-5 (Ai &DS)
60 pages
Machine Learning unit 1 que and ans
No ratings yet
Machine Learning unit 1 que and ans
6 pages
Combine PDF
No ratings yet
Combine PDF
75 pages
Code Planet. Machine Learning With Python. a Comprehensive Guide...2025
No ratings yet
Code Planet. Machine Learning With Python. a Comprehensive Guide...2025
231 pages
ML Module 1
No ratings yet
ML Module 1
12 pages
Manual Data
No ratings yet
Manual Data
13 pages
UNIT 1
No ratings yet
UNIT 1
28 pages
AIML NOTES
No ratings yet
AIML NOTES
12 pages
ML Training PDF
No ratings yet
ML Training PDF
6 pages
Machine Learning With Python.
100% (1)
Machine Learning With Python.
147 pages
Krishna Edx Machine Learning With Python
No ratings yet
Krishna Edx Machine Learning With Python
18 pages
machine learning
No ratings yet
machine learning
3 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
6 pages
Supervised - ML Complete Book
No ratings yet
Supervised - ML Complete Book
153 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
Diya Basera
No ratings yet
Diya Basera
15 pages
Pa 2
No ratings yet
Pa 2
13 pages
Report
No ratings yet
Report
11 pages
Best Python Libraries For Machine Learning - GeeksforGeeks
No ratings yet
Best Python Libraries For Machine Learning - GeeksforGeeks
18 pages
LECTURE-2
No ratings yet
LECTURE-2
36 pages
Module_-1
No ratings yet
Module_-1
9 pages
Dhaapps Datascience With Gen AI-1
No ratings yet
Dhaapps Datascience With Gen AI-1
23 pages
Python - Follow Dr. AngShu (@drangshu) For More
100% (1)
Python - Follow Dr. AngShu (@drangshu) For More
300 pages
Machine Learning
No ratings yet
Machine Learning
24 pages
algorithmeknn-121213175830-phpapp02
No ratings yet
algorithmeknn-121213175830-phpapp02
52 pages
ML Training
No ratings yet
ML Training
6 pages
Introduction to ML for Business Analysts and Managers
No ratings yet
Introduction to ML for Business Analysts and Managers
41 pages
Class Notes: The Basics of Machine Learning
No ratings yet
Class Notes: The Basics of Machine Learning
4 pages
Machine Learning With Python Unit 1-17-84 Final13092024
No ratings yet
Machine Learning With Python Unit 1-17-84 Final13092024
68 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
Volatility Surface Interpolation
No ratings yet
Volatility Surface Interpolation
24 pages
A Combination of Hidden Markov Model and Fuzzy Model For Stock Market Forecasting PDF
No ratings yet
A Combination of Hidden Markov Model and Fuzzy Model For Stock Market Forecasting PDF
8 pages
Seismic Magnitude Forecasting Through Machine Learning Paradigms: A Confluence of Predictive Models
No ratings yet
Seismic Magnitude Forecasting Through Machine Learning Paradigms: A Confluence of Predictive Models
8 pages
[FREE PDF sample] Maximum Likelihood Estimation of Misspecified Models Twenty Years Later Volume 17 1° Edition T. Fomby ebooks
100% (8)
[FREE PDF sample] Maximum Likelihood Estimation of Misspecified Models Twenty Years Later Volume 17 1° Edition T. Fomby ebooks
75 pages
Optimizing-land-mine-detection-across-diverse-minin_2024_International-Journ
No ratings yet
Optimizing-land-mine-detection-across-diverse-minin_2024_International-Journ
13 pages
Automated Design of Analog Circuits Using Machine Learning Techniques
No ratings yet
Automated Design of Analog Circuits Using Machine Learning Techniques
6 pages
Lecture 02 Supervised Learning 27102022 124322am
No ratings yet
Lecture 02 Supervised Learning 27102022 124322am
29 pages
KGiSL Institute of Technolog(Final) - Copy (2)
No ratings yet
KGiSL Institute of Technolog(Final) - Copy (2)
31 pages
Detecting Randomness Effect Among Raters in Physics
No ratings yet
Detecting Randomness Effect Among Raters in Physics
10 pages
Codes and Concepts of ML-Developer
No ratings yet
Codes and Concepts of ML-Developer
125 pages
10 3 Way Propensity Matching
No ratings yet
10 3 Way Propensity Matching
9 pages
The Evaluation of Grinding Process Using Artificia
No ratings yet
The Evaluation of Grinding Process Using Artificia
8 pages
BUSINESS INTELLIGENCE docs
No ratings yet
BUSINESS INTELLIGENCE docs
12 pages
Literature Review (Stock Market Prediction)
No ratings yet
Literature Review (Stock Market Prediction)
14 pages
Metrix in ML
No ratings yet
Metrix in ML
7 pages
Polverino Et Al 2023 Machine Learning For Prognostics and Health Management of Industrial Mechanical Systems and
No ratings yet
Polverino Et Al 2023 Machine Learning For Prognostics and Health Management of Industrial Mechanical Systems and
20 pages
Automated Acoustic Monitoring Captures Timing and Intensity of Bird Migration
No ratings yet
Automated Acoustic Monitoring Captures Timing and Intensity of Bird Migration
27 pages
6.correlation Between Intraosseous Thermal Change and Drilling Impulse Data During Osteotomy Within Autonomous Dental Implant Robotic System: An in Vitro Study
No ratings yet
6.correlation Between Intraosseous Thermal Change and Drilling Impulse Data During Osteotomy Within Autonomous Dental Implant Robotic System: An in Vitro Study
10 pages
An Ensemble Methods For Medical Insurance Costs Prediction Task
No ratings yet
An Ensemble Methods For Medical Insurance Costs Prediction Task
16 pages
Stock Market Prediction Analysis by Incorporating Social and News Opinion and Sentiment
No ratings yet
Stock Market Prediction Analysis by Incorporating Social and News Opinion and Sentiment
7 pages
Unit Vi: TO Artificial Neural Network
No ratings yet
Unit Vi: TO Artificial Neural Network
71 pages
Statistics for Business and Economics Revised 12th Edition Anderson Test Bank - Available For Quick Download And Unlimited Reading
100% (3)
Statistics for Business and Economics Revised 12th Edition Anderson Test Bank - Available For Quick Download And Unlimited Reading
57 pages
Jurnal (Muliyani, Noeryanti)
No ratings yet
Jurnal (Muliyani, Noeryanti)
10 pages
Bustamante Chapter2
No ratings yet
Bustamante Chapter2
6 pages
1 Mark Python Imp
No ratings yet
1 Mark Python Imp
6 pages
Volatility Forecasting For Mutual Fund Portfolios
No ratings yet
Volatility Forecasting For Mutual Fund Portfolios
10 pages
Queuing Theory
No ratings yet
Queuing Theory
62 pages
E & Ai Lab Manual
No ratings yet
E & Ai Lab Manual
31 pages

Silver Oak College of Computer Application: Subject:Machine Learning

Uploaded by

Silver Oak College of Computer Application: Subject:Machine Learning

Uploaded by

SILVER OAK COLLEGE OF COMPUTER APPLICATION

SUBJECT :MACHINE LEARNING

TOPIC : Flow & Tools of machine learning

• Step 1: Data Collection

• Step 2: Data Preprocessing

• Step 5: Model Training

• Step 6: Model Evaluation

• Step 7: Hyper parameter Tuning

• Step 9: Monitoring and Maintenance

# Creating two arrays of rank 1 # Find Mean

# Matrix and matrix product # Find Max

# Create and Train the Model

data = {"country": ["Brazil", "Russia", "India", "China", "South Africa"],

# Adding the legends

You might also like