0% found this document useful (0 votes)

33 views102 pages

1.introduction To Machine Learning and Toolkit

Uploaded by

Ehab Emam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views102 pages

1.introduction To Machine Learning and Toolkit

Uploaded by

Ehab Emam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 102

Introduction to Machine

Learning and Toolkit

Legal Notices and
Disclaimers
This presentation is for informational purposes only. INTEL MAKES NO WARRANTIES,
EXPRESS
OR IMPLIED, IN THIS SUMMARY.

Intel technologies’ features and benefits depend on system configuration and may
require enabled hardware, software or service activation. Performance varies depending
on system configuration. Check with your system manufacturer or retailer or learn more
at intel.com.
This sample source code is released under the Intel Sample Source Code License

Agreement. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or
other countries.

*Other names and brands may be claimed as the property of

others. Copyright © 2021, Intel Corporation. All rights reserved.
Overview of
Course
Topics include:
• Introduction and exploratory analysis (Week 1)
• Supervised machine learning (Weeks 2 – 10)
• Unsupervised machine learning (Weeks 11 –
12)
Overview of
Course
Audience includes:
• University level professors who may wish to use this
content
in their courses
• University level students or others who want to prepare
for using machine learning and applying machine
learning principles to data
Overview of
Course
Prerequisites:
• Python* programming
• Calculus
• Linear algebra
• Statistics
Overview of
Course
Each week:
• Lecture
• Exercises with solutions
• Time commitment: ~3 hours per week
Total Time: 12 weeks of lectures and exercises. Each
week requires three hours to complete.
Our Toolset: Intel® oneAPI AI Analytics Toolkit (AI
Kit)
• Intel® Extension for Scikit-
learn*
Learning
•Objectives
Demonstrate supervised learning algorithms

• Explain key concepts like under- and over-fitting,

regularization, and cross-validation

• Classify the type of problem to be solved, choose the

right algorithm, tune parameters, and validate a model

• Apply Intel® Extension for Scikit-learn* to leverage

underlying compute capabilities of hardware
Our Toolset: Intel® oneAPI AI Analytics Toolkit (AI
Kit)
Installation
options
https://
software.intel.com/ai
Monolithic intel-distribution-for-python
Distributio
n
Anacond
a articles/using-intel-distribution-for-python-with-anaconda
Package
Manager
Our Toolset: Intel® Distribution for
Python
Installation options
https://software.intel.com/
ai
Monolithic intel-distribution-for-python
Distributio
n
Anacond
a articles/using-intel-distribution-for-python-with-anaconda
Package
Manager
Seaborn is also required: conda install
seaborn
Our Toolset: Intel® oneAPI AI Analytics Toolkit (AI
Kit)
• Jupyter notebooks: interactive coding
and visualization of output

• NumPy, SciPy, Pandas: numerical

computation

• Matplotlib, Seaborn: data visualization

• Scikit-learn: machine learning

Our Toolset: Intel® oneAPI AI Analytics Toolkit (AI
Kit)
• Jupyter notebooks: interactive coding
and visualization of output
Week 1
• NumPy, SciPy, Pandas: numerical
computation

• Matplotlib, Seaborn: data visualization

• Scikit-learn: machine learning

Our Toolset: Intel® oneAPI AI Analytics Toolkit (AI
Kit)
• Jupyter notebooks: interactive coding
and visualization of output

• NumPy, SciPy, Pandas: numerical

computation

• Matplotlib, Seaborn: data visualization

• Scikit-learn: machine learning

Weeks 2 – 12
Introduction to Jupyter
Notebook
• Polyglot analysis
environment— blends
multiple languages

• Jupyter is an anagram of:

Julia, Python, and R

• Supports multiple content s

type code, narrative text, :
images, movies, etc.

Source: http://jupyter.org/
Introduction to Jupyter
Notebook
• Polyglot analysis
environment— blends
multiple languages

• Jupyter is an anagram of:

Julia, Python, and R

• Supports multiple content s:

type
code, narrative text,
images, movies, etc.
Source: http://jupyter.org/
Introduction to Jupyter
Notebook
• Polyglot analysis
environment— blends
multiple languages

• Jupyter is an anagram of:

Julia, Python, and R

• Supports multiple content

types: code, narrative text,
images, movies, etc.

Source: http://jupyter.org/
Introduction to Jupyter
Notebook
• HTML &
Markdown

• LaTeX
(equations)

• Code

Source: http://jupyter.org/
Introduction to Jupyter
Notebook
• HTML &
Markdown

• LaTeX
(equations)

• Code

Source: http://jupyter.org/
Introduction to Jupyter
Notebook
• HTML &
Markdown

• LaTeX
(equations)

• Code

Source: http://jupyter.org/
Introduction to Jupyter
Notebook
• HTML &
Markdown

• LaTeX
(equations)

• Code

Source: http://jupyter.org/
Introduction to Jupyter
Notebook
• Code is divided into cells
to control execution

• Enables
interactive
development

• Ideal for
exploratory
analysis
and model
building
Introduction to Jupyter
Notebook
• Code is divided into cells
to control execution

• Enables
interactive
development

• Ideal for
exploratory
analysis
and model
building
Jupyter Cell
Magics
• %matplotlib inline: display
plots inline in Jupyter
notebook
• %%timeit: time how long a
cell
takes to execute

• %run filename.ipynb: execute

code from another notebook
or python file
Jupyter Cell
Magics
• %matplotlib inline: display
plots inline in Jupyter
notebook
• %%timeit: time how long a
cell
takes to execute

• %run filename.ipynb: execute

code from another notebook
or python file
Jupyter Cell
Magics
• %matplotlib inline: display
plots inline in Jupyter
notebook
• %%timeit: time how long a
cell
takes to execute

• %run filename.ipynb: execute

code from another notebook
or python file
Jupyter Cell
Magics
• %matplotlib inline: display
plots inline in Jupyter
notebook
• %%timeit: time how long a
cell
takes to execute
Jupyter Cell
Magics
• %matplotlib inline: display
plots inline in Jupyter
notebook
• %%timeit: time how long a
cell
takes to execute
• %run filename.ipynb: execute
code from another notebook
or python file
Jupyter Cell
Magics
• %matplotlib inline: display
plots inline in Jupyter
notebook
• %%timeit: time how long a
cell
takes to execute
• %run filename.ipynb: execute
code from another notebook
or python file
• %load filename.py: copy
contents of the file and
paste
into the cell
Jupyter Keyboard
Shortcuts

Keyboard shortcuts can be viewed from Help → Keyboard

Shortcuts
Making Jupyter Notebooks
Reusable
To extract Python code from a Jupyter notebook:
Convert from Command Line Export from within
notebook
>>> jupyter nbconvert --to python
notebook.ipynb
Making Jupyter Notebooks
Reusable
To extract Python code from a Jupyter notebook:
Convert from Command Line Export from
Notebook

>>> jupyter nbconvert --to python

notebook.ipynb
Introduction to
Pandas
• Library for computation with tabular data
• Mixed types of data allowed in a single
table
• Columns and rows of data can be named
• Advanced data aggregation and
statistical functions

S
o
u
r
Introduction to
Pandas
Basic data structures
Type Pandas
Name
Vector
(1
Series
Dimension)
Array
(2 Dimensions)
DataFrame
Introduction to
Pandas
Basic data structures
Type Pandas
Name
Vector
(1 Dimension)
Series

Array DataFram
(2 e
Dimensions)
Pandas Series Creation and
Indexing
Use data from step tracking application to create a Pandas Series
Code
Output
>>> 0 3620
import pandas as pd 1 7891
2 9761
step_data = [3620, 7891, 9761, 3 3907
3907, 4338, 5373] 4 4338
5 5373
step_counts = Name: steps, dtype: int64
pd.Series(step_data,
name=
'step
s')

print(step_counts)
Pandas Series Creation and
Indexing
Use data from step tracking application to create a Pandas Series
Code Output

import pandas as pd >>> 0 3620

1 7891
step_data = [3620, 7891, 9761, 2 9761
3907, 4338, 5373] 3 3907
4 4338
step_counts = pd.Series(step_data, 5 5373
name='steps') Name: steps, dtype: int64

print(step_counts)
Pandas Series Creation and
Indexing
Add a date range to the Series
Code
Output
>>> 2015-03-29 3620
step_counts.index = pd.date_range('20150329', 2015-03-30 7891
periods=6) 2015-03-31 9761
2015-04-01 3907
print(step_counts) 2015-04-02 4338
2015-04-03 5373
Freq: D, Name: steps,
dtype: int64
Pandas Series Creation and
Indexing
Add a date range to the Series
Code Output

step_counts.index = pd.date_range('20150329', >>> 2015-03-29 3620

periods=6) 2015-03-30 7891
2015-03-31 9761
print(step_counts) 2015-04-01 3907
2015-04-02 4338
2015-04-03 5373
Freq: D, Name: steps,
dtype: int64
Pandas Series Creation and
Indexing
Select data by the index values
Code
Output

# Just like a dictionary >>> 3907

print(step_counts['2015-04-01'])
# Or by index position--like an array
>>> 3907
print(step_counts[3])

# Select all of April

print(step_counts['2015-04']) >>> 2015-04-01 3907
2015-04-02 4338
2015-04-03 5373
Freq: D, Name:
steps, dtype:
int64
Pandas Series Creation and
Indexing
Select data by the index values
Code
Output
# Just like a dictionary
print(step_counts['2015-04-01']) >>> 3907

# Or by index position--like an array

>>> 3907
print(step_counts[3])

# Select all of April

>>> 2015-04-01 3907
print(step_counts['2015-04'])
2015-04-02 4338
2015-04-03 5373
Freq: D, Name:
steps,
dtype: int64
Pandas Series Creation and
Indexing
Select data by the index values
Code
Output
# Just like a dictionary
print(step_counts['2015-04-01']) >>> 3907

# Or by index position--like an array

>>> 3907
print(step_counts[3])

# Select all of April

print(step_counts['2015-04']) >>> 2015-04-01 3907
2015-04-02 4338
2015-04-03 5373
Freq: D, Name:
steps,
dtype: int64
Pandas Series Creation and
Indexing
Select data by the index values
Code
Output
# Just like a dictionary
print(step_counts['2015-04-01']) >>> 3907

# Or by index position--like an array

>>> 3907
print(step_counts[3])

# Select all of April

>>> 2015-04-01 3907
print(step_counts['2015-04'])
2015-04-02 4338
2015-04-03 5373
Freq: D, Name: steps,
dtype: int64
Pandas Series Creation and
Indexing
Select data by the index values
Code
Output
# Just like a dictionary
print(step_counts['2015-04-01']) >>> 3907

# Or by index position--like an array

>>> 3907
print(step_counts[3])

# Select all of April

>>> 2015-04-01 3907
print(step_counts['2015-04'])
2015-04-02 4338
2015-04-03 5373
Freq: D, Name: steps,
dtype: int64
Pandas Series Creation and
Indexing
Select data by the index values
Code Output

# Just like a dictionary

print(step_counts['2015-04-01']) >>> 3907

# Or by index position--like an array >>> 3907

print(step_counts[3])

# Select all of April >>> 2015-04-01 3907

print(step_counts['2015-04']) 2015-04-02 4338
2015-04-03 5373
Freq: D, Name: steps,
dtype: int64
Pandas Data Types and
Imputation
Data types can be viewed and converted
Cod Outpu
e t
# View the data type
print(step_counts.dtypes) >>> int64

# Convert to a float
step_counts = step_counts.astype(np.float)

# View the data type

>>> float64
print(step_counts.dtypes)
Pandas Data Types and
Imputation
Data types can be viewed and converted
Cod Outpu
e t
# View the data type
print(step_counts.dtypes) >>> int64

# Convert to a float
step_counts = step_counts.astype(np.float)

# View the data type >>> float64

print(step_counts.dtypes)
Pandas Data Types and
Imputation
Data types can be viewed and converted
Cod Outpu
e t
# View the data type
print(step_counts.dtypes) >>> int64

# Convert to a float
step_counts = step_counts.astype(np.float)

# View the data type >>> float64

print(step_counts.dtypes)
Pandas Data Types and
Imputation
Data types can be viewed and converted
Cod Outpu
e t
# View the data type
print(step_counts.dtypes) >>> int64

# Convert to a float
step_counts = step_counts.astype(np.float)

# View the data type >>> float64

print(step_counts.dtypes)
Pandas Data Types and
Imputation
Invalid data points can be easily filled with values
Cod Outpu
e t
# Create invalid data
step_counts[1:3] = np.NaN >>> 2015-03-30 0.0
2015-03-31 0.0
# Now fill it in with zeros Freq: D, Name: steps,
step_counts = step_counts.fillna(0.) dtype: float64
# equivalently,
# step_counts.fillna(0.,
inplace=True)

print(step_counts[1:3])
Pandas Data Types and
Imputation
Invalid data points can be easily filled with values
Code Output

# Create invalid data

step_counts[1:3] = np.NaN >>> 2015-03-30
0.0
# Now fill it in with zeros 2015-03-31 0.0
step_counts = step_counts.fillna(0.) Freq: D, Name: steps,
# equivalently, dtype: float64
# step_counts.fillna(0.,
inplace=True)

print(step_counts[1:3])
Pandas DataFrame Creation and
Methods
DataFrames can be created from lists, dictionaries, and Pandas Series
Code
Output

# Cycling distance >>>

cycling_data = [10.7, 0, None, 2.4, 15.3,
10.9, 0, None]

# Create a tuple of data

joined_data = list(zip(step_data,
cycling_data))

# The dataframe
activity_df = pd.DataFrame(joined_data)

print(activity_df)
Pandas DataFrame Creation and
Methods
DataFrames can be created from lists, dictionaries, and Pandas Series
Code
Output
# Cycling distance
cycling_data = [10.7, 0, None, 2.4, 15.3, >>>
10.9, 0, None]

# Create a tuple of data

joined_data = list(zip(step_data,
cycling_data))

# The dataframe
activity_df = pd.DataFrame(joined_data)

print(activity_df)
Pandas DataFrame Creation and
Methods
Labeled columns and an index can be added
Code
Output

# Add column names to dataframe >>>

activity_df = pd.DataFrame(
joined_data,
index=pd.date_range('20150329', periods=6),
columns=['Walking','Cycling'])

print(activity_df)
Pandas DataFrame Creation and
Methods
Labeled columns and an index can be added
Code
Output
# Add column names to dataframe
activity_df = pd.DataFrame(joined_data, >>>
index=pd.date_range('20150329',
periods=6),
columns=['Walking','Cycling'])

print(activity_df)
Indexing DataFrame Rows
DataFrame rows can be indexed by row using the 'loc' and 'iloc'
methods
Code
Output

# Select row of data by index name >>> Walking 3907.0

print(activity_df.loc['2015-04-01']) Cycling 2.4
Name: 2015-04-01,
dtype: float64
Indexing DataFrame Rows
DataFrame rows can be indexed by row using the 'loc' and 'iloc'
methods
Code
Output
# Select row of data by index name
print(activity_df.loc['2015-04-01']) >>> Walking 3907.0
Cycling 2.4
Name: 2015-04-01,
dtype: float64
Indexing DataFrame Rows
DataFrame rows can be indexed by row using the 'loc' and 'iloc'
methods
Code
Output

# Select row of data by integer position >>> Walking 3907.0

print(activity_df.iloc[-3]) Cycling 2.4
Name: 2015-04-01,
dtype: float64
Indexing DataFrame Rows
DataFrame rows can be indexed by row using the 'loc' and 'iloc'
methods
Code
Output
# Select row of data by integer position
print(activity_df.iloc[-3]) >>> Walking 3907.0
Cycling 2.4
Name: 2015-04-01,
dtype: float64
Indexing DataFrame
Columns
DataFrame columns can be indexed by name
Cod Outpu
e t
# Name of column
print(activity_df['Walking']) >>> 2015-03-29 3620
2015-03-30 7891
2015-03-31 9761
2015-04-01 3907
2015-04-02 4338
2015-04-03 5373
Freq: D, Name: Walking,
dtype: int64
Indexing DataFrame
Columns
DataFrame columns can be indexed by name
Code Outpu
t
# Name of column
print(activity_df['Walking']) >>> 2015-03-29 3620
2015-03-30 7891
2015-03-31 9761
2015-04-01 3907
2015-04-02 4338
2015-04-03 5373
Freq: D, Name: Walking,
dtype: int64
Indexing DataFrame
Columns
DataFrame columns can also be indexed as
Cod Outpu
properties
e t
# Object-oriented approach
print(activity_df.Walking) >>> 2015-03-29 3620
2015-03-30 7891
2015-03-31 9761
2015-04-01 3907
2015-04-02 4338
2015-04-03 5373
Freq: D, Name: Walking,
dtype: int64
Indexing DataFrame
Columns
DataFrame columns can also be indexed as
Cod Outpu
properties
e t
# Object-oriented approach
print(activity_df.Walking) >>> 2015-03-29 3620
2015-03-30 7891
2015-03-31 9761
2015-04-01 3907
2015-04-02 4338
2015-04-03 5373
Freq: D, Name: Walking,
dtype: int64
Indexing DataFrame
Columns
DataFrame columns can be indexed by integer
Cod Outpu
e t
# First column
print(activity_df.iloc[:,0]) >>> 2015-03-29 3620
2015-03-30 7891
2015-03-31 9761
2015-04-01 3907
2015-04-02 4338
2015-04-03 5373
Freq: D, Name: Walking,
dtype: int64
Indexing DataFrame
Columns
DataFrame columns can be indexed by integer
Code Outpu
t
# First column
print(activity_df.iloc[:,0]) >>> 2015-03-29 3620
2015-03-30 7891
2015-03-31 9761
2015-04-01 3907
2015-04-02 4338
2015-04-03 5373
Freq: D, Name: Walking,
dtype: int64
Reading Data with Pandas
CSV and other common filetypes can be read with a single
command
Code
Output

# The location of the data file >>>

filepath = 'data/Iris_Data/Iris_Data.csv'

# Import the data

data = pd.read_csv(filepath)

# Print a few rows

print(data.iloc[:5])
Reading Data with Pandas
CSV and other common filetypes can be read with a single
command
Code
Output
# The location of the data file
filepath = 'data/Iris_Data/Iris_Data.csv' >>>

# Import the data

data = pd.read_csv(filepath)

# Print a few rows

print(data.iloc[:5])
Assigning New Data to a
DataFrame
Data can be (re-)assigned to a DataFrame column
Code
Output

# Create a new column that is a product >>>

# of both measurements
data['sepal_area'] = data.sepal_length *
data.sepal_width

# Print a few rows and columns

print(data.iloc[:5, -3:])
Assigning New Data to a
DataFrame
Data can be (re-)assigned to a DataFrame column
Code
Output
# Create a new column that is a product
# of both measurements >>>
data['sepal_area'] = data.sepal_length *
data.sepal_width

# Print a few rows and columns

print(data.iloc[:5, -3:])
Applying a Function to a DataFrame
Column
Functions can be applied to columns or rows of a DataFrame or Series
Code
Output

# The lambda function applies what >>>

# follows it to each row of data
data['abbrev'] = (data
.species
.apply(lambda x:
x.replace('Iris-','')))

# Note that there are other ways to

# accomplish the above

print(data.iloc[:5, -3:])
Applying a Function to a DataFrame
Column
Functions can be applied to columns or rows of a DataFrame or Series
Code
Output
# The lambda function applies what
# follows it to each row of data >>>
data['abbrev'] = (data
.species
.apply(lambda x:
x.replace('Iris-','')))

# Note that there are other ways to

# accomplish the above

print(data.iloc[:5, -3:])
Concatenating Two
DataFrames
Two DataFrames can be concatenated along either
Code
dimension
Output

# Concatenate the first two and >>>

# last two rows
small_data = pd.concat([data.iloc[:2],
data.iloc[-2:]])

print(small_data.iloc[:,-3:])

# See the 'join' method for

# SQL style joining of dataframes
Concatenating Two
DataFrames
Two DataFrames can be concatenated along either
Code
dimension
Output
# Concatenate the first two and
# last two rows >>>
small_data = pd.concat([data.iloc[:2],
data.iloc[-2:]])

print(small_data.iloc[:,-3:])

# See the 'join' method for

# SQL style joining of dataframes
Aggregated Statistics with
GroupBy
Using the groupby method calculated aggregated DataFrame
Code
statistics
Output

# Use the size method with a >>> species

# DataFrame to get count Iris-setosa
# For a Series, use the .value_counts
# method 50
group_sizes = (data Iris-versicolor
.groupby('species')
.size()) 50
Iris-virginica
print(group_sizes)
50 dtype: int64
Aggregated Statistics with
GroupBy
Using the groupby method calculated aggregated DataFrame
Code
statistics
Output
# Use the size method with a
# DataFrame to get count >>> species
# For a Series, use the .value_counts Iris-setosa 50
# method Iris-versicolor 50
group_sizes = (data Iris-virginica 50
.groupby('species') dtype: int64
.size())

print(group_sizes)
Performing Statistical Calculations
Pandas contains a variety of statistical methods—mean, median, and
mode
Code
Output

# Mean calculated on a DataFrame >>> sepal_length 5.843333

print(data.mean()) sepal_width 3.054000
petal_length 3.758667
petal_width 1.198667
dtype: float64
# Median calculated on a Series
print(data.petal_length.median()) >>> 4.35

# Mode calculated on a Series

print(data.petal_length.mode()) >>> 0 1.5
dtype: float64
Performing Statistical Calculations
Pandas contains a variety of statistical methods—mean, median, and
mode
Code
Output
# Mean calculated on a DataFrame
print(data.mean()) >>> sepal_length 5.843333
sepal_width 3.054000
petal_length 3.758667
petal_width 1.198667
dtype: float64
# Median calculated on a Series
print(data.petal_length.median()) >>> 4.35

# Mode calculated on a Series

print(data.petal_length.mode()) >>> 0 1.5
dtype: float64
Performing Statistical Calculations
Standard deviation, variance, SEM and quantiles can also be
calculated
Code
Output

# Standard dev, variance, and SEM

print(data.petal_length.std(), >>> 1.76442041995
3.11317941834
data.petal_length.var(), 0.144064324021
# As well as quantiles
print(data.quantile(0))
data.petal_length.sem()) >>> sepal_length 4.3
sepal_width 2.0
petal_length 1.0
petal_width 0.1
Name: 0, dtype: float64
Performing Statistical Calculations
Standard deviation, variance, SEM and quantiles can also be
calculated
Code
Output
# Standard dev, variance, and SEM
print(data.petal_length.std(), >>> 1.76442041995
3.11317941834
data.petal_length.var(), 0.14406432402
1
data.petal_length.sem())
# As well as quantiles >>> sepal_length 4.3
print(data.quantile(0)) sepal_width 2.0
petal_length 1.0
petal_width 0.1
Name: 0, dtype: float64
Performing Statistical Calculations
Standard deviation, variance, SEM and quantiles can also be
calculated
Code
Output
# Standard dev, variance, and SEM
print(data.petal_length.std(), >>> 1.76442041995
3.11317941834
data.petal_length.var(), 0.14406432402
1
data.petal_length.sem())
# As well as quantiles >>> sepal_length 4.3
print(data.quantile(0)) sepal_width 2.0
petal_length 1.0
petal_width 0.1
Name: 0, dtype: float64
Performing Statistical
Calculations
Multiple
Code calculations can be presented in a DataFrame
Output

>>>
print(data.describe())
Performing Statistical
Calculations
Multiple
Code calculations can be presented in a DataFrame
Output
print(data.describe()) >>>
Sampling from
DataFrames
DataFrames can be randomly sampled from
Cod Outpu
e t
# Sample 5 rows without replacement
sample = (data >>>
.sample(n=5,
replace=False,
random_state=42))

print(sample.iloc[:,-3:])
Sampling from
DataFrames
DataFrames can be randomly sampled from
Cod Outpu
e t
# Sample 5 rows without replacement
sample = (data >>>
.sample(n=5,
replace=False,
random_state=42))

print(sample.iloc[:,-3:])

SciPy and NumPy also contain a variety of statistical

functions.
Visualization
Libraries
Visualizations can be created in multiple ways:
• Matplotlib
• Pandas (via Matplotlib)
• Seaborn
• Statistically-focused plotting methods
• Global preferences incorporated by
Matplotlib
Basic Scatter Plots with
Matplotlib
Scatter plots can be created from Pandas Series
Code
Output

Import matplotlib.pyplot as plt

plt.plot(data.sepal_length,
data.sepal_width,
ls ='', marker='o')
Basic Scatter Plots with
Matplotlib
Scatter plots can be created from Pandas Series
Code
Output
4.5
Import matplotlib.pyplot as plt
4.0

plt.plot(data.sepal_length, 3.5

data.sepal_width,
ls ='', marker='o') 3.0

2.5

2.0

5 6 7 8
Basic Scatter Plots with
Matplotlib
Multiple layers of data can also be added
Code
Output

plt.plot(data.sepal_length,
data.sepal_width,
ls ='', marker='o',
label='sepal')

plt.plot(data.petal_length,
data.petal_width,
ls ='', marker='o',
label='petal')
Basic Scatter Plots with
Matplotlib
Multiple layers of data can also be added
Code
Output
plt.plot(data.sepal_length,
sepal
data.sepal_width, 4 petal
ls ='', marker='o',
label='sepal')
3

plt.plot(data.petal_length,
data.petal_width, 2

ls ='', marker='o',
label='petal') 1

0
2 4 6 8
Histograms with
Matplotlib
Histograms can be created from Pandas Series
Cod Outpu
e t
plt.hist(data.sepal_length, bins=25)
Histograms with
Matplotlib
Histograms can be created from Pandas Series
Cod Outpu
e t
plt.hist(data.sepal_length, bins=25)
16

0
5 6 7 8
Customizing Matplotlib
Plots
Every feature of Matplotlib plots can be
Cod Outpu
customized
e t
fig, ax = plt.subplots()

ax.barh(np.arange(10),
data.sepal_width.iloc[:10])

# Set position of ticks and tick labels

ax.set_yticks(np.arange(0.4,10.4,1.0))
ax.set_yticklabels(np.arange(1,11))
ax.set(xlabel='xlabel', ylabel='ylabel',
title='Title')
Customizing Matplotlib
Plots
Every feature of Matplotlib plots can be
Cod Outpu
customized
e t
fig, ax = plt.subplots()

ax.barh(np.arange(10),
data.sepal_width.iloc[:10])

# Set position of ticks and tick labels

ax.set_yticks(np.arange(0.4,10.4,1.0))
ax.set_yticklabels(np.arange(1,11))
ax.set(xlabel='xlabel', ylabel='ylabel',
title='Title')
Incorporating Statistical
Calculations
Statistical calculations can be included with Pandas methods
Code
Output

(data
.groupby('species')
.mean()
.plot(color=['red','blue',
'black','green'],
fontsize=10.0, figsize=(4,4)))
Incorporating Statistical
Calculations
Statistical calculations can be included with Pandas methods
Code
Output

(data
.groupby('species')
.mean()
.plot(color=['red','blue',
'black','green'],
fontsize=10.0, figsize=(4,4)))
Statistical Plotting with
Seaborn
Joint distribution and scatter plots can be created
Cod Outpu
e t
import seaborn as sns

sns.jointplot(x='sepal_length',
y='sepal_width',
data=data, size=4)
Statistical Plotting with
Seaborn
Joint distribution and scatter plots can be created
Cod Outpu
e t
import seaborn as sns
4.5
sns.jointplot(x='sepal_length', pearsonr -0.11; p 0.18

y='sepal_width', 4.0
data=data, size=4)
3.5

sepal_width
3.0

2.5

2.0
5 6 7 8
sepal_length
Statistical Plotting with Seaborn
Correlation plots of all variable pairs can also be made with
Seaborn
Code
Output

sns.pairplot(data, hue='species', size=3)

Statistical Plotting with Seaborn
Correlation plots of all variable pairs can also be made with
Seaborn
Code
Output
sns.pairplot(data, hue='species', size=3) 8

sepal_length
6

sepal_widt
3

h
2
species
Iris-setosa
Iris-
6 versicolor

petal_length
Iris-virginica
4

petal_width
1

0
5.0 7.5 2 4 2.5 5.0 0 2
sepal_lengt sepal_widt petal_length petal_width
h h

Jupyter Notebook Tutorial
No ratings yet
Jupyter Notebook Tutorial
23 pages
Lputorrents Summary One Year of TikTok in The United Kingdom
No ratings yet
Lputorrents Summary One Year of TikTok in The United Kingdom
1 page
Software Requirements
No ratings yet
Software Requirements
3 pages
2. Applications of AI in InfoSec
No ratings yet
2. Applications of AI in InfoSec
86 pages
Self Service KIOSK Catalog
No ratings yet
Self Service KIOSK Catalog
15 pages
3_machine_learning_tools
No ratings yet
3_machine_learning_tools
69 pages
Jupiter Notebook Tricks
100% (1)
Jupiter Notebook Tricks
9 pages
Dhruv Python Lab File
No ratings yet
Dhruv Python Lab File
20 pages
DVT FrameWorkRevA[1]
No ratings yet
DVT FrameWorkRevA[1]
157 pages
Programming For Data Analytics Introduction
100% (2)
Programming For Data Analytics Introduction
32 pages
Lec 10
No ratings yet
Lec 10
7 pages
1B Coding Environments - Copy
No ratings yet
1B Coding Environments - Copy
6 pages
Manual Mcqs
No ratings yet
Manual Mcqs
31 pages
Upload Data From A Flat File Into A Database Table
No ratings yet
Upload Data From A Flat File Into A Database Table
3 pages
Deployment Notes
No ratings yet
Deployment Notes
4 pages
Setup Environment & Python Basics
No ratings yet
Setup Environment & Python Basics
62 pages
Remote Alarm Notification: Moeller Intelligent Relays
No ratings yet
Remote Alarm Notification: Moeller Intelligent Relays
12 pages
FSP Form Filling Process
No ratings yet
FSP Form Filling Process
14 pages
Machine Learning With Python-Python EcoSystem
No ratings yet
Machine Learning With Python-Python EcoSystem
19 pages
U.S. Embassy in Ethiopia
No ratings yet
U.S. Embassy in Ethiopia
1 page
System Requirements
No ratings yet
System Requirements
4 pages
PYTHON
No ratings yet
PYTHON
43 pages
IE 305 Recitation 2: Introduction To Arena 18.10.2021
No ratings yet
IE 305 Recitation 2: Introduction To Arena 18.10.2021
7 pages
Jupyter
No ratings yet
Jupyter
13 pages
Python Introduction
No ratings yet
Python Introduction
38 pages
Dsf - Unit II Notes
No ratings yet
Dsf - Unit II Notes
43 pages
Ey Cyber Risk Management
100% (1)
Ey Cyber Risk Management
12 pages
Correlation in The Stock Market
No ratings yet
Correlation in The Stock Market
12 pages
Introduction To Anaconda and Jupyter Notebooks
100% (1)
Introduction To Anaconda and Jupyter Notebooks
14 pages
Exp No. 1-3 (MLC)
No ratings yet
Exp No. 1-3 (MLC)
12 pages
Software Environment
No ratings yet
Software Environment
6 pages
A1 T1 Lecture1
No ratings yet
A1 T1 Lecture1
29 pages
Python Environment Setup PDF
100% (1)
Python Environment Setup PDF
11 pages
Getting Started With Python
No ratings yet
Getting Started With Python
8 pages
Data Visualization_Lab_Manual_2024
No ratings yet
Data Visualization_Lab_Manual_2024
13 pages
Videx User Guide
No ratings yet
Videx User Guide
4 pages
Regularization and Feature Selectio N
No ratings yet
Regularization and Feature Selectio N
102 pages
DM2324 Lab01
No ratings yet
DM2324 Lab01
66 pages
EE5075 Lecture 3A correct (2)
No ratings yet
EE5075 Lecture 3A correct (2)
34 pages
Strategic Guide Validated Move S4HANA GLO EN 20210804 1
No ratings yet
Strategic Guide Validated Move S4HANA GLO EN 20210804 1
7 pages
De Sulfat or PCB
No ratings yet
De Sulfat or PCB
1 page
MRDN-MI_5
No ratings yet
MRDN-MI_5
23 pages
Lab 12 Manual
No ratings yet
Lab 12 Manual
43 pages
Lab Manual
No ratings yet
Lab Manual
100 pages
Jupyter Notebooks Advanced Tutorial
100% (1)
Jupyter Notebooks Advanced Tutorial
40 pages
Account Statement From 1 Apr 2021 To 1 Jul 2021: TXN Date Value Date Description Ref No./Cheque No. Debit Credit Balance
No ratings yet
Account Statement From 1 Apr 2021 To 1 Jul 2021: TXN Date Value Date Description Ref No./Cheque No. Debit Credit Balance
4 pages
ML LAB Record
No ratings yet
ML LAB Record
51 pages
Sec-D ML Practical File PDF
No ratings yet
Sec-D ML Practical File PDF
19 pages
DL Lab Manual
No ratings yet
DL Lab Manual
34 pages
SLANY ChapterNews Newsletter Autumn 2006
No ratings yet
SLANY ChapterNews Newsletter Autumn 2006
13 pages
VARMA For Battery Voltage Forecasting 1
No ratings yet
VARMA For Battery Voltage Forecasting 1
70 pages
Introduction To Python Lecture 2: Introduction To Jupyter: Pavlos Antoniou
No ratings yet
Introduction To Python Lecture 2: Introduction To Jupyter: Pavlos Antoniou
36 pages
ENGG1003_10_PythonApplicationsOnJupiter
No ratings yet
ENGG1003_10_PythonApplicationsOnJupiter
30 pages
Learning IPython For Interactive Computing and Data Visualization - Second Edition - Sample Chapter
0% (1)
Learning IPython For Interactive Computing and Data Visualization - Second Edition - Sample Chapter
64 pages
AIDI - 1010 - WEEK2 - Google Colab - v1.2
No ratings yet
AIDI - 1010 - WEEK2 - Google Colab - v1.2
17 pages
03-Jupyter Markdown Python
No ratings yet
03-Jupyter Markdown Python
28 pages
Blockchain Technology - CSE
No ratings yet
Blockchain Technology - CSE
30 pages
Civil 3D, The Modern
No ratings yet
Civil 3D, The Modern
60 pages
2013-03-15 "Joule Thief" Powered by .040 V Thermocouple - RustyBolt - Info - Wordpress
No ratings yet
2013-03-15 "Joule Thief" Powered by .040 V Thermocouple - RustyBolt - Info - Wordpress
1 page
DT-1. Familiarization With AIML Platforms
No ratings yet
DT-1. Familiarization With AIML Platforms
25 pages
Python Libraries 2
No ratings yet
Python Libraries 2
80 pages
Renewable Energy Based Grid Connected Battery Projects Around The World-An Overview
No ratings yet
Renewable Energy Based Grid Connected Battery Projects Around The World-An Overview
23 pages
2.introduction To Supervised Learning and K Nearest Neighbors
No ratings yet
2.introduction To Supervised Learning and K Nearest Neighbors
74 pages
Currency Translation in Controlling
No ratings yet
Currency Translation in Controlling
2 pages
1.1-1.4_Introduction to Python
No ratings yet
1.1-1.4_Introduction to Python
50 pages
Jupyter PDF
No ratings yet
Jupyter PDF
39 pages
Lecture 4
No ratings yet
Lecture 4
42 pages
DSO Nano Firmware Generation and Upgrade
No ratings yet
DSO Nano Firmware Generation and Upgrade
6 pages
ML LAB Record
No ratings yet
ML LAB Record
54 pages
TypeScript Interview Questions
No ratings yet
TypeScript Interview Questions
36 pages
CS251 Intro. To SE (0) Module Outline - An Intro. To SE
No ratings yet
CS251 Intro. To SE (0) Module Outline - An Intro. To SE
22 pages
Let's Start With Data Science
No ratings yet
Let's Start With Data Science
5 pages
Generations of Computers
No ratings yet
Generations of Computers
10 pages
Guide To Jupyter Notebooks: Universidad Complutense de Madrid
No ratings yet
Guide To Jupyter Notebooks: Universidad Complutense de Madrid
38 pages
Introduction To Python by Data Science Nigeria
No ratings yet
Introduction To Python by Data Science Nigeria
56 pages
FIA UDC Paper by Preparation Point
No ratings yet
FIA UDC Paper by Preparation Point
15 pages
Smart Grid Policy Framework and Roadmap For The Philippines: Redentor E. Delola
No ratings yet
Smart Grid Policy Framework and Roadmap For The Philippines: Redentor E. Delola
47 pages
Numpy - Python Package For Data
No ratings yet
Numpy - Python Package For Data
9 pages
Lab Python Numpy Opencv
No ratings yet
Lab Python Numpy Opencv
45 pages
How To Install Jupyter Notebook On Ubuntu: Getting Started
No ratings yet
How To Install Jupyter Notebook On Ubuntu: Getting Started
95 pages
Python Week+1 New
No ratings yet
Python Week+1 New
44 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
DV Lab Manual
No ratings yet
DV Lab Manual
38 pages
TCP/IP Sockets in C
100% (3)
TCP/IP Sockets in C
100 pages
Python Tutorial For Students Machine Learning Course Holzinger
100% (1)
Python Tutorial For Students Machine Learning Course Holzinger
46 pages
Num Py
No ratings yet
Num Py
20 pages
Lecture 1.1 - Introduction To Jupyter Notebooks and Google Colab
No ratings yet
Lecture 1.1 - Introduction To Jupyter Notebooks and Google Colab
23 pages
Jupyter Notebook Stable
No ratings yet
Jupyter Notebook Stable
157 pages
FortiMail-6 4 0-Administration - Guide
No ratings yet
FortiMail-6 4 0-Administration - Guide
674 pages
Week1 - Introduction To Machine Learning and Toolkit
No ratings yet
Week1 - Introduction To Machine Learning and Toolkit
102 pages
Statistics Machine Learning Python Draft
No ratings yet
Statistics Machine Learning Python Draft
319 pages
Grokking Data Science
No ratings yet
Grokking Data Science
61 pages
Learning Jupyter
From Everand
Learning Jupyter
Dan Toomey
3.5/5 (4)
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
Numpy Simply In Depth
From Everand
Numpy Simply In Depth
Ajit Singh
5/5 (1)