0% found this document useful (0 votes)

10 views

1.5

The document discusses NumPy boolean indexing, which allows filtering elements in an array based on specific conditions using boolean masks. It also covers the Pandas library, detailing its data structures like DataFrames and Series, and how to manipulate data within these structures. Additionally, it provides examples of creating and using boolean masks in both NumPy and Pandas.

Uploaded by

123109015

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

1.5

Uploaded by

123109015

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

NumPy Boolean Indexing

● In NumPy, boolean indexing allows us to filter elements from an

array based on a specific condition.
● Boolean indexing is commonly known as a filter with boolean
masks to specify the condition.
● Boolean indexing uses the result of a Boolean operation over the
data, returning a mask with True or False for each row.
● The rows marked True in the mask will be selected.
● In NumPy, Boolean mask is a numpy array containing truth values
(True/False) that correspond to each element in the array.
Example of Boolean Masks
2

● Suppose we have an array named array1.

array1 = np.array([12, 24, 16, 21, 32, 29, 7, 15])
● Now let's create a mask that selects all elements of array1 that are greater than 20.
boolean_mask = array1 > 20
● Here, array1 > 20 creates a boolean mask that evaluates to True for elements that
are greater than 20, and False for elements that are less than or equal to 20.
● The resulting mask is an array stored in the boolean_mask variable as:
[False, True, False, True, True, True, False, False]
array1 = np.array([1, 2, 4, 9, 11, 16, 18, 22, 26, 31, 33, 47, 51, 52])
# create a boolean mask using combined logical operators
boolean_mask = (array1 < 10) | (array1 > 40)
# apply the boolean mask to the array
result = array1[boolean_mask]
print(result)
[ 1 2 4 9 47 51 52]

numbers = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

numbers_copy = numbers.copy()
# change all even numbers to 0 in the copy
numbers_copy[numbers % 2 == 0] = 0
# print the modified copy
print(numbers_copy)
[1 0 3 0 5 0 7 0 9 0]
2D Boolean Indexing in NumPy
# create a 2D array
array1 = np.array([[1, 7, 9],[14, 19, 21],25, 29,
35]])
# create a boolean mask elements for greater than 9
boolean_mask = array1 > 9
result = array1[boolean_mask]
print(result)
[14 19 21 25 29 35]
Pandas Library for Data Manipulation and Analysis
∙ Pandas provides two types of classes for handling data:
∙ DataFrame: a two-dimensional data structure that holds data like a
two-dimension array or a table with rows and columns.
∙ Rows in DataFrame have a specific index to access rows and
columns, which can be any name or value.
∙ In Pandas, the columns are called Series, which consists of a list of
several values, where each value has an index.
∙ Values can be integers, strings, Python objects etc.
● python -m pip install --upgrade pip
● python3
● pip install pandas
Series in Pandas

● data = [10, 20, 30, 40, 50]

● my_series = pd.Series(data)
● print(my_series[2])
● a = [1, 3, 5]
● my_series = pd.Series(a, index = ["x", "y", "z"])
● print(my_series)
● print(my_series["y"])
import pandas as pd
# create a dictionary
grades = {"Sem1": 8.25, "Sem2": 9.5, "Sem3": 7.75}
# create a series from the dictionary
my_series = pd.Series(grades)
print(my_series)
first_year = pd.Series(grades, index =
["Sem1", "Sem2"])
Series in Pandas
import pandas as pd
s = pd.Series([1, 3, 5, np.nan, 6, 8])
s
0 1.0
1 3.0
2 5.0
3 NaN
4 6.0
5 8.0
dtype: float64
import pandas as pd
data = [['John', 25, 'New York'],
['Alice', 30, 'London'],
['Bob', 35, 'Paris']]
# create a DataFrame from the list
df = pd.DataFrame(data, columns=['Name',
'Age', 'City'])

print(df)
Pandas DataFrame Using Python Dictionary
data = {’year’: [2010 , 2011 , 2012 ,
2010 , 2011 , 2012 ,2010 , 2011 , 2012],
’team’: [’FCBarcelona’, ’FCBarcelona’, ’FCBarcelona’,
’RMadrid ’, ’RMadrid’, ’RMadrid’, ’ValenciaCF’,
’ValenciaCF’, ’ValenciaCF’],
’wins’: [30 , 28 , 32 , 29 , 32 , 26 , 21 , 17 , 19],
’draws’:[6 , 7, 4, 5, 4, 7, 8, 10 , 8] ,
’losses’: [2 , 3, 2, 4, 2, 5, 9, 11 , 11]
}

football = pd.DataFrame(data,columns=[’year’,’team’,
’wins’, ’draws’, ’losses’] )
df = pd.DataFrame() # create an empty DataFrame
df = pd.read_csv('data.csv') #from CSV

df = pd.read_csv('./csv_files/data.csv', header = 0)

Employee ID,First Name,Last

Name,Department,Position,Salary
101,John,Doe,Marketing,Manager,50000
102,Jane,Smith,Sales,Associate,35000
103,Michael,Johnson,Finance,Analyst,45000
104,Emily,Williams,HR,Coordinator,40000
23, 'Hello', 45.6
56, 'World', 78.9
89, 'Foo', 12.3
34, 'Bar', 56.7

# read csv file with some arguments

df = pd.read_csv('data.csv', header = None, names =
['col1', 'col2', 'col3'], skiprows = 2)
print(df)
>>> data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],... 'Age': [25, 30, 35, 28],...
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
>>> df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])
>>> df
Name Age City
A Alice 25 New York
B Bob 30 Los Angeles
C Charlie 35 Chicago
D David 28 Houston
>>> selected_row = df.loc['A']
>>> print(selected_row)
Name Alice
Age 25
City New York
Name: A, dtype: object
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 28], 'City': ['New York', 'Los Angeles',
'Chicago', 'Houston']}
df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])
# select specific rows and columns
selected_data = df.loc[['A', 'C'], ['Name', 'Age']]
print(selected_data)
cd1 = df.loc['B':'C', ['Name', 'Age']]
cd2 = df.loc[:, ['Name', 'Age']]
cd3 = df.loc[:]
sr2 = df.loc[['A','C'],:]
sr1 = df.loc[df['Age'] >= 30]
arr = np.array([1, 2, 3, 4, 5])
arr = np.array((1, 2, 3, 4, 5)) #Tuple

arr = np.array([1, 2, 3, 4])

print(arr[2] + arr[3])

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('5th element on 2nd row: ', arr[1, 4])
3 + 2.5 * np.random.randn(2, 4)
>>>Array([[ 3.56443934, 0.21240777, 1.65220694, 6.32284338],
[-0.90036278, 4.78487666, 3.40952793, 1.71824131]])
>>> np.array([3] * 4, dtype="int32")
array([3, 3, 3, 3], dtype=int32)
>>> z = np.arange(3, dtype=np.uint8) #Array Range
>>> z
array([0, 1, 2], dtype=uint8)
https://www.programiz.com/python-programming/pandas/ge
tting-started

● Categoricals are a pandas data type corresponding to categorical

variables in statistics.
● Takes a limited / usually fixed, number of possible values
● Categorical data might have an order
● like ‘strongly agree’ vs ‘agree’ or
● ‘first observation’ vs. ‘second observation’
● “Test Data” , “Train Data”
● Order is defined by the order of categories, not lexical order of
the values
All values here are either in categories or np.nan
s =pd.Series(["a","b","c", "a"],dtype="category")

df = pd.DataFrame({"A": ["a", "b", "c", "a"]})

df["B"] = df["A"].astype("category")

data = {'Name': ['Alice', 'Bob', 'Charlie',

'David', 'Eve'], 'Age': [25, 32, 18, 47, 33],
'City':['New York', 'Paris', 'London', 'Tokyo',
'Sydney']}
df = pd.DataFrame(data)
names = df['Name']

name_city = df[['Name','City']]
df2 = pd.DataFrame(
{"A": 1.0, "B":pd.Timestamp("20250128"),
"C": pd.Series(1,index=list(range(4)),
dtype="float32"),
"D": np.array([3] * 4, dtype="int32"),
"E": pd.Categorical(["test", "train", "test",
"train"]),
"F": "foo", } )
>>>df2
A B C D E F
0 1.0 2013-01-02 1.0 3 test foo
1 1.0 2013-01-02 1.0 3 train foo
2 1.0 2013-01-02 1.0 3 test foo
3 1.0 2013-01-02 1.0 3 train foo
>>> df2.dtypes
A float64
B datetime64[s]
C float32
D int32
E category
F object
dtype: object
>>> dates = pd.date_range("20250101", periods=6)
df = pd.DataFrame(np.random.randn(6, 4),
>>>
index=dates, columns=list("ABCD"))
>>> df
A B C D
2025-01-01 0.293879 0.324915 0.434401 -1.391992
2025-01-02 -0.701108 -0.011810 0.835216 -0.586246
2025-01-03 -0.677587 0.348766 -0.457098 1.147319
2025-01-04 -1.671191 0.651669 -0.685242 -1.954809
2025-01-05 0.526734 -1.297472 0.177927 0.612196
2025-01-06 0.778206 0.865262 -0.970947 -0.460400
>>> df.head()
A B C D
2025-01-01 0.293879 0.324915 0.434401 -1.391992
2025-01-02 -0.701108 -0.011810 0.835216 -0.586246
2025-01-03 -0.677587 0.348766 -0.457098 1.147319
2025-01-04 -1.671191 0.651669 -0.685242 -1.954809
2025-01-05 0.526734 -1.297472 0.177927 0.612196
>>> df.tail(2)
A B C D
2025-01-05 0.526734 -1.297472 0.177927 0.612196
2025-01-06 0.778206 0.865262 -0.970947 -0.460400

>>> df.index
DatetimeIndex(['2025-01-01', '2025-01-02',
'2025-01-03', '2025-01-04','2025-01-05',
'2025-01-06'],dtype='datetime64[ns]', freq='D')
>>> df.columns
Index(['A', 'B', 'C', 'D'], dtype='object')

>>> df.to_numpy()
array([[ 0.29387942, 0.32491506, 0.43440078,-1.39199244],
[-0.70110762,-0.01181039, 0.83521647, -0.58624567],
[-0.67758743, 0.34876597, -0.45709763, 1.14731948],
[-1.67119052, 0.65166926, -0.68524221, -1.95480876],
[ 0.52673407,-1.29747191, 0.17792695, 0.6121957 ],
[ 0.77820621,0.8652619 , -0.97094701, -0.46040001]])
>>> df.describe()
A B C D
count 6.000000 6.000000 6.000000 6.000000
mean -0.241844 0.146888 -0.110957 -0.438989
std 0.934028 0.768723 0.702184 1.170421
min -1.671191 -1.297472 -0.970947 -1.954809
25% -0.695228 0.072371 -0.628206 -1.190556
50% -0.191854 0.336841 -0.139585 -0.523323
75% 0.468520 0.575943 0.370282 0.344047
max 0.778206 0.865262 0.835216 1.147319
>>> df.T
2025-01-01 2025-01-02 2025-01-03 2025-01-04 2025-01-05 2025-01-06
A 0.293879 -0.701108 -0.677587 -1.671191 0.526734 0.778206
B 0.324915 -0.011810 0.348766 0.651669 -1.297472 0.865262
C 0.434401 0.835216 -0.457098 -0.685242 0.177927 -0.970947
D -1.391992 -0.586246 1.147319 -1.954809 0.612196 -0.460400
>>> df["A"]
2025-01-01 0.293879
2025-01-02 -0.701108
2025-01-03 -0.677587
2025-01-04 -1.671191
2025-01-05 0.526734
2025-01-06 0.778206
Freq: D, Name: A, dtype: float64
>>> df.A
2025-01-01 0.293879
2025-01-02 -0.701108
2025-01-03 -0.677587
2025-01-04 -1.671191
2025-01-05 0.526734
2025-01-06 0.778206
Freq: D, Name: A, dtype: float64
data = {'Name': ['John', 'Alice', 'Bob'],
'Age': [25, 30, 35],
'City': ['New York', 'London', 'Paris']}
# create a dataframe from the dictionary
df = pd.DataFrame(data)
# write dataframe to csv file
df.to_csv('output.csv', index=False)

df = pd.DataFrame(data)
df.duplicated(subset=['Name', 'Age']
df.drop_duplicates(inplace=True)
import pandas as pd

# create dataframe
data = {'Name': ['Tom', 'Nick', 'John', 'Tom'],
'Age': [20, 21, 19, 18],
'City': ['New York', 'London', 'Paris', 'Berlin']}
df = pd.DataFrame(data)

# write to csv file

df.to_csv('output.csv', sep = ';', index = False, header = True)
data = { 'A': [1, 2, 3, None, 5], 'B': [None, 2, 3, 4, 5],
'C': [1, 2, None, None, 5] }

df = pd.DataFrame(data)
print("Original Data:\n",df)
# use dropna() to remove rows with any missing values
df_cleaned = df.dropna()
print("Cleaned Data:\n",df_cleaned)
Cleaned Data:
A B C
1 2.0 2.0 2.0
4 5.0 5.0 5.0
import pandas as pd
data = { 'A': [1, 2, 3, None, 5],
'B': [None, 2, 3, 4, 5], 'C': [1, 2, None, None, 5]}
df = pd.DataFrame(data)
print("Original Data:\n", df)
# filling NaN values with 0
df.fillna(0, inplace=True)
print("\nData after filling NaN with 0:\n", df)
import pandas as pd
data = {
'Name': ['John', 'Michael', 'Tom', 'Alex', 'Ryan'],
'Age': [8, 9, 7, 80, 100], 'Gender': ['M', 'M', 'M', 'F', 'M'],
'Standard': [3, 4, 12, 3, 5]}
df = pd.DataFrame(data)
# replace F with M
df.loc[3, 'Gender'] = 'M'
print(df)
import pandas as pd
data = {
'Name': ['John', 'Michael', 'Tom', 'Alex', 'Ryan'],
'Age': [8, 9, 7, 80, 100], 'Gender': ['M', 'M', 'M', 'M', 'M'],
'Standard': [3, 4, 12, 3, 5] }
df = pd.DataFrame(data)
# replace values based on conditions
for i in df.index:
age_val = df.loc[i, 'Age']
if (age_val > 14) and (age_val%10 == 0):
df.loc[i, 'Age'] = age_val/10
print(df)
Resources: Datasets
39

◻ UCI Repository: http://www.ics.uci.edu/~mlearn/MLRepository.html

◻ Statlib: http://lib.stat.cmu.edu/
◻ European Union (Eurostat): https://ec.europa.eu/eurostat/data/database

Information Systems 7th Edition Baltzan all chapter instant download
83% (6)
Information Systems 7th Edition Baltzan all chapter instant download
82 pages
Data Analysis Project Plan Template
No ratings yet
Data Analysis Project Plan Template
9 pages
(F5-FORMATIVE) NEW Formative Assessment 5
No ratings yet
(F5-FORMATIVE) NEW Formative Assessment 5
16 pages
Data ONTAP 8 2 Commands Manual Page Reference
No ratings yet
Data ONTAP 8 2 Commands Manual Page Reference
160 pages
10 Minutes to Pandas — Pandas 2.1.1 Documentation
No ratings yet
10 Minutes to Pandas — Pandas 2.1.1 Documentation
24 pages
10 Minutes To Pandas - Pandas 1.2.4 Documentation
No ratings yet
10 Minutes To Pandas - Pandas 1.2.4 Documentation
18 pages
10 Minutes To Pandas
No ratings yet
10 Minutes To Pandas
26 pages
Pandas
No ratings yet
Pandas
44 pages
pandas
No ratings yet
pandas
24 pages
Unit3_3) Pandas.ipynb - Colab
No ratings yet
Unit3_3) Pandas.ipynb - Colab
11 pages
Pandas
No ratings yet
Pandas
27 pages
Pandas
No ratings yet
Pandas
5 pages
Data Analysis 6060
No ratings yet
Data Analysis 6060
6 pages
10 Minutes To Pandas - Pandas 0.21
No ratings yet
10 Minutes To Pandas - Pandas 0.21
23 pages
ip study
No ratings yet
ip study
18 pages
UNIT-4 Important Q-A
No ratings yet
UNIT-4 Important Q-A
28 pages
IP Slybuss
No ratings yet
IP Slybuss
21 pages
Pandas Library
No ratings yet
Pandas Library
5 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
10 Minutes To Pandas
No ratings yet
10 Minutes To Pandas
19 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
Pandas Shan Ver2
No ratings yet
Pandas Shan Ver2
25 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
11.2 Pandas
No ratings yet
11.2 Pandas
24 pages
P03 Introduction To Pandas Ans
No ratings yet
P03 Introduction To Pandas Ans
45 pages
Ip Project Work 2
No ratings yet
Ip Project Work 2
52 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
Practical - With Solution - XII - IP
No ratings yet
Practical - With Solution - XII - IP
13 pages
Practical File Questions With Answers
No ratings yet
Practical File Questions With Answers
7 pages
Unit 2
No ratings yet
Unit 2
81 pages
Assignment-1 (Python Pandas-Series Object and Data Frame: 1. Answer The Following
100% (1)
Assignment-1 (Python Pandas-Series Object and Data Frame: 1. Answer The Following
8 pages
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
No ratings yet
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
16 pages
Exercise 7 - Pandas
No ratings yet
Exercise 7 - Pandas
2 pages
Panda
No ratings yet
Panda
33 pages
Assignments IP Class 12
No ratings yet
Assignments IP Class 12
9 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
Pandas & Numpy
No ratings yet
Pandas & Numpy
32 pages
UNIT 1 PYTHON PROGRAMMING-II
No ratings yet
UNIT 1 PYTHON PROGRAMMING-II
15 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
batch1 ds
No ratings yet
batch1 ds
15 pages
PANDAS
No ratings yet
PANDAS
24 pages
Pandas Notes: """ Useful Data Analysis Tool """
No ratings yet
Pandas Notes: """ Useful Data Analysis Tool """
11 pages
Python Cheat Sheets
97% (33)
Python Cheat Sheets
11 pages
Python For Data Science 1662157639
No ratings yet
Python For Data Science 1662157639
6 pages
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
No ratings yet
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
19 pages
a5
No ratings yet
a5
28 pages
Pandas,Numpy,Matplotlib
No ratings yet
Pandas,Numpy,Matplotlib
11 pages
DMT Function
No ratings yet
DMT Function
10 pages
Ip Project
No ratings yet
Ip Project
27 pages
IP Practical File
No ratings yet
IP Practical File
27 pages
Python Notes by Prof T
No ratings yet
Python Notes by Prof T
10 pages
DATA HANDLING AND CSV 2024- 2025
No ratings yet
DATA HANDLING AND CSV 2024- 2025
12 pages
IP Practical File Project
No ratings yet
IP Practical File Project
60 pages
AIML LAB MANAUAL R23
100% (1)
AIML LAB MANAUAL R23
10 pages
09_Pandas slides
No ratings yet
09_Pandas slides
33 pages
Series and Pandas Methods
No ratings yet
Series and Pandas Methods
5 pages
PYQ Data Analysis and Visualisation Using Python GE May 2024
No ratings yet
PYQ Data Analysis and Visualisation Using Python GE May 2024
6 pages
Practical File Question 28.09.2022
No ratings yet
Practical File Question 28.09.2022
15 pages
Python Practical Questions
No ratings yet
Python Practical Questions
13 pages
Python Programs
No ratings yet
Python Programs
29 pages
Practical Record Programs - Solutions
No ratings yet
Practical Record Programs - Solutions
23 pages
Lecture 3 - Pandas
No ratings yet
Lecture 3 - Pandas
37 pages
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
No ratings yet
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
9 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Case Study IBM - The Case For Business Analytics in Midsize Firm
No ratings yet
Case Study IBM - The Case For Business Analytics in Midsize Firm
14 pages
Chapter 3: Research Methodology
No ratings yet
Chapter 3: Research Methodology
16 pages
Android
No ratings yet
Android
16 pages
Grade 11 Chapter 9 Computer Science
No ratings yet
Grade 11 Chapter 9 Computer Science
8 pages
SAP HANA Interview Questions
No ratings yet
SAP HANA Interview Questions
17 pages
Lecture 6 - Data Sources and Course Project
No ratings yet
Lecture 6 - Data Sources and Course Project
10 pages
Be Information Technology Engineering Semester 4 2024 May Database Management Systems Dms Pattern 2019
No ratings yet
Be Information Technology Engineering Semester 4 2024 May Database Management Systems Dms Pattern 2019
3 pages
5.MD.C.5c Volume Additive Real World
No ratings yet
5.MD.C.5c Volume Additive Real World
4 pages
EBOOK Elementary Statistics Picturing The World 6Th Edition Ebook PDF Download Full Chapter PDF Docx Kindle
100% (66)
EBOOK Elementary Statistics Picturing The World 6Th Edition Ebook PDF Download Full Chapter PDF Docx Kindle
61 pages
MySQL Data Types & My
No ratings yet
MySQL Data Types & My
34 pages
Kenny-230718-Top 70 Microsoft Data Science Interview Questions
No ratings yet
Kenny-230718-Top 70 Microsoft Data Science Interview Questions
17 pages
commvault-mrr-solution-brief_v2.2-SECURED
No ratings yet
commvault-mrr-solution-brief_v2.2-SECURED
3 pages
Spark With Bigdata
No ratings yet
Spark With Bigdata
94 pages
VSAM
100% (1)
VSAM
19 pages
Informatica Powercenter 8.1.0 Product
No ratings yet
Informatica Powercenter 8.1.0 Product
32 pages
Eastern Tayabas College Inc,.: Title of Action Research
No ratings yet
Eastern Tayabas College Inc,.: Title of Action Research
10 pages
Dataset y DataTable en VISUAL STUDIO 2010-1
No ratings yet
Dataset y DataTable en VISUAL STUDIO 2010-1
19 pages
Lecture 1
100% (1)
Lecture 1
33 pages
ZFS Best Practices Guide - Siwiki
No ratings yet
ZFS Best Practices Guide - Siwiki
12 pages
MOP HP EVA 4400 Power Cycle
No ratings yet
MOP HP EVA 4400 Power Cycle
5 pages
SAP Analystic Cloud
0% (1)
SAP Analystic Cloud
6 pages
Project Report
No ratings yet
Project Report
64 pages
Queries For Creating Tables For College Database
No ratings yet
Queries For Creating Tables For College Database
8 pages
Computer Hardware Software
No ratings yet
Computer Hardware Software
31 pages
Managing A Successful Business Project
No ratings yet
Managing A Successful Business Project
32 pages
286 Question Paper
No ratings yet
286 Question Paper
2 pages

1.5

Uploaded by

1.5

Uploaded by

NumPy Boolean Indexing

● In NumPy, boolean indexing allows us to filter elements from an

● Suppose we have an array named array1.

numbers = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

● data = [10, 20, 30, 40, 50]

Employee ID,First Name,Last

# read csv file with some arguments

arr = np.array([1, 2, 3, 4])

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

● Categoricals are a pandas data type corresponding to categorical

df = pd.DataFrame({"A": ["a", "b", "c", "a"]})

data = {'Name': ['Alice', 'Bob', 'Charlie',

# write to csv file

◻ UCI Repository: http://www.ics.uci.edu/~mlearn/MLRepository.html

You might also like