0% found this document useful (0 votes)
2 views

LAB 2 DWM

This document provides an introduction to essential Python libraries including Pandas, NumPy, and Matplotlib, detailing their functionalities and usage. It covers installation, data manipulation, and visualization techniques, along with practical examples and exercises. The objective is to equip learners with foundational knowledge to work with data using these libraries.

Uploaded by

hamza.tariqkwl
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

LAB 2 DWM

This document provides an introduction to essential Python libraries including Pandas, NumPy, and Matplotlib, detailing their functionalities and usage. It covers installation, data manipulation, and visualization techniques, along with practical examples and exercises. The objective is to equip learners with foundational knowledge to work with data using these libraries.

Uploaded by

hamza.tariqkwl
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Experiment # 2: Introduction to Python Libraries

Objective: To provide basic knowledge about basic python libraries, pandas, numpy, matplotlib.

Time Required : 3 hrs

Programming Language: Python

Software Required : Anaconda/ Google Colab

Python

Python is a popular programming language. It was created by Guido van Rossum, and released in 1991.

It is used for:

 web development (server-side),

 software development,

 mathematics,

 system scripting.

What can Python do?

 Python can be used on a server to create web applications.

 Python can be used alongside software to create workflows.

 Python can connect to database systems. It can also read and modify files.

 Python can be used to handle big data and perform complex mathematics.

 Python can be used for rapid prototyping, or for production-ready software development.

Pandas

Pandas is a Python library used for working with data sets.

It has functions for analyzing, cleaning, exploring, and manipulating data.

The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was created
by Wes McKinney in 2008.

Why Use Pandas?

Pandas allows us to analyze big data and make conclusions based on statistical theories.

Pandas can clean messy data sets, and make them readable and relevant.
Relevant data is very important in data science.

Installation of Pandas

If you have Python and PIP already installed on a system, then installation of Pandas is very easy.

Install it using this command:

Import Pandas

Once Pandas is installed, import it in your applications by adding the import keyword:

Pandas as pd

Pandas is usually imported under the pd alias.

Pandas Series

A Pandas Series is like a column in a table.

It is a one-dimensional array holding data of any type.

With the index argument, you can name your own labels.

Pandas DataFrames
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows
and columns.

import pandas as pd

data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}

#load data into a DataFrame object:


df = pd.DataFrame(data)

print(df)

Named Indexes

With the index argument, you can name your own indexes.

import pandas as pd

data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}

df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

print(df)

Read CSV Files

A simple way to store big data sets is to use CSV files (comma separated files).

CSV files contains plain text and is a well know format that can be read by everyone including Pandas.

Load the CSV into a DataFrame

import pandas as pd

df = pd.read_csv('data.csv')

print(df.to_string())

Read JSON

Big data sets are often stored, or extracted as JSON.


JSON is plain text, but has the format of an object, and is well known in the world of programming,
including Pandas.

Load the JSON file into a DataFrame

import pandas as pd

df = pd.read_json('data.json')

print(df.to_string())

NumPy

NumPy is a Python library used for working with arrays.

It also has functions for working in domain of linear algebra, fourier transform, and matrices.

NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely.

NumPy stands for Numerical Python.

Create a NumPy ndarray Object

NumPy is used to work with arrays. The array object in NumPy is called ndarray.

We can create a NumPy ndarray object by using the array() function.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

print(type(arr))

Dimensions in Arrays

A dimension in arrays is one level of array depth (nested arrays).

An array can have any number of dimensions.

When the array is created, you can define the number of dimensions by using the ndmin argument.

Create an array with 5 dimensions and verify that it has 5 dimensions:

import numpy as np

arr = np.array([1, 2, 3, 4], ndmin=5)

print(arr)
print('number of dimensions :', arr.ndim)
Iterating Arrays

Iterating means going through elements one by one.

As we deal with multi-dimensional arrays in numpy, we can do this using basic for loop of python.

If we iterate on a 1-D array it will go through each element one by one.

Iterating Arrays Using nditer()

The function nditer() is a helping function that can be used from very basic to very advanced iterations. It
solves some basic issues which we face in iteration, lets go through it with examples.

Iterating on Each Scalar Element

In basic for loops, iterating through each scalar of an array we need to use n for loops which can be
difficult to write for arrays with very high dimensionality.

Iterate through the following 3-D array:

import numpy as np

arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

for x in np.nditer(arr):
print(x)

Joining NumPy Arrays

Joining means putting contents of two or more arrays in a single array.

In SQL we join tables based on a key, whereas in NumPy we join arrays by axes.

We pass a sequence of arrays that we want to join to the concatenate() function, along with the axis. If
axis is not explicitly passed, it is taken as 0.

import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2))

print(arr)
Joining Arrays Using Stack Functions

Stacking is same as concatenation, the only difference is that stacking is done along a new axis.

We can concatenate two 1-D arrays along the second axis which would result in putting them one over
the other, ie. stacking.

We pass a sequence of arrays that we want to join to the stack() method along with the axis. If axis is not
explicitly passed it is taken as 0.

import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.stack((arr1, arr2), axis=1)

print(arr)

Splitting NumPy Arrays

Splitting is reverse operation of Joining.

Joining merges multiple arrays into one and Splitting breaks one array into multiple.

We use array_split() for splitting arrays, we pass it the array we want to split and the number of splits.

Split the array in 3 parts:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr)

NumPy Searching Arrays

You can search an array for a certain value, and return the indexes that get a match.

To search an array, use the where() method.

Find the indexes where the value is 4:

import numpy as np
arr = np.array([1, 2, 3, 4, 5, 4, 4])

x = np.where(arr == 4)

print(x)

Search Sorted

There is a method called searchsorted() which performs a binary search in the array, and returns the
index where the specified value would be inserted to maintain the search order.

Find the indexes where the value 7 should be inserted:

import numpy as np

arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7)

print(x)

Multiple Values

To search for more than one value, use an array with the specified values.

Find the indexes where the values 2, 4, and 6 should be inserted:

import numpy as np

arr = np.array([1, 3, 5, 7])

x = np.searchsorted(arr, [2, 4, 6])

print(x)

NumPy Sorting Arrays

Sorting means putting elements in an ordered sequence.

Ordered sequence is any sequence that has an order corresponding to elements, like numeric or
alphabetical, ascending or descending.

The NumPy ndarray object has a function called sort(), that will sort a specified array.

Sort the array:


import numpy as np

arr = np.array([3, 2, 0, 1])

print(np.sort(arr))

NumPy Filter Array

Getting some elements out of an existing array and creating a new array out of them is called filtering.

Create a filter array that will return only values higher than 42:

import numpy as np

arr = np.array([41, 42, 43, 44])

filter_arr = arr > 42

newarr = arr[filter_arr]

print(filter_arr)
print(newarr)

Matplotlib

Matplotlib is a low level graph plotting library in python that serves as a visualization utility.

Matplotlib was created by John D. Hunter.

Matplotlib is open source and we can use it freely.

Matplotlib is mostly written in python, a few segments are written in C, Objective-C and Javascript for
Platform compatibility.

Matplotlib Pyplot

Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported under
the plt alias:

import matplotlib.pyplot as plt

Example

Draw a line in a diagram from position (0,0) to position (6,250)


import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([0, 6])


ypoints = np.array([0, 250])

plt.plot(xpoints, ypoints)
plt.show()

Plotting Without Line

To plot only the markers, you can use shortcut string notation parameter 'o', which means 'rings'.

Draw two points in the diagram, one at position (1, 3) and one in position (8, 10):

import matplotlib.pyplot as plt


import numpy as np

xpoints = np.array([1, 8])


ypoints = np.array([3, 10])

plt.plot(xpoints, ypoints, 'o')


plt.show()

Matplotlib Markers

You can use the keyword argument marker to emphasize each point with a specified marker.

Marker Reference

You can choose any of these markers

Marker Description

'o' Circle

'*' Star

'.' Point

',' Pixel
'x' X

'X' X (filled)

'+' Plus

'P' Plus (filled)

's' Square

'D' Diamond

'd' Diamond (thin)

'p' Pentagon

'H' Hexagon

'h' Hexagon

'v' Triangle Down

'^' Triangle Up

'<' Triangle Left

'>' Triangle Right

'1' Tri Down

'2' Tri Up

'3' Tri Left

'4' Tri Right

'|' Vline
'_' Hline

Line Reference

Line Syntax Description

'-' Solid line

':' Dotted line

'--' Dashed line

'-.' Dashed/dotted line

Color Reference

Color Syntax Description

'r' Red

'g' Green

'b' Blue

'c' Cyan

'm' Magenta

'y' Yellow

'k' Black

'w' White
Matplotlib Labels and Title

With Pyplot, you can use the xlabel() and ylabel() functions to set a label for the x- and y-axis.

With Pyplot, you can use the title() function to set a title for the plot.

Example

Add a plot title and labels for the x- and y-axis:

import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.plot(x, y)

plt.title("Sports Watch Data")


plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.show()

Matplotlib Subplot

With the subplot() function you can draw multiple plots in one figure.

The subplot() function takes three arguments that describes the layout of the figure.

The layout is organized in rows and columns, which are represented by the first and second argument.

The third argument represents the index of the current plot.

Example

Draw 2 plots:

import matplotlib.pyplot as plt


import numpy as np

#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])

plt.subplot(1, 2, 1)
plt.plot(x,y)
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])

plt.subplot(1, 2, 2)
plt.plot(x,y)

plt.show()

Exercise:

Q1. Write code to find the indexes where the values are negative in the array [5, -3, 7, -9, 2, -1].

Q2. Create a Pandas Series with a custom index for the sales data:

 Sales: [300, 500, 700, 600]

 Days: ["Monday", "Tuesday", "Wednesday", "Thursday"] Perform a query to display sales greater
than 400.

Q3. Given the array [12, 24, 7, 35, 50, 9, 21], use NumPy to filter and display only the elements greater
than 15.

Q4. Given the array [15, 22, 8, 19, 30, 42], filter and display only elements divisible by 3 using NumPy.

Q5. Write code to generate a plot with a dashed red line from point (0, 0) to point (5, 10).

References:

https://www.w3schools.com/python/default.asp

You might also like