UNIT-5 NOTES
UNIT-5 NOTES
Python Packages:
● A Package is a file containing Python definitions (i.e. functions) and
statements.
● Standard library of Python is extended as a Python package(s) to a
programmer.
● Definitions from the Package/module can be used within the code of
a program. To use these modules in the program, a programmer
needs to import the Package/module.
● Once you import a module, you can reference (use)any of its
functions or variables in your code.
● There are many ways to import a module in your program, the one’s
which you should know are:
i. import
ii. From
● It is the simplest and most common way to use modules in our code.
Its syntax is:
import modulename1 [,modulename2, ---------]
Example:
>>> import random
● On execution of this statement, Python will
(i) search for the file ‘random.py’.
(ii) Create space where modules definition & variable will be created,
(iii) then execute the statements in the module.
● Now the definitions of the module will become part of the code in
which the module was imported.
● To use/ access/invoke a function, you will specify the module name
and name of the function- separated by dot (.).
● This format is also known as dot notation.
Example:
>>>random.random ( )
From Statement:
● It is used to get a specific function in the code instead of the
complete module file.
● If we know beforehand which function(s)we will be needing, then we
may use from.
● For modules having large no. of functions, it is recommended to use
from instead of import.
● Its syntax is
>>> from modulename import functionname [, functionname…..]
Example:
>>> from random import randint
NUMPY:
● NumPy stands for ‘Numerical Python’. It is a package for data
analysis and scientific computing with Python.
● NumPy uses a multidimensional array object, and has functions and
tools for working with these arrays.
● The powerful n-dimensional array in NumPy speeds-up data
processing.
● NumPy can be easily interfaced with other Python packages and
provides tools for integrating with other programming languages like
C, C++ etc.
Installing NumPy:
● NumPy can be installed by typing following command:
pip install NumPy
Array:
● An array is a data type used to store multiple values using a single
identifier (variable name).
● An array contains an ordered collection of data elements where each
element is of the same type and can be referenced by its index
(position).
● The important characteristics of an array are:
• Each element of the array is of the same data type, though the
values stored in them may be different.
• The entire array is stored contiguously in memory. This makes
operations on arrays fast.
• Each element of the array is identified or referred using the name of
the Array along with the index of that element, which is unique for
each element.
• The index of an element is an integral value associated with the
element, based on the element’s position in the array.
For example consider an array with 5 numbers:
[ 10, 9, 99, 71, 90 ]
NumPy Array:
● NumPy arrays are used to store lists of numerical data, vectors and
matrices.
● The NumPy library has a large set of built-in functions for creating,
manipulating, and transforming NumPy arrays.
● The Python language also has an array data structure, but it is not
as versatile, efficient and useful as the NumPy array. The NumPy
array is officially called ndarray but commonly known as array.
List Array
List can have elements of different All elements of an array are of same
data types for example, [1,3.4, ‘hello’, data type for example, an array of
‘a@’] floats may be: [1.2, 5.4, 2.7]
Indexing:
● For 2-D arrays indexing for both dimensions starts from 0, and each
element is referenced through two indexes i and j, where i represents
the row number and j represents the column number.
Ramesh 78 67 56
Ramesh 76 75 47
Harun 84 59 60
Prasad 67 72 54
● Here, marks[i,j] refers to the element at (i+1)th row and (j+1)th column
because the index values start at 0.
● Thus marks[3,1] is the element in the 4th row and second column
which is 72 (marks of Prasad in English).
Slicing:
● Sometimes we need to extract part of an array. This is done through
slicing.
● We can define which part of the array to be sliced by specifying the
start and end index values using [start : end] along with the array
name.
E.g. 1
>>> array8 array([-2, 2, 6, 10, 14, 18, 22])
>>> array8[3:5] # excludes the value at the end index
array([10, 14])
E.g. 2
>>> array8[ : : -1] # reverse the array
array([22, 18, 14, 10, 6, 2, -2])
Operations on Arrays:
1. Arithmetic Operations:
#Subtraction
>>> array1 - array2
array([[ -7, -14],
[-11, -10]])
#Multiplication
>>> array1 * array2
array([[ 30, 120],
[ 60, 24]])
#Matrix Multiplication
>>> array1 @ array2
array([[120, 132],
[ 70, 104]])
#Exponentiation
>>> array1 ** 3
array([[ 27, 216],
[ 64, 8]], dtype=int32)
#Division
>>> array2 / array1
array([[3.33333333, 3.33333333],
[3.75 , 6. ]])
Concatenating Arrays:
● Concatenation means joining two or more arrays.
● Concatenating 1-D arrays means appending the sequences one after
another.
● NumPy.concatenate() function can be used to concatenate two or
more 2-D arrays either row-wise or column-wise.
● All the dimensions of the arrays to be concatenated must match
exactly except for the dimension or axis along which they need to be
joined.
● Any mismatch in the dimensions results in an error. By default, the
concatenation of the arrays happens along axis=0.
Reshaping Arrays:
● We can modify the shape of an array using the reshape() function.
● Reshaping an array cannot be used to change the total number of
elements in the array.
● Attempting to change the number of elements in the array using
reshape() results in an error.
Splitting Arrays:
● We can split an array into two or more subarrays.
● numpy.split() splits an array along the specified axis.
● We can either specify a sequence of index values where an array is to
be split; or we can specify an integer N, that indicates the number of
equal parts in which the array is to be split, as parameter(s) to the
NumPy.split() function.
● By default, NumPy.split() splits along axis = 0. Consider the array given
below:
>>> array4
array ( [ [ 10, -7, 0, 20],
[ -5, 1, 200, 40],
[ 30, 1, -1, 4],
[ 1, 2, 0, 4],
[ 0, 1, 0, 2 ] ] )
# array4 is split after the first row and upto the third row and
stored on the sub-array second
>>> second
array ( [ [ -5, 1, 200, 40],
[ 30, 1, -1, 4 ] ] )
>>> arrayA.mean()
3.125
>>> arrayB.mean()
3.75
>>> arrayB.mean(axis=0)
array([3.5, 4. ])
>>> arrayB.mean(axis=1)
array([4.5, 3. ])
>>> arrayA.std()
3.550968177835448
>>> arrayB.std()
1.479019945774904
>>> arrayB.std(axis=0)
array([0.5, 2. ])
>>> arrayB.std(axis=1)
array([1.5, 1. ])
2 22 23 45
3 43 51 37
4 41 40 60
5 13 18 37
.
● We can load the data from the data.txt file into an array say,
studentdata in the following manner:
● The parameter skiprows=1 indicates that the first row is the header
row and therefore we need to skip it as we do not want to load it in
the array.
● The delimiter specifies whether the values are separated by comma,
semicolon, tab or space (the four are together called whitespace), or
any other character. The default value for delimiter is space.
● We can also specify the data type of the array to be created by
specifying through the dtype argument. By default, dtype is float.
PANDAS:
● PANDAS (PANel DAta) is a high-level data manipulation tool used for
analyzing data.
● It is very easy to import and export data using Pandas library which
has a very rich set of functions.
● It is built on packages like NumPy and Matplotlib and gives us a
single, convenient place to do most of our data analysis and
visualization work.
● Pandas has three important data structures, namely – Series,
DataFrame and Panel to make the process of analyzing data
organized, effective and efficient.
What the need for Pandas is when NumPy can be used for
data analysis. Following are some of the differences between Pandas
and Numpy:
1. A Numpy array requires homogeneous data, while a Pandas
DataFrame can have different data types (float, int, string, datetime,
etc.).
2. Pandas have a simpler interface for operations like file loading,
plotting, selection, joining, GROUP BY, which come very handy in
data-processing applications.
3. Pandas DataFrames (with column names) make it very easy to keep
track of data.
4. Pandas is used when data is in Tabular Format, whereas Numpy is
used for numeric array based data manipulation.
Installing Pandas
● Installing Pandas is very similar to installing NumPy. To install Pandas
from command line, we need to type in:
pip install pandas
Series:
● A Series is a one-dimensional array containing a sequence of values
of any data type (int, float, list, string, etc) which by default have
numeric data labels starting from zero.
● The data label associated with a particular value is called its index.
● We can also assign values of other data types as indexes.
● We can imagine a Pandas Series as a column in a spreadsheet.
● Example of a series containing names of students is given below:
Index Value
0 Arnab
1 Samridhi
2 Ramit
3 Divyam
4 Kritika
Creation of Series: There are different ways in which a series can be
created in Pandas. To create or use series, we first need to import the
Pandas library.
(A) Creation of Series from Scalar Values
● A Series can be created using scalar values as shown in the
example below:
>>> import pandas as pd
>>> series1 = pd.Series([10,20,30])
>>> print(series1)
Output:
0 10
1 20
2 30
dtype: int64
(A) Indexing
● Indexing in Series is similar to that for NumPy arrays, and is
used to access elements in a series.
● Indexes are of two types: positional index and labelled index.
● Positional index takes an integer value that corresponds to its
position in the series starting from 0, whereas labelled index
takes any user-defined label as index.
● Following example shows usage of the positional index for
accessing a value from a Series.
>>> seriesNum = pd.Series([10,20,30])
>>> seriesNum[2]
30
● When labels are specified, we can use labels as indices while
selecting values from a Series, as shown below. Here, the value 3
is displayed for the labelled index Mar.
>>> seriesMnths = pd.Series([2,3,4],index=["Feb
","Mar","Apr"])
>>> seriesMnths["Mar"]
3
(B) Slicing
● Sometimes, we may need to extract a part of a series, This can
be done through slicing.
● We can define which part of the series is to be sliced by
specifying the start and end parameters [start :end] with the
series name.
● When we use positional indices for slicing, the value at the
endindex position is excluded, i.e., only (end - start) number of
data values of the series are extracted.
● Consider the following series seriesCapCntry:
● If labelled indexes are used for slicing, then value at the end index
label is also included in the output, for example:
>>> seriesCapCntry['USA' : 'France']
USA WashingtonDC
UK London
France Paris
dtype: object
● We can also get the series in reverse order, for example:
>>> seriesCapCntry[ : : -1]
France Paris
UK London
USA WashingtonDC
India NewDelhi
dtype: object
Attributes of Series
● We can access certain properties called attributes of a series by
using that property with the series name.
Example:
>>> seriesCapCntry
India NewDelhi
USA WashingtonDC
UK London
France Paris
Attribute Name Purpose Example
Methods of Series
● we are going to discuss some of the methods that are available
for Pandas Series.
Creation of DataFrame
● There are a number of ways to create a DataFrame.
>>> students['Preeti']=[89,78,76]
>>> students
Arnab Ramit Samridhi Riya Mallika Preeti
Maths 90 92 89 81 94 89
Science 91 81 91 71 95 78
Hindi 97 96 88 67 99 76
● If the DataFrame has more than one row with the same label, the
DataFrame.drop() method will delete all the matching rows from it
>>> students.loc['Science']
Arnab 91
Ramit 81
Samridhi 91
Riya 71
Mallika 95
>>> dFrame1=dFrame1.append(dFrame2)
>>> dFrame1
C1 C2 C3 C5
R1 1.0 2.0 3.0 NaN
R2 4.0 5.0 NaN NaN
R3 6.0 NaN NaN NaN
R4 NaN 10.0 NaN 20.0
R2 NaN 30.0 NaN NaN
R5 NaN 40.0 NaN 50.0
Attributes of Pandas DataFrame
Customisation of Plots
● Pyplot library gives us numerous functions, which can be used to
customize charts such as adding titles or legends.
grid([b, which, axis]) Configure the grid lines.
Colour
● It is also possible to format the plot further by changing the color of
the plotted data.
● We can either use character codes or the color names as values to
the parameter color in the plot().
Linewidth and Line Style
● The linewidth and linestyle property can be used to change the width
and the style of the line chart.
● Linewidth is specified in pixels.
● The default line width is 1 pixel showing a thin line.
● Thus, a number greater than 1 will output a thicker line depending on
the value provided.
● We can also set the line style of a line chart using the linestyle
parameter.
● It can take a string such as "solid", "dotted", "dashed" or "dashdot"
Depict the sales for the three weeks using a Line chart. It should have the
following:
i. Chart title as “Mela Sales Report”.
ii. axis label as Days.
iii. axis label as “Sales in Rs”.
Line colours are red for week 1, blue for week 2 and brown
for week 3.
SOLUTION:
import pandas as pd
import matplotlib.pyplot as plt
# reads "MelaSales.csv" to df by giving path to the file
df=pd.read_csv("MelaSales.csv")
#create a line plot of different color for each week
df.plot(kind='line', color=['red','blue','brown'])
# Set title to "Mela Sales Report"
plt.title('Mela Sales Report')
# Label x axis as "Days"
plt.xlabel('Days')
# Label y axis as "Sales in Rs"
plt.ylabel('Sales in Rs')
#Display the figure
plt.show()
Tkinter Introduction
● Tkinter module in Python is a standard library in Python used for
creating Graphical User Interface (GUI) for Desktop Applications.
● With the help of Tkinter developing desktop applications is not a
tough task.
● The Tkinter module in Python is a good way to start creating simple
projects in Python.
● The Tkinter library provides us with a lot of built-in widgets (also
called Tk widgets or Tk interface) that can be used to create different
desktop applications.
● Among various GUI Frameworks, Tkinter is the only framework that is
built-in into Python's Standard Library.
● An important feature in favor of Tkinter is that it is cross-platform, so
the same code can easily work on Windows, macOS, and Linux.
● Tkinter is a lightweight module.
● It comes as part of the standard Python installation, so you don't
have to install it separately.
● It supports a lot of built-in widgets that can be used directly to create
desktop applications.
● Tkinter is based upon the Tk toolkit, which was originally designed for
the Tool Command Language (Tcl). As Tk is very popular thus it has
been ported to a variety of other scripting languages, including Perl
(Perl/Tk), Ruby (Ruby/Tk), and Python (Tkinter).
● The wide variety of widgets, portability, and flexibility of Tk makes it
the right tool which can be used to design and implement a wide
variety of simple and complex projects.
● Python with Tkinter provides a faster and more efficient way to build
useful desktop applications that would have taken much time if you
had to program directly in C/C++ with the help of native OS system
libraries.
The basic steps of creating a simple desktop application using the Tkinter
module in Python are as follows:
● When you create a desktop application, the first thing that you will
have to do is create a new window for the desktop application.
● The main window object is created by the Tk class in Tkinter.
● Once you have a window, you can add text, input fields, buttons, etc.
to it.
import tkinter as tk
win = tk.Tk()
win.title('Hello World!')
# you can add widgets here
win.mainloop()
The two main methods that are used while creating desktop applications in
Python are:
1. Tk( )
This is how you can use it, just like in the Hello World code example,
win = tkinter.Tk() ## where win indicates name of the main window object
It will wait for events to occur and process the events as long as the window
is not closed.
import tkinter as tk
root = tk.Tk()
root.title("Tkinter World")
entry = tk.Entry(root)
entry.pack()
root.mainloop()