0% found this document useful (0 votes)
14 views

Why to Use Pytho1

Uploaded by

ankushpandey900
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Why to Use Pytho1

Uploaded by

ankushpandey900
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Why to use Python:

The following are the primary factors to use python in day-to-day life:
1. Python is object-oriented
Structure supports such concepts as polymorphism, operation overloading
and multiple inheritance.
2. Indentation
Indentation is one of the greatest feature in python
3. It’s free (open source)
Downloading python and installing python is free and easy
4. It’s Powerful

 Dynamic typing

 Built-in types and tools

 Library utilities

 Third party utilities (e.g. Numeric, NumPy, sciPy)

 Automatic memory management


5. It’s Portable

 Python runs virtually every major platform used today

 As long as you have a compaitable python interpreter installed,


python programs will run in exactly the same manner, irrespective
of platform.
6. It’s easy to use and learn

 No intermediate compile

 Python Programs are compiled automatically to an intermediate


form called byte code, which the interpreter then reads.

 This gives python the development speed of an interpreter without


the performance loss inherent in purely interpreted languages.

 Structure and syntax are pretty intuitive and easy to grasp.


7. Interpreted Language
Python is processed at runtime by python Interpreter
8. Interactive Programming Language
Users can interact with the python interpreter directly for writing the
programs
9. Straight forward syntax
The formation of python syntax is simple and straight forward which
also makes it popular.
Installation:
There are many interpreters available freely to run Python scripts like IDLE
(Integrated Development Environment) which is installed when you install
the python software from http://python.org/downloads/
Steps to be followed and remembered:
Step 1: Select Version of Python to Install. Step 2: Download Python
Executable Installer. Step 3: Run Executable Installer.
Step 4: Verify Python Was Installed On Windows.
Step 5: Verify Pip Was Installed.
Step 6: Add Python Path to Environment Variables (Optional)

Working with Python Python Code Execution:


Python’s traditional runtime execution model: Source code you type is
translated to byte code, which is then run by the Python Virtual Machine
(PVM). Your code is automatically compiled, but then it is interpreted.
Source Byte code Runtime
Source code extension is .py

PVM
m.pyc
m.py
Byte code extension is .pyc (Compiled python code)

There are two modes for using the Python interpreter:

• Interactive Mode

• Script Mode
Running Python in interactive mode:
Without passing python script file to the interpreter, directly execute code
to Python prompt. Once you’re inside the python interpreter, then you can
start.
>>> print("hello world") hello world
# Relevant output is displayed on subsequent lines without the >>> symbol
>>> x=[0,1,2]
# Quantities stored in memory are not displayed by default.
>>> x
#If a quantity is stored in memory, typing its name will display it. [0, 1,
2]
>>> 2+3
5

The chevron at the beginning of the 1st line, i.e., the symbol >>> is a prompt
the python interpreter uses to indicate that it is ready. If the programmer
types 2+6, the interpreter replies 8.
Running Python in script mode:

Alternatively, programmers can store Python script source code


in a file with the .py extension, and use the interpreter to execute the
contents of the file. To execute the script by the interpreter, you have to
tell the interpreter the name of the file. For example, if you have a script
name MyFile.py and you're working on Unix, to run the script you have to type:

python MyFile.py
Working with the interactive mode is better when Python programmers deal with
small pieces of code as you can type and execute them immediately, but when
the code is more than 2-4 lines, using the script for coding can help to modify
and use the code in future.
Example:
Data types:
The data stored in memory can be of many types. For example, a student roll
number is stored as a numeric value and his or her address is stored as
alphanumeric characters. Python has various standard data types that are used

to define the operations possible on them and the storage method for each of
them.

Int:
Int, or integer, is a whole number, positive or negative, without
decimals, of unlimited length.
>>> print(24656354687654+2) 24656354687656
>>> print(20) 20
>>> print(0b10) 2
>>> print(0B10) 2
>>> print(0X20) 32
>>> 20
20
>>> 0b10 2
>>> a=10
>>> print(a) 10
# To verify the type of any object in Python, use the type() function:
>>> type(10)
<class 'int'>
>>> a=11
>>> print(type(a))
<class 'int'>
Float:
Float, or "floating point number" is a number, positive or negative,
containing one or more decimals.
Float can also be scientific numbers with an "e" to indicate the power of 10.
>>> y=2.8
>>> y 2.8
>>> y=2.8
>>> print(type(y))
<class 'float'>
>>> type(.4)
<class 'float'>
>>> 2.
2.0
Example:
x = 35e3 y = 12E4
z = -87.7e100
print(type(x)) print(type(y)) print(type(z))
Output:
<class 'float'>
<class 'float'>
<class 'float'>
Boolean:
Objects of Boolean type may have one of two values, True or False:
>>> type(True)
<class 'bool'>
>>> type(False)
<class 'bool'>
String:

1. Strings in Python are identified as a contiguous set of characters


represented in the quotation marks. Python allows for either pairs of
single or double quotes.

• 'hello' is the same as "hello".

• Strings can be output to screen using the print function. For


example: print("hello").
>>> print("ICFAI")
ICFAI
>>> type("ICFAI")
<class 'str'>
>>> print(ICFAI')
ICFAI
>>> " "
' '
If you want to include either type of quote character within the string, the
simplest way is to delimit the string with the other type. If a string is to
contain a single quote, delimit it with double quotes and vice versa:
>>> print("mrcet is an autonomous (') college") mrcet is an autonomous (')
college
>>> print('mrcet is an autonomous (") college') mrcet is an autonomous (")
college Suppressing Special Character:
Specifying a backslash (\) in front of the quote character in a string “escapes”
it and causes Python to suppress its usual special meaning. It is then
interpreted simply as a literal single quote character:
>>> print("mrcet is an autonomous (\') college") mrcet is an autonomous (')
college
>>> print('mrcet is an autonomous (\") college') mrcet is an autonomous (")
college
The following is a table of escape sequences which cause Python to suppress
the usual special interpretation of a character in a string:

>>> print('a\
....b')
a.. b
>>> print('a\ b\
c')

abc
>>> print('a \n b') a
b
>>> print("mrcet \n college") mrcet
college

Escape Usual Interpretation of Character(s) “Escaped”


Sequence After Backslash Interpretation
Terminates string with single quote Literal single quote
\'
opening delimiter (') character
Terminates string with double quote Literal double quote
\"
opening delimiter (") character
\newline Terminates input line Newline is ignored
Literal backslash (\)
\\ Introduces escape sequence
character

In Python (and almost all other common computer languages), a tab character
can be specified by the escape sequence \t:
>>> print("a\tb") a b

List:
 It is a general purpose most widely used in data structures

 List is a collection which is ordered and changeable and allows


duplicate members. (Grow and shrink as needed, sequence type,
sortable).

 To use a list, you must declare it first. Do this using square brackets
and separate values with commas.

 We can construct / create list in many ways. Ex:


>>> list1=[1,2,3,'A','B',7,8,[10,11]]
>>> print(list1)
[1, 2, 3, 'A', 'B', 7, 8, [10, 11]]

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
Dictionary
Dictionaries are used to store data values in key:value pairs.
A dictionary is a collection which is ordered*, changeable and do not allow
duplicates.
Dictionaries are written with curly brackets, and have keys and values:
Example
Create and print a dictionary:
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(thisdict)

Nested Dictionaries
A dictionary can contain dictionaries, this is called nested dictionaries.
ExampleGet your own Python Server
Create a dictionary that contain three dictionaries:
myfamily = {
"child1" : {
"name" : "Emil",
"year" : 2004
},
"child2" : {
"name" : "Tobias",
"year" : 2007
},
"child3" : {
"name" : "Linus",
"year" : 2011
}
}

Accessing from Nested Dictionary


d= {'k1':[1,2,3,{'tricky':['oh', 'man', 'inception',
{'target':[1,2,3,'hello']}]}]} Defining a Dictionary
d['k1'][3]['tricky'][3]['target'][0] Accessing 1 from the dictionary
‘target’
Green font (Jupyterlab code)
Shift + Enter
Output obtained
Code / Output Remark

1 Type declaration is not needed

1 Integer output

1.0 Type declaration is not needed

1.0 Float output

1+1 Understands arithmetic calculations

2 Addition output

1*3 Understands arithmetic calculations

3 Multiplication output

1/2 Understands arithmetic calculations

0.5 Division output

1/2.0 Provides output in suitable type

0.5 Float output

2**5 ** is for power

32 2*2*2*2*2 is the output

2+3*5+5 Understands BODMAS rule

22 Output

(2+3)*(5+5) Understands BODMAS rule

50 Output

4%2 % is used for finding Remainder

0 Remainder in output

5/2 Provides output in suitable type

2.5 Float output

25%7 % is used for finding Remainder


4 Remainder in output

'Single Quote' Input in quote sign

Output in single quote

print('single quote') Using print()

single quote Gives output without quote sign


num = 20
Automatically consider the type
name='ICFAI'
'I study at {} and my ID is {}'.format Using {} for printer variable in user
(num, name) friendly way (01)
'I study at 20 and my ID is ICFAI' Output

print('I study at {} and my ID is Using {} for printer variable in user


{}'.format (name, num)) friendly way (02)

I study at ICFAI and my ID is 25 Output

'I study at {one} and my ID is Using {} for printer variable in user


{two}'.format (one=name, two=num) friendly way (03)

'I study at ICFAI and my ID is 25' Output

s='abcdefghijk' String type variable

s[0] Access first character (array starts


from 0)
'a' Output

s[2:] Accessing characters from 3rd onwards

'cdefghijk' Output

s[:4] Accessing 04 initial characters

'abcd' Output

s[1:5] Accessing 2 – 4 characters

'bcde' Output

my_list = ['a', 'b', 'c'] Defining List

My_list Accessing List

['a', 'b', 'c'] Output

my_list.append('d') Appending List

my_list Accessing List


['a', 'b', 'c', 'd'] Output

Feature Scaling

In Data Processing, we try to change the data in such a way that the model can
process it without any problems. And Feature Scaling is one such process in which
we transform the data into a better version. Feature Scaling is done to normalize
the features in the dataset into a finite range.

There are several ways to do feature scaling. I will be discussing the top 5 of
the most commonly used feature scaling techniques.
1. Absolute Maximum Scaling
2. Min-Max Scaling
3. Normalization
4. Standardization
5. Robust Scaling

Absolute Maximum Scaling


 Find the absolute maximum value of the feature in the dataset
 Divide all the values in the column by that maximum value

If we do this for all the numerical columns, then all their values will lie between
-1 and 1. The main disadvantage is that the technique is sensitive to outliers.
Like consider the feature *square feet*, if 99% of the houses have square feet
area of less than 1000, and even if just 1 house has a square feet area of 20,000,
then all those other house values will be scaled down to less than 0.05.
I will be working with the sine and cosine functions throughout the article and
show you how the scaling techniques affect their magnitude. sin() will be ranging
between -1 and +1, and 50*cos() will be ranging between -50 and +50.

This is how they actually look, you will not even be able to see that the red one
is a sine graph, it basically looks like a straight squiggly line when compared
to the big blue graph.
y1_new = y1/max(y1)
y2_new = y2/max(y2)
See from the graph that now both the datasets are ranging from -1 to +1 after the
scaling.
This might become significantly small with many data points below even 0.01 even
if there is a single big outlier.
Min Max Scaling

In min-max you will subtract the minimum value in the dataset with all the values
and then divide this by the range of the dataset(maximum-minimum). In this case,
your dataset will lie between 0 and 1 in all cases whereas in the previous case,
it was between -1 and +1. Again, this technique is also prone to outliers.
y1_new = (y1-min(y1))/(max(y1)-min(y1))
y2_new = (y2-min(y2))/(max(y2)-min(y2))
plt.plot(x,y1_new,'red')
plt.plot(x,y2_new,'blue')
[<matplotlib.lines.Line2D at 0x7f6e1bf8fd30>]
Normalization
Instead of using the min () value in the previous case, in this case, we will be
using the average() value.

In scaling, you are changing the range of your data while in normalization you are
re changing the shape of the distribution of your data.
y1_new = (y1-np.mean(y1))/(max(y1)-min(y1))
y2_new = (y2-np.mean(y2))/(max(y2)-min(y2))
plt.plot(x,y1_new,'red')
plt.plot(x,y2_new,'blue')
[<matplotlib.lines.Line2D at 0x7f6e1bfb5518>]
Standardization
In standardization, we calculate the z-value for each of the data points and
replaces those with these values.

This will make sure that all the features are centred around the mean value with
a standard deviation value of 1. This is the best to use if your feature is
normally distributed like salary or age.
y1_new = (y1-np.mean(y1))/np.std(y1)
y2_new = (y2-np.mean(y2))/np.std(y2)
plt.plot(x,y1_new,'red')
plt.plot(x,y2_new,'blue')
[<matplotlib.lines.Line2D at 0x7f6e25e66e10>]
Robust Scaling
In this method, you need to subtract all the data points with the median value and
then divide it by the Inter Quartile Range(IQR) value.

IQR is the distance between the 25th percentile point and the 50th percentile
point.
This method centres the median value at zero and this method is robust to outliers.
from scipy import stats
IQR1 = stats.iqr(y1, interpolation = 'midpoint')
y1_new = (y1-np.median(y1))/IQR1
IQR2 = stats.iqr(y2, interpolation = 'midpoint')
y2_new = (y2-np.median(y2))/IQR2
plt.plot(x,y1_new,'red')
plt.plot(x,y2_new,'blue')
[<matplotlib.lines.Line2D at 0x7f6e25e19080>]
Is Feature Scaling actually helpful?
Let’s look at an example of a College Admission dataset, in which your goal is to
predict the chance of admission for each student based on the other features given.
You can download the dataset from the link below.
https://www.kaggle.com/mohansacharya/graduate-admissions
import pandas as pd
df = pd.read_csv("Admission_Predict.csv")
df.head()

The dataset has a wide variety of features with different ranges. The first
column Serial No. is not important, so I am going to be deleting it. Then I am
splitting the dataset into training and test dataset.
df.drop("Serial No.",axis=1,inplace=True)
y = df['Chance of Admit ']
df.drop("Chance of Admit ",axis=1,inplace=True)
from sklearn.model_selection import train_test_split
Dealing with outliers using the Z-Score method
Outlier detection is one of the widely used methods in any data science project,
as its presence can lead to the development of a bad machine learning model. Let’s
take a quick scenario of the linear regression problem statement, where suppose
you have to predict the person’s weight from the height. In general, one with more
height will also have more weight (linear positive trend), but what if rarely we
have 3-4 people who have much less weight, but comparatively more height than that
data will be treated as bad data or the outliers. Which, in the end, would not be
a good fit for our regression model.
There are various ways through which we can deal with the outliers, though, in
this article, we will have our complete focus on the Z-Score method. Here we will
talk about the limitations of this method, When to use and prefer other methods,
and of course, the complete

The data point with values Z > +3 and Z<-3 is considered to be the outlier which
may be removed as follows using python:-

Outlier Removal Using Z-Score


import pandas as pd
import seaborn as sn
df=pd.read_csv('c:/users/lenovo/desktop/height.csv')
df.head()
Gender Height
0 Male 73.847017
1 Male 68.781904
2 Male 74.110105
3 Male 71.730978
4 Male 69.881796
df.Height.describe()
count 10000.000000
mean 66.367560
std 3.847528
min 54.263133
25% 63.505620
50% 66.318070
75% 69.174262
max 78.998742
Name: Height, dtype: float64
sn.histplot(df.Height, kde=True)
<matplotlib.axes._subplots.AxesSubplot at 0x170b773e438>

mean=df.Height.mean()
mean
66.3675597548656
std_deviation=df.Height.std()
std_deviation
3.847528120795573
df['zscore']=(df.Height-df.Height.mean())/df.Height.std()
df.head()
Gender Height zscore
0 Male 73.847017 1.943964
1 Male 68.781904 0.627505
2 Male 74.110105 2.012343
3 Male 71.730978 1.393991
4 Male 69.881796 0.913375
new_df=df[(df.zscore<3) & (df.zscore>-3)]
outlier=df[(df.zscore<-3)|(df.zscore>3)]
new_df
Gender Height zscore
0 Male 73.847017 1.943964
1 Male 68.781904 0.627505
2 Male 74.110105 2.012343
3 Male 71.730978 1.393991
4 Male 69.881796 0.913375
... ... ... ...
9995 Female 66.172652 -0.050658
9996 Female 67.067155 0.181830
9997 Female 63.867992 -0.649655
9998 Female 69.034243 0.693090
9999 Female 61.944246 -1.149651

[9993 rows x 3 columns]


outlier
Gender Height zscore
994 Male 78.095867 3.048271
1317 Male 78.462053 3.143445
2014 Male 78.998742 3.282934
3285 Male 78.528210 3.160640
3757 Male 78.621374 3.184854
6624 Female 54.616858 -3.054091
9285 Female 54.263133 -3.146027
Outlier Removal Using Standard Deviation
Similarily, to remove the outlier using Standard deviation, we chose lower cut off
as (mean - 3* std) and upper cut off as (mean + 3* std) as 99.7% of total
information is contained within the range specified.
The figure below expresses the same:-
Python code for an example of removing outlier using standard deviation is provided
below:-

import pandas as pd
import seaborn as sn
df=pd.read_csv('c:/users/lenovo/desktop/height.csv')
df.head()
Gender Height
0 Male 73.847017
1 Male 68.781904
2 Male 74.110105
3 Male 71.730978
4 Male 69.881796
df.Height.describe()
count 10000.000000
mean 66.367560
std 3.847528
min 54.263133
25% 63.505620
50% 66.318070
75% 69.174262
max 78.998742
Name: Height, dtype: float64
sn.histplot(df.Height, kde=True)
<matplotlib.axes._subplots.AxesSubplot at 0x170b773e438>
mean=df.Height.mean()
mean
66.3675597548656
std_deviation=df.Height.std()
std_deviation
3.847528120795573
mean-3*std_deviation
54.824975392478876
mean+3*std_deviation
77.91014411725232
df[df.Height<54.82]
Gender Height
6624 Female 54.616858
9285 Female 54.263133
df[df.Height>77.91]
Gender Height
994 Male 78.095867
1317 Male 78.462053
2014 Male 78.998742
3285 Male 78.528210
3757 Male 78.621374
new_df=df[df.Height<77.91]
new_df
Gender Height
0 Male 73.847017
1 Male 68.781904
2 Male 74.110105
3 Male 71.730978
4 Male 69.881796
... ... ...
9995 Female 66.172652
9996 Female 67.067155
9997 Female 63.867992
9998 Female 69.034243
9999 Female 61.944246

[9995 rows x 2 columns]


new_df1=new_df[new_df.Height>54.82]
new_df1
Gender Height
0 Male 73.847017
1 Male 68.781904
2 Male 74.110105
3 Male 71.730978
4 Male 69.881796
... ... ...
9995 Female 66.172652
9996 Female 67.067155
9997 Female 63.867992
9998 Female 69.034243
9999 Female 61.944246

[9993 rows x 2 columns]

x_train,x_test,y_train,y_test = train_test_split(df,y,test_size=0.2)
I am going to be building a linear regression model, first without normalization,
and next with normalization, let’s check whether there is any improvement in the
accuracy.
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(x_train,y_train)
pred = lr.predict(x_test)
from sklearn import metrics
rmse = np.sqrt(metrics.mean_squared_error(y_test,pred))
rmse
0.06845052747026953
See that without normalization the root mean squared error value comes out to be
0.0684, as most of the values in the `y` are less than 0.5.
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
sc.fit(df)
df = sc.transform(df)
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(df,y,test_size=0.2)
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(x_train,y_train)
pred = lr.predict(x_test)
from sklearn import metrics
rmse = np.sqrt(metrics.mean_squared_error(y_test,pred))
rmse
0.05674870151306346
See that, we are able to get a significant reduction in the error when we used the
standardization technique.
Code / Output Remark

1 Type declaration is not needed

1 Integer output

1.0 Type declaration is not needed

1.0 Float output

1+1 Understands arithmetic calculations

2 Addition output

1*3 Understands arithmetic calculations

3 Multiplication output

1/2 Understands arithmetic calculations

0.5 Division output

1/2.0 Provides output in suitable type

0.5 Float output

2**5 ** is for power

32 2*2*2*2*2 is the output

2+3*5+5 Understands BODMAS rule

22 Output

(2+3)*(5+5) Understands BODMAS rule

50 Output

4%2 % is used for finding Remainder

0 Remainder in output

5/2 Provides output in suitable type

2.5 Float output

25%7 % is used for finding Remainder

4 Remainder in output

'Single Quote' Input in quote sign

'Single Quote' Output in single quote


print('single quote') Using print()

single quote Gives output without quote sign


num = 20
Automatically consider the type
name='ICFAI'
'I study at {} and my ID is {}'.format Using {} for printer variable in user
(num, name) friendly way (01)
'I study at 20 and my ID is ICFAI' Output

print('I study at {} and my ID is Using {} for printer variable in user


{}'.format (name, num)) friendly way (02)

I study at ICFAI and my ID is 25 Output

'I study at {one} and my ID is Using {} for printer variable in user


{two}'.format (one=name, two=num) friendly way (03)

'I study at ICFAI and my ID is 25' Output

s='abcdefghijk' String type variable

s[0] Access first character (array starts


from 0)
'a' Output

s[2:] Accessing characters from 3rd onwards

'cdefghijk' Output

s[:4] Accessing 04 initial characters

'abcd' Output

s[1:5] Accessing 2 – 4 characters

'bcde' Output

my_list = ['a', 'b', 'c'] Defining List

My_list Accessing List

['a', 'b', 'c'] Output

my_list.append('d') Appending List

my_list Accessing List

['a', 'b', 'c', 'd'] Output

my_list[0:4] Accessing List (0-3 elements)

['a', 'b', 'c', 'd'] Output


my_list[0:] Accessing List (0 onwards elements)

['a', 'b', 'c', 'd'] Output

my_list[0]='IUR' Replacing an element by another type


of data
my_list Accessing List

['IUR', 'b', 'c', 'd'] Output

nest = [1, 2, [3, 4]] Defining nested list

nest[2] Accessing nested part of the list

[3, 4] Output

nest[2][1] Accessing (4) from List

4 Output

nest=[1,2,3,[4,5, ['target', 'python',


Defining nested list
'matlab']]]

nest[3][2][2] Accessing List element ‘matlab’

'matlab' Output
if 1==2:
print('first')
Understanding if---else
else:
print('last')
last Output
if 1!=2:
print('first')
Understanding if---else
else:
print('last')
first Output
if 1==2:
print('first')
elif 3==3:
Understanding if---else ladder
print('middle')
else:
print('last')
middle Output
fruits = ["apple", "banana", "cherry"]
for x in fruits: Understanding for loop
print(x)
apple
banana Output
cherry
seq=[1,2,3,4,5] Defining a list
for i in seq:
Using for loop
print(i)
1
2
3 Output
4
5
fruits = ["apple", "banana", "cherry"]
for x in fruits:
if x == "banana": Printing using for loop (Use of break)
break
print(x)
apple Output
fruits = ["apple", "banana", "cherry"]
for x in fruits:
if x == "banana": Printing using for loop (Use of continue)
continue
print(x)
apple
Output
cherry
for x in range(6):
Printing a range starts from 0
print(x)
0
1
2
Output
3
4
5
for x in range(2, 6):
For loop for two number
print(x)
2
3
Output
4
5
i=2
while i<5:
Printing using for loop
print('i is: {}'.format(i))
i=i+1
i is: 2
i is: 3 Output
i is: 4
nest=[1,2,[7,4],[5, ['Nisha', 'DSB'],
Nested List for practice
[100,200,['Hello']], 23, 11],1,7]
nest[3][2][1] Accessing 200 from the list
200 Output
d= {'k1':[1,2,3,{'tricky':['oh',
'man', 'inception', Defining a Dictionary
{'target':[1,2,3,'hello']}]}]}
d['k1'][3]['tricky'][3]['target'][0] Accessing 1 from the list
1 Output

You might also like