Python Workshop - Learn Python the Hard WayUtkarsh Sengar
This document provides an introduction to learning Python. It discusses prerequisites for Python, basic Python concepts like variables, data types, operators, conditionals and loops. It also covers functions, files, classes and exceptions handling in Python. The document demonstrates these concepts through examples and exercises learners to practice char frequency counting and Caesar cipher encoding/decoding in Python. It encourages learners to practice more to master the language and provides additional learning resources.
This document provides an introduction to Python programming. It demonstrates how to open a Python terminal or IDE, use basic data types like integers, floats, strings, lists, tuples and dictionaries. It shows how to take user input, read and write files, use conditional and loop statements, define functions, and more. Some key points covered include:
- Python uses indentation rather than braces to define code blocks for conditionals and loops.
- All variables are references to objects, and objects have dynamic types.
- Common data types include integers, floats, strings, lists, tuples, dictionaries, booleans, and None.
- Lists and dictionaries are mutable, while tuples are immutable.
- Functions
This document provides an overview of the Python programming language in 3 paragraphs. It discusses that Python is a high-level, interpreted, interactive and object-oriented scripting language. It was created by Guido van Rossum in the late 1980s and derived from languages like C and C++. The document then covers some key features of Python, including that it is easy to learn and read, portable, extensible and supports object-oriented programming. It provides examples of Python's basic syntax including indentation, variables, data types, operators and more.
These are the slides of the second part of this multi-part series, from Learn Python Den Haag meetup group. It covers List comprehensions, Dictionary comprehensions and functions.
This document provides an introduction to the Python programming language. It discusses what Python is, its history and creator, why it is popular, who uses it, and how to get started with the syntax. Key topics covered include Python's readability, dynamic typing, standard library, and use across many industries. The document also includes code examples demonstrating basic Python concepts like variables, strings, control flow, functions, and file input/output.
- Python is a dynamic, interpreted, object-oriented programming language where variables do not need to be declared and type checked.
- It uses indentation rather than curly braces to delimit code blocks, and is readable and flexible.
- Key features covered include variables, strings, functions, conditionals, loops, lists, tuples, and dictionaries. Examples are provided for common operations on each data type.
Python is a versatile, object-oriented programming language that can be used for web development, data analysis, and more. It has a simple syntax and is easy to read and learn. Key features include being interpreted, dynamically typed, supporting functional and object-oriented programming. Common data types include numbers, strings, lists, dictionaries, tuples, and files. Functions and classes can be defined to organize and reuse code. Regular expressions provide powerful string manipulation. Python has a large standard library and is used widely in areas like GUIs, web scripting, AI, and scientific computing.
The document provides an introduction to Python programming. It discusses key concepts like variables, data types, operators, and sequential data types. Python is presented as an interpreted programming language that uses indentation to indicate blocks of code. Comments and documentation are included to explain the code. Various data types are covered, including numbers, strings, booleans, and lists. Operators for arithmetic, comparison, assignment and more are also summarized.
This document provides a cheat sheet on Python keywords and basic data types. It lists common Python keywords like False, True, and, or, not, break, continue, class, def, if, else, for, while, in, is, None, lambda, and return along with code examples. It also covers basic data types like Boolean, integer, float, string, list, set, dictionary, and complex data types like classes. It provides examples of using lists, sets, dictionaries, classes and functions in Python.
This document provides examples and descriptions of Python keywords and basic data types. It discusses keywords like False, True, and, or, not, break, continue, class, def, if, elif, else, for, while, in, is, None, lambda, and return. It also covers basic data types like integers, floats, strings, lists, sets, dictionaries, and Boolean values. It provides code examples to demonstrate the usage of these keywords and data types in Python.
This document provides an overview of Python lists:
- Lists store a series of items in a particular order and allow you to access items using an index or loop through the items.
- You can add and remove items from lists, sort lists, loop through lists to print or modify items, and access items by index.
- Common list operations include appending to add an item, inserting to add an item at a specific index, removing items, sorting lists, finding the length of a list, and accessing items by index.
The document provides an overview of common Python data structures and programming concepts including:
- Strings, variables, and concatenation for combining strings
- Lists for storing ordered sets of items that can be accessed by index or looped through
- Tuples for storing immutable sets of items similar to lists
- Dictionaries for storing connections between keys and values in non-ordered key-value pairs
- Common operations for each like appending, inserting, removing, sorting, and slicing items
It also covers conditional statements, functions, files, exceptions, classes and objects, user input/output, and while and for loops for iteration. The goal is to introduce fundamental Python programming concepts and data handling techniques.
The document provides an overview of common Python data structures and programming concepts including:
- Strings, variables, and concatenation for combining strings
- Lists for storing ordered sets of items that can be accessed by index or looped through
- Tuples for storing immutable sets of items similar to lists
- Dictionaries for storing connections between keys and values in non-ordered key-value pairs
- Common operations for each like appending, inserting, removing, sorting, and slicing items
It also covers conditional statements, functions, files, exceptions, classes and objects, user input/output, and while and for loops for iteration. The goal is to introduce fundamental Python programming concepts and data structures.
This document provides an overview of Python lists:
- Lists store a series of items in a particular order and allow you to access items using an index or loop through the items.
- You can add and remove items from lists, sort lists, loop through lists to print or modify items, and access items by index.
- Common list operations include appending items, inserting items, removing items, sorting lists, finding the length of a list, and accessing items by index. Lists provide a powerful way to organize and work with sets of data in Python.
This document provides an overview of key Python concepts including numbers, strings, variables, lists, tuples, dictionaries, and sets. It defines each concept and provides examples. Numbers discusses integer, float, and complex data types. Strings covers string operations like accessing characters, concatenation, formatting and methods. Variables explains variable naming rules and scopes. Lists demonstrates accessing, modifying, and sorting list elements. Tuples describes immutable ordered collections. Dictionaries defines storing and accessing data via keys and values. Sets introduces unordered unique element collections.
python programming for beginners and advancedgranjith6
Python is a popular programming language created by Guido van Rossum in 1991. It can be used for web development, software development, mathematics, and system scripting. Python code is written in text files and can be executed using an interpreter. Key features of Python include being highly readable, using indentation to define scope, and supporting multiple programming paradigms like object-oriented, procedural, and functional programming. Variables in Python are created when a value is assigned to them, and can be easily reassigned without type declarations. Data types in Python include strings, integers, floats, booleans, lists, tuples, dictionaries, sets, and more.
This document provides an overview of Python data types including strings, integers, floats, Booleans, lists and dictionaries. It discusses how to define and manipulate variables of each data type in Python. For strings, it describes how to get input from the user, change case, concatenate and add whitespace. For numbers, it covers arithmetic operations on integers and floats. It also discusses determining a variable's type, dynamic typing in Python, accessing, modifying and adding/removing elements from lists.
This document provides an overview of string fundamentals in Python including:
- Strings can be indexed and sliced to access individual characters or substrings
- Built-in functions like len() return the length of a string
- Strings are immutable so cannot be modified, but new strings can be created
- Common string methods like upper(), lower(), find(), strip() can manipulate strings
Python is a general-purpose interpreted, interactive, object-oriented, and high-level programming language.
Make use of the PPT to have a better understanding of Python.
Mieke Jans is a Manager at Deloitte Analytics Belgium. She learned about process mining from her PhD supervisor while she was collaborating with a large SAP-using company for her dissertation.
Mieke extended her research topic to investigate the data availability of process mining data in SAP and the new analysis possibilities that emerge from it. It took her 8-9 months to find the right data and prepare it for her process mining analysis. She needed insights from both process owners and IT experts. For example, one person knew exactly how the procurement process took place at the front end of SAP, and another person helped her with the structure of the SAP-tables. She then combined the knowledge of these different persons.
Ad
More Related Content
Similar to Python Cheatsheet_A Quick Reference Guide for Data Science.pdf (20)
- Python is a dynamic, interpreted, object-oriented programming language where variables do not need to be declared and type checked.
- It uses indentation rather than curly braces to delimit code blocks, and is readable and flexible.
- Key features covered include variables, strings, functions, conditionals, loops, lists, tuples, and dictionaries. Examples are provided for common operations on each data type.
Python is a versatile, object-oriented programming language that can be used for web development, data analysis, and more. It has a simple syntax and is easy to read and learn. Key features include being interpreted, dynamically typed, supporting functional and object-oriented programming. Common data types include numbers, strings, lists, dictionaries, tuples, and files. Functions and classes can be defined to organize and reuse code. Regular expressions provide powerful string manipulation. Python has a large standard library and is used widely in areas like GUIs, web scripting, AI, and scientific computing.
The document provides an introduction to Python programming. It discusses key concepts like variables, data types, operators, and sequential data types. Python is presented as an interpreted programming language that uses indentation to indicate blocks of code. Comments and documentation are included to explain the code. Various data types are covered, including numbers, strings, booleans, and lists. Operators for arithmetic, comparison, assignment and more are also summarized.
This document provides a cheat sheet on Python keywords and basic data types. It lists common Python keywords like False, True, and, or, not, break, continue, class, def, if, else, for, while, in, is, None, lambda, and return along with code examples. It also covers basic data types like Boolean, integer, float, string, list, set, dictionary, and complex data types like classes. It provides examples of using lists, sets, dictionaries, classes and functions in Python.
This document provides examples and descriptions of Python keywords and basic data types. It discusses keywords like False, True, and, or, not, break, continue, class, def, if, elif, else, for, while, in, is, None, lambda, and return. It also covers basic data types like integers, floats, strings, lists, sets, dictionaries, and Boolean values. It provides code examples to demonstrate the usage of these keywords and data types in Python.
This document provides an overview of Python lists:
- Lists store a series of items in a particular order and allow you to access items using an index or loop through the items.
- You can add and remove items from lists, sort lists, loop through lists to print or modify items, and access items by index.
- Common list operations include appending to add an item, inserting to add an item at a specific index, removing items, sorting lists, finding the length of a list, and accessing items by index.
The document provides an overview of common Python data structures and programming concepts including:
- Strings, variables, and concatenation for combining strings
- Lists for storing ordered sets of items that can be accessed by index or looped through
- Tuples for storing immutable sets of items similar to lists
- Dictionaries for storing connections between keys and values in non-ordered key-value pairs
- Common operations for each like appending, inserting, removing, sorting, and slicing items
It also covers conditional statements, functions, files, exceptions, classes and objects, user input/output, and while and for loops for iteration. The goal is to introduce fundamental Python programming concepts and data handling techniques.
The document provides an overview of common Python data structures and programming concepts including:
- Strings, variables, and concatenation for combining strings
- Lists for storing ordered sets of items that can be accessed by index or looped through
- Tuples for storing immutable sets of items similar to lists
- Dictionaries for storing connections between keys and values in non-ordered key-value pairs
- Common operations for each like appending, inserting, removing, sorting, and slicing items
It also covers conditional statements, functions, files, exceptions, classes and objects, user input/output, and while and for loops for iteration. The goal is to introduce fundamental Python programming concepts and data structures.
This document provides an overview of Python lists:
- Lists store a series of items in a particular order and allow you to access items using an index or loop through the items.
- You can add and remove items from lists, sort lists, loop through lists to print or modify items, and access items by index.
- Common list operations include appending items, inserting items, removing items, sorting lists, finding the length of a list, and accessing items by index. Lists provide a powerful way to organize and work with sets of data in Python.
This document provides an overview of key Python concepts including numbers, strings, variables, lists, tuples, dictionaries, and sets. It defines each concept and provides examples. Numbers discusses integer, float, and complex data types. Strings covers string operations like accessing characters, concatenation, formatting and methods. Variables explains variable naming rules and scopes. Lists demonstrates accessing, modifying, and sorting list elements. Tuples describes immutable ordered collections. Dictionaries defines storing and accessing data via keys and values. Sets introduces unordered unique element collections.
python programming for beginners and advancedgranjith6
Python is a popular programming language created by Guido van Rossum in 1991. It can be used for web development, software development, mathematics, and system scripting. Python code is written in text files and can be executed using an interpreter. Key features of Python include being highly readable, using indentation to define scope, and supporting multiple programming paradigms like object-oriented, procedural, and functional programming. Variables in Python are created when a value is assigned to them, and can be easily reassigned without type declarations. Data types in Python include strings, integers, floats, booleans, lists, tuples, dictionaries, sets, and more.
This document provides an overview of Python data types including strings, integers, floats, Booleans, lists and dictionaries. It discusses how to define and manipulate variables of each data type in Python. For strings, it describes how to get input from the user, change case, concatenate and add whitespace. For numbers, it covers arithmetic operations on integers and floats. It also discusses determining a variable's type, dynamic typing in Python, accessing, modifying and adding/removing elements from lists.
This document provides an overview of string fundamentals in Python including:
- Strings can be indexed and sliced to access individual characters or substrings
- Built-in functions like len() return the length of a string
- Strings are immutable so cannot be modified, but new strings can be created
- Common string methods like upper(), lower(), find(), strip() can manipulate strings
Python is a general-purpose interpreted, interactive, object-oriented, and high-level programming language.
Make use of the PPT to have a better understanding of Python.
Mieke Jans is a Manager at Deloitte Analytics Belgium. She learned about process mining from her PhD supervisor while she was collaborating with a large SAP-using company for her dissertation.
Mieke extended her research topic to investigate the data availability of process mining data in SAP and the new analysis possibilities that emerge from it. It took her 8-9 months to find the right data and prepare it for her process mining analysis. She needed insights from both process owners and IT experts. For example, one person knew exactly how the procurement process took place at the front end of SAP, and another person helped her with the structure of the SAP-tables. She then combined the knowledge of these different persons.
By James Francis, CEO of Paradigm Asset Management
In the landscape of urban safety innovation, Mt. Vernon is emerging as a compelling case study for neighboring Westchester County cities. The municipality’s recently launched Public Safety Camera Program not only represents a significant advancement in community protection but also offers valuable insights for New Rochelle and White Plains as they consider their own safety infrastructure enhancements.
Tijn van der Heijden is a business analyst with Deloitte. He learned about process mining during his studies in a BPM course at Eindhoven University of Technology and became fascinated with the fact that it was possible to get a process model and so much performance information out of automatically logged events of an information system.
Tijn successfully introduced process mining as a new standard to achieve continuous improvement for the Rabobank during his Master project. At his work at Deloitte, Tijn has now successfully been using this framework in client projects.
Telangana State, India’s newest state that was carved from the erstwhile state of Andhra
Pradesh in 2014 has launched the Water Grid Scheme named as ‘Mission Bhagiratha (MB)’
to seek a permanent and sustainable solution to the drinking water problem in the state. MB is
designed to provide potable drinking water to every household in their premises through
piped water supply (PWS) by 2018. The vision of the project is to ensure safe and sustainable
piped drinking water supply from surface water sources
Python Cheatsheet_A Quick Reference Guide for Data Science.pdf
1. Python Cheatsheet
keyboard_arrow_down
1. Syntax and whitespace
2. Comments
3. Numbers and operations
4. String manipulation
5. Lists, tuples, and dictionaries
6. JSON
7. Loops
8. File handling
9. Functions
10. Working with datetime
11. NumPy
12. Pandas
Contents
keyboard_arrow_down
To run a cell, press Shift+Enter or click Run at the top of the page.
Python uses indented space to indicate the level of statements. The following cell is an example where 'if' and 'else' are in same level, while
'print' is separated by space to a different level. Spacing should be the same for items that are on the same level.
1. Syntax and whitespace
keyboard_arrow_down
student_number = input("Enter your student number:")
if student_number != 0:
print("Welcome student {}".format(student_number))
else:
print("Try again!")
Enter your student number: 1
Welcome student 1
In Python, comments start with hash '#' and extend to the end of the line. '#' can be at the begining of the line or after code.
2. Comments
keyboard_arrow_down
# This is code to print hello world!
print("Hello world!") # Print statement for hello world
print("# is not a comment in this case")
Hello world!
# is not a comment in this case
Like with other programming languages, there are four types of numbers:
Integers (e.g., 1, 20, 45, 1000) indicated by int
Floating point numbers (e.g., 1.25, 20.35, 1000.00) indicated by float
Long integers
Complex numbers (e.g., x+2y where x is known)
3. Numbers and operations
keyboard_arrow_down
Operation Result
x + y Sum of x and y
x - y Difference of x and y
x * y Product of x and y
x / y Quotient of x and y
2. Operation Result
x // y Quotient of x and y (floored)
x % y Remainder of x / y
abs(x) Absolute value of x
int(x) x converted to integer
long(x) x converted to long integer
float(x) x converted to floating point
pow(x, y) x to the power y
x ** y x to the power y
# Number examples
a = 5 + 8
print("Sum of int numbers: {} and number format is {}".format(a, type(a)))
b = 5 + 2.3
print ("Sum of int and {} and number format is {}".format(b, type(b)))
Sum of int numbers: 13 and number format is <class 'int'>
Sum of int and 7.3 and number format is <class 'float'>
Python has rich features like other programming languages for string manipulation.
4. String manipulation
keyboard_arrow_down
# Store strings in a variable
test_word = "hello world to everyone"
# Print the test_word value
print(test_word)
# Use [] to access the character of the string. The first character is indicated by '0'.
print(test_word[0])
# Use the len() function to find the length of the string
print(len(test_word))
# Some examples of finding in strings
print(test_word.count('l')) # Count number of times l repeats in the string
print(test_word.find("o")) # Find letter 'o' in the string. Returns the position of first match.
print(test_word.count(' ')) # Count number of spaces in the string
print(test_word.upper()) # Change the string to uppercase
print(test_word.lower()) # Change the string to lowercase
print(test_word.replace("everyone","you")) # Replace word "everyone" with "you"
print(test_word.title()) # Change string to title format
print(test_word + "!!!") # Concatenate strings
print(":".join(test_word)) # Add ":" between each character
print("".join(reversed(test_word))) # Reverse the string
hello world to everyone
h
23
3
4
3
HELLO WORLD TO EVERYONE
hello world to everyone
hello world to you
Hello World To Everyone
hello world to everyone!!!
h:e:l:l:o: :w:o:r:l:d: :t:o: :e:v:e:r:y:o:n:e
enoyreve ot dlrow olleh
Python supports data types lists, tuples, dictionaries, and arrays.
5. Lists, tuples, and dictionaries
keyboard_arrow_down
A list is created by placing all the items (elements) inside square brackets [ ] separated by commas. A list can have any number of items, and
they may be of different types (integer, float, strings, etc.).
Lists
keyboard_arrow_down
3. # A Python list is similar to an array. You can create an empty list too.
my_list = []
first_list = [3, 5, 7, 10]
second_list = [1, 'python', 3]
# Nest multiple lists
nested_list = [first_list, second_list]
nested_list
[[3, 5, 7, 10], [1, 'python', 3]]
# Combine multiple lists
combined_list = first_list + second_list
combined_list
[3, 5, 7, 10, 1, 'python', 3]
# You can slice a list, just like strings
combined_list[0:3]
[3, 5, 7]
# Append a new entry to the list
combined_list.append(600)
combined_list
[3, 5, 7, 10, 1, 'python', 3, 600]
# Remove the last entry from the list
combined_list.pop()
600
# Iterate the list
for item in combined_list:
print(item)
3
5
7
10
1
python
3
A tuple is similar to a list, but you use them with parentheses ( ) instead of square brackets. The main difference is that a tuple is immutable,
while a list is mutable.
Tuples
keyboard_arrow_down
my_tuple = (1, 2, 3, 4, 5)
my_tuple[1:4]
(2, 3, 4)
A dictionary is also known as an associative array. A dictionary consists of a collection of key-value pairs. Each key-value pair maps the key to
its associated value.
Dictionaries
keyboard_arrow_down
desk_location = {'jack': 123, 'joe': 234, 'hary': 543}
desk_location['jack']
123
JSON is text writen in JavaScript Object Notation. Python has a built-in package called json that can be used to work with JSON data.
6. JSON
keyboard_arrow_down
4. import json
# Sample JSON data
x = '{"first_name":"Jane", "last_name":"Doe", "age":25, "city":"Chicago"}'
# Read JSON data
y = json.loads(x)
# Print the output, which is similar to a dictonary
print("Employee name is "+ y["first_name"] + " " + y["last_name"])
Employee name is Jane Doe
If, Else, ElIf loop: Python supports conditional statements like any other programming language. Python relies on indentation (whitespace at
the begining of the line) to define the scope of the code.
7. Loops
keyboard_arrow_down
a = 22
b = 33
c = 100
# if ... else example
if a > b:
print("a is greater than b")
else:
print("b is greater than a")
# if .. else .. elif example
if a > b:
print("a is greater than b")
elif b > c:
print("b is greater than c")
else:
print("b is greater than a and c is greater than b")
b is greater than a
b is greater than a and c is greater than b
While loop: Processes a set of statements as long as the condition is true
# Sample while example
i = 1
while i < 10:
print("count is " + str(i))
i += 1
print("="*10)
# Continue to next iteration if x is 2. Finally, print message once the condition is false.
x = 0
while x < 5:
x += 1
if x == 2:
continue
print(x)
else:
print("x is no longer less than 5")
count is 1
count is 2
count is 3
count is 4
count is 5
count is 6
count is 7
count is 8
count is 9
==========
1
3
4
5
x is no longer less than 5
5. For loop: A For loop is more like an iterator in Python. A For loop is used for iterating over a sequence (list, tuple, dictionay, set, string, or
range).
# Sample for loop examples
fruits = ["orange", "banana", "apple", "grape", "cherry"]
for fruit in fruits:
print(fruit)
print("n")
print("="*10)
print("n")
# Iterating range
for x in range(1, 10, 2):
print(x)
else:
print("task complete")
print("n")
print("="*10)
print("n")
# Iterating multiple lists
traffic_lights = ["red", "yellow", "green"]
action = ["stop", "slow down", "go"]
for light in traffic_lights:
for task in action:
print(light, task)
orange
banana
apple
grape
cherry
==========
1
3
5
7
9
task complete
==========
red stop
red slow down
red go
yellow stop
yellow slow down
yellow go
green stop
green slow down
green go
The key function for working with files in Python is the open() function. The open() function takes two parameters: filename and mode.
There are four different methods (modes) for opening a file:
"r" - Read
"a" - Append
"w" - Write
"x" - Create
In addition, you can specify if the file should be handled in binary or text mode.
"t" - Text
"b" - Binary
8. File handling
keyboard_arrow_down
# Let's create a test text file
!echo "This is a test file with text in it. This is the first line." > test.txt
6. !echo "This is the second line." >> test.txt
!echo "This is the third line." >> test.txt
# Read file
file = open('test.txt', 'r')
print(file.read())
file.close()
print("n")
print("="*10)
print("n")
# Read first 10 characters of the file
file = open('test.txt', 'r')
print(file.read(10))
file.close()
print("n")
print("="*10)
print("n")
# Read line from the file
file = open('test.txt', 'r')
print(file.readline())
file.close()
This is a test file with text in it. This is the first line.
This is the second line.
This is the third line.
==========
This is a
==========
This is a test file with text in it. This is the first line.
# Create new file
file = open('test2.txt', 'w')
file.write("This is content in the new test2 file.")
file.close()
# Read the content of the new file
file = open('test2.txt', 'r')
print(file.read())
file.close()
This is content in the new test2 file.
# Update file
file = open('test2.txt', 'a')
file.write("nThis is additional content in the new file.")
file.close()
# Read the content of the new file
file = open('test2.txt', 'r')
print(file.read())
file.close()
This is content in the new test2 file.
This is additional content in the new file.
# Delete file
import os
file_names = ["test.txt", "test2.txt"]
for item in file_names:
if os.path.exists(item):
os.remove(item)
print(f"File {item} removed successfully!")
else:
print(f"{item} file does not exist.")
7. File test.txt removed successfully!
File test2.txt removed successfully!
A function is a block of code that runs when it is called. You can pass data, or parameters, into the function. In Python, a function is defined by
def .
9. Functions
keyboard_arrow_down
# Defining a function
def new_funct():
print("A simple function")
# Calling the function
new_funct()
A simple function
# Sample fuction with parameters
def param_funct(first_name):
print(f"Employee name is {first_name}.")
param_funct("Harry")
param_funct("Larry")
param_funct("Shally")
Employee name is Harry.
Employee name is Larry.
Employee name is Shally.
Anonymous functions (lambda): A lambda is a small anonymous function. A lambda function can take any number of arguments but only one
expression.
# Sample lambda example
x = lambda y: y + 100
print(x(15))
print("n")
print("="*10)
print("n")
x = lambda a, b: a*b/100
print(x(2,4))
115
==========
0.08
A datetime module in Python can be used to work with date objects.
10. Working with datetime
keyboard_arrow_down
import datetime
x = datetime.datetime.now()
print(x)
print(x.year)
print(x.strftime("%A"))
print(x.strftime("%B"))
print(x.strftime("%d"))
print(x.strftime("%H:%M:%S %p"))
2023-11-30 19:51:49.727931
2023
Thursday
November
30
19:51:49 PM
8. NumPy is the fundamental package for scientific computing with Python. Among other things, it contains:
Powerful N-dimensional array object
Sophisticated (broadcasting) functions
Tools for integrating C/C++ and Fortran code
Useful linear algebra, Fourier transform, and random number capabilities
11. NumPy
keyboard_arrow_down
# Install NumPy using pip
!pip install numpy
Requirement already satisfied: numpy in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (1.22.4)
# Import NumPy module
import numpy as np
Inspecting your array
keyboard_arrow_down
# Create array
a = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
b = np.zeros((3,5)) # Create array with zeroes
c = np.ones( (2,3,4), dtype=np.int16 ) # Createarray with ones and defining data types
d = np.ones((3,5))
a.shape # Array dimension
(3, 5)
len(b)# Length of array
3
c.ndim # Number of array dimensions
3
a.size # Number of array elements
15
b.dtype # Data type of array elements
dtype('float64')
c.dtype.name # Name of data type
'int16'
c.astype(float) # Convert an array type to a different type
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
Basic math operations
keyboard_arrow_down
# Create array
a = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
b = np.zeros((3,5)) # Create array with zeroes
c = np.ones( (2,3,4), dtype=np.int16 ) # Createarray with ones and defining data types
d = np.ones((3,5))
np.add(a,b) # Addition
9. array([[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[10., 11., 12., 13., 14.]])
np.subtract(a,b) # Substraction
array([[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[10., 11., 12., 13., 14.]])
np.divide(a,d) # Division
array([[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[10., 11., 12., 13., 14.]])
np.multiply(a,d) # Multiplication
array([[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[10., 11., 12., 13., 14.]])
np.array_equal(a,b) # Comparison - arraywise
False
Aggregate functions
keyboard_arrow_down
# Create array
a = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
b = np.zeros((3,5)) # Create array with zeroes
c = np.ones( (2,3,4), dtype=np.int16 ) # Createarray with ones and defining data types
d = np.ones((3,5))
a.sum() # Array-wise sum
105
a.min() # Array-wise min value
0
a.mean() # Array-wise mean
7.0
a.max(axis=0) # Max value of array row
array([10, 11, 12, 13, 14])
np.std(a) # Standard deviation
4.320493798938574
Subsetting, slicing, and indexing
keyboard_arrow_down
# Create array
a = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
b = np.zeros((3,5)) # Create array with zeroes
c = np.ones( (2,3,4), dtype=np.int16 ) # Createarray with ones and defining data types
d = np.ones((3,5))
a[1,2] # Select element of row 1 and column 2
7
a[0:2] # Select items on index 0 and 1
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
10. a[:1] # Select all items at row 0
array([[0, 1, 2, 3, 4]])
a[-1:] # Select all items from last row
array([[10, 11, 12, 13, 14]])
a[a<2] # Select elements from 'a' that are less than 2
array([0, 1])
Array manipulation
keyboard_arrow_down
# Create array
a = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
b = np.zeros((3,5)) # Create array with zeroes
c = np.ones( (2,3,4), dtype=np.int16 ) # Createarray with ones and defining data types
d = np.ones((3,5))
np.transpose(a) # Transpose array 'a'
array([[ 0, 5, 10],
[ 1, 6, 11],
[ 2, 7, 12],
[ 3, 8, 13],
[ 4, 9, 14]])
a.ravel() # Flatten the array
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
a.reshape(5,-2) # Reshape but don't change the data
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
np.append(a,b) # Append items to the array
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12.,
13., 14., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0.])
np.concatenate((a,d), axis=0) # Concatenate arrays
array([[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[10., 11., 12., 13., 14.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.]])
np.vsplit(a,3) # Split array vertically at 3rd index
[array([[0, 1, 2, 3, 4]]),
array([[5, 6, 7, 8, 9]]),
array([[10, 11, 12, 13, 14]])]
np.hsplit(a,5) # Split array horizontally at 5th index
[array([[ 0],
[ 5],
[10]]),
array([[ 1],
[ 6],
[11]]),
array([[ 2],
[ 7],
[12]]),
array([[ 3],
[ 8],
[13]]),
array([[ 4],
[ 9],
[14]])]
11. Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python
programming language.
Pandas DataFrames are the most widely used in-memory representation of complex data collections within Python.
Pandas
keyboard_arrow_down
# Install pandas, xlrd, and openpyxl using pip
!pip install pandas
!pip install xlrd openpyxl
Requirement already satisfied: pandas in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (2.1.1)
Requirement already satisfied: numpy>=1.22.4 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (from
Requirement already satisfied: python-dateutil>=2.8.2 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packag
Requirement already satisfied: pytz>=2020.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (from p
Requirement already satisfied: tzdata>=2022.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (from
Requirement already satisfied: six>=1.5 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (from pytho
Collecting xlrd
Downloading xlrd-2.0.1-py2.py3-none-any.whl (96 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.5/96.5 kB 9.2 MB/s eta 0:00:00
Requirement already satisfied: openpyxl in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (3.1.2)
Requirement already satisfied: et-xmlfile in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (from ope
Installing collected packages: xlrd
Successfully installed xlrd-2.0.1
# Import NumPy and Pandas modules
import numpy as np
import pandas as pd
/home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages/pandas/core/computation/expressions.py:21: UserWarning
from pandas.core.computation.check import NUMEXPR_INSTALLED
num_legs num_wings num_specimen_seen
falcon 2.0 2 10.0
dog 4.0 0 NaN
spider NaN 0 1.0
fish 0.0 0 8.0
# Sample dataframe df
df = pd.DataFrame({'num_legs': [2, 4, np.nan, 0],
'num_wings': [2, 0, 0, 0],
'num_specimen_seen': [10, np.nan, 1, 8]},
index=['falcon', 'dog', 'spider', 'fish'])
df # Display dataframe df
# Another sample dataframe df1 - using NumPy array with datetime index and labeled column
df1 = pd.date_range('20130101', periods=6)
df1 = pd.DataFrame(np.random.randn(6, 4), index=df1, columns=list('ABCD'))
df1 # Display dataframe df1
12. A B C D
2013-01-01 -0.898850 -0.680102 0.193667 1.074850
2013-01-02 1.431951 0.793661 0.946500 -0.507993
2013-01-03 1.660753 1.023082 -0.578049 -1.202825
2013-01-04 1.876802 0.426981 0.371810 -0.219708
2013-01-05 0.178279 -0.040635 -0.346963 1.173570
2013-01-06 -1.077499 0.410345 0.880085 -1.340728
Viewing data
keyboard_arrow_down
df1 = pd.date_range('20130101', periods=6)
df1 = pd.DataFrame(np.random.randn(6, 4), index=df1, columns=list('ABCD'))
A B C D
2013-01-01 1.391132 -1.593587 1.801365 0.004086
2013-01-02 -0.431011 2.605599 0.384398 -0.417979
df1.head(2) # View top data
A B C D
2013-01-05 -1.074617 -0.854460 -0.017001 -0.761798
2013-01-06 0.199736 -0.022141 -2.377702 0.245258
df1.tail(2) # View bottom data
df1.index # Display index column
DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
'2013-01-05', '2013-01-06'],
dtype='datetime64[ns]', freq='D')
df1.dtypes # Inspect datatypes
A float64
B float64
C float64
D float64
dtype: object
df1.describe() # Display quick statistics summary of data
13. A B C D
count 6.000000 6.000000 6.000000 6.000000
mean 0.575569 -0.045096 0.031565 -0.135568
std 1.207154 1.552915 1.357931 0.463769
min -1.074617 -1.593587 -2.377702 -0.761798
25% -0.273325 -1.102284 -0.022663 -0.405454
50% 0.752909 -0.438300 0.183699 -0.181896
75% 1.369869 0.578642 0.413259 0.184965
max 2.062092 2.605599 1.801365 0.484907
Subsetting, slicing, and indexing
keyboard_arrow_down
df1 = pd.date_range('20130101', periods=6)
df1 = pd.DataFrame(np.random.randn(6, 4), index=df1, columns=list('ABCD'))
2013-01-01 2013-01-02 2013-01-03 2013-01-04 2013-01-05 2013-01-06
A 0.027030 0.976364 -0.479214 -1.732572 -0.847890 -1.241276
B 0.975635 -1.082700 -0.118557 0.245337 -0.230890 -0.372955
C -1.287683 -0.097347 0.879278 0.694448 -0.977119 0.417494
D 0.522557 0.342539 -0.339455 0.999107 0.655293 0.081941
df1.T # Transpose data
D C B A
2013-01-01 0.522557 -1.287683 0.975635 0.027030
2013-01-02 0.342539 -0.097347 -1.082700 0.976364
2013-01-03 -0.339455 0.879278 -0.118557 -0.479214
2013-01-04 0.999107 0.694448 0.245337 -1.732572
2013-01-05 0.655293 -0.977119 -0.230890 -0.847890
2013-01-06 0.081941 0.417494 -0.372955 -1.241276
df1.sort_index(axis=1, ascending=False) # Sort by an axis
A B C D
2013-01-02 0.976364 -1.082700 -0.097347 0.342539
2013-01-06 -1.241276 -0.372955 0.417494 0.081941
2013-01-05 -0.847890 -0.230890 -0.977119 0.655293
2013-01-03 -0.479214 -0.118557 0.879278 -0.339455
2013-01-04 -1.732572 0.245337 0.694448 0.999107
2013-01-01 0.027030 0.975635 -1.287683 0.522557
df1.sort_values(by='B') # Sort by values
df1['A'] # Select column A
14. 2013-01-01 0.027030
2013-01-02 0.976364
2013-01-03 -0.479214
2013-01-04 -1.732572
2013-01-05 -0.847890
2013-01-06 -1.241276
Freq: D, Name: A, dtype: float64
A B C D
2013-01-01 0.027030 0.975635 -1.287683 0.522557
2013-01-02 0.976364 -1.082700 -0.097347 0.342539
2013-01-03 -0.479214 -0.118557 0.879278 -0.339455
df1[0:3] # Select index 0 to 2
A B C D
2013-01-02 0.976364 -1.082700 -0.097347 0.342539
2013-01-03 -0.479214 -0.118557 0.879278 -0.339455
2013-01-04 -1.732572 0.245337 0.694448 0.999107
df1['20130102':'20130104'] # Select from index matching the values
A B
2013-01-01 0.027030 0.975635
2013-01-02 0.976364 -1.082700
2013-01-03 -0.479214 -0.118557
2013-01-04 -1.732572 0.245337
2013-01-05 -0.847890 -0.230890
2013-01-06 -1.241276 -0.372955
df1.loc[:, ['A', 'B']] # Select on a multi-axis by label
df1.iloc[3] # Select via the position of the passed integers
A -1.732572
B 0.245337
C 0.694448
D 0.999107
Name: 2013-01-04 00:00:00, dtype: float64
df1[df1 > 0] # Select values from a DataFrame where a boolean condition is met
15. A B C D
2013-01-01 0.027030 0.975635 NaN 0.522557
2013-01-02 0.976364 NaN NaN 0.342539
2013-01-03 NaN NaN 0.879278 NaN
2013-01-04 NaN 0.245337 0.694448 0.999107
2013-01-05 NaN NaN NaN 0.655293
2013-01-06 NaN NaN 0.417494 0.081941
A B C D E
2013-01-03 -0.479214 -0.118557 0.879278 -0.339455 two
2013-01-05 -0.847890 -0.230890 -0.977119 0.655293 four
df2 = df1.copy() # Copy the df1 dataset to df2
df2['E'] = ['one', 'one', 'two', 'three', 'four', 'three'] # Add column E with value
df2[df2['E'].isin(['two', 'four'])] # Use isin method for filtering
Pandas primarily uses the value np.nan to represent missing data. It is not included in computations by default.
Missing data
keyboard_arrow_down
df = pd.DataFrame({'num_legs': [2, 4, np.nan, 0],
'num_wings': [2, 0, 0, 0],
'num_specimen_seen': [10, np.nan, 1, 8]},
index=['falcon', 'dog', 'spider', 'fish'])
account_circle
num_legs num_wings num_specimen_seen
falcon 2.0 2 10.0
fish 0.0 0 8.0
df.dropna(how='any') # Drop any rows that have missing data
df.dropna(how='any', axis=1) # Drop any columns that have missing data
16. num_wings
falcon 2
dog 0
spider 0
fish 0
num_legs num_wings num_specimen_seen
falcon 2.0 2 10.0
dog 4.0 0 5.0
spider 5.0 0 1.0
fish 0.0 0 8.0
df.fillna(value=5) # Fill missing data with value 5
num_legs num_wings num_specimen_seen
falcon False False False
dog False False True
spider True False False
fish False False False
pd.isna(df) # To get boolean mask where data is missing
File handling
keyboard_arrow_down
df = pd.DataFrame({'num_legs': [2, 4, np.nan, 0],
'num_wings': [2, 0, 0, 0],
'num_specimen_seen': [10, np.nan, 1, 8]},
index=['falcon', 'dog', 'spider', 'fish'])
df.to_csv('foo.csv') # Write to CSV file
Unnamed: 0 num_legs num_wings num_specimen_seen
0 falcon 2.0 2 10.0
1 dog 4.0 0 NaN
2 spider NaN 0 1.0
3 fish 0.0 0 8.0
pd.read_csv('foo.csv') # Read from CSV file
df.to_excel('foo.xlsx', sheet_name='Sheet1') # Write to Microsoft Excel file
pd.read_excel('foo.xlsx', 'Sheet1', index_col=None, na_values=['NA']) # Read from Microsoft Excel file
17. Unnamed: 0 num_legs num_wings num_specimen_seen
0 falcon 2.0 2 10.0
1 dog 4.0 0 NaN
2 spider NaN 0 1.0
3 fish 0.0 0 8.0
Plotting
keyboard_arrow_down
# Install Matplotlib using pip
!pip install matplotlib
Requirement already satisfied: matplotlib in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (3.8.0)
Requirement already satisfied: contourpy>=1.0.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (fr
Requirement already satisfied: cycler>=0.10 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (from m
Requirement already satisfied: fonttools>=4.22.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (f
Requirement already satisfied: kiwisolver>=1.0.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (f
Requirement already satisfied: numpy<2,>=1.21 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (from
Requirement already satisfied: packaging>=20.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (fro
Requirement already satisfied: pillow>=6.2.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (from
Requirement already satisfied: pyparsing>=2.3.1 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (fr
Requirement already satisfied: python-dateutil>=2.7 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages
Requirement already satisfied: six>=1.5 in /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages (from pytho
from matplotlib import pyplot as plt # Import Matplotlib module
Matplotlib is building the font cache; this may take a moment.
# Generate random time-series data
ts = pd.Series(np.random.randn(1000),index=pd.date_range('1/1/2000', periods=1000))
ts.head()
2000-01-01 -0.909929
2000-01-02 -0.713175
2000-01-03 0.256578
2000-01-04 1.887163
2000-01-05 0.156225
Freq: D, dtype: float64
ts = ts.cumsum()
ts.plot() # Plot graph
plt.show()
18. A B C D
2000-01-01 0.634267 -2.033250 -1.226215 0.106784
2000-01-02 1.393185 -2.893325 -0.923199 -0.318161
2000-01-03 0.873873 -1.817906 0.310210 -0.615651
2000-01-04 2.295118 -3.427966 0.772764 -0.585540
2000-01-05 3.343442 -2.535185 -0.591843 -1.069885
# On a DataFrame, the plot() method is convenient to plot all of the columns with labels
df4 = pd.DataFrame(np.random.randn(1000, 4), index=ts.index,columns=['A', 'B', 'C', 'D'])
df4 = df4.cumsum()
df4.head()
df4.plot()
plt.show()