0% found this document useful (0 votes)

18 views12 pages

Experiment 678910

The document provides a comprehensive guide on using the pandas library for data manipulation in Python, covering operations like creating DataFrames, concatenating, setting conditions, and adding new columns. It also explains how to handle missing values, sort data, group data, and read various file formats including text, CSV, Excel, JSON, and more. Additionally, it includes a section on web scraping using requests and BeautifulSoup to extract data from websites.

Uploaded by

Murali Tirumanadham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views12 pages

Experiment 678910

Uploaded by

Murali Tirumanadham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Experiment 6.

Perform following operations using pandas

a. Creating dataframe
b. concat()
c. Setting conditions
d. Adding a new column

Pandas: A Powerful Data Analysis Library in Python

Pandas is a Python library used for data manipulation, analysis, and cleaning. It provides two
primary data structures:

 Series (1D labeled array)

 DataFrame (2D table similar to an Excel spreadsheet)

Installing and Importing Pandas

First, install pandas by using below command

pip install pandas

Now, import pandas:

import pandas as pd

1. import pandas → This imports the pandas library, which is used for data analysis and
manipulation.
2. as pd → This assigns a short alias (pd) to pandas, so we can refer to it as pd instead of
writing pandas every time.

a. Creating DataFrame

A DataFrame is the core structure in pandas, similar to a table in SQL or Excel.

First, we create a basic DataFrame from a dictionary:

import pandas as pd

# Creating a simple DataFrame

data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'Salary': [50000, 60000, 70000, 80000]
}

df = pd.DataFrame(data)
print(df)

Output:
Name Age Salary
0 Alice 25 50000
1 Bob 30 60000
2 Charlie 35 70000
3 David 40 80000

b. Using concat()

The concat() function is used to concatenate DataFrames along a particular axis (rows or
columns). Here, we concatenate two DataFrames along the rows (axis=0):

# Creating a simple DataFrame

data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'Salary': [50000, 60000, 70000, 80000]
}

df = pd.DataFrame(data)
# Creating another DataFrame
data2 = {
'Name': ['Eve', 'Frank'],
'Age': [45, 50],
'Salary': [90000, 100000]
}

df2 = pd.DataFrame(data2)

# Concatenating along rows (axis=0)

df_concat = pd.concat([df, df2], axis=0, ignore_index=True)
print(df_concat)

Output:

Name Age Salary

0 Alice 25 50000
1 Bob 30 60000
2 Charlie 35 70000
3 David 40 80000
4 Eve 45 90000
5 Frank 50 100000

c. Setting Conditions

You can filter data in a DataFrame based on conditions. For example, selecting people who have
a salary greater than 70,000:

# Applying condition (Salary > 70000)

condition = df_concat[df_concat['Salary'] > 70000]
print(condition)

Output:

Name Age Salary

4 Eve 45 90000
5 Frank 50 100000

d. Adding a New Column with Explanation

To add a new column based on a condition or calculation, we can do something like adding a
'Bonus' column that is 10% of the salary:

# Adding a new column 'Bonus'

df_concat['Bonus'] = df_concat['Salary'] * 0.1
print(df_concat)

Output:

yaml
CopyEdit
Name Age Salary Bonus
0 Alice 25 50000 5000.0
1 Bob 30 60000 6000.0
2 Charlie 35 70000 7000.0
3 David 40 80000 8000.0
4 Eve 45 90000 9000.0
5 Frank 50 100000 10000.0

Here, the Bonus column is calculated as 10% of the Salary column for each person.

Experiment 7.

Perform following operations using pandas

a. Filling NaN with string
b. Sorting based on column values
c. groupby() with explanation with output

a. Filling NaN with String

You can fill missing values (NaN) in a DataFrame using the fillna() method. For instance, if
we have a NaN in a column and want to replace it with a specific string like 'Unknown',

import pandas as pd
import numpy as np

# Creating a DataFrame with NaN values

data = {
'Name': ['Alice', 'Bob', 'Charlie', np.nan],
'Age': [25, 30, 35, 40],
'Salary': [50000, np.nan, 70000, 80000]
}

df = pd.DataFrame(data)

# Filling NaN values with a string

df_filled = df.fillna('Unknown')
print(df_filled)
Output:

Name Age Salary

0 Alice 25 50000
1 Bob 30 Unknown
2 Charlie 35 70000
3 Unknown 40 80000

The NaN values in both the Name and Salary columns are filled with the string 'Unknown'.

b. Sorting Based on Column Values

You can sort a DataFrame based on the values of one or more columns using the sort_values()
method. Below is an example of sorting the DataFrame based on the Age column in ascending
order:

# Sorting by the 'Age' column in ascending order

df_sorted = df.sort_values(by='Age', ascending=True)
print(df_sorted)

Output:

Name Age Salary

0 Alice 25 50000
1 Bob 30 NaN
2 Charlie 35 70000
3 David 40 80000

the DataFrame is sorted by the Age column, starting from the smallest age.

c. Using groupby()

The groupby() method in pandas allows you to group rows based on a column and perform
some aggregation. For example, let's group the DataFrame by the Age column and calculate the
average Salary for each age group:

# Grouping by 'Age' and calculating the average Salary

df_grouped = df.groupby('Age')['Salary'].mean().reset_index()
print(df_grouped)

Explanation:

 The groupby('Age') groups the data based on unique values in the Age column.
 We then use .mean() to compute the average salary for each age group.
 The reset_index() is used to convert the resulting grouped data back into a regular
DataFrame.

Output:
Age Salary
0 25 50000.0
1 30 NaN
2 35 70000.0
3 40 80000.0

Experiment 8.

Read the following file formats using pandas

a. Text files
b. CSV files
c. Excel files
d. JSON files

a. Reading Text Files

You can read text files in pandas using the read_csv() function, even for simple text files, by
specifying the delimiter (e.g., space, tab, etc.). Here is an example for reading a text file that uses
spaces or tabs as delimiters.

Example:

import pandas as pd

# Read a text file (assuming it has space or tab-separated data)

df_text = pd.read_csv('file.txt', delimiter=' ') # or use '\t' for tab-
delimited files
print(df_text)

Explanation:

 read_csv() is versatile and can read text files as long as we specify the correct delimiter.
 You can replace ' ' with the actual delimiter used in your text file.
b. Reading CSV Files

CSV files are very common, and pandas makes it very easy to read them using the read_csv()
function.

Example:

# Read a CSV file

df_csv = pd.read_csv('file.csv')
print(df_csv)

Explanation:

 read_csv() reads the file and automatically handles comma-separated data.

 You can pass extra arguments like header, index_col, etc., if needed.

Output:

Name Age Salary

0 Alice 25 50000
1 Bob 30 60000
2 Charlie 35 70000

c. Reading Excel Files

For reading Excel files, you can use the read_excel() function. You may need to install
openpyxl or xlrd for Excel files depending on the file format (xlsx or xls).

Example:

# Read an Excel file

df_excel = pd.read_excel('file.xlsx', sheet_name='Sheet1')
print(df_excel)

Explanation:

 read_excel() reads an Excel file.

 sheet_name allows you to specify which sheet to load (if there are multiple sheets).

Output:

Name Age Salary

0 Alice 25 50000
1 Bob 30 60000
2 Charlie 35 70000

d. Reading JSON Files

For JSON files, you can use the read_json() function. JSON files are commonly used for
hierarchical or nested data.
Example:

# Read a JSON file

df_json = pd.read_json('file.json')
print(df_json)

Explanation:

 read_json() loads data from JSON files into a pandas DataFrame.

 JSON files are often structured in a nested way, and pandas will flatten the data into
tabular format.

Output (example of JSON data converted to DataFrame):

[
{"Name": "Alice", "Age": 25, "Salary": 50000},
{"Name": "Bob", "Age": 30, "Salary": 60000},
{"Name": "Charlie", "Age": 35, "Salary": 70000}
]
Name Age Salary
0 Alice 25 50000
1 Bob 30 60000
2 Charlie 35 70000

Experiment 9.

Read the following file formats

a. Pickle files
b. Image files using PIL
c. Multiple files using Glob
d. Importing data from database

a. Reading Pickle Files

Pickle files are used to serialize Python objects. You can load them back into memory using the
read_pickle() function in pandas.

Example:

import pandas as pd

# Read a Pickle file

df_pickle = pd.read_pickle('file.pkl')
print(df_pickle)

Explanation:

 Pickle files save data in a binary format and pandas can read them directly with
read_pickle().
 This is useful when you want to save the state of a DataFrame (or other Python objects)
and load it back later.
Output:

# Example output (depending on the content of the Pickle file)

Name Age Salary
0 Alice 25 50000
1 Bob 30 60000
2 Charlie 35 70000

b. Reading Image Files using PIL

To read and work with image files, you can use the PIL (Python Imaging Library) or Pillow,
which is an improved version of PIL.

Example:

from PIL import Image

# Open an image file

img = Image.open('image.jpg')
img.show() # This will display the image

Explanation:

 Image.open() opens the image file.

 The show() method displays the image (you can also save or manipulate the image as
needed).

c. Reading Multiple Files using Glob

You can use the glob module to match files using patterns (such as .csv, .txt, etc.). This
allows you to work with multiple files at once.

Example:

import glob
import pandas as pd

# Using glob to get all CSV files in the directory

files = glob.glob('*.csv')

# Reading all files into a list of DataFrames

dfs = [pd.read_csv(file) for file in files]

# Concatenating all DataFrames into one

df_combined = pd.concat(dfs, ignore_index=True)
print(df_combined)

Explanation:

 glob.glob('*.csv') retrieves all CSV files in the current directory.

 We loop through the file paths, read each one using pd.read_csv(), and store the
DataFrames in a list.
 pd.concat() is used to combine all DataFrames into a single DataFrame.
Output (example):

Name Age Salary

0 Alice 25 50000
1 Bob 30 60000
2 Charlie 35 70000

d. Importing Data from a Database

To import data from a database (such as SQLite, MySQL, etc.), you can use pandas along with a
database connector. The example here uses SQLite with sqlite3.

Example:

import sqlite3
import pandas as pd

# Create a connection to the SQLite database

conn = sqlite3.connect('database.db')

# Query to select data

query = "SELECT * FROM employees"

# Import data into a pandas DataFrame

df_db = pd.read_sql_query(query, conn)
print(df_db)

# Close the database connection

conn.close()

Explanation:

 sqlite3.connect() establishes a connection to the SQLite database (for other

databases, you'd use the corresponding connector like mysql.connector for MySQL).
 pd.read_sql_query() runs the SQL query and loads the results into a DataFrame.

Output (example):

ID Name Age Department Salary

0 1 Alice 25 HR 50000
1 2 Bob 30 IT 60000
2 3 Charlie 35 Finance 70000

For databases like MySQL, you can use pymysql.connect() or similar connectors and follow a
similar process.
Exp 10.

Demonstrate web scraping using python

Web scraping in Python can be done using libraries like requests for making HTTP requests,
and BeautifulSoup (from the bs4 library) for parsing and extracting data from HTML content.
Below is a simple demonstration of web scraping.

Steps:

1. Install the required libraries: If you don't have requests and beautifulsoup4
installed, you can install them using pip:

pip install requests beautifulsoup4

2. Web Scraping Process:

o Send an HTTP request to the website.
o Parse the HTML content using BeautifulSoup.
o Extract the relevant data (e.g., text, links, tables).

Example: Scraping Quotes from a Website

Let's demonstrate web scraping by extracting quotes from a sample website:

http://quotes.toscrape.com/.

1. Sending a Request and Parsing the HTML

import requests
from bs4 import BeautifulSoup

# Step 1: Send a request to the website

url = 'http://quotes.toscrape.com/'
response = requests.get(url)

# Step 2: Parse the HTML content using BeautifulSoup

soup = BeautifulSoup(response.text, 'html.parser')

# Step 3: Extracting data: In this case, we extract all quotes from the page
quotes = soup.find_all('span', class_='text')

# Step 4: Print the quotes

for quote in quotes:
print(quote.text)

Explanation:

 requests.get(url) sends an HTTP request to the specified URL and retrieves the
webpage's content.
 BeautifulSoup(response.text, 'html.parser') parses the HTML content.
 soup.find_all('span', class_='text') finds all <span> tags with the class 'text',
which contain the quotes.
 Finally, we loop through the quotes and print them.

Output:
“The world as we have created it is a process of our thinking. It cannot be
changed without changing our thinking.”
“Life is what happens when you're busy making other plans.”
“It is our choices that show what we truly are, far more than our abilities.”
“Never let the fear of striking out keep you from playing the game.”
“You have within you right now, everything you need to deal with whatever the
world can throw at you.”
“The person, be it gentleman or lady, who has not pleasure in a good novel,
must be intolerably stupid.”

2. Extracting Additional Information (e.g., Author and Tags)

You can also extract other information such as the author of each quote and tags associated with
it. Here’s how you can extend the previous example:

# Extracting authors
authors = soup.find_all('small', class_='author')

# Extracting tags
tags = soup.find_all('div', class_='tags')

# Step 4: Print quotes, authors, and tags

for quote, author, tag in zip(quotes, authors, tags):
print(f'Quote: {quote.text}')
print(f'Author: {author.text}')
print(f'Tags: {[t.text for t in tag.find_all("a")]}\n')

Explanation:

 soup.find_all('small', class_='author') finds all <small> tags with the class

'author', which contain the authors of the quotes.
 soup.find_all('div', class_='tags') finds all <div> tags with the class 'tags',
which contain the tags related to each quote.
 We then loop through the quotes, authors, and tags, printing each set of information.

Output:

Quote: “The world as we have created it is a process of our thinking. It

cannot be changed without changing our thinking.”
Author: Albert Einstein
Tags: ['change', 'deep-thoughts', 'thinking', 'world']

Quote: “Life is what happens when you're busy making other plans.”
Author: John Lennon
Tags: ['life', 'adulthood', 'quotes']

Quote: “It is our choices that show what we truly are, far more than our
abilities.”
Author: J.K. Rowling
Tags: ['abilities', 'choices']

Quote: “Never let the fear of striking out keep you from playing the game.”
Author: Babe Ruth
Tags: ['sports', 'fear', 'inspirational']

Quote: “You have within you right now, everything you need to deal with
whatever the world can throw at you.”
Author: Brian Tracy
Tags: ['self-confidence', 'inspirational', 'world']

1745516832930-Pandas-Handbook
No ratings yet
1745516832930-Pandas-Handbook
33 pages
Python Cheat Sheet 2.0
100% (1)
Python Cheat Sheet 2.0
10 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Cloudant With Postman Practical
No ratings yet
Cloudant With Postman Practical
4 pages
Python Notes by Prof T
No ratings yet
Python Notes by Prof T
10 pages
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
No ratings yet
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
8 pages
Data Wrangling With Python and Pandas
No ratings yet
Data Wrangling With Python and Pandas
7 pages
a5
No ratings yet
a5
28 pages
Python Data Science 101
100% (1)
Python Data Science 101
41 pages
On Data Handling Using Pandas-I
100% (2)
On Data Handling Using Pandas-I
63 pages
Pandas PDF(2)
No ratings yet
Pandas PDF(2)
25 pages
What is pandas
No ratings yet
What is pandas
9 pages
1 Data Handling Using Pandas 1
No ratings yet
1 Data Handling Using Pandas 1
63 pages
Class Xii Information Practices Ppt on Data Handling Using Pandas-i
No ratings yet
Class Xii Information Practices Ppt on Data Handling Using Pandas-i
64 pages
ML Lab1 Python Panda
No ratings yet
ML Lab1 Python Panda
9 pages
7 Days Analytics Course 3feiz7 4
No ratings yet
7 Days Analytics Course 3feiz7 4
8 pages
Pandas 1
No ratings yet
Pandas 1
64 pages
Pandas 1
No ratings yet
Pandas 1
2 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Pandas
No ratings yet
Pandas
12 pages
Usage of NumPy for Numerical Data in Detail
No ratings yet
Usage of NumPy for Numerical Data in Detail
52 pages
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
No ratings yet
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
9 pages
Pandas
No ratings yet
Pandas
4 pages
DataFrame.docx
No ratings yet
DataFrame.docx
95 pages
Exp1 - Manipulating Datasets Using Pandas
No ratings yet
Exp1 - Manipulating Datasets Using Pandas
15 pages
Python-for-Data-Analysis-edgar
No ratings yet
Python-for-Data-Analysis-edgar
49 pages
Pandas
No ratings yet
Pandas
13 pages
Data Aggregation and Group Operations
No ratings yet
Data Aggregation and Group Operations
34 pages
Working With Panda
No ratings yet
Working With Panda
13 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
27 pages
Introduction To Pandas in Data Analytics
No ratings yet
Introduction To Pandas in Data Analytics
12 pages
Pandas
No ratings yet
Pandas
94 pages
IP 12th Chapter 3
No ratings yet
IP 12th Chapter 3
9 pages
Starting Out With Pandas - Ext
No ratings yet
Starting Out With Pandas - Ext
18 pages
UNIT -4 -PART 2
No ratings yet
UNIT -4 -PART 2
36 pages
Chapter Notes - Data Handling Using Pandas DataFrame
No ratings yet
Chapter Notes - Data Handling Using Pandas DataFrame
16 pages
Pandas
No ratings yet
Pandas
27 pages
05 Pandas Data Frames
No ratings yet
05 Pandas Data Frames
33 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
UNIT II Notes (1)
No ratings yet
UNIT II Notes (1)
23 pages
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
No ratings yet
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
6 pages
Python For Data Analysis: Dr. Kishore Kunal
100% (1)
Python For Data Analysis: Dr. Kishore Kunal
43 pages
Unit6 - Working With Data
No ratings yet
Unit6 - Working With Data
29 pages
python interviews
No ratings yet
python interviews
154 pages
Pandas Lab Assignment Work-2
No ratings yet
Pandas Lab Assignment Work-2
5 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
DAP_3_module
No ratings yet
DAP_3_module
62 pages
Python For Data Analysis
No ratings yet
Python For Data Analysis
96 pages
Murali Internship
No ratings yet
Murali Internship
34 pages
Chapter 4 - Python For Data Analysis
No ratings yet
Chapter 4 - Python For Data Analysis
47 pages
Python For Statistics
No ratings yet
Python For Statistics
40 pages
data handling module
No ratings yet
data handling module
10 pages
Python Cheat Sheet For Excel Users
No ratings yet
Python Cheat Sheet For Excel Users
5 pages
Pandas_Tutorial
No ratings yet
Pandas_Tutorial
9 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
39 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Python For Data Analysis
No ratings yet
Python For Data Analysis
47 pages
CO3_1_Pandas Series and Data Frame
No ratings yet
CO3_1_Pandas Series and Data Frame
37 pages
Quick Python Guide
From Everand
Quick Python Guide
Coder1
No ratings yet
10 Lessons in Front-end
From Everand
10 Lessons in Front-end
Krasimir Tsonev
2/5 (1)
Simplifying Data Science With Python
From Everand
Simplifying Data Science With Python
Billy David millican
No ratings yet
T.2.1.3 Evaluation Sheet Solved
No ratings yet
T.2.1.3 Evaluation Sheet Solved
3 pages
Database Management Systems 2
No ratings yet
Database Management Systems 2
16 pages
Sahil Rawat ... 01
No ratings yet
Sahil Rawat ... 01
13 pages
LIS Interview Questions and Answers Seri
No ratings yet
LIS Interview Questions and Answers Seri
6 pages
Introduction To Data Warehousing Concepts
No ratings yet
Introduction To Data Warehousing Concepts
8 pages
Apex Triggers Cheatsheet
No ratings yet
Apex Triggers Cheatsheet
18 pages
History Data Storage and Retrieval
No ratings yet
History Data Storage and Retrieval
10 pages
Department of Computer Technology MIT Campus:: Anna University
No ratings yet
Department of Computer Technology MIT Campus:: Anna University
2 pages
Admin Scripts
No ratings yet
Admin Scripts
97 pages
Advantages: (Multiple Database Following Homogenous Environment Following? Each
No ratings yet
Advantages: (Multiple Database Following Homogenous Environment Following? Each
10 pages
Case Study - Rca - Customer Complaints - Sologic
No ratings yet
Case Study - Rca - Customer Complaints - Sologic
4 pages
Transparent Data Encryption
No ratings yet
Transparent Data Encryption
8 pages
Mongodb Lab Viva Questions
No ratings yet
Mongodb Lab Viva Questions
8 pages
SQL - Practical: Data Base
No ratings yet
SQL - Practical: Data Base
6 pages
Dokumen - Tips - Af SDK 29 Getting Started Guide Pi Datalink Pi Datalink Server Pi Developeras
No ratings yet
Dokumen - Tips - Af SDK 29 Getting Started Guide Pi Datalink Pi Datalink Server Pi Developeras
54 pages
IS481 Week 6 Assignment
No ratings yet
IS481 Week 6 Assignment
2 pages
Added Please Check Nowgdgd
No ratings yet
Added Please Check Nowgdgd
34 pages
Online Shopping System Project Report
56% (9)
Online Shopping System Project Report
161 pages
cs practical 24-25
No ratings yet
cs practical 24-25
12 pages
Sgi Xfs Guide
No ratings yet
Sgi Xfs Guide
113 pages
III Sem (CA) RDBMS Lab Qoestion Bank 2019-20
No ratings yet
III Sem (CA) RDBMS Lab Qoestion Bank 2019-20
5 pages
Vijay Advance Excel
No ratings yet
Vijay Advance Excel
12 pages
Lab - Qlik Replicate With Snowflake Documentation
No ratings yet
Lab - Qlik Replicate With Snowflake Documentation
33 pages
Indexing and B+ Tress
No ratings yet
Indexing and B+ Tress
6 pages
MOVIETICKETBOOKING
No ratings yet
MOVIETICKETBOOKING
22 pages
Unit-2-ER-Relational-Mapping and Relational-Model
No ratings yet
Unit-2-ER-Relational-Mapping and Relational-Model
59 pages
Power BI Interview Questions
100% (2)
Power BI Interview Questions
6 pages
AnalytixLabs - Visualization & Analytics With Excel-VBA, SQL & Tableau
No ratings yet
AnalytixLabs - Visualization & Analytics With Excel-VBA, SQL & Tableau
16 pages
SQL Joins Cheat Sheet
100% (1)
SQL Joins Cheat Sheet
8 pages

Experiment 678910

Uploaded by

Experiment 678910

Uploaded by

Experiment 6.

Perform following operations using pandas

Pandas: A Powerful Data Analysis Library in Python

 Series (1D labeled array)

Installing and Importing Pandas

First, install pandas by using below command

pip install pandas

Now, import pandas:

A DataFrame is the core structure in pandas, similar to a table in SQL or Excel.

First, we create a basic DataFrame from a dictionary:

# Creating a simple DataFrame

# Creating a simple DataFrame

# Concatenating along rows (axis=0)

Name Age Salary

# Applying condition (Salary > 70000)

Name Age Salary

d. Adding a New Column with Explanation

# Adding a new column 'Bonus'

Perform following operations using pandas

a. Filling NaN with String

# Creating a DataFrame with NaN values

# Filling NaN values with a string

Name Age Salary

b. Sorting Based on Column Values

# Sorting by the 'Age' column in ascending order

Name Age Salary

# Grouping by 'Age' and calculating the average Salary

Read the following file formats using pandas

a. Reading Text Files

# Read a text file (assuming it has space or tab-separated data)

# Read a CSV file

 read_csv() reads the file and automatically handles comma-separated data.

Name Age Salary

c. Reading Excel Files

# Read an Excel file

 read_excel() reads an Excel file.

Name Age Salary

d. Reading JSON Files

# Read a JSON file

 read_json() loads data from JSON files into a pandas DataFrame.

Output (example of JSON data converted to DataFrame):

Read the following file formats

a. Reading Pickle Files

# Read a Pickle file

# Example output (depending on the content of the Pickle file)

b. Reading Image Files using PIL

from PIL import Image

# Open an image file

 Image.open() opens the image file.

c. Reading Multiple Files using Glob

# Using glob to get all CSV files in the directory

# Reading all files into a list of DataFrames

# Concatenating all DataFrames into one

 glob.glob('*.csv') retrieves all CSV files in the current directory.

Name Age Salary

d. Importing Data from a Database

# Create a connection to the SQLite database

# Query to select data

# Import data into a pandas DataFrame

# Close the database connection

 sqlite3.connect() establishes a connection to the SQLite database (for other

ID Name Age Department Salary

Demonstrate web scraping using python

pip install requests beautifulsoup4

2. Web Scraping Process:

Example: Scraping Quotes from a Website

Let's demonstrate web scraping by extracting quotes from a sample website:

1. Sending a Request and Parsing the HTML

# Step 1: Send a request to the website

# Step 2: Parse the HTML content using BeautifulSoup

# Step 4: Print the quotes

2. Extracting Additional Information (e.g., Author and Tags)

# Step 4: Print quotes, authors, and tags

 soup.find_all('small', class_='author') finds all <small> tags with the class

Quote: “The world as we have created it is a process of our thinking. It

You might also like