0% found this document useful (0 votes)
2 views

Pandas

Uploaded by

vanshvats4060
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Pandas

Uploaded by

vanshvats4060
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

1)Question: Create a DataFrame with at least 5 rows and 3 columns, where some cells

contain missing values (NaN). Write a Python script to:

●​ Detect and print all the missing values in the DataFrame.


●​ Count the number of missing values in each column.

2)Dropping Missing Values

Question: Given the following DataFrame:

data = {'A': [1, 2, None, 4, 5],

'B': [None, 2, 3, None, 5],

'C': [1, None, 3, 4, None]}

df = pd.DataFrame(data)

a)Drop rows where any column has missing values.

b)Drop columns where all the values are missing.

c)Drop rows with missing values but keep those where at least 3 values are non-missing.

3)Fill the missing values in a DataFrame with the following strategy:

○​ For numeric columns, replace missing values with the mean of the column.
○​ For categorical columns (strings), replace missing values with the mode of
the column.

Provide an example DataFrame and fill the missing values according to the above
strategies.

4)Question: Given the following DataFrame:

data = {'A': [None, 2, 3, 4, 5],

'B': [1, None, 3, None, 5],

'C': [1, 2, None, 4, None]}

df = pd.DataFrame(data)

●​ Replace all missing values in column 'A' with 0.


●​ Replace all missing values in column 'B' with the column's mean.
●​ Replace missing values in column 'C' with the median.

1. Detecting Missing Values


You can detect missing values using isna() or isnull() (both work the same way) and
notna() or notnull() to check for non-missing values.

import pandas as pd

df = pd.DataFrame({

'A': [1, 2, None, 4],

'B': [None, 2, 3, 4]

})

# Detect missing values

df.isna()

# Detect non-missing values

df.notna()

2. Dropping Missing Values

You can remove rows or columns with missing values using dropna().

●​ Drop rows with missing values:

df.dropna(axis=0) # Drops rows with NaN

Drop columns with missing values

df.dropna(axis=1) # Drops columns with NaN

You can also specify a threshold, for example, keeping rows with at least 2 non-NaN values:

df.dropna(thresh=2)

3. Filling Missing Values

If you don't want to drop missing values, you can fill them with some value using fillna().
There are various strategies for filling missing data.

●​ Fill with a specific value (e.g., 0):

df.fillna(0)
Fill with a value per column (e.g., different fill values for each column):

df.fillna({'A': 0, 'B': 5})

4. Interpolating Missing Values

You can use interpolation methods to estimate the missing values. This is useful for
numerical data.

df.interpolate() # Linear interpolation

You can also specify different interpolation methods, like polynomial interpolation:

df.interpolate(method='polynomial', order=2)

Replacing Missing Values with Statistical Measures

You can replace missing values with statistical values like the mean, median, or mode
of the column.

●​ Fill with the mean:

df.fillna(df.mean())

Fill with the median:

df.fillna(df.median())

Fill with the mode:

df.fillna(df.mode().iloc[0])

Checking for Missing Values After Handling

After you've handled missing data, you can check if any values are still missing using:

df.isna().sum() # Check remaining missing values by column

You might also like