0% found this document useful (0 votes)

12 views

Python Codes Test 1

Uploaded by

Manish Mohapatra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Python Codes Test 1

Uploaded by

Manish Mohapatra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Test 1

Upload data using the below code, in jupiter upload file in csv format and put file name in file_path it will load the data.

1. import pandas as pd

file_path = r'DEVP-II_TEST_DATASET.csv'

# Import & Read Dataset

df = pd.read_csv(file_path)

df.info()

A1 Display the customer surname (and the value) that has the highest repeat transactions or repeat purchase. (Tip: Use count or sum
statistic)

surname_counts = df['surname'].value_counts()

# Get the surname with the highest count

most_common_surname = surname_counts.idxmax()
count_of_most_common_surname = surname_counts.max()

print(f"The customer surname with the highest repeat transactions is

'{most_common_surname}' with {count_of_most_common_surname}
occurrences.")
1.
 df['surname'] extracts the 'surname' column from the DataFrame df.
 value_counts() is a pandas function that counts the occurrences of each unique value in the 'surname'
column.
2. idxmax() is a pandas function that returns the index (in this case, the surname) corresponding to the maximum value
in the Series.
3. max() is a pandas function that returns the maximum value in the Series, which represents the count of the most
common surname.

A2. Display the country (and the value) that has: (1) the highest number of membership, (2) the lowest number of membership. (Tip: Use
member = 1 | Use count or sum statistic)

# A2
import pandas as pd

# Assuming your DataFrame is loaded as 'df'

# Group by country and sum the membership values

country_memberships = df.groupby('country')['member'].sum()

# Display the country with the highest number of memberships

max_membership_country = country_memberships.idxmax()
max_membership_value = country_memberships.max()

print(f"The country with the highest number of memberships is

{max_membership_country} with {max_membership_value} memberships.")
# Display the country with the lowest number of memberships
min_membership_country = country_memberships.idxmin()
min_membership_value = country_memberships.min()

print(f"The country with the lowest number of memberships is

{min_membership_country} with {min_membership_value} memberships.")

 1. This line groups the DataFrame df by the 'country' column.

 It then calculates the sum of the 'member' column within each group, representing the total number of memberships
for each country.
 2. idxmax() returns the index (in this case, the country) corresponding to the maximum value in the Series
(country_memberships).
 max_membership_country now contains the country with the highest total number of memberships.
 max_membership_value contains the maximum total number of memberships for any country.

A3. Display the gender (and the value) that is: (1) the youngest, (2) the oldest. (Tip: Use median statistic)

# A3
gender_age_median = df.groupby('gender')['age'].median().reset_index()

# Find the gender with the youngest median age

youngest_gender_row =
gender_age_median.loc[gender_age_median['age'].idxmin()]

# Find the gender with the oldest median age

oldest_gender_row =
gender_age_median.loc[gender_age_median['age'].idxmax()]

# Display the results

print(f"The gender with the youngest median age is:
{youngest_gender_row['gender']}")
print(f"Median age: {youngest_gender_row['age']}")

print(f"\nThe gender with the oldest median age is:

{oldest_gender_row['gender']}")
print(f"Median age: {oldest_gender_row['age']}")

 This line groups the DataFrame df by the 'gender' column.

 It then calculates the median age for each gender using the 'age' column and resets the index.
 idxmin() returns the index where the minimum value occurs in the 'age' column of gender_age_median.
 loc[] is then used to locate the row with the minimum median age.
 Similar to the process for the youngest, idxmax() returns the index where the maximum value occurs in the 'age'
column.

A 4. Display the country (and the value) that has: (1) the richest customer, (2) the poorest customers. (Tip: Use salary | Use mean statistic)

import pandas as pd

# Assuming your DataFrame is loaded as 'df'

# Find the country with the richest customer (based on mean salary)
richest_country = df.groupby('country')['salary'].mean().idxmax()
richest_salary = df.groupby('country')['salary'].mean().max()

print(f"The country with the richest customer is {richest_country} with

a mean salary of {richest_salary}.")

# Find the country with the poorest customers (based on mean salary)
poorest_country = df.groupby('country')['salary'].mean().idxmin()
poorest_salary = df.groupby('country')['salary'].mean().min()

print(f"The country with the poorest customers is {poorest_country}

with a mean salary of {poorest_salary}.")

 This line groups the DataFrame df by the 'country' column.

 It then calculates the mean salary for each country using the 'salary' column.
 idxmax() and max() are used to find the index (country) and maximum mean salary, respectively.
 This line prints a formatted string that includes the country with the richest customer and the corresponding mean salary.
 Similar to the process for the richest, idxmin() and min() are used to find the index and minimum mean salary for the poorest
customers.

A5. Display the gender (and the value) that has: (1) the best credit score, (2) the worst credit score. (Tip: Use mean statictic)

import pandas as pd

# Assuming your DataFrame is loaded as 'df'

# Find the gender with the best credit score (based on mean credit score)
best_credit_gender = df.groupby('gender')['credit_score'].mean().idxmax()
best_credit_score = df.groupby('gender')['credit_score'].mean().max()

print(f"The gender with the best credit score is {best_credit_gender} with a mean credit score of {best_credit_score}.")

# Find the gender with the worst credit score (based on mean credit score)
worst_credit_gender = df.groupby('gender')['credit_score'].mean().idxmin()
worst_credit_score = df.groupby('gender')['credit_score'].mean().min()

print(f"The gender with the worst credit score is {worst_credit_gender} with a mean credit score of {worst_credit_score}.")
 This line groups the DataFrame df by the 'gender' column.
 It then calculates the mean credit score for each gender using the 'credit_score' column.
 idxmax() and max() are used to find the index (gender) and maximum mean credit score, respectively.
 Similar to the process for the best credit score, idxmin() and min() are used to find the index and minimum mean credit score
for the worst credit score.

A6. Display the country (and the value) with: (1) highest variation in salary (2) lowest variation in salary. (Tip: Use std. dev. statistic)

import pandas as pd

# Assuming your DataFrame is loaded as 'df'

# Find the country with the highest variation in salary (based on

standard deviation)
highest_variation_country = df.groupby('country')
['salary'].std().idxmax()
highest_salary_variation = df.groupby('country')['salary'].std().max()

print(f"The country with the highest variation in salary is

{highest_variation_country} with a standard deviation of
{highest_salary_variation}.")

# Find the country with the lowest variation in salary (based on

standard deviation)
lowest_variation_country = df.groupby('country')
['salary'].std().idxmin()
lowest_salary_variation = df.groupby('country')['salary'].std().min()

print(f"The country with the lowest variation in salary is

{lowest_variation_country} with a standard deviation of
{lowest_salary_variation}.")
 This line groups the DataFrame df by the 'country' column.
 It then calculates the standard deviation of the 'salary' column for each country.
 idxmax() and max() are used to find the index (country) and maximum standard deviation of salary, respectively.
 Similar to the process for the highest variation, idxmin() and min() are used to find the index and minimum standard deviation
of salary for the lowest variation.

A7. Display the gender statistics with respect to country. (Tip: Use count or sum statistic)
1. Country with highest number of male customer 2. Country with lowest number of male customer

3. Country with highest number of female customer 4. Country with lowest number of female customer

import pandas as pd

# Assuming your DataFrame is loaded as 'df'

# Country with the highest number of male customers

highest_male_country = df[df['gender'] == 'Male'].groupby('country')
['customer_id'].count().idxmax()
highest_male_count = df[df['gender'] == 'Male'].groupby('country')
['customer_id'].count().max()

print(f"Country with the highest number of male customers:

{highest_male_country} with {highest_male_count} male customers.")

# Country with the lowest number of male customers

lowest_male_country = df[df['gender'] == 'Male'].groupby('country')
['customer_id'].count().idxmin()
lowest_male_count = df[df['gender'] == 'Male'].groupby('country')
['customer_id'].count().min()

print(f"Country with the lowest number of male customers:

{lowest_male_country} with {lowest_male_count} male customers.")
# Country with the highest number of female customers
highest_female_country = df[df['gender'] ==
'Female'].groupby('country')['customer_id'].count().idxmax()
highest_female_count = df[df['gender'] == 'Female'].groupby('country')
['customer_id'].count().max()

print(f"Country with the highest number of female customers:

{highest_female_country} with {highest_female_count} female
customers.")

# Country with the lowest number of female customers

lowest_female_country = df[df['gender'] == 'Female'].groupby('country')
['customer_id'].count().idxmin()
lowest_female_count = df[df['gender'] == 'Female'].groupby('country')
['customer_id'].count().min()

print(f"Country with the lowest number of female customers:

{lowest_female_country} with {lowest_female_count} female customers.")
 This line filters the DataFrame df to include only male customers.
 It then groups the filtered DataFrame by 'country' and counts the number of male customers in each country using the
'customer_id' column.
 idxmax() and max() are used to find the index (country) and maximum count of male customers, respectively.
 Similar to the process for the highest number, idxmin() and min() are used to find the index and minimum count of male
customers for the lowest number.

A8. Display the member counts (both 0 & 1) with respect to both country & gender. (Tip: Use count or sum statistic)

import pandas as pd

# Assuming your DataFrame is loaded as 'df'

# Display member counts (0 and 1) with respect to both country and

gender
member_counts = df.groupby(['country', 'gender', 'member'])
['customer_id'].count().reset_index()

print("Member Counts:")
print(member_counts)

import pandas as pd
import matplotlib.pyplot as plt

# Assuming your DataFrame is loaded as 'df'

# Display member counts (0 and 1) with respect to both country and

gender
member_counts = df.groupby(['country', 'gender', 'member'])
['customer_id'].count().reset_index()
# Filter for member = 1 (assuming member is binary)
member_counts_1 = member_counts[member_counts['member'] == 1]

# Create a MultiIndex for better labeling

member_counts_1.set_index(['country', 'gender'], inplace=True)

# Create bar graph

fig, ax = plt.subplots(figsize=(12, 6))
member_counts_1['customer_id'].plot(kind='bar', ax=ax, legend=False)

# Title and labels

plt.title('Member Counts by Country and Gender (Member = 1)',
fontsize=14)
plt.xlabel('Country, Gender', fontsize=12)
plt.ylabel('Count', fontsize=12)

# Show the plot

plt.show()
 This line groups the DataFrame df by the columns 'country', 'gender', and 'member'.
 It then counts the number of customers for each combination of country, gender, and member status using the 'customer_id'
column.
 The reset_index() function is used to reset the index of the resulting DataFrame, creating a clearer structure for the data.
 This prints the resulting DataFrame that contains the counts of customers for each combination of country, gender, and member
status.
 This line filters the member_counts DataFrame to include only rows where 'member' is equal to 1.
 This line sets a MultiIndex for the DataFrame using the columns 'country' and 'gender'. This is done to improve the labeling of
the bar graph.
 This section uses Matplotlib to create a bar graph.
 The 'country' and 'gender' combinations are plotted on the x-axis, and the count of customers with 'member' equal to 1 is
plotted on the y-axis.
 These lines set the title, x-axis label, and y-axis label for the bar graph.

A9. Display the combination of country & gender (and the value) that has: (1) the best credit score, (2) the worst credit score. (Tip: Use
mean statictic)

import pandas as pd
# Assuming your DataFrame is loaded as 'df'
# Find the combination of country & gender with the best credit score
(based on mean credit score)
best_credit_combination = df.groupby(['country', 'gender'])
['credit_score'].mean().idxmax()
best_credit_score = df.groupby(['country', 'gender'])
['credit_score'].mean().max()

print(f"The combination of country & gender with the best credit score
is {best_credit_combination} with a mean credit score of
{best_credit_score}.")

# Find the combination of country & gender with the worst credit score
(based on mean credit score)
worst_credit_combination = df.groupby(['country', 'gender'])
['credit_score'].mean().idxmin()
worst_credit_score = df.groupby(['country', 'gender'])
['credit_score'].mean().min()

print(f"The combination of country & gender with the worst credit score
is {worst_credit_combination} with a mean credit score of
{worst_credit_score}.")

# Calculating the mean credit score for each combination of country and
gender
mean_credit_scores_by_country_gender = df.groupby(['country',
'gender'])['credit_score'].mean()

# Finding the combination with the best (highest) and worst (lowest)
mean credit score
best_credit_combo = mean_credit_scores_by_country_gender.idxmax()
best_credit_score = mean_credit_scores_by_country_gender.max()

worst_credit_combo = mean_credit_scores_by_country_gender.idxmin()
worst_credit_score = mean_credit_scores_by_country_gender.min()

# Filtering the data for the highest and lowest combinations

filtered_credit_scores =
mean_credit_scores_by_country_gender.loc[[best_credit_combo,
worst_credit_combo]]

# Plotting the mean credit scores for the combinations with the highest
and lowest scores
plt.figure(figsize=(6, 4))
sns.barplot(x=filtered_credit_scores.index.map(lambda x: f"{x[0]},
{x[1]}"), y=filtered_credit_scores.values, palette="Blues")
plt.title("Best and Worst Credit Scores by Country and Gender")
plt.xlabel("Country, Gender")
plt.ylabel("Mean Credit Score")
plt.show()

 Groups the DataFrame df by the columns 'country' and 'gender'.

 Calculates the mean credit score for each combination.
 Uses idxmax() to find the index (combination of country and gender) with the maximum mean credit score, and max() to find
the maximum mean credit score.
 Prints the combination and mean credit score for the best credit score.
 Similar to the process for the best credit score, uses idxmin() and min() to find the index and minimum mean credit score for
the worst credit score.
 Prints the combination and mean credit score for the worst credit score.
 Groups the DataFrame by the columns 'country' and 'gender'.
 Calculates the mean credit score for each combination.
 Uses idxmax() and idxmin() to find the indices (combinations of country and gender) with the maximum and minimum mean
credit scores.
 Uses max() and min() to find the maximum and minimum mean credit scores.
 Filters the mean credit scores DataFrame to include only rows corresponding to the best and worst credit combinations.
 Uses Seaborn to create a bar graph with x-axis labels representing the combinations of country and gender, and y-axis
representing mean credit scores.
 Displays the bar graph

A10. Display the top 3 & bottom 3 credit scores in the: (1) members catergory, (2) non-members category. (Tip: Use n-largest / n-smallest
statictic)

# Assuming your DataFrame is loaded as 'df'

# Top 3 credit scores in the members category
top_members_credit_scores = df[df['member'] == 1].nlargest(3,
'credit_score')

print("Top 3 credit scores in the members category:")

print(top_members_credit_scores[['customer_id', 'credit_score']])

# Bottom 3 credit scores in the members category

bottom_members_credit_scores = df[df['member'] == 1].nsmallest(3,
'credit_score')

print("\nBottom 3 credit scores in the members category:")

print(bottom_members_credit_scores[['customer_id', 'credit_score']])

# Top 3 credit scores in the non-members category

top_non_members_credit_scores = df[df['member'] == 0].nlargest(3,
'credit_score')

print("\nTop 3 credit scores in the non-members category:")

print(top_non_members_credit_scores[['customer_id', 'credit_score']])

# Bottom 3 credit scores in the non-members category

bottom_non_members_credit_scores = df[df['member'] == 0].nsmallest(3,
'credit_score')

print("\nBottom 3 credit scores in the non-members category:")

print(bottom_non_members_credit_scores[['customer_id',
'credit_score']])

# Finding the top 3 and bottom 3 credit scores in the members category
top_3_member_credit_scores = df[df['member'] == 1]
['credit_score'].nlargest(3)
bottom_3_member_credit_scores = df[df['member'] == 1]
['credit_score'].nsmallest(3)

# Finding the top 3 and bottom 3 credit scores in the non-members

category
top_3_non_member_credit_scores = df[df['member'] == 0]
['credit_score'].nlargest(3)
bottom_3_non_member_credit_scores = df[df['member'] == 0]
['credit_score'].nsmallest(3)
# Combining these into a single dataframe for plotting
credit_scores_combined = pd.DataFrame({
'Top 3 Members': top_3_member_credit_scores.values,
'Bottom 3 Members': bottom_3_member_credit_scores.values,
'Top 3 Non-Members': top_3_non_member_credit_scores.values,
'Bottom 3 Non-Members': bottom_3_non_member_credit_scores.values
})

# Plotting the top 3 and bottom 3 credit scores for members and non-
members
plt.figure(figsize=(10, 6))
credit_scores_combined.plot(kind='bar', color=["lightblue",
"lightcoral", "mediumseagreen", "peachpuff"])
plt.title("Top 3 and Bottom 3 Credit Scores in Members and Non-Members
Categories")
plt.xlabel("Rank")
plt.ylabel("Credit Score")
plt.xticks(ticks=[0, 1, 2], labels=['1st', '2nd', '3rd'], rotation=0)
plt.legend(title='Category')
plt.show()

 For members (df['member'] == 1), nlargest(3, 'credit_score') is used to find the top 3 credit scores.
 nsmallest(3, 'credit_score') is used to find the bottom 3 credit scores.
 These results are printed.
 For non-members (df['member'] == 0), nlargest(3, 'credit_score') is used to find the top 3 credit scores.
 nsmallest(3, 'credit_score') is used to find the bottom 3 credit scores.
 These results are printed.
 Using nlargest and nsmallest on the credit scores directly for both members and non-members.
 Creates a new DataFrame, credit_scores_combined, with columns representing the top 3 and bottom 3 credit scores for both
members and non-members.
 Creates a bar graph using the combined DataFrame.
 The x-axis represents the rank (1st, 2nd, 3rd), and the y-axis represents the credit score.
 Different colors are used for members and non-members.
 Displays the bar graph.

A11. Display the top & bottom country (and the value) in terms of salary in the: (1) members category, (2) non-members category. (Tip:
Use median statistics)

import pandas as pd
# Assuming your DataFrame is loaded as 'df'
# Top country in terms of salary in the members category
top_members_salary_country = df[df['member'] == 1].groupby('country')['salary'].median().nlargest(1).reset_index()
print("Top country in terms of salary in the members category:")
print(top_members_salary_country)

# Bottom country in terms of salary in the members category

bottom_members_salary_country = df[df['member'] == 1].groupby('country')['salary'].median().nsmallest(1).reset_index()
print("\nBottom country in terms of salary in the members category:")
print(bottom_members_salary_country)

# Top country in terms of salary in the non-members category

top_non_members_salary_country = df[df['member'] == 0].groupby('country')['salary'].median().nlargest(1).reset_index()
print("\nTop country in terms of salary in the non-members category:")
print(top_non_members_salary_country)

# Bottom country in terms of salary in the non-members category

bottom_non_members_salary_country = df[df['member'] == 0].groupby('country')['salary'].median().nsmallest(1).reset_index()
print("\nBottom country in terms of salary in the non-members category:")
print(bottom_non_members_salary_country)

# Calculating the median salary for each country within members and non-members categories
median_salary_members = df[df['member'] == 1].groupby('country')['salary'].median()
median_salary_non_members = df[df['member'] == 0].groupby('country')['salary'].median()

# Finding the top & bottom country in terms of median salary for members and non-members
top_country_salary_members = median_salary_members.idxmax()
top_salary_members = median_salary_members.max()
bottom_country_salary_members = median_salary_members.idxmin()
bottom_salary_members = median_salary_members.min()

top_country_salary_non_members = median_salary_non_members.idxmax()
top_salary_non_members = median_salary_non_members.max()
bottom_country_salary_non_members = median_salary_non_members.idxmin()
bottom_salary_non_members = median_salary_non_members.min()

# Plotting the median salaries for the top & bottom countries in members and non-members categories
plt.figure(figsize=(10, 6))
sns.barplot(x=['Top Member', 'Bottom Member', 'Top Non-Member', 'Bottom Non-Member'],
y=[top_salary_members, bottom_salary_members, top_salary_non_members, bottom_salary_non_members],
hue=['Members', 'Members', 'Non-Members', 'Non-Members'],
palette='Set2')
plt.title("Top & Bottom Countries by Median Salary in Members and Non-Members Categories")
plt.ylabel("Median Salary")
plt.show()

{
"top_country_salary_members": (top_country_salary_members, top_salary_members),
"bottom_country_salary_members": (bottom_country_salary_members, bottom_salary_members),
"top_country_salary_non_members": (top_country_salary_non_members, top_salary_non_members),
"bottom_country_salary_non_members": (bottom_country_salary_non_members, bottom_salary_non_members)
}

A12. Display the top 2 & bottom 2 age groups in the (1) members category, (2) non-members category. (Tip: Use n-largest / n-smallest
statistic)

# Assuming your DataFrame is named 'df'

# Replace 'df' with the actual name if different

# Top 2 age groups in terms of median age for members

top_members_age_groups = df[df['member'] == 1].groupby('age')['age'].count().nlargest(2)
# Bottom 2 age groups in terms of median age for members
bottom_members_age_groups = df[df['member'] == 1].groupby('age')['age'].count().nsmallest(2)

# Top 2 age groups in terms of median age for non-members

top_non_members_age_groups = df[df['member'] == 0].groupby('age')['age'].count().nlargest(2)

# Bottom 2 age groups in terms of median age for non-members

bottom_non_members_age_groups = df[df['member'] == 0].groupby('age')['age'].count().nsmallest(2)

print("Top 2 age groups in terms of median age for members:")

print(top_members_age_groups)

print("\nBottom 2 age groups in terms of median age for members:")

print(bottom_members_age_groups)

print("\nTop 2 age groups in terms of median age for non-members:")

print(top_non_members_age_groups)

print("\nBottom 2 age groups in terms of median age for non-members:")

print(bottom_non_members_age_groups)

bins = [18, 30, 40, 50, 60, 70, 80]

labels = ['18-30', '31-40', '41-50', '51-60', '61-70', '71-80']
df['age_group'] = pd.cut(df['age'], bins=bins, labels=labels, right=False)

# Counting the number of customers in each age group for members and non-members
age_group_counts_members = df[df['member'] == 1]['age_group'].value_counts()
age_group_counts_non_members = df[df['member'] == 0]['age_group'].value_counts()

# Creating a DataFrame to combine the values for plotting

combined_age_groups = pd.DataFrame({
"Members": age_group_counts_members,
"Non-Members": age_group_counts_non_members
})

# Extracting the top 2 and bottom 2 age groups for members and non-members
top_2_members = combined_age_groups['Members'].nlargest(2)
bottom_2_members = combined_age_groups['Members'].nsmallest(2)
top_2_non_members = combined_age_groups['Non-Members'].nlargest(2)
bottom_2_non_members = combined_age_groups['Non-Members'].nsmallest(2)

# Merging top 2 and bottom 2 age groups into a single DataFrame for plotting
# Ensuring all relevant age groups are included in the index
all_age_groups =
top_2_members.index.union(bottom_2_members.index).union(top_2_non_members.index).union(bottom_2_non_members.index)
merged_age_groups = pd.DataFrame(index=all_age_groups)
# Adding the top 2 and bottom 2 age groups to the DataFrame
merged_age_groups['Top 2 Members'] = top_2_members.reindex(all_age_groups)
merged_age_groups['Bottom 2 Members'] = bottom_2_members.reindex(all_age_groups)
merged_age_groups['Top 2 Non-Members'] = top_2_non_members.reindex(all_age_groups)
merged_age_groups['Bottom 2 Non-Members'] = bottom_2_non_members.reindex(all_age_groups)

# Plotting the age group counts

plt.figure(figsize=(10, 6))
merged_age_groups.plot(kind='bar', color=["blue", "green", "red", "orange"])
plt.title("Top 2 & Bottom 2 Age Groups in Members and Non-Members Categories")
plt.xlabel("Age Group")
plt.ylabel("Number of Customers")
plt.legend(title='Category')
plt.show()
 For members (df['member'] == 1), nlargest(2) and nsmallest(2) are used to find the top 2 and bottom 2 age groups,
respectively.
 These results are printed.
 For non-members (df['member'] == 0), similar to members, nlargest(2) and nsmallest(2) are used to find the top 2 and bottom
2 age groups.
 These results are printed.
 Creates age groups using the pd.cut function based on specified bins and labels.
 Counts the number of customers in each age group for members and non-members.
 Creates a DataFrame, combined_age_groups, to combine the counts for members and non-members for each age group.
 Uses nlargest and nsmallest to extract the top 2 and bottom 2 age groups for both members and non-members.
 Creates a DataFrame, merged_age_groups, with an index that includes all unique age groups from the top and bottom groups.
 Adds columns to the merged_age_groups DataFrame for the top 2 and bottom 2 age groups for both members and non-
members.
 Creates a bar graph using the combined age groups DataFrame.
 Different colors are used for members and non-members.
 Displays the bar graph.

Williamson Macroeconomics ch9 Solutions
100% (1)
Williamson Macroeconomics ch9 Solutions
24 pages
Smart 3D Plant Curriculum Path Training Guidelines 2019
No ratings yet
Smart 3D Plant Curriculum Path Training Guidelines 2019
28 pages
PYTHON SQL
No ratings yet
PYTHON SQL
5 pages
Functionapplicationp PDF
No ratings yet
Functionapplicationp PDF
6 pages
vertopal.com_12_Pandas
No ratings yet
vertopal.com_12_Pandas
14 pages
Practical File 2024
No ratings yet
Practical File 2024
25 pages
Python Class 6 Assignment Solution
No ratings yet
Python Class 6 Assignment Solution
9 pages
DSA lab manual pgms_fINAL
No ratings yet
DSA lab manual pgms_fINAL
34 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
20 pages
05 Pandas (1)
No ratings yet
05 Pandas (1)
12 pages
Lab Record IP
No ratings yet
Lab Record IP
13 pages
Battle of The Data Tools - Pandas Vs SQL
No ratings yet
Battle of The Data Tools - Pandas Vs SQL
12 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
Set B
No ratings yet
Set B
8 pages
Answer Key for SET-1 TO 3
No ratings yet
Answer Key for SET-1 TO 3
7 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
4 PythonPandas
No ratings yet
4 PythonPandas
8 pages
Python CheatSheet
No ratings yet
Python CheatSheet
2 pages
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
Practical
No ratings yet
Practical
12 pages
Descriptive Statistics With Pandas: Data Handling Using Pandas - II
100% (1)
Descriptive Statistics With Pandas: Data Handling Using Pandas - II
37 pages
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
Summary: Introduction To Data Visualization Tools
No ratings yet
Summary: Introduction To Data Visualization Tools
13 pages
Pandas Complete + Visualisation Summary of IBM Visualization
No ratings yet
Pandas Complete + Visualisation Summary of IBM Visualization
21 pages
Pandas Plots
No ratings yet
Pandas Plots
14 pages
Pandas PDF(2)
No ratings yet
Pandas PDF(2)
25 pages
ML Lab1 Python Panda
No ratings yet
ML Lab1 Python Panda
9 pages
Supermarket Sales Data analysis
No ratings yet
Supermarket Sales Data analysis
6 pages
EDA_Module_3-1
No ratings yet
EDA_Module_3-1
40 pages
Churn Prediction Model
No ratings yet
Churn Prediction Model
36 pages
Data Aggregation and Group Operations
No ratings yet
Data Aggregation and Group Operations
34 pages
Series 1
No ratings yet
Series 1
408 pages
Ip Practical 2024
No ratings yet
Ip Practical 2024
12 pages
Python Data Science 101
100% (1)
Python Data Science 101
41 pages
Python-for-Data-Analysis (Pandas
No ratings yet
Python-for-Data-Analysis (Pandas
31 pages
Pandas PDF
No ratings yet
Pandas PDF
6 pages
Understanding Pandas Groupby For Data Aggregation
No ratings yet
Understanding Pandas Groupby For Data Aggregation
49 pages
Pandas 2 Complete Notes Class XII
No ratings yet
Pandas 2 Complete Notes Class XII
18 pages
DSBDAL - Assignment No 9
No ratings yet
DSBDAL - Assignment No 9
12 pages
Python Unit 4&5 Que
No ratings yet
Python Unit 4&5 Que
33 pages
Practical File IP
No ratings yet
Practical File IP
27 pages
L-3 (Data Frame Part 2).Ipynb - Colab
No ratings yet
L-3 (Data Frame Part 2).Ipynb - Colab
5 pages
Python Pandas-DataFrames Complete - Jupyter Notebook
No ratings yet
Python Pandas-DataFrames Complete - Jupyter Notebook
34 pages
PANDAS & VIS 2
No ratings yet
PANDAS & VIS 2
11 pages
Data Frames and Charts: 2.1 Working With Dataframes
No ratings yet
Data Frames and Charts: 2.1 Working With Dataframes
13 pages
Pandas Tricks To Create A DataFrame From An Existing One
No ratings yet
Pandas Tricks To Create A DataFrame From An Existing One
14 pages
EDA_CODE_SNIPPETS
No ratings yet
EDA_CODE_SNIPPETS
17 pages
EDA Lab Manual
100% (2)
EDA Lab Manual
93 pages
609008987-EDA-Lab-Manual
No ratings yet
609008987-EDA-Lab-Manual
93 pages
Ip Practical Shubham.pdf
No ratings yet
Ip Practical Shubham.pdf
20 pages
data_preprocess_steps
No ratings yet
data_preprocess_steps
2 pages
Pandas Commands
No ratings yet
Pandas Commands
3 pages
12 Useful Pandas Techniques in Python For Data Manipulation
100% (2)
12 Useful Pandas Techniques in Python For Data Manipulation
19 pages
ML Expt 1 Description
No ratings yet
ML Expt 1 Description
15 pages
Python Codes Test 2
No ratings yet
Python Codes Test 2
12 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Vanshika Goyal Gec Practicals
No ratings yet
Vanshika Goyal Gec Practicals
31 pages
Exercises 2
No ratings yet
Exercises 2
10 pages
Python Lab
No ratings yet
Python Lab
8 pages
Python For Data Science
No ratings yet
Python For Data Science
45 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
NCERT Solutions For Class 5 Maths 9 May Chapter 2 Shapes and Angles
No ratings yet
NCERT Solutions For Class 5 Maths 9 May Chapter 2 Shapes and Angles
16 pages
Power System Operation and Control (19a02703a) : Vemu Institute of Technology P.Kothakota, Chittoor District - 517112
No ratings yet
Power System Operation and Control (19a02703a) : Vemu Institute of Technology P.Kothakota, Chittoor District - 517112
202 pages
Tagalog Reviewers
No ratings yet
Tagalog Reviewers
132 pages
Procurement EXCEL
No ratings yet
Procurement EXCEL
3 pages
Lessons of The Naqshbandi Mujaddidi Tariqah
No ratings yet
Lessons of The Naqshbandi Mujaddidi Tariqah
17 pages
Embedded System - QB
No ratings yet
Embedded System - QB
3 pages
Towards A Process Oriented Understanding of HR Analytics - Implementation and Application
No ratings yet
Towards A Process Oriented Understanding of HR Analytics - Implementation and Application
32 pages
Lesson Plan
No ratings yet
Lesson Plan
2 pages
Practice Problems
No ratings yet
Practice Problems
5 pages
Nail Tech8 Exam Prep
No ratings yet
Nail Tech8 Exam Prep
7 pages
23 How Can I Spot Someones Weaknesses and Use It Against Them
100% (1)
23 How Can I Spot Someones Weaknesses and Use It Against Them
5 pages
Sol03 Landau Levels
No ratings yet
Sol03 Landau Levels
8 pages
Organizational Structure Design
100% (1)
Organizational Structure Design
40 pages
Death of A Salesman Literature Review
100% (2)
Death of A Salesman Literature Review
7 pages
Daily English Lesson Plan
100% (1)
Daily English Lesson Plan
13 pages
Implementation of Space Vector Pulse Width Modulation (SVPWM) For Three Phase Voltage Source Inverter Using Matlab Simulink - 24 Pages
100% (1)
Implementation of Space Vector Pulse Width Modulation (SVPWM) For Three Phase Voltage Source Inverter Using Matlab Simulink - 24 Pages
24 pages
Vega R Crankshaft & Piston
No ratings yet
Vega R Crankshaft & Piston
1 page
The New Yorker - Cartoons of The Year 2015 PDF
100% (8)
The New Yorker - Cartoons of The Year 2015 PDF
148 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
3 pages
Lecture 11 Decanters
No ratings yet
Lecture 11 Decanters
10 pages
General Declaration: (Outward / Inward)
No ratings yet
General Declaration: (Outward / Inward)
1 page
My Mother at Sixty Six
100% (1)
My Mother at Sixty Six
10 pages
BL 5 Audit, Business Processes
100% (1)
BL 5 Audit, Business Processes
188 pages
Client Privacy Notice en Difc
No ratings yet
Client Privacy Notice en Difc
9 pages
Project of Green Building
No ratings yet
Project of Green Building
53 pages
Detailed Lesson Plan in Science
No ratings yet
Detailed Lesson Plan in Science
4 pages
University of Botswana Department of Law Law 332: Evidence SEMESTER 2021/2022
No ratings yet
University of Botswana Department of Law Law 332: Evidence SEMESTER 2021/2022
8 pages
ICB 300 MPCS Ni - Br.Coin Blanks.
No ratings yet
ICB 300 MPCS Ni - Br.Coin Blanks.
89 pages

Python Codes Test 1

Uploaded by

Python Codes Test 1

Uploaded by

Test 1

# Import & Read Dataset

# Get the surname with the highest count

print(f"The customer surname with the highest repeat transactions is

# Assuming your DataFrame is loaded as 'df'

# Group by country and sum the membership values

# Display the country with the highest number of memberships

print(f"The country with the highest number of memberships is

print(f"The country with the lowest number of memberships is

 1. This line groups the DataFrame df by the 'country' column.

# Find the gender with the youngest median age

# Find the gender with the oldest median age

# Display the results

print(f"\nThe gender with the oldest median age is:

 This line groups the DataFrame df by the 'gender' column.

# Assuming your DataFrame is loaded as 'df'

print(f"The country with the richest customer is {richest_country} with

print(f"The country with the poorest customers is {poorest_country}

 This line groups the DataFrame df by the 'country' column.

# Assuming your DataFrame is loaded as 'df'

# Assuming your DataFrame is loaded as 'df'

# Find the country with the highest variation in salary (based on

print(f"The country with the highest variation in salary is

# Find the country with the lowest variation in salary (based on

print(f"The country with the lowest variation in salary is

# Assuming your DataFrame is loaded as 'df'

# Country with the highest number of male customers

print(f"Country with the highest number of male customers:

# Country with the lowest number of male customers

print(f"Country with the lowest number of male customers:

print(f"Country with the highest number of female customers:

# Country with the lowest number of female customers

print(f"Country with the lowest number of female customers:

# Assuming your DataFrame is loaded as 'df'

# Display member counts (0 and 1) with respect to both country and

# Assuming your DataFrame is loaded as 'df'

# Display member counts (0 and 1) with respect to both country and

# Create a MultiIndex for better labeling

# Create bar graph

# Title and labels

# Show the plot

# Filtering the data for the highest and lowest combinations

 Groups the DataFrame df by the columns 'country' and 'gender'.

# Assuming your DataFrame is loaded as 'df'

print("Top 3 credit scores in the members category:")

# Bottom 3 credit scores in the members category

print("\nBottom 3 credit scores in the members category:")

# Top 3 credit scores in the non-members category

print("\nTop 3 credit scores in the non-members category:")

# Bottom 3 credit scores in the non-members category

print("\nBottom 3 credit scores in the non-members category:")

# Finding the top 3 and bottom 3 credit scores in the non-members

# Bottom country in terms of salary in the members category

# Top country in terms of salary in the non-members category

# Bottom country in terms of salary in the non-members category

# Assuming your DataFrame is named 'df'

# Top 2 age groups in terms of median age for members

# Top 2 age groups in terms of median age for non-members

# Bottom 2 age groups in terms of median age for non-members

print("Top 2 age groups in terms of median age for members:")

print("\nBottom 2 age groups in terms of median age for members:")

print("\nTop 2 age groups in terms of median age for non-members:")

print("\nBottom 2 age groups in terms of median age for non-members:")

bins = [18, 30, 40, 50, 60, 70, 80]

# Creating a DataFrame to combine the values for plotting

# Plotting the age group counts

You might also like