0% found this document useful (0 votes)
5 views34 pages

Data Analytics

The document introduces data analytics, explaining its importance in decision-making through various types of analysis: descriptive, diagnostic, predictive, and prescriptive. It illustrates these concepts with a case study of BeanBrew Café, demonstrating how data-driven insights can improve sales and customer satisfaction. Additionally, it highlights Python's role in data analytics, showcasing libraries like NumPy, Pandas, Matplotlib, Seaborn, and SciPy for data manipulation and visualization.

Uploaded by

meejanani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views34 pages

Data Analytics

The document introduces data analytics, explaining its importance in decision-making through various types of analysis: descriptive, diagnostic, predictive, and prescriptive. It illustrates these concepts with a case study of BeanBrew Café, demonstrating how data-driven insights can improve sales and customer satisfaction. Additionally, it highlights Python's role in data analytics, showcasing libraries like NumPy, Pandas, Matplotlib, Seaborn, and SciPy for data manipulation and visualization.

Uploaded by

meejanani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Data

Analytic
s
From Data to
Decisions
Introduction
to Data
Analytics
• What is data ? Data refers to raw facts and
figures collected from various sources.
• What is data analytics ? Data analytics is
the process of looking at data to find useful
information, patterns, or trends that help us
make better decisions.
• It involves:
1. Collecting the right data
2. Processing and organizing it
3. Analyzing it using statistical or
computational techniques
4. Visualizing results using charts, graphs,
dashboards, etc.
Types of Data
Analytics
The Tale of
BeanBrew
Café
How data saved the coffee
empire
Once upon a time in a bustling city, a cozy little coffee shop called
BeanBrew Café was famous for its rich espresso and warm
pastries. For years, it was the go-to spot for students, professionals,
and tourists alike. But in early 2024, the owner, Maya, noticed
something troubling—sales were slipping.

Instead of panicking, Maya turned to data.


Descriptive Analysis:
“What’s happening?”
Maya’s first step was to look at the numbers. She
pulled sales data from the past year and used
descriptive analysis:
• Sales had dropped 20% over 3 months.
• Afternoon sales were down the most.
• The number of returning customers was
decreasing.
Maya visualized this with charts showing declining
foot traffic and monthly revenue dips. Something
was definitely off.
Diagnostic Analysis:
“Why is it happening?”
Now that she knew what was happening, Maya
asked: Why ? She conducted a diagnostic analysis:
• She compared weather patterns (rainy
afternoons!)
• She looked at online reviews — people
complained about long wait times in the
afternoon.
• She found a correlation: wait times ⬆️=
customer satisfaction ⬇️.
She confirmed a strong link between staff shortage
and lost sales.
Predictive Analysis:
“What could happen
next?”
Worried the trend might continue, Maya used
predictive analysis .
She fed her sales and staffing data into a machine
learning model, which forecasted:
• If no changes were made, she'd lose 30% more
sales over the next quarter.
• However, adding one extra barista in the
afternoon could reverse the trend.
She also predicted hot drink sales would spike in the
upcoming rainy season
Prescriptive Analysis:
“What should I do
about it?”
Maya wanted a strategy, not just insights . Using
prescriptive analysis, she:
• Simulated different staffing schedules.
• Modeled pricing strategies (e.g., happy hour
deals at 3–5 p.m.).
• Found the optimal plan: hire a part-time barista,
offer a rainy-day discount, and promote mobile
orders.
The model showed this would increase profits by
15% and reduce customer complaints.
70-
80%
of people use Python for Data
Analysis
Python is the top choice for data analytics because it is popular, easy to learn,
and supported by a large community. Its rich ecosystem of powerful libraries
like pandas, numpy, matplotlib, seaborn and scikit-learn makes data handling,
visualization, and machine learning simple. Python is versatile, free, and open-
source, used across many industries. Plus, Python skills are highly in demand in
the job market, making it a valuable tool for data professionals.
Libraries used in
Python for Data
Analytics
NumPy
• It stands for Numerical Python.
• NumPy is a powerful library in Python used
for:
1. Working with arrays (especially multi-
dimensional arrays)
2. Performing mathematical and logical
operations on data efficiently
3. Serving as the base for libraries like
Pandas, Scikit-learn, etc.
Creating arrays
import numpy as np

# 1D array
array_1d = np.array([10, 20, 30])
print("1D Array:", array_1d)

# 2D array
array_2d = np.array([[1, 2, 3], [4, 5, 6]])
print("2D Array:", array_2d)

print("Dimensions:", array_2d.ndim)
print("Shape:", array_2d.shape)
print("Data Type:", array_2d.dtype)
Indexing and Slicing arrays
# Create a 1D array
arr = np.array([10, 20, 30, 40, 50])
print(arr[0])
print(arr[0:3])

# Create a 2D array
arr2d = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
print(arr2d[0, 1])
print(arr2d[0])
print(arr2d[:, 0])
print(arr2d[1:, 1:])
Flatten, Reshape and
Transpose
arr = np.array([
[1, 2, 3],
[4, 5, 6]
])

print("Original array: ", arr)

print("Flattened:", arr.flatten())

print("Reshaped :", arr.reshape(3, 2))

print("Transposed:", arr.transpose())
Pandas
• Pandas is an open-source Python library that
provides powerful and easy-to-use data
structures for data analysis and
manipulation.
• At the core of Pandas are two primary data
structures:
1. Series: A one-dimensional labeled array.
2. DataFrame: A two-dimensional, tabular
data structure with labeled rows and
columns
Creating Series and
DataFrames
import pandas as pd
#Series
data = [10, 20, 30, 40, 50]

series = pd.Series(data)
print(series)

#DataFrame
data = {
'Name': ['A', 'B’, C'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
}

df = pd.DataFrame(data)
print(df)
Creating Series and
DataFrames
import pandas as pd
#Series
data = [10, 20, 30, 40, 50]

series = pd.Series(data)
print(series)

#DataFrame
data = {
'Name': ['A', 'B’, C'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
}

df = pd.DataFrame(data)
print(df)
Summary methods
data = {
'Name': ['A', 'B’, C', 'D’, E’, F'],
'Age': [25, 30, 35, 40, 22, 28],
'Salary': [50000, 60000, 70000, 80000, 45000, 52000]
}

df = pd.DataFrame(data)

print(df.head())

print(df.tail())

print(df.info())

print(df.describe())
loc[ ] and iloc[ ]
data = {
'Name': ['John', 'Emma', 'Liam'],
'Age': [28, 24, 31]
}
df = pd.DataFrame(data)

print("Using loc: ")

print(df.loc[0, 'Name’])
print(df.loc[:, 'Age’])

print("Using iloc: ")

print(df.iloc[0, 0])
print(df.iloc[0:2, 0])
Handling missing data
data = {
'Name': ['John', 'Emma', None, 'Liam'],
'Age': [28, None, 22, 31],
'City': ['New York', 'Los Angeles', 'Chicago', None]
}
df = pd.DataFrame(data)

print(df.isnull())
print(df.dropna())

df['Name'] = df['Name'].fillna('Unknown')
mean_age = df['Age'].mean()

df['Age'] = df['Age'].fillna(mean_age)
Matplotlib
• Matplotlib is an open-source Python library
used for creating a variety of charts and
graphs.
• At the core of Matplotlib is:
1. Figure: The overall window or page that
holds the plot(s).
2. Axes: The individual plot or graph within
the figure where data is visualized.
• Matplotlib makes it easy to generate common
plots such as line graphs, scatter plots,
Creating a plot
import numpy as np
import matplotlib.pyplot as plt

x = np.array([1, 2, 3, 4])
y = np.array([10, 20, 25, 30])

plt.plot(x, y)

plt.title("Simple Line Plot with NumPy")


plt.xlabel("X-axis")
plt.ylabel("Y-axis")

plt.show()
Subplots
x = np.array([1, 2, 3, 4])
y1 = np.array([10, 20, 30, 40])
y2 = np.array([40, 30, 20, 10])
y3 = np.array([5, 15, 10, 25])

# Create 3 subplots (3 rows, 1 column)


fig, axs = plt.subplots(3, 1, figsize=(6, 9))

# First subplot
axs[0].plot(x, y1)

# Second subplot
axs[1].plot(x, y2)

plt.show()
Seaborn
• Seaborn is an open-source Python library
built on top of Matplotlib.
• It is designed for creating attractive and
informative statistical graphics with ease.
• It provides a high-level interface for drawing
visually appealing and complex plots using
fewer lines of code.
• Seaborn makes it easy to visualize
relationships in data, explore patterns, and
enhance plots created with Matplotlib.
Creating a simple plot
import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")

sns.scatterplot(data=tips, x="total_bill", y="tip", hue="time")

plt.title("Total Bill vs Tip")


plt.xlabel("Total Bill ($)")
plt.ylabel("Tip ($)")

plt.show()
SciPy
• SciPy stands for Scientific Python. It's an
open-source Python library used for scientific
and technical computing.
• It's built on top of NumPy and provides
additional functionality.
• Think of SciPy as a powerful extension of
NumPy — NumPy gives you arrays and
basic math, SciPy gives you advanced tools
to analyze and solve complex problems.
Simple hypothesis testing
from scipy.stats import ttest_rel

before = [2.9, 3.0, 2.5, 2.6, 3.2]


after = [3.1, 3.2, 2.7, 2.8, 3.4]

t_stat, p_value = ttest_rel(before, after)


print("t-statistic:", t_stat)
print("p-value:", p_value)
Conclusion
• Python transforms data analytics from a
complex challenge into a powerful
opportunity.
• With its intuitive libraries and dynamic
tools, it not only simplifies data
processing and visualization but also
unlocks deeper insights.
• This fuels smarter decisions and spark
innovation—making it the ultimate
catalyst for success across
industries.
Thank
you very
much!
Presented by Srivarshan

You might also like