0% found this document useful (0 votes)
18 views

Intro To Data Analytics Activity Templates - Marquina Alberto

The document provides guidance for completing activities over multiple weeks for an analytics course. It includes reminders to save the document, complete all activities, and remove certain sections before submitting. Details are given for week 2 on obtaining and scrubbing data, week 3 on exploring and modeling data, and week 4 on interpreting data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Intro To Data Analytics Activity Templates - Marquina Alberto

The document provides guidance for completing activities over multiple weeks for an analytics course. It includes reminders to save the document, complete all activities, and remove certain sections before submitting. Details are given for week 2 on obtaining and scrubbing data, week 3 on exploring and modeling data, and week 4 on interpreting data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Reminders…

● You’ll use this file for the entirety of this course. Save it in a place where you can easily
access it over the upcoming weeks.
○ You can edit and save this document in Google Drive
○ If you download this document, keep it in a place you can find it later
● The content you put into this document will be used for later lessons
○ It is recommended that you do not skip any activity in any of the lessons
○ It is recommended that you update this document after every week of content
and start with week 2
● Requirements:
○ Answer all the questions in this document
○ When complete, download this as a PDF document for submission in the peer
review assignment.
○ Don’t know how to download it as a PDF? You can find more information about
downloading this by clicking here.
○ Remove this section before submitting

Content

Week 2 Activity: Obtaining and Scrubbing Data


Week 3 Activity: Exploring and Modeling Data
Week 4 Activity: iNterpreting Data
Week 2 Activity: Obtaining and Scrubbing Data
Anna owns a clothing boutique in New York, called BrightThreads. She sells a mix of clothing
brands and chooses items for her store that she believes her clients will like. She also sells
online.

Anna is working on long-term planning for the upcoming year at BrightThreads. Business has
been going well, but she would really like to increase sales and potentially open up a second
location in a different neighborhood. Next year, Anna would like to increase her total sales by
10%. This would be a very good year for Anna and BrightThreads, but it seems doable based
on the last few quarters and with some hard work.

Using this information, answer the questions below regarding the obtain and scrub stages of the
OSEMN process. Add your answers to the template below.

In this scenario, what is a SMART goal that would benefit from data analysis?

Increase BrightThreads total sales by 10% by next year.

What is a Primary KPI that would be useful to analyze for this goal?

An indicator is total sales.

What relevant data would you gather in this scenario?

● Historic sales data


● Inventory data
● Marketing and promotion data
● Economic and market data.

How do you imagine you could obtain this data? What sources would you gather data from?
Specifically, note what kind of data (first-party, third-party) and what methods you might use
(survey, web analytics).

● Historic sales data: Can be obtained internally from BrightThreads sales records,
which would constitute first-party data.
● Inventory data: This data is from BrightThreads internal registry.
● Marketing and promotion data: This is developed by carrying out advertising
campaigns and promotions in correct management by someone who has knowledge
of the market being sought.
● Economic and market data: This data comes from industry reports and economic
databases.

Anna at BrightThreads has begun the process of gathering data to help analyze current sales.

She has collected data on recent online sales directly from the online storefront.

Access this sample Customer Data and click on Use Template in the upper right corner. You will
need to be logged into a Google account to use this template.

Anna has isolated 4 different segments that each have issues that need to be fixed. You can
access each segment in the four sheets in this one spreadsheet. Click on each sheet for a
different segment of the dataset. You can click on the tabs at the bottom of the spreadsheet to
move between sheets. Review the image below for a preview:

The four sheets are accessible by clicking the tabs at the bottom of the spreadsheet.

Using what you know about data validity, do you think the data Anna has gathered is valid? Why
or why not?

No, I consider most of the data to be useless. Only the category and cost would be useful to
know more about the market and the audience she is looking for.

What issue did you identify in segment 1 of the data?

Unnecessary data on specific customers or product code is not useful.

What issue did you identify in segment 2 of the data?

Variety of cost in different kind of products is not the best option to understand the market.
What issue did you identify in segment 3 of the data?

Some data is missing.

What issue did you identify in segment 4 of the data?

The cost of some type of products is irregular.

Week 3 Activity: Exploring and Modeling Data


Anna from BrightThreads is exploring some data from last quarter's online sales.

The data was gathered from the BrightThreads online store.

Access BrightThread’s online sales data and click on Use Template in the upper right corner to
access the dataset. Please note you will need to be logged into a Google account.

Review the following data and charts, then share what you can learn in the exploration stage of
the OSEMN process.

Using this information, answer the questions below regarding the explore and model stages of
the OSEMN process. Add your answers to the template below.

What are some things you can tell about this dataset? For instance, what does the size of the
dataset tell you?

It shows me that we have the cost of clothing and what different users spend for which an
average of cost can be obtained to seek better customer attraction.

What kind of data is in this dataset? (Numerical, categorical, etc.)

Numerical, the cost and the amount.

Reviewing this data, what is the minimum value in the order_total column? What is the maximum
value in order_total column?
The minimum is 39.99
The maximum is 149.99

What kind of chart would you use to help visualize this data?

Bar graphic.

Based on what you have learned, would you add an additional column to this dataset using
feature engineering? For instance, using the sales dates, would it be helpful to add in the day of
the week data?

I consider it would be an option to add a column that shows the time it takes for the product to
sell.

Anna has created the following chart to explore the relationship between order totals and the
number of orders.
Based on the data in this chart, what would be a good title for this chart?

Order & Total.

What does this chart tell you about the number of orders in relation to the amount someone
spends per order?

The cost of the order between different prices of products.

What range do most of the orders tend to be in?

Between 55 to 115.

Anna has also been analyzing data on the amount of money she spends on social media ads
and how many clicks to the BrightThreads website they are generating.
Do you notice any correlations between the variables in this chart? If so, how would you
describe them?

Advertising expenses tend to be higher as they begin to be more clicks per ad.

Anna has learned a lot while exploring the data she has gathered. Now, it’s time to model some
of this data.
Reviewing this linear regression model, roughly how many site visits can be expected if the
marketing budget is increased to $250?

12 site visits.
Review this linear regression model which shows the actual data values and the values
predicted by the model when given a test set. Do you think that this model is sufficient for
general use for this data? Why or why not?
This representation can be useful to make a relationship between both arguments but it is not
enough for general use.
Review this clustering model. A clustering algorithm has been used and identified two
groups.How would you describe the two different customer groups? Why?

Group A and B. Because there are 2 different kind of customers and the have different needs,
it depends of the quantity of products the buy.

You are trying to forecast BrightThreads sales in the coming quarter- what model might you
use? Why did you choose this?

I will use the previous figures and present an estimate in a bar graph depending on the need
for the product and the cost of production.
Week 4 Activity: iNterpreting Data
Anna has learned many things using data analysis. She has prepared a presentation to show to
BrightThreads stakeholders. As a reminder, her goal is to grow sales by 10% in the upcoming
year, and this presentation will cover what she’s learned and how she plans to accomplish this
goal.

Access Anna’s presentation.

Review the presentation, then share your thoughts on Anna’s interpretation of the data at the
end of OSEMN process.

Using this information, answer the questions below regarding the interpret stage of the OSEMN
process. Add your answers to the template below.

What was the objective for this analysis?

Analyze current sales


Determine top-selling items
Forecast sales numbers
Adjust inventory if needed

How can Anna apply this in a business context?

The market research they do can be of great help to companies who wants to understand
their customers

What slides in the presentation covered the methods used in the project?

What slides in the presentation included visualization of the project?

7, 8, 9, 10, 11, 12
What slides in the presentation offered recommendations after the project?

16

In your opinion, what parts of the presentation were the setup, buildup, climax, and conclusion?
Why?

Setup - slides 1, 2
Buildup - slides 4, 5
Climax - 7, 8, 9, 10, 11, 12
Conclusion - 13, 14, 15

You might also like