Intro To Data Analytics Activity Templates - Marquina Alberto
Intro To Data Analytics Activity Templates - Marquina Alberto
● You’ll use this file for the entirety of this course. Save it in a place where you can easily
access it over the upcoming weeks.
○ You can edit and save this document in Google Drive
○ If you download this document, keep it in a place you can find it later
● The content you put into this document will be used for later lessons
○ It is recommended that you do not skip any activity in any of the lessons
○ It is recommended that you update this document after every week of content
and start with week 2
● Requirements:
○ Answer all the questions in this document
○ When complete, download this as a PDF document for submission in the peer
review assignment.
○ Don’t know how to download it as a PDF? You can find more information about
downloading this by clicking here.
○ Remove this section before submitting
Content
Anna is working on long-term planning for the upcoming year at BrightThreads. Business has
been going well, but she would really like to increase sales and potentially open up a second
location in a different neighborhood. Next year, Anna would like to increase her total sales by
10%. This would be a very good year for Anna and BrightThreads, but it seems doable based
on the last few quarters and with some hard work.
Using this information, answer the questions below regarding the obtain and scrub stages of the
OSEMN process. Add your answers to the template below.
In this scenario, what is a SMART goal that would benefit from data analysis?
What is a Primary KPI that would be useful to analyze for this goal?
How do you imagine you could obtain this data? What sources would you gather data from?
Specifically, note what kind of data (first-party, third-party) and what methods you might use
(survey, web analytics).
● Historic sales data: Can be obtained internally from BrightThreads sales records,
which would constitute first-party data.
● Inventory data: This data is from BrightThreads internal registry.
● Marketing and promotion data: This is developed by carrying out advertising
campaigns and promotions in correct management by someone who has knowledge
of the market being sought.
● Economic and market data: This data comes from industry reports and economic
databases.
Anna at BrightThreads has begun the process of gathering data to help analyze current sales.
She has collected data on recent online sales directly from the online storefront.
Access this sample Customer Data and click on Use Template in the upper right corner. You will
need to be logged into a Google account to use this template.
Anna has isolated 4 different segments that each have issues that need to be fixed. You can
access each segment in the four sheets in this one spreadsheet. Click on each sheet for a
different segment of the dataset. You can click on the tabs at the bottom of the spreadsheet to
move between sheets. Review the image below for a preview:
The four sheets are accessible by clicking the tabs at the bottom of the spreadsheet.
Using what you know about data validity, do you think the data Anna has gathered is valid? Why
or why not?
No, I consider most of the data to be useless. Only the category and cost would be useful to
know more about the market and the audience she is looking for.
Variety of cost in different kind of products is not the best option to understand the market.
What issue did you identify in segment 3 of the data?
Access BrightThread’s online sales data and click on Use Template in the upper right corner to
access the dataset. Please note you will need to be logged into a Google account.
Review the following data and charts, then share what you can learn in the exploration stage of
the OSEMN process.
Using this information, answer the questions below regarding the explore and model stages of
the OSEMN process. Add your answers to the template below.
What are some things you can tell about this dataset? For instance, what does the size of the
dataset tell you?
It shows me that we have the cost of clothing and what different users spend for which an
average of cost can be obtained to seek better customer attraction.
Reviewing this data, what is the minimum value in the order_total column? What is the maximum
value in order_total column?
The minimum is 39.99
The maximum is 149.99
What kind of chart would you use to help visualize this data?
Bar graphic.
Based on what you have learned, would you add an additional column to this dataset using
feature engineering? For instance, using the sales dates, would it be helpful to add in the day of
the week data?
I consider it would be an option to add a column that shows the time it takes for the product to
sell.
Anna has created the following chart to explore the relationship between order totals and the
number of orders.
Based on the data in this chart, what would be a good title for this chart?
What does this chart tell you about the number of orders in relation to the amount someone
spends per order?
Between 55 to 115.
Anna has also been analyzing data on the amount of money she spends on social media ads
and how many clicks to the BrightThreads website they are generating.
Do you notice any correlations between the variables in this chart? If so, how would you
describe them?
Advertising expenses tend to be higher as they begin to be more clicks per ad.
Anna has learned a lot while exploring the data she has gathered. Now, it’s time to model some
of this data.
Reviewing this linear regression model, roughly how many site visits can be expected if the
marketing budget is increased to $250?
12 site visits.
Review this linear regression model which shows the actual data values and the values
predicted by the model when given a test set. Do you think that this model is sufficient for
general use for this data? Why or why not?
This representation can be useful to make a relationship between both arguments but it is not
enough for general use.
Review this clustering model. A clustering algorithm has been used and identified two
groups.How would you describe the two different customer groups? Why?
Group A and B. Because there are 2 different kind of customers and the have different needs,
it depends of the quantity of products the buy.
You are trying to forecast BrightThreads sales in the coming quarter- what model might you
use? Why did you choose this?
I will use the previous figures and present an estimate in a bar graph depending on the need
for the product and the cost of production.
Week 4 Activity: iNterpreting Data
Anna has learned many things using data analysis. She has prepared a presentation to show to
BrightThreads stakeholders. As a reminder, her goal is to grow sales by 10% in the upcoming
year, and this presentation will cover what she’s learned and how she plans to accomplish this
goal.
Review the presentation, then share your thoughts on Anna’s interpretation of the data at the
end of OSEMN process.
Using this information, answer the questions below regarding the interpret stage of the OSEMN
process. Add your answers to the template below.
The market research they do can be of great help to companies who wants to understand
their customers
What slides in the presentation covered the methods used in the project?
7, 8, 9, 10, 11, 12
What slides in the presentation offered recommendations after the project?
16
In your opinion, what parts of the presentation were the setup, buildup, climax, and conclusion?
Why?
Setup - slides 1, 2
Buildup - slides 4, 5
Climax - 7, 8, 9, 10, 11, 12
Conclusion - 13, 14, 15