Activity Template - Course 7 PACE Strategy Document
Activity Template - Course 7 PACE Strategy Document
Instructions
Use this PACE strategy document to record your decisions and reflections as a data professional as you work
through the capstone project. As a reminder, this document is a resource guide that you can reference in the
future and a space to help guide your responses and reflections posed at various points throughout the project.
● Show how data professionals leverage Python to load, explore, extract, and organize information
● Demonstrate understanding of how to organize and analyze a dataset to find the “story”
● Demonstrate the ability to use a notebook environment to create a series of machine learning models
Project proposal
Page 2
GROW WITH GOOGLE CAREER CERTIFICATE
● How can you best prepare to understand and organize the provided information?
● What follow-along and self-review codebooks will help you perform this work?
● What are a couple additional activities a resourceful learner would perform before starting to code?
● What are the data columns and variables and which ones are most relevant to your deliverable?
● What units are your variables in?
● What are your initial presumptions about the data that can inform your EDA, knowing you will need to
confirm or deny with your future findings?
● Is there any missing or incomplete data?
● Are all pieces of this dataset in the same format?
● Which EDA practices will be required to begin this project?
Page 3
GROW WITH GOOGLE CAREER CERTIFICATE
Page 4
GROW WITH GOOGLE CAREER CERTIFICATE
● Will the available information be sufficient to achieve the goal based on your intuition and the analysis
of the variables?
● What steps need to be taken to perform EDA in the most effective way to achieve the project goal?
● Do you need to add more data using the EDA practice of joining? What type of structuring needs to
be done to this dataset, such as filtering, sorting, etc.?
● What initial assumptions do you have about the types of visualizations that might best be suited for
the intended audience?
● What are some purposes of EDA before constructing a multiple linear regression model?
● Do you have any ethical considerations in this stage?
● What am I trying to solve? Does it still work? Does the plan need revising?
● Does the data break the assumptions of the model? Is that ok, or unacceptable?
● Why did you select the X variables you did?
● What are some purposes of EDA before constructing a model?
● What has the EDA told you?
● What resources do you find yourself using as you complete this stage?
● Do you have any ethical considerations in this stage?
Page 5
GROW WITH GOOGLE CAREER CERTIFICATE
● What data visualizations, machine learning algorithms, or other data outputs will need to be built in
order to complete the project goals?
● What processes need to be performed in order to build the necessary data visualizations?
● Which variables are most applicable for the visualizations in this data project?
● Going back to the Plan stage, how do you plan to deal with the missing data (if any)?
● How did you formulate your null hypothesis and alternative hypothesis?
● What conclusion can be drawn from the hypothesis test?
Page 6
GROW WITH GOOGLE CAREER CERTIFICATE
● Given your current knowledge of the data, what would you initially recommend to your manager to
investigate further prior to performing an exploratory data analysis?
● Given what you know about the data and the visualizations you were using, what other questions
could you research for the team?
● Do you think your model could be improved? Why or why not? How?
● Were there any features that were not important at all? What if you take them out?
● Given what you know about the data and the models you were using, what other questions could you
address for the team?
● What resources do you find yourself using as you complete this stage?
● Is my model ethical?
● When my model makes a mistake, what is happening? How does that translate to my use case?
Page 8