0% found this document useful (0 votes)
73 views

Dand Syllabus v7 Terms 1

This document outlines the syllabus for a Data Analyst Nanodegree program. The 6-month program is divided into two 3-month terms focused on data analysis with Python/SQL and practical statistics. Students will complete introductory and multi-week projects on topics like weather trends, US bikeshare data, investigating datasets, analyzing experiment results, and more. Supporting lessons introduce programming concepts in Python, data analysis processes, and tools like Pandas, NumPy, and SQL. The program aims to teach skills for organizing data, finding patterns/insights, communicating findings, and gaining skills for a data analyst job.

Uploaded by

Ahsen Majid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

Dand Syllabus v7 Terms 1

This document outlines the syllabus for a Data Analyst Nanodegree program. The 6-month program is divided into two 3-month terms focused on data analysis with Python/SQL and practical statistics. Students will complete introductory and multi-week projects on topics like weather trends, US bikeshare data, investigating datasets, analyzing experiment results, and more. Supporting lessons introduce programming concepts in Python, data analysis processes, and tools like Pandas, NumPy, and SQL. The program aims to teach skills for organizing data, finding patterns/insights, communicating findings, and gaining skills for a data analyst job.

Uploaded by

Ahsen Majid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

 

 
 

Data​ ​Analyst​ ​Nanodegree​ ​Syllabus 


Discover​ ​Insights​ ​from​ ​Data​ ​with​ ​Python,​ ​R,​ ​SQL,​ ​and​ ​Tableau  
 
 

Before​ ​You​ ​Start 


Prerequisites​:​ ​In​ ​order​ ​to​ ​succeed​ ​in​ ​this​ ​program,​ ​we​ ​recommend​ ​having​ ​experience​ ​working​ ​with​ ​data​ ​in 
SQL​ ​and/or​ ​a​ ​spreadsheet​ ​tool​ ​like​ ​Microsoft​ ​Excel.​ ​You​ ​should​ ​also​ ​have​ ​a​ ​good​ ​understanding​ ​of 
descriptive​ ​statistics,​ ​including​ ​how​ ​to​ ​calculate​ ​and​ ​interpret​ ​measures​ ​of​ ​center​ ​(mean,​ ​median,​ ​mode); 
measures​ ​of​ ​spread​ ​(standard​ ​deviation,​ ​5-number​ ​summary);​ ​and​ ​build​ ​bar​ ​charts,​ ​histograms,​ ​boxplots, 
and​ ​scatterplots.  
 
Educational​ ​Objectives​:​ ​Learn​ ​to​ ​organize​ ​data,​ ​uncover​ ​patterns​ ​and​ ​insights,​ ​draw​ ​meaningful 
conclusions,​ ​and​ ​clearly​ ​communicate​ ​critical​ ​findings.​ ​Learn​ ​to​ ​use​ ​Python,​ ​R,​ ​SQL​ ​and​ ​Tableau.​ ​Gain​ ​all​ ​the 
skills​ ​necessary​ ​to​ ​get​ ​a​ ​job​ ​as​ ​a​ ​data​ ​analyst. 
 
Program​ ​Design 
Length​ ​of​ ​Program​*:​ ​The​ ​program​ ​is​ ​divided​ ​into​ ​two​ ​terms​ ​of​ ​three​ ​months​ ​each​ ​(approx.​ ​13​ ​weeks).​ ​We 
expect​ ​students​ ​to​ ​work​ ​10​ ​hours/week​ ​on​ ​average.​ ​Estimated​ ​time​ ​commitment​ ​is​ ​130​ ​hours​ ​per​ ​term. 
Textbooks​ ​required​:​ ​None 
Instructional​ ​Tools​ ​Available​:​ ​Video​ ​lectures,​ ​personalized​ ​project​ ​reviews,​ ​live​ ​chat​ ​help,​ ​dedicated​ ​mentor 
 
*The​ ​length​ ​is​ ​an​ ​estimation​ ​of​ ​total​ ​hours​ ​the​ ​average​ ​student​ ​may​ ​take​ ​to​ ​complete​ ​all​ ​required 
coursework,​ ​including​ ​lecture​ ​and​ ​project​ ​time.​ ​Actual​ ​hours​ ​may​ ​vary. 
 

TERM​ ​1:​ ​DATA​ ​ANALYSIS​ ​WITH​ ​PYTHON​ ​AND​ ​SQL 

Intro​ ​Project:​ ​Explore​ ​Weather​ ​Trends​ ​(5​ ​hrs) 


This​ ​project​ ​will​ ​introduce​ ​you​ ​to​ ​the​ ​key​ ​steps​ ​of​ ​the​ ​data​ ​analysis​ ​process.​ ​You’ll​ ​do​ ​so​ ​by​ ​analyzing​ ​data 
from​ ​a​ ​bike​ ​share​ ​company​ ​found​ ​in​ ​the​ ​San​ ​Francisco​ ​Bay​ ​Area.​ ​You’ll​ ​submit​ ​this​ ​project​ ​in​ ​your​ ​first​ ​7 
days,​ ​and​ ​by​ ​the​ ​end​ ​you’ll​ ​be​ ​able​ ​to: 
 

➔ Use​ ​basic​ ​Python​ ​code​ ​to​ ​clean​ ​a​ ​dataset​ ​for​ ​analysis 
➔ Run​ ​code​ ​to​ ​create​ ​visualizations​ ​from​ ​the​ ​wrangled​ ​data 
➔ Analyze​ ​trends​ ​shown​ ​in​ ​the​ ​visualizations​ ​and​ ​report​ ​your​ ​conclusions 
➔ Determine​ ​if​ ​this​ ​program​ ​is​ ​a​ ​good​ ​fit​ ​for​ ​your​ ​time​ ​and​ ​talents 

 
 

Project:​ ​Explore​ ​US​ ​Bikeshare​ ​Data​ ​(40​ ​hrs) 


You​ ​will​ ​use​ ​Python​ ​to​ ​perform​ ​steps​ ​of​ ​the​ ​data​ ​analysis​ ​process​ ​on​ ​bikeshare​ ​trip​ ​data​ ​collected​ ​from 
three​ ​US​ ​cities.​ ​You​ ​will​ ​write​ ​code​ ​to​ ​clean​ ​the​ ​data,​ ​compute​ ​descriptive​ ​statistics,​ ​and​ ​create​ ​basic 
visualizations​ ​of​ ​the​ ​distribution​ ​of​ ​data. 

Supporting​ ​Lesson​ ​Content:​ ​Introduction​ ​to​ ​Python​ ​Programming 

Lesson​ ​Title  Learning​ ​Outcomes 

NUMBERS​ ​AND​ ​STRINGS  ➔ Learn​ ​about​ ​Python's​ ​numeric​ ​and​ ​string​ ​data​ ​types 
➔ Use​ ​variables​ ​to​ ​store​ ​data 
➔ Use​ ​built-in​ ​functions​ ​and​ ​methods 

FUNCTIONS,  ➔ Install​ ​Python​ ​on​ ​your​ ​computer 


INSTALLATION,​ ​AND  ➔ Organize​ ​your​ ​code​ ​into​ ​functions 
CONDITIONALS  ➔ Use​ ​conditionals​ ​to​ ​make​ ​decisions 

DATA​ ​STRUCTURES​ ​AND  ➔ Use​ ​collection​ ​data​ ​types:​ ​lists,​ ​sets,​ ​and​ ​dictionaries 
LOOPS  ➔ Write​ `
​ for`​ ​and​ ​`while`​ ​loops​ ​to​ ​express​ ​repetition 
➔ Practice​ ​refactoring​ ​and​ ​problem​ ​solving 

FILES​ ​AND​ ​MODULES  ➔ Use​ ​modules​ ​from​ ​the​ ​Python​ ​standard​ ​library​ ​and​ ​from 
third-party​ ​libraries 
➔ Read​ ​data​ ​from​ ​files​ ​on​ ​disk 
➔ Use​ ​online​ ​resources​ ​to​ ​help​ ​solve​ ​problems 

Project:​ ​Investigate​ ​a​ ​Dataset​ ​(40​ ​hrs) 


In​ ​this​ ​project,​ ​you’ll​ ​choose​ ​one​ ​of​ ​Udacity's​ ​curated​ ​datasets​ ​and​ ​investigate​ ​it​ ​using​ ​NumPy​ ​and​ ​pandas. 
You’ll​ ​complete​ ​the​ ​entire​ ​data​ ​analysis​ ​process,​ ​starting​ ​by​ ​posing​ ​a​ ​question​ ​and​ ​finishing​ ​by​ ​sharing​ ​your 
findings. 

Supporting​ ​Lesson​ ​Content:​ ​Introduction​ ​to​ ​Data​ ​Analysis 

Lesson​ ​Title  Learning​ ​Outcomes 

Data​ ​Analysis​ ​in​ ​Python   

DATA​ ​ANALYSIS​ ​PROCESS  ➔ Learn​ ​about​ ​the​ ​keys​ ​steps​ ​of​ ​the​ ​data​ ​analysis​ ​process 
➔ Investigate​ ​multiple​ ​datasets​ ​using​ ​Python​ ​and​ ​Pandas 

PANDAS​ ​AND​ ​NUMPY:  ➔ Perform​ ​the​ ​entire​ ​data​ ​analysis​ ​process​ ​on​ ​a​ ​dataset 
CASE​ ​STUDY​ ​1  ➔ Learn​ ​to​ u
​ se​ ​NumPy​ ​and​ ​Pandas​ ​to​ ​wrangle,​ ​explore,​ ​analyze, 

 
 

and​ ​visualize​ ​data 

PANDAS​ ​AND​ ​NUMPY:  ➔ Perform​ ​the​ ​entire​ ​data​ ​analysis​ ​process​ ​on​ ​a​ ​dataset 
CASE​ ​STUDY​ ​2  ➔ Learn​ ​more​ ​about​ ​NumPy​ ​and​ ​Pandas​ ​to​ ​wrangle,​ ​explore, 
analyze,​ ​and​ ​visualize​ ​data. 

Introduction​ ​to​ ​SQL   

Basic​ ​SQL  ➔ Write​ ​common​ ​SQL​ ​commands​ ​including​ ​SELECT,​ ​FROM,​ ​and 
WHERE,​ ​as​ ​well​ ​as​ ​corresponding​ ​logical​ ​operators 

SQL​ ​Joins  ➔ Write​ ​JOINs​ ​in​ ​SQL,​ ​as​ ​you​ ​are​ ​now​ ​able​ ​to​ ​combine​ ​data​ ​from 
multiple​ ​sources​ ​to​ ​answer​ ​more​ ​complex​ ​business​ ​questions 

SQL​ ​Aggregations  ➔ Write​ ​common​ ​aggregations​ ​in​ ​SQL​ ​including​ ​COUNT,​ ​SUM,​ ​MIN, 
and​ ​MAX 
➔ Write​ ​CASE​ ​and​ ​DATE​ ​functions,​ ​as​ ​well​ ​as​ ​work​ ​with​ ​NULLs 

Advanced​ ​SQL​ ​Queries  ➔ Edit​ ​a​ ​database​ ​using​ ​CREATE​ ​TABLE,​ ​INSERT​ ​INTO,​ ​UPDATE,​ ​and 
other​ ​statements 
➔ Use​ ​window​ ​functions​ ​and​ ​subqueries​ ​to​ ​add​ ​steps​ ​to​ ​a​ ​query 
➔ Use​ ​documentation​ ​to​ ​learn​ ​new​ ​functions​ ​and​ ​complete 
complex​ ​tasks 

Project:​ ​Analyze​ ​Experiment​ ​Results​ ​(45​ ​hrs) 


In​ ​this​ ​project,​ ​you​ ​will​ ​be​ ​provided​ ​a​ ​dataset​ ​reflecting​ ​data​ ​collected​ ​from​ ​an​ ​experiment.​ ​You’ll​ ​use 
statistical​ ​techniques​ ​to​ ​answer​ ​questions​ ​about​ ​the​ ​data​ ​and​ ​report​ ​your​ ​conclusions​ ​and 
recommendations​ ​in​ ​a​ ​report. 

Supporting​ ​Lesson​ ​Content:​ ​Practical​ ​Statistics 

Lesson​ ​Title  Learning​ ​Outcomes 

STANDARDIZING  ➔ Convert​ ​distributions​ ​into​ ​the​ ​standard​ ​normal​ ​distribution 


using​ ​the​ ​Z-score 
➔ Compute​ ​proportions​ ​using​ ​standardized​ ​distributions 

NORMAL​ ​DISTRIBUTION  ➔ Use​ ​normal​ ​distributions​ ​to​ ​compute​ ​probabilities 


➔ Use​ t​ he​ ​Z-table​ ​to​ ​look​ ​up​ ​the​ ​proportions​ ​of​ ​observations 
above,​ ​below,​ ​or​ ​in​ ​between​ ​values 

SAMPLING  ➔ Apply​ ​the​ ​concepts​ ​of​ ​probability​ ​and​ ​normalization​ ​to​ ​sample 
DISTRIBUTIONS  data​ ​sets 

ESTIMATION  ➔ Estimate​ ​population​ ​parameters​ ​from​ ​sample​ ​statistics​ ​using 


confidence​ ​intervals 

 
 

HYPOTHESIS​ ​TESTING  ➔ Use​ ​critical​ ​values​ ​to​ ​make​ ​decisions​ ​on​ ​whether​ ​or​ ​not​ ​a 
treatment​ ​has​ ​changed​ ​the​ ​value​ ​of​ ​a​ ​population​ ​parameter 

T-TESTS  ➔ Test​ ​the​ ​effect​ ​of​ ​a​ ​treatment​ ​or​ ​compare​ ​the​ ​difference​ ​in 
means​ ​for​ ​two​ ​groups​ ​when​ ​we​ ​have​ ​small​ ​sample​ ​sizes 

REGRESSION  ➔ Build​ ​a​ ​linear​ ​regression​ ​model​ ​to​ ​understand​ ​the​ ​relationship 
between​ ​independent​ ​and​ ​dependent​ ​variables 
➔ Use​ ​linear​ ​regression​ ​results​ ​to​ ​make​ ​a​ ​prediction 

TERM​ ​2:​ ​ADVANCED​ ​DATA​ ​ANALYSIS  

Intro​ ​Project:​ ​Test​ ​a​ ​Perceptual​ ​Phenomenon​ ​(10​ ​hrs) 


In​ ​this​ ​project,​ ​you’ll​ ​use​ ​descriptive​ ​statistics​ ​and​ ​a​ ​statistical​ ​test​ ​to​ ​analyze​ ​the​ ​Stroop​ ​effect,​ ​a​ ​classic 
result​ ​of​ ​experimental​ ​psychology.​ ​Communicate​ ​your​ ​understanding​ ​of​ ​the​ ​data​ ​and​ ​use​ ​statistical 
inference​ ​to​ ​draw​ ​a​ ​conclusion​ ​based​ ​on​ ​the​ ​results. 

Supporting​ ​Lesson​ ​Content:​ ​Practical​ ​Statistics 

Project:​ ​Wrangle​ ​and​ ​Analyze​ ​Data​ ​(50​ ​hrs) 


Real-world​ ​data​ ​rarely​ ​comes​ ​clean.​ ​Using​ ​Python,​ ​you'll​ ​gather​ ​data​ ​from​ ​a​ ​variety​ ​of​ ​sources,​ ​assess​ ​its 
quality​ ​and​ ​tidiness,​ ​then​ ​clean​ ​it.​ ​You'll​ ​document​ ​your​ ​wrangling​ ​efforts​ ​in​ ​a​ ​Jupyter​ ​Notebook,​ ​plus 
showcase​ ​them​ ​through​ ​analyses​ ​and​ ​visualizations​ ​using​ ​Python​ ​and​ ​SQL. 

Supporting​ ​Lesson​ ​Content:​ ​Data​ ​Wrangling 

Lesson​ ​Title  Learning​ ​Outcomes 

INTRO​ ​TO​ ​DATA  ➔ Identify​ ​each​ ​step​ ​of​ ​the​ ​data​ ​wrangling​ ​process​ ​(gathering, 
WRANGLING  assessing,​ ​and​ ​cleaning) 
➔ Wrangle​ ​a​ ​CSV​ ​file​ ​downloaded​ ​from​ ​Kaggle​ ​using​ ​fundamental 
gathering,​ ​assessing,​ ​and​ ​cleaning​ ​code 

GATHERING​ ​DATA  ➔ Gather​ ​data​ ​from​ ​multiple​ ​sources,​ ​including​ ​gathering​ ​files, 
programmatically​ ​downloading​ ​files,​ ​web-scraping​ ​data,​ ​and 
accessing​ ​data​ ​from​ ​APIs 
➔ Import​ ​data​ ​of​ ​various​ ​file​ ​formats​ ​into​ ​pandas,​ ​including​ ​flat 
files​ ​(e.g.​ ​TSV),​ ​HTML​ ​files,​ ​TXT​ ​files,​ ​and​ ​JSON​ ​files 
➔ Store​ ​gathered​ ​data​ ​in​ ​a​ ​PostgreSQL​ ​database 

 
 

ASSESSING​ ​DATA  ➔ Assess​ ​data​ ​visually​ ​and​ ​programmatically​ ​using​ ​pandas 


➔ Distinguish​ ​between​ ​dirty​ ​data​ ​(content​ ​or​ ​“quality”​ ​issues)​ ​and 
messy​ ​data​ ​(structural​ ​or​ ​“tidiness”​ ​issues) 
➔ Identify​ ​data​ ​quality​ ​issues​ ​and​ ​categorize​ ​them​ ​using​ ​metrics: 
validity,​ ​accuracy,​ ​completeness,​ ​consistency,​ ​and​ ​uniformity 

CLEANING​ ​DATA  ➔ Identify​ ​each​ ​step​ ​of​ ​the​ ​data​ ​cleaning​ ​process​ ​(defining,​ ​coding, 
and​ ​testing) 
➔ Clean​ ​data​ ​using​ ​Python​ ​and​ ​pandas 
➔ Test​ ​cleaning​ ​code​ ​visually​ ​and​ ​programmatically​ ​using​ ​Python 

Project:​ ​Explore​ ​and​ ​Summarize​ ​Data​ ​(50​ ​hrs) 


In​ ​this​ ​project,​ ​you’ll​ ​use​ ​R​ ​and​ ​apply​ ​exploratory​ ​data​ ​analysis​ ​techniques​ ​to​ ​explore​ ​a​ ​selected​ ​data​ ​set​ ​for 
distributions,​ ​outliers,​ ​and​ ​anomalies. 

Supporting​ ​Lesson​ ​Content:​ ​Data​ ​Analysis​ ​with​ ​R 

Lesson​ ​Title  Learning​ ​Outcomes 

WHAT​ ​IS​ ​EDA?  ➔ Define​ ​and​ ​identify​ ​the​ ​importance​ ​of​ ​exploratory​ ​data​ ​analysis 
(EDA) 

R​ ​BASICS  ➔ Install​ ​RStudio​ ​and​ ​packages 


➔ Write​ ​basic​ ​R​ ​scripts​ ​to​ ​inspect​ ​datasets 

EXPLORE​ ​ONE​ ​VARIABLE  ➔ Quantify​ ​and​ ​visualize​ ​individual​ ​variables​ ​within​ ​a​ ​dataset 
➔ Create​ ​histograms​ ​and​ ​boxplots 
➔ Transform​ ​variables 
➔ Examine​ ​and​ ​identify​ ​tradeoffs​ ​in​ ​visualizations 

EXPLORE​ ​TWO​ ​VARIABLES  ➔ Properly​ ​apply​ ​relevant​ ​techniques​ ​for​ ​exploring​ ​the​ ​relationship 
between​ ​any​ ​two​ ​variables​ ​in​ ​a​ ​data​ ​set 
➔ Create​ ​scatter​ ​plots 
➔ Calculate​ ​correlations 
➔ Investigate​ ​conditional​ ​means 

EXPLORE​ ​MANY  ➔ Reshape​ ​data​ ​frames​ ​and​ ​use​ ​aesthetics​ ​like​ ​color​ ​and​ ​shape​ ​to 
VARIABLES  uncover​ ​information 

DIAMONDS​ ​AND​ ​PRICE  ➔ Use​ ​predictive​ ​modeling​ ​to​ ​determine​ ​a​ ​good​ ​price​ ​for​ ​a 
PREDICTIONS  diamond 

 
 

Project:​ ​Create​ ​a​ ​Tableau​ ​Story​ ​(20​ ​hrs) 


In​ ​this​ ​project,​ ​you’ll​ ​create​ ​a​ ​data​ ​visualization,​ ​using​ ​Tableau,​ ​from​ ​a​ ​data​ ​set​ ​that​ ​tells​ ​a​ ​story​ ​or​ ​highlights 
trends​ ​or​ ​patterns​ ​in​ ​the​ ​data.​ ​Your​ ​work​ ​should​ ​be​ ​a​ ​reflection​ ​of​ ​the​ ​theory​ ​and​ ​practice​ ​of​ ​data 
visualization,​ ​harnessing​ ​visual​ ​encodings​ ​and​ ​design​ ​principles​ ​for​ ​effective​ ​communication. 

Supporting​ ​Lesson​ ​Content:​ ​Data​ ​Visualization​ ​with​ ​Tableau 

Lesson​ ​Title  Learning​ ​Outcomes 

DATA​ ​VISUALIZATION  ➔ Understand​ ​the​ ​importance​ ​of​ ​data​ ​visualization 


FUNDAMENTALS  ➔ Know​ ​how​ ​different​ ​data​ ​types​ ​are​ ​encoded​ ​in​ ​visualizations 

DESIGN​ ​PRINCIPLES  ➔ Select​ ​the​ ​most​ ​effective​ ​chart​ ​or​ ​graph​ ​based​ ​on​ ​the​ ​data 
being​ ​displayed 
➔ Use​ ​color,​ ​shape,​ ​size,​ ​and​ ​other​ ​elements​ ​effectively 

CREATING  ➔ Become​ ​proficient​ ​in​ ​basic​ ​Tableau​ ​functionality,​ ​including 


VISUALIZATIONS​ ​WITH  charts,​ ​filters,​ ​hierarchies,​ ​etc. 
TABLEAU  ➔ Create​ ​calculated​ ​fields​ ​in​ ​Tableau 

TELLING​ ​STORIES​ ​WITH  ➔ Create​ ​Tableau​ ​dashboards​ ​and​ ​stories​ ​to​ ​effectively 
TABLEAU  communicate​ ​data 
 

You might also like