0% found this document useful (0 votes)
47 views

If You Want To Learn Data Science, Start With One

Uploaded by

ismail ghmiriss
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

If You Want To Learn Data Science, Start With One

Uploaded by

ismail ghmiriss
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Published in We’ve moved to freeCodeCamp.

org/…
ews

David Venturi Follow

Sep 26, 2016 · 14 min read · Listen

Save

If you want to learn Data


Science, start with one of
these programming
classes

A year ago, I was a numbers geek with no


coding background. After trying an online
programming course, I was so inspired that I
enrolled in one of the best computer science
programs in Canada.

Two weeks later, I realized that I could learn


everything I needed through edX, Coursera,
and Udacity instead. So I dropped out.

The decision was not difficult. I could learn


the content I wanted to faster, more
efficiently, and for a fraction of the cost.

I already had a university degree and,


perhaps more importantly, I already had the
university experience. Paying $30K+ to go
back to school seemed irresponsible.

I started creating my own data science


master’s degree using online courses shortly
afterwards, after realizing it was a better fit
for me than computer science. I scoured the
introduction to programming landscape. I’ve
already taken several courses and audited
portions of many others. I know the options,
and what skills are needed if you’re targeting
a data analyst or data scientist role.

For this guide, I spent 20+ hours trying to find


every single online introduction to
programming course offered as of August
2016, extracting key bits of information from
their syllabi and reviews, and compiling their
ratings. For this task, I turned to none other
than the open source Class Central
community and its database of thousands of
course ratings and reviews.

Class Central’s homepage.

Since 2011, Class Central founder Dhawal


Shah has kept a closer eye on online courses
than arguably anyone else in the world.
Dhawal personally helped me assemble this
list of resources.

Quick interlude
Hey, it’s David. I wrote this guide back in
2016. Since then, I’ve become a professional
data analyst and created courses for multiple
industry-leading online education companies.

Do you want to become a data analyst,


without spending 4 years and $41,762 to go to
university? Follow my latest 27-day
curriculum and learn alongside other
aspiring data pros. My top programming
course recommendation for 2023 is in there,
too.

Data Maverick: Initiation


Learn analytics skills with
friends using the…
internet's best resources.
datamaverickhq.com

Okay, back to the guide.

How we picked courses to consider


Each course had to fit four criteria:

It introduces programming and,


optionally, computer science. See “A note
on Programming vs. Computer Science”
below.

The language of instruction is Python or


R. These are by far the two most popular
programming languages used in data
science.

It must be an interactive online course,


so no books or text-based tutorials.
Regarding the latter, Codecademy’s video-
less and text editor-based courses would
qualify, but strict text tutorials like the
ones from R tutorial would not. Though
books are viable ways to learn
programming, Python, and R, this guide
focuses on courses.

It must be a decent length: at least ten


hours in total for estimated completion.

Python and R are the two most popular programming


languages used in data science.

How we evaluated courses


We believe we covered every notable course
that exists and which fits the above criteria.
Since there are seemingly hundreds of
courses on Udemy in Python and R, we chose
to consider the most reviewed and highest
rated ones only. There is a chance we missed
something, however. Please let us know if you
think that is the case.

We compiled average rating and number of


reviews from Class Central and other review
sites. We calculated a weighted average rating
for each course. If a series had multiple
courses (like Rice University’s Part 1 and Part
2), we calculated the weighted average rating
across all courses. We also read text reviews
and used this feedback to supplement the
numerical ratings.

We made subjective syllabus judgment calls


based on three factors:

1. Coverage of the fundamentals of


programming.

2. Coverage of more advanced, but useful,


topics in programming. (E.g. several
courses choose to not cover object-
oriented programming. We believe this is
a key topic, though not a deal-breaker,
hence these courses only being docked
marks and not excluded from
consideration.)

3. How much of the syllabus is relevant to


data science?

A note on Programming vs. Computer


Science
Programming is not computer science and
vice versa. There is a difference of which
beginners may not be acutely aware.
Borrowing this answer from Programmers
Stack Exchange:

Computer science is the


study of what computers
[can] do; programming is
the practice of making
computers do things.
The course we are looking for introduces
programming and optionally touches on
relevant aspects of computer science that
would benefit a new programmer in terms of
awareness. Many of the courses considered,
you’ll notice, do indeed have a computer
science portion.

None of the courses, however, are strictly


computer science courses, which is why
something like Harvard’s CS50x on edX is
excluded.

Our pick for the best programming


course for data scientists is…
University of Toronto’s “Learn to Program”
series on Coursera. LTP1: The Fundamentals
and LTP2: Crafting Quality Code have a near-
perfect weighted average rating of 4.71 out of
5 stars over 284 reviews. They also have a
great mix of content difficulty and scope for
the beginner data scientist.

This free, Python-based introduction to


programming sets itself apart from the other
20+ courses we considered.

Part 2 of the University of Toronto’s “Learn to Program”


series.

Jennifer Campbell and Paul Gries, two


associate professors in the University of
Toronto’s department of computer science
(which is regarded as one of the best in the
world) teach the series. The self-paced, self-
contained Coursera courses match the
material in their book, “Practical
Programming: An Introduction to Computer
Science Using Python 3.” LTP1 covers 40–50%
of the book and LTP2 covers another 40%.
The 10–20% not covered is not particularly
useful for data science, which helped their
case for being our pick.

Your “Learn to Program” instructors: Jennifer Campbell and


Paul Gries.

The professors kindly and promptly sent me


detailed course syllabi upon request, which
were difficult to find online prior to the
course’s official restart in September 2016.

Learn to Program: The Fundamentals (LTP1)

Timeline: 7 weeks

Estimated time commitment: 6–8 hours per


week

This course provides an introduction to


computer programming intended for people
with no programming experience. It covers
the basics of programming in Python
including elementary data types (numeric
types, strings, lists, dictionaries, and files),
control flow, functions, objects, methods,
fields, and mutability.

Modules

1. Installing Python, IDLE, mathematical


expressions, variables, assignment
statement, calling and defining functions,
syntax, and semantic errors.

2. Strings, input/output, function reuse,


function design recipe, and docstrings.

3. Booleans, import, namespaces, and if


statements.

4. For loops and fancy string manipulation.

5. While loops, lists, and mutability.

6. For loops over indices, parallel lists and


strings, and files.

7. Tuples and dictionaries.

Learn to Program: Crafting Quality Code


(LTP2)

Timeline: 5 weeks

Estimated time commitment: 6–8 hours per


week

You know the basics of programming in


Python: elementary data types (numeric
types, strings, lists, dictionaries, and files),
control flow, functions, objects, methods,
fields, and mutability. You need to be good at
these in order to succeed in this course.

LTP: Crafting Quality Code covers the next


steps: designing larger programs, testing your
code so that you know it works, reading code
in order to understand how efficient it is, and
creating your own types.

Modules

1. Designing algorithms: how do you decide


what to do in a function body? How do
you figure out what functions to write in
the first place?

2. Automated testing: doctest and unittest.

3. Analyzing code for speed — details of


searching and sorting.

4. Creating new types: classes in Python.

5. Functions as arguments, default


parameter values, and exceptions.

Associate professor Gries also provided the


following commentary on the course
structure: “Each module has between about
45 minutes to a bit more than an hour of
video. There are in-video quiz questions,
which will bring the total time spent studying
the videos to perhaps 2 hours.”

These videos are generally shorter than ten


minutes each.

He continued: “In addition, we have one


exercise (a dozen or two or so multiple choice
and short-answer questions) per module,
which should take an hour or two. There are
three programming assignments in LTP1,
each of which might take four to eight hours
of work. There are two programming
assignments in LTP2 of similar size.”

He emphasized that the estimate of 6–8 hours


per week is a rough guess: “Estimating time
spent is incredibly student-dependent, so
please take my estimates in that context. For
example, someone who knows a bit of
programming, perhaps in another
programming language, might take half the
time of someone completely new to
programming. Sometimes someone will get
stuck on a concept for a couple of hours,
while they might breeze through on other
concepts … That’s one of the reasons the self-
paced format is so appealing to us.”

In total, the University of Toronto’s Learn to


Program series runs an estimated 12 weeks at
6–8 hours per week, which is about standard
for most online courses created by
universities. If you prefer to binge-study your
MOOCs, that’s 72–96 hours, which could
feasibly be completed in two to three weeks,
especially if you have a bit of programming
experience.

Another great Python option


If you already have some familiarity with
programming, and don’t mind a syllabus that
has a notable skew towards games and
interactive applications, I would also
recommend Rice University’s An Introduction
to Interactive Programming in Python (Part 1
and Part 2) on Coursera.

With 6,000+ reviews and the highest weighted


average rating of 4.93/5 stars, this popular
course is noted for its engaging videos,
challenging quizzes, and enjoyable mini
projects. It’s slightly more difficult, and
focuses less on the fundamentals and more
on topics that aren’t applicable in data science
than our #1 pick.

These courses are also part of the 7 course


Principles in Computing Specialization on
Coursera.

CodeSkulptor: Browser-based Python programming


environment used for Rice University’s MOOCs.

The materials are self-paced and free, and a


paid certificate is available. The course must
be purchased for $79 (USD) for access to
graded materials.

Rice University’s Coursera page.

The condensed course description and full


syllabus are as follows:

“This two-part course is designed to help


students with very little or no computing
background learn the basics of building
simple interactive applications … To make
learning Python easy, we have developed a
new browser-based programming
environment that makes developing
interactive applications in Python simple.
These applications will involve windows
whose contents are graphical and respond to
buttons, the keyboard, and the mouse.

Recommended background: A knowledge of


high school mathematics is required. While
the class is designed for students with no
prior programming experience, some
beginning programmers have viewed the
class as being fast-paced. For students
interested in some light preparation prior to
the start of class, we recommend a self-paced
Python learning site such as
codecademy.com.”

Part 1
Timeline: 5 weeks

Estimated time commitment: 7–10 hours per


week

Week 0 — statements, expressions, variables


Understand the structure of this class, and
explore Python as a calculator.

Week 1 — functions, logic, conditionals


Learn the basic constructs of Python
programming, and create a program that
plays a variant of Rock-Paper-Scissors.

Week 2 — event-driven programming,


local/global variables
Learn the basics of event-driven
programming, understand the difference
between local and global variables, and create
an interactive program that plays a simple
guessing game.

Week 3 — canvas, drawing, timers


Create a canvas in Python, learn how to draw
on the canvas, and create a digital stopwatch.

Week 4 — lists, keyboard input, the basics of


modeling motion
Learn the basics of lists in Python, model
moving objects in Python, and recreate the
classic arcade game “Pong.”

Part 2
Week 5 — mouse input, list methods,
dictionaries
Read mouse input, learn about list methods
and dictionaries, and draw images.

Week 6 — classes and object-oriented


programming
Learn the basics of object-oriented
programming in Python using classes, and
work with tiled images.

Week 7 — basic game physics, sprites


Understand the math of acceleration and
friction, work with sprites, and add sound to
your game.

Week 8 — sets and animation


Learn about sets in Python, compute
collisions between sprites, and animate
sprites.

If you are set on R


If you are set on an introduction to
programming course in R, we recommend
DataCamp’s series of R courses: Introduction
to R, Intermediate R, Intermediate R —
Practice, and Writing Functions in R. Though
the latter three come at a price point of
$25/month, DataCamp is best in category for
covering the programming fundamentals and
R-specific topics, which is reflected in its
average rating of 4.29/5 stars.

The first three courses in DataCamp’s series of R courses.

We believe the best approach to learning


programming for data science using online
courses is to do it first through Python. Why?
There is a lack of MOOC options that teach
core programming principles and use R as
the language of instruction. We found six
such R courses that fit our testing criteria,
compared to twenty-two Python-based
courses. Most of the R courses didn’t receive
great ratings and failed to meet most of our
subjective testing criteria.

DataCamp’s website.

The series breakdown is as follows:

Introduction to R
Estimated time commitment: 4 hours

Chapters:

1. Intro to basics

2. Vectors

3. Matrices

4. Factors

5. Data frames

6. Lists

Intermediate R
Estimated time commitment: 6 hours

Chapters:

1. Conditionals and control flow

2. Loops

3. Functions

4. The apply family

5. Utilities

Intermediate R — Practice
Estimated time commitment: 4 hours

This follow-up course on intermediate R does


not cover new programming concepts.
Instead, you will strengthen your knowledge
of the topics in intermediate R with a bunch
of new and fun exercises.

Writing Functions in R
Estimated time commitment: 4 hours

Chapters:

1. A quick refresher

2. When and how you should write a


function

3. Functional programming

4. Advanced inputs and output

5. Robust functions

Another option for R would be to take a


Python-based introduction to programming
course to cover the fundamentals of
programming, and then pick up R syntax with
an R basics course. This is what I did, but I
did it with Udacity’s Data Analysis with R. It
worked well for me.

You can also pick up R with our top


recommendation for a statistics class, which
teaches the basics of R through coding up
stats problems.

The Competition
Our #1 and #2 picks had a 4.71 and 4.93 star
weighted average rating over 284 and 6,069
reviews, respectively. Let’s look at the other
alternatives.

Python courses (descending weighted average


ratings)
Programming for Everybody (Getting
Started with Python) and Python Data
Structures (University of
Michigan/Coursera): another great
option. It has a great teacher (Dr. Charles
“Chuck” Severance), as well. This series
came close to usurping our #1 pick
because it matched it in rating and in
most of the subjective criteria. This
course is more gentle, however, with
reviewers noting that it might not prepare
you as well as other options. Dr. Chuck
himself noted that this course is a bridge
to more advanced programming courses:
“I would suggest that after students complete
my Python course, if they are interested in
more programming, that they would take the
Rice course.” We also felt that the reviews
for our #1 pick were more enthusiastic. It
has a 4.8-star weighted average rating
over 4,800+ reviews.

Python A-Z: Python For Data Science With


Real Exercises (Udemy): it costs money,
and has a 4.7-star weighted average rating
over 52 reviews.

Automate the Boring Stuff with Python


Programming (Udemy): it costs money,
and has a 4.6-star weighted average rating
over 2,000+ reviews.

Python for Beginners: From Noob to


Expert in 22+ Hours (Udemy): it costs
money, and has a 4.6-star weighted
average rating over 240 reviews.

Introduction to Computer Science and


Programming Using Python (MIT/edX):
another good option. It has 4.5-star
weighted average rating over 240 reviews.

Complete Python Bootcamp (Udemy): it


costs money, and has a 4.5-star weighted
average rating over 4,700+ reviews.

Treehouse’s Python series (9 courses): it


costs money. It’s a popular option, but
there are not enough reviews to make a
value judgment. It has a 4.5-star weighted
average rating over 5 reviews.

Python (Codecademy): video-less, text


editor-based, interactive course. It has a
4.5-star weighted average rating over 20
reviews.

Introduction to Python for Data Science


(Microsoft/edX): it has a 4.47-star
weighted average rating over 360 reviews.

Intro to Programming Nanodegree


(Udacity): it has a notable focus on web
development. It’s a great option for
someone who doesn’t know what type of
programming they want to do. It has a
4.4-star weighted average rating over 730
reviews. Note that it contains the first half
of Udacity’s popular “Intro to Computer
Science” course, which doesn’t fit our
inclusion criteria.

CS For All: Introduction to Computer


Science and Python Programming
(Harvey Mudd College/edX): it has very
few reviews, and a 4.33-star weighted
average rating over 6 reviews.

Programming Foundations with Python


(Udacity): doesn’t cover the fundamentals.
It has a 4-star weighted average rating
over 7 reviews.

Learn to Program Using Python


(edX/University of Texas Arlington): it has
a 4-star weighted average rating over 14
reviews.

Learn to Code for Data Analysis (The


Open University/FutureLearn): it has a
3.5-star weighted average rating over 2
reviews.

DataCamp’s Python series (3 courses): it


has no reviews on the two major course
review sites, but DataCamp is a popular
option.

SoloLearn’s Python 3 Tutorial: it has no


reviews, but has a comprehensive
curriculum and a dedicated fanbase.

Dataquest’s Python series (3 courses): it


has no reviews, but has a comprehensive
curriculum and an outspoken fanbase.

R courses (descending weighted average ratings)


R Programming A-Z™: R For Data Science
With Real Exercises! (Udemy): costs
money. It doesn’t offer as much bang for
your buck as our #1 R offering. Ratings
are similar, considering sample size. It
has a 4.7-star weighted average rating
over 785 reviews.

Introduction to R for Data Science


(Microsoft/edX): not as much depth as
DataCamp’s offering. It has a 4.48-star
weighted average rating over 500 reviews.

R Programming (Johns Hopkins


University/Coursera): doesn’t sufficiently
cover the basics of programming.
Reviewers note that it is difficult, and not
in a good way. It has a 4.04-star weighted
average rating over 900+ reviews, despite
a 2.5-star rating over 212 reviews on Class
Central.

TryR (CodeSchool): it’s not long enough to


fit testing criteria, and doesn’t sufficiently
cover programming fundamentals. It has
a 4-star weighted average rating over 260
reviews.

Programming with R for Data Science


(Microsoft/edX): more of an introduction
to the R language rather than
programming. The course site states, “If
you have some programming experience,
and would like to learn more about R,
then you’re at the right place.” It has a 3-
star weighted average rating over 12
reviews.

Wrapping it Up
This is the first of a six-piece series that
covers the best MOOCs for launching yourself
into the data science field. It will cover
several other data science core competencies:
statistics, the data science process, data
visualization, and machine learning.

If you want to learn Data


Science, take a few of…
these statistics classes
A comprehensive guide
to online statistics and…
probability courses.
medium.freecodecamp.com

I ranked every Intro to


Data Science course o…
the internet, based on
A comprehensive guide
thousands of data points
to online intro to data…
science courses.
medium.freecodecamp.com

The final piece will be a summary of those


courses, and the best MOOCs for other key
topics such as data wrangling, databases, and
even software engineering.

If you’re looking for a complete list of Data


Science MOOCs, you can find them on Class
Central’s Data Science and Big Data subject
page.

If you enjoyed reading this, check out some of


Class Central’s other pieces:

Here are 250 Ivy League


courses you can take…
online right now
fromfor free
250 MOOCs Brown,
Columbia, Cornell,…
Dartmouth, Harvard,
medium.freecodecamp.com
Penn, Princeton, and
Yale.

The 50 best free online


university courses…
according to dataClass
When I launched
Central back in…
November 2011, there
medium.freecodecamp.com
were around 18 or so free
online courses, and
almost all of…
If you have suggestions for courses I missed,
let me know in the responses!

If you found this helpful, click the so more


people will see it here on Medium.

This is a condensed version of the original article


published on Class Central, where course
descriptions, syllabi, and multiple reviews are
included.

Education Programming Data Science

Learning To Code Technology

7.4K 54

7.4K 54
More from We’ve moved to Follow
freeCodeCamp.org/news
We’ve moved to https://freecodecamp.org/news and
publish tons of tutorials each week. See you there.

TK · Sep 30, 2017

Learning Python: From Zero to Hero

Python 11 min read

Share your ideas with millions of readers.

Write on Medium

Facundo Corradini · Oct 9, 2018

CSS Previous sibling selectors and


how to fake them

CSS 5 min read

Rinor Maloku · Apr 14, 2018

Learn Kubernetes in Under 3 Hours:


A Detailed Guide to Orchestrating…
Containers

Kubernetes 33 min read

Richard Reis · Apr 10, 2018

How to think like a programmer —


lessons in problem solving

Programming 7 min read

Gabriel Tanner · Apr 22, 2019

How to create a music bot using


Discord.js

JavaScript 7 min read

Read more from We’ve moved to freeCodeCamp.org/news

About Help Terms Privacy

Get the Medium app

You might also like