From SQL to Python - A Beginner's Guide to Making the Switch

FROM SQL TO PYTHON:
HANDS-ON DATA
ANALYTICS AND MACHINE
LEARNING
RACHEL BERRYMAN
DATA SCIENTIST – TEMPUS ENERGY
RACHELKBERRYMAN@GMAIL.COM

ABOUT ME
MSc Sustainable
Development, BA
Economics
Senior Energy Data
Analyst
Data Science Retreat
Batch 12, 2017
Data Scientist at
Tempus Energy.
Instructor of Model
Pipelines course at DSR

FROM DATA ANALYSIS TO DATA SCIENCE
What’s the
difference between
data analysis and
Data Science?
How do I know if a
career in Data Science
is right for me?
How do I make the
switch to a career in
Data Science?

WHAT IS DATA SCIENCE?
• Data Science definition (Wikipedia): Data
Science is a concept to unify statistics,
data analysis, machine learning and their
related methods in order to understand
and analyze actual phenomena with data.
It employs techniques and theories
drawn from many fields within the broad
areas
of mathematics, statistics, information
science, and computer science.

DATA ANALYSIS VS. DATA SCIENCE:
WHAT’S THE DIFFERENCE?
• Data Analysis mainly looks at the present and past. It answers
questions like:
• How much revenue did we bring in last year?
• What is product does customer X buy most frequently?
• Data Science mainly looks at the present and future. It answers
questions like:
• What products should we invest in expanding for the future?
• What product should we recommend to customer X so that they buy
more when they visit our site, based on their most-purchased products
in the past?

• Data Analysis works mainly with proprietary tools
• Oracle, MS SQL Server, Tableau.
• Data Science works mainly with open-source tools
• Open-source languages and packages: ie Python, scikit-learn, keras,
matplotlib

• Data Analysis works with data from one or few sources
• Ex: data from an in-house SQL database.
• Data Science works with data from many varying sources
• Ex: data from in-house SQL database, data scraped from the web, text
data from customer surveys, data from across multiple departments

WHY DO COMPANIES NEED DATA
SCIENTISTS?
• More and more data comes in
unstructured formats: ex,
natural language in emails and
social media posts, photos,
audio files.
• Data Scientists make use of
this data and apply it to
pressing business questions.

From SQL to Python - A Beginner's Guide to Making the Switch

IS DATA SCIENCE RIGHT FOR ME?
 You like to code
 You like to work across many teams and find synergies
 You enjoy getting “stuck in” and working through challenges
 You consider yourself a life-long learner
 You like math
 You seek out work and answers to questions: you don’t wait
for questions to be delegated to you

HOW CAN I MAKE THE SWITCH FROM DATA
ANALYSIS TO DATA SCIENCE?
• Learn the basics:
• Practical
• Command Line
• Git and Github
• A common Data Science programming language (Python, or R).
• Theoretical
• Machine Learning Algorithms
• Supervised, Unsupervised
• Go further:
• MOOCs
• Intensive deep-dive: Bootcamp/Retreat

LEARN THE BASICS: COMMAND LINE & GIT
• Command line is how you directly interact with your computer
(sans GUI). It is the “ultimate seat of power for your computer”.
• Git is a distributed version control system. Git is responsible for
keeping track of changes to content (usually source code files),
and it provides mechanisms for sharing that content with
others. GitHub is a company that provides Git repository
hosting.
• Learn how to use your command line to clone repositories
(‘repos’) from GitHub. You will open up a world of learning
opportunities!

LEARN THE BASICS: PYTHON FOR DATA
MUNGING AND ANALYSIS
• In SQL, you’re usually using a company-purchased software
like Oracle SQL Developer, or MS SQL Server Management
Studio.

Python
Interpreter
iPython IDE/Jupyter
Get
Coding! 

Python Interpreter

iPython

Jupyter Notebook

• IDEs:
• Python-specific:
• PyCharm
• Spyder
• Thonny
• General, with support for
Python:
• Atom (also can add iPython
with Hydrogen)
• Sublime Text
• Vim

• Start with what you know!
• Write SQL commands in Python
• Automate what you would have to do manually in Excel
• Use matplotlib to make visualizations you would have done in Tableau
• Learn what you don’t know
• Python packages, modules, libraries
• Object-Oriented Programming (OOP)

SCIENCE AND MACHINE LEARNING
• Github Repo with practice for switching from SQL to Python
• Clone repo: https://github.com/rachelkberryman/From_SQL_to_Python
• cd into repo, and run command “jupyter notebook” (more information
about jupyter notebooks here)
• Includes:
• sample SQL queries with Python equivalents
• examples of Python functions for data manipulation and analysis
• sample visualizations in Seaborn

SCIENCE AND MACHINE LEARNING
• By applying machine learning algorithms (with code), you will
learn them MUCH more quickly than by only reading about
them
• Even better, apply them to a sample dataset (work and/or
passion project)

GOING FURTHER: MOOCS VS. DEEP DIVE
• The jump from automating what you already know to working
with predictive models is where most people get overwhelmed.
• Need for more structured learning: MOOCs vs. Deep Dive

GOING FURTHER: MOOCS VS. DEEP DIVE
MOOCs:
• PROs:
• On your own time
• Little risk
• CONs:
• No or little help when stuck
• No supportive community/job
help
Deep Dive:
• PROs:
• Structure and support
• Faster learning rate
• Network and hiring support
• CONs:
• Risk and opportunity cost

MAKING THE SWITCH: GOING FURTHER
• MOOCs:
• Andrew Ng’s Machine Learning course on Coursera
• Explanation of machine learning algorithms
• Applied Data Science with Python course on Coursera
• Coding practice in notebooks, with explanation videos
• Deep Dive: Data Science Retreat
• 3 months of intensive data science teaching and training in Berlin
• Culminates in final portfolio project

MORE RESOURCES
• On Python:
• Think Python: Thinking Like a Computer Scientist, Allan B. Downey
• Fluent Python, Luciano Ramalho
• On DS/ML:
• Think Bayes: Bayesian Statistics in Python, Allan B. Downey
• The Master Algorithm, Pedro Domingos
• Pattern Recognition in Machine Learning, Christopher Bishop

QUESTIONS?
Contact me:
rachelkberryman@gmail.com
Github.com/rachelkberryman

From SQL to Python - A Beginner's Guide to Making the Switch

Recommended

More Related Content

What's hot (20)

Similar to From SQL to Python - A Beginner's Guide to Making the Switch (20)

Recently uploaded (20)

From SQL to Python - A Beginner's Guide to Making the Switch

Editor's Notes