SlideShare a Scribd company logo
Machine Learning for
Hackers
is how we make sense of big data.
Adam Gibson
2-27-2014 SFHN
BIG DATA & STATISTICS
• Statistics – Group by, aggregate, count,average, mean,p
values,mode,correlations, exploring, < 100 variables
• Machine Learning – Label this image, Predict the next
event, Pick out the anomalies – aka learn from data not
count it, group data by similarities, > 100 variables.
What is data?!
Many kinds of data Wow.


Unstructured



Text



SQL



Video



XML



Images



JSON



Time Series



CSV

Data Scientists

Structured

We know this, and just process it.
WHAT do machines learn?
• Machine learning is a general tool that can work with
•
•
•
•
•
•
•

various data types.
Images = Machine vision
Text = Natural-language processing
Time-series = Prediction
Facial recognition => Security
Text => Customer profiles/Recommendation engines
Time-series => stock-market trading platforms
NLP => Customer service
WHAT IS A DATA SCIENTIST?
Analyst
Exploratory analysis of data,
typically on smaller data sets.
Understands the algorithms
and interprets data.

Distributed Systems Engineer
Implements production data crunching,
also known as the nosql person.
They handle distributed systems
and workloads,
APIs, perhaps
even data collection and storage
What kinds of Machine Learning Are there?
Unsupervised – Clustering (group things that are similar,
regression (correlation != causation ring a bell?)
Supervised – Label all the things!
Predict the future!
How does this affect me?
I will leave who does this to your imagination


Ad Targeting



Recommends you Movies



Brings you search results



Recognizes your face in the camera



Drives your car



Automatically disables your credit card when you leave the country
Can I do this?
The shortcut here is to start with basics –
for example google analytics, understanding churn rate.
Pick up a more advanced understanding
after that if it still seems interesting.
If you are in to backends start with distributed systems,
get your math basics up enough to understand
what the guy on the other side of the table
who's asking you to put the algorithm in to production is saying
Resources
Coursera Machine Learning
Reddit Machine Learning
DataTau (hacker news for data scientists)
More mathy Stanford Machine Learning
Tools
Analysts

Data Engineers
Spark

http://scikit-learn.org/stable/
Julia Lang
R Lang

Hadoop
Storm
Hadoop QuickStart VM

Recommended

Sf data mining_meetup
Sf data mining_meetup
Adam Gibson
 
Ersatz meetup - DeepLearning4j Demo
Ersatz meetup - DeepLearning4j Demo
Adam Gibson
 
Top 10 deep learning algorithms you should know in
Top 10 deep learning algorithms you should know in
AmanKumarSingh97
 
Emotion detection using cnn.pptx
Emotion detection using cnn.pptx
RADO7900
 
Neural networks and deep learning
Neural networks and deep learning
RADO7900
 
Cat and dog classification
Cat and dog classification
omaraldabash
 
Machine model to classify dogs and cat
Machine model to classify dogs and cat
Akash Parui
 
Deep learning short introduction
Deep learning short introduction
Adwait Bhave
 
Handwritten bangla-digit-recognition-using-deep-learning
Handwritten bangla-digit-recognition-using-deep-learning
Sharmin Rubi
 
Machine Learning techniques
Machine Learning techniques
Jigar Patel
 
Image captioning with Keras and Tensorflow - Debarko De @ Practo
Image captioning with Keras and Tensorflow - Debarko De @ Practo
Debarko De
 
Deep Learning
Deep Learning
Shaikh Shahzad
 
Deep learning
Deep learning
Mohamed Loey
 
Image captioning
Image captioning
Rajesh Shreedhar Bhat
 
Animesh Prasad and Muthu Kumar Chandrasekaran - WESST - Basics of Deep Learning
Animesh Prasad and Muthu Kumar Chandrasekaran - WESST - Basics of Deep Learning
NUS Institute of Applied Learning Sciences and Educational Technology
 
Alberto Massidda - Images and words: mechanics of automated captioning with n...
Alberto Massidda - Images and words: mechanics of automated captioning with n...
Codemotion
 
Human Emotion Recognition using Machine Learning
Human Emotion Recognition using Machine Learning
ijtsrd
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
Amr Rashed
 
Basics of Soft Computing
Basics of Soft Computing
Sangeetha Rajesh
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
Eckovation machine learning project
Eckovation machine learning project
Vinod Jatav
 
Deep Learning Projects - Anomaly Detection Using Deep Learning
Deep Learning Projects - Anomaly Detection Using Deep Learning
DezyreAcademy
 
Deep learning - what is it and why now?
Deep learning - what is it and why now?
Natalia Konstantinova
 
Deep Learning Primer - a brief introduction
Deep Learning Primer - a brief introduction
ananth
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptx
Chun-Hao Chang
 
Introduction to Deep Learning
Introduction to Deep Learning
Oswald Campesato
 
SPEECH BASED EMOTION RECOGNITION USING VOICE
SPEECH BASED EMOTION RECOGNITION USING VOICE
VamshidharSingh
 
Hot machine learning topics
Hot machine learning topics
WriteMyThesis
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
Hchethankumar
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
Hchethankumar
 

More Related Content

What's hot (20)

Handwritten bangla-digit-recognition-using-deep-learning
Handwritten bangla-digit-recognition-using-deep-learning
Sharmin Rubi
 
Machine Learning techniques
Machine Learning techniques
Jigar Patel
 
Image captioning with Keras and Tensorflow - Debarko De @ Practo
Image captioning with Keras and Tensorflow - Debarko De @ Practo
Debarko De
 
Deep Learning
Deep Learning
Shaikh Shahzad
 
Deep learning
Deep learning
Mohamed Loey
 
Image captioning
Image captioning
Rajesh Shreedhar Bhat
 
Animesh Prasad and Muthu Kumar Chandrasekaran - WESST - Basics of Deep Learning
Animesh Prasad and Muthu Kumar Chandrasekaran - WESST - Basics of Deep Learning
NUS Institute of Applied Learning Sciences and Educational Technology
 
Alberto Massidda - Images and words: mechanics of automated captioning with n...
Alberto Massidda - Images and words: mechanics of automated captioning with n...
Codemotion
 
Human Emotion Recognition using Machine Learning
Human Emotion Recognition using Machine Learning
ijtsrd
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
Amr Rashed
 
Basics of Soft Computing
Basics of Soft Computing
Sangeetha Rajesh
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
Eckovation machine learning project
Eckovation machine learning project
Vinod Jatav
 
Deep Learning Projects - Anomaly Detection Using Deep Learning
Deep Learning Projects - Anomaly Detection Using Deep Learning
DezyreAcademy
 
Deep learning - what is it and why now?
Deep learning - what is it and why now?
Natalia Konstantinova
 
Deep Learning Primer - a brief introduction
Deep Learning Primer - a brief introduction
ananth
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptx
Chun-Hao Chang
 
Introduction to Deep Learning
Introduction to Deep Learning
Oswald Campesato
 
SPEECH BASED EMOTION RECOGNITION USING VOICE
SPEECH BASED EMOTION RECOGNITION USING VOICE
VamshidharSingh
 
Hot machine learning topics
Hot machine learning topics
WriteMyThesis
 
Handwritten bangla-digit-recognition-using-deep-learning
Handwritten bangla-digit-recognition-using-deep-learning
Sharmin Rubi
 
Machine Learning techniques
Machine Learning techniques
Jigar Patel
 
Image captioning with Keras and Tensorflow - Debarko De @ Practo
Image captioning with Keras and Tensorflow - Debarko De @ Practo
Debarko De
 
Alberto Massidda - Images and words: mechanics of automated captioning with n...
Alberto Massidda - Images and words: mechanics of automated captioning with n...
Codemotion
 
Human Emotion Recognition using Machine Learning
Human Emotion Recognition using Machine Learning
ijtsrd
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
Amr Rashed
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
Eckovation machine learning project
Eckovation machine learning project
Vinod Jatav
 
Deep Learning Projects - Anomaly Detection Using Deep Learning
Deep Learning Projects - Anomaly Detection Using Deep Learning
DezyreAcademy
 
Deep learning - what is it and why now?
Deep learning - what is it and why now?
Natalia Konstantinova
 
Deep Learning Primer - a brief introduction
Deep Learning Primer - a brief introduction
ananth
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptx
Chun-Hao Chang
 
Introduction to Deep Learning
Introduction to Deep Learning
Oswald Campesato
 
SPEECH BASED EMOTION RECOGNITION USING VOICE
SPEECH BASED EMOTION RECOGNITION USING VOICE
VamshidharSingh
 
Hot machine learning topics
Hot machine learning topics
WriteMyThesis
 

Similar to San Francisco Hacker News - Machine Learning for Hackers (20)

Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
Hchethankumar
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
Hchethankumar
 
Intro to machine learning
Intro to machine learning
Govind Mudumbai
 
Say "Hi!" to Your New Boss
Say "Hi!" to Your New Boss
Andreas Dewes
 
Machine Learning
Machine Learning
Joshua Robinson
 
Artificial intelligence slides beginners
Artificial intelligence slides beginners
Antonio Fernandes
 
#ATAGTR2021 Presentation : "Use of AI and ML in Performance Testing" by Adolf...
#ATAGTR2021 Presentation : "Use of AI and ML in Performance Testing" by Adolf...
Agile Testing Alliance
 
Business analytics Project.docx
Business analytics Project.docx
kushi62
 
Modex Talks - AI Conceptual Overview
Modex Talks - AI Conceptual Overview
Modex
 
Azure machine learning indiandotnet
Azure machine learning indiandotnet
Indiandotnet
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
CodePolitan
 
Barga Data Science lecture 2
Barga Data Science lecture 2
Roger Barga
 
Artificial Intelligence
Artificial Intelligence
Enes Bolfidan
 
An Introduction to Machine Learning
An Introduction to Machine Learning
Vedaj Padman
 
AC Atlassian Coimbatore Session Slides( 22/06/2024)
AC Atlassian Coimbatore Session Slides( 22/06/2024)
apoorva2579
 
Machine learning by ganesh kavhar
Machine learning by ganesh kavhar
Savitribai Phule Pune University
 
Intro to ai application emeritus uob-final
Intro to ai application emeritus uob-final
Luis Fernando Gonzalez Sanchez
 
Machine Learning Basics
Machine Learning Basics
Suresh Arora
 
GDSC BPIT ML Campaign.pptx
GDSC BPIT ML Campaign.pptx
khushbooGupta928250
 
GDSC Machine Learning Session Presentation
GDSC Machine Learning Session Presentation
gdsclavasa
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
Hchethankumar
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
Hchethankumar
 
Intro to machine learning
Intro to machine learning
Govind Mudumbai
 
Say "Hi!" to Your New Boss
Say "Hi!" to Your New Boss
Andreas Dewes
 
Artificial intelligence slides beginners
Artificial intelligence slides beginners
Antonio Fernandes
 
#ATAGTR2021 Presentation : "Use of AI and ML in Performance Testing" by Adolf...
#ATAGTR2021 Presentation : "Use of AI and ML in Performance Testing" by Adolf...
Agile Testing Alliance
 
Business analytics Project.docx
Business analytics Project.docx
kushi62
 
Modex Talks - AI Conceptual Overview
Modex Talks - AI Conceptual Overview
Modex
 
Azure machine learning indiandotnet
Azure machine learning indiandotnet
Indiandotnet
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
CodePolitan
 
Barga Data Science lecture 2
Barga Data Science lecture 2
Roger Barga
 
Artificial Intelligence
Artificial Intelligence
Enes Bolfidan
 
An Introduction to Machine Learning
An Introduction to Machine Learning
Vedaj Padman
 
AC Atlassian Coimbatore Session Slides( 22/06/2024)
AC Atlassian Coimbatore Session Slides( 22/06/2024)
apoorva2579
 
Machine Learning Basics
Machine Learning Basics
Suresh Arora
 
GDSC Machine Learning Session Presentation
GDSC Machine Learning Session Presentation
gdsclavasa
 

More from Adam Gibson (20)

End to end MLworkflows
End to end MLworkflows
Adam Gibson
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018
Adam Gibson
 
Deploying signature verification with deep learning
Deploying signature verification with deep learning
Adam Gibson
 
Self driving computers active learning workflows with human interpretable ve...
Self driving computers active learning workflows with human interpretable ve...
Adam Gibson
 
Anomaly Detection and Automatic Labeling with Deep Learning
Anomaly Detection and Automatic Labeling with Deep Learning
Adam Gibson
 
Strata Beijing 2017: Jumpy, a python interface for nd4j
Strata Beijing 2017: Jumpy, a python interface for nd4j
Adam Gibson
 
Boolan machine learning summit
Boolan machine learning summit
Adam Gibson
 
Advanced deeplearning4j features
Advanced deeplearning4j features
Adam Gibson
 
Deep Learning with GPUs in Production - AI By the Bay
Deep Learning with GPUs in Production - AI By the Bay
Adam Gibson
 
Big Data Analytics Tokyo
Big Data Analytics Tokyo
Adam Gibson
 
Wrangleconf Big Data Malaysia 2016
Wrangleconf Big Data Malaysia 2016
Adam Gibson
 
Distributed deep rl on spark strata singapore
Distributed deep rl on spark strata singapore
Adam Gibson
 
Deep learning in production with the best
Deep learning in production with the best
Adam Gibson
 
Dl4j in the wild
Dl4j in the wild
Adam Gibson
 
SKIL - Dl4j in the wild meetup
SKIL - Dl4j in the wild meetup
Adam Gibson
 
Strata Beijing - Deep Learning in Production on Spark
Strata Beijing - Deep Learning in Production on Spark
Adam Gibson
 
Anomaly detection in deep learning (Updated) English
Anomaly detection in deep learning (Updated) English
Adam Gibson
 
Skymind - Udacity China presentation
Skymind - Udacity China presentation
Adam Gibson
 
Anomaly Detection in Deep Learning (Updated)
Anomaly Detection in Deep Learning (Updated)
Adam Gibson
 
Hadoop summit 2016
Hadoop summit 2016
Adam Gibson
 
End to end MLworkflows
End to end MLworkflows
Adam Gibson
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018
Adam Gibson
 
Deploying signature verification with deep learning
Deploying signature verification with deep learning
Adam Gibson
 
Self driving computers active learning workflows with human interpretable ve...
Self driving computers active learning workflows with human interpretable ve...
Adam Gibson
 
Anomaly Detection and Automatic Labeling with Deep Learning
Anomaly Detection and Automatic Labeling with Deep Learning
Adam Gibson
 
Strata Beijing 2017: Jumpy, a python interface for nd4j
Strata Beijing 2017: Jumpy, a python interface for nd4j
Adam Gibson
 
Boolan machine learning summit
Boolan machine learning summit
Adam Gibson
 
Advanced deeplearning4j features
Advanced deeplearning4j features
Adam Gibson
 
Deep Learning with GPUs in Production - AI By the Bay
Deep Learning with GPUs in Production - AI By the Bay
Adam Gibson
 
Big Data Analytics Tokyo
Big Data Analytics Tokyo
Adam Gibson
 
Wrangleconf Big Data Malaysia 2016
Wrangleconf Big Data Malaysia 2016
Adam Gibson
 
Distributed deep rl on spark strata singapore
Distributed deep rl on spark strata singapore
Adam Gibson
 
Deep learning in production with the best
Deep learning in production with the best
Adam Gibson
 
Dl4j in the wild
Dl4j in the wild
Adam Gibson
 
SKIL - Dl4j in the wild meetup
SKIL - Dl4j in the wild meetup
Adam Gibson
 
Strata Beijing - Deep Learning in Production on Spark
Strata Beijing - Deep Learning in Production on Spark
Adam Gibson
 
Anomaly detection in deep learning (Updated) English
Anomaly detection in deep learning (Updated) English
Adam Gibson
 
Skymind - Udacity China presentation
Skymind - Udacity China presentation
Adam Gibson
 
Anomaly Detection in Deep Learning (Updated)
Anomaly Detection in Deep Learning (Updated)
Adam Gibson
 
Hadoop summit 2016
Hadoop summit 2016
Adam Gibson
 

Recently uploaded (20)

Securing AI - There Is No Try, Only Do!.pdf
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
AI VIDEO MAGAZINE - June 2025 - r/aivideo
AI VIDEO MAGAZINE - June 2025 - r/aivideo
1pcity Studios, Inc
 
Securing Account Lifecycles in the Age of Deepfakes.pptx
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
 
Curietech AI in action - Accelerate MuleSoft development
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
OWASP Barcelona 2025 Threat Model Library
OWASP Barcelona 2025 Threat Model Library
PetraVukmirovic
 
"Database isolation: how we deal with hundreds of direct connections to the d...
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
The Future of Technology: 2025-2125 by Saikat Basu.pdf
The Future of Technology: 2025-2125 by Saikat Basu.pdf
Saikat Basu
 
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Nilesh Gule
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
Fwdays
 
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
Lessons Learned from Developing Secure AI Workflows.pdf
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 
Securing AI - There Is No Try, Only Do!.pdf
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
AI VIDEO MAGAZINE - June 2025 - r/aivideo
AI VIDEO MAGAZINE - June 2025 - r/aivideo
1pcity Studios, Inc
 
Securing Account Lifecycles in the Age of Deepfakes.pptx
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
 
Curietech AI in action - Accelerate MuleSoft development
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
OWASP Barcelona 2025 Threat Model Library
OWASP Barcelona 2025 Threat Model Library
PetraVukmirovic
 
"Database isolation: how we deal with hundreds of direct connections to the d...
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
The Future of Technology: 2025-2125 by Saikat Basu.pdf
The Future of Technology: 2025-2125 by Saikat Basu.pdf
Saikat Basu
 
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Nilesh Gule
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
Fwdays
 
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
Lessons Learned from Developing Secure AI Workflows.pdf
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 

San Francisco Hacker News - Machine Learning for Hackers

  • 1. Machine Learning for Hackers is how we make sense of big data. Adam Gibson 2-27-2014 SFHN
  • 2. BIG DATA & STATISTICS • Statistics – Group by, aggregate, count,average, mean,p values,mode,correlations, exploring, < 100 variables • Machine Learning – Label this image, Predict the next event, Pick out the anomalies – aka learn from data not count it, group data by similarities, > 100 variables.
  • 4. Many kinds of data Wow.  Unstructured  Text  SQL  Video  XML  Images  JSON  Time Series  CSV Data Scientists Structured We know this, and just process it.
  • 5. WHAT do machines learn? • Machine learning is a general tool that can work with • • • • • • • various data types. Images = Machine vision Text = Natural-language processing Time-series = Prediction Facial recognition => Security Text => Customer profiles/Recommendation engines Time-series => stock-market trading platforms NLP => Customer service
  • 6. WHAT IS A DATA SCIENTIST? Analyst Exploratory analysis of data, typically on smaller data sets. Understands the algorithms and interprets data. Distributed Systems Engineer Implements production data crunching, also known as the nosql person. They handle distributed systems and workloads, APIs, perhaps even data collection and storage
  • 7. What kinds of Machine Learning Are there? Unsupervised – Clustering (group things that are similar, regression (correlation != causation ring a bell?) Supervised – Label all the things! Predict the future!
  • 8. How does this affect me?
  • 9. I will leave who does this to your imagination  Ad Targeting  Recommends you Movies  Brings you search results  Recognizes your face in the camera  Drives your car  Automatically disables your credit card when you leave the country
  • 10. Can I do this? The shortcut here is to start with basics – for example google analytics, understanding churn rate. Pick up a more advanced understanding after that if it still seems interesting. If you are in to backends start with distributed systems, get your math basics up enough to understand what the guy on the other side of the table who's asking you to put the algorithm in to production is saying
  • 11. Resources Coursera Machine Learning Reddit Machine Learning DataTau (hacker news for data scientists) More mathy Stanford Machine Learning
  • 12. Tools