SlideShare a Scribd company logo
#datapopupseattle
Understanding Feature Space
in Machine Learning
Alice Zheng
Director of Data Science, Dato
RainyData Datoinc
#datapopupseattle
UNSTRUCTURED
Data Science POP-UP in Seattle
www.dominodatalab.com
D
Produced by Domino Data Lab
Domino’s enterprise data science platform is used
by leading analytical organizations to increase
productivity, enable collaboration, and publish
models into production faster.
Understanding Feature Space
in Machine Learning
Alice Zheng, Dato
October, 2015
3
‹#›
My journey so far
Applied+machine+learning+
(Data+science)
Build+ML+tools
Shortage+of+experts+
and+good+tools.
‹#›
Why machine learning?
Model data.
Make predictions.
Build intelligent
applications.
‹#›
The machine learning pipeline
I+fell+in+love+the+instant+I+laid+
my+eyes+on+that+puppy.+His+
big+eyes+and+playful+tail,+his+
soft+furry+paws,+…
Raw+data
Features
Models
Predictions
Deploy+in+
production
‹#›
Three things to know about ML
• Feature = numeric representation of raw data
• Model = mathematical “summary” of features
• Making something that works = choose the right model
and features, given data and task
Feature = Numeric representation of raw data
‹#›
Representing natural text
It#is#a#puppy#and#it#is#
extremely#cute.
What’s+important?+
Phrases?+Specific+
words?+Ordering?+
Subject,+object,+verb?
Classify:++
puppy+or+not?
Raw+Text
{“it”:2,++
+“is”:2,++
+“a”:1,++
+“puppy”:1,++
+“and”:1,+
+“extremely”:1,+
+“cute”:1+}
Bag+of+Words
‹#›
Representing natural text
It#is#a#puppy#and#it#is#
extremely#cute.
Classify:++
puppy+or+not?
Raw+Text Bag+of+Words
it 2
they 0
I 0
am 0
how 0
puppy 1
and 1
cat 0
aardvark 0
cute 1
extremely 1
… …
Sparse+vector+
representation
‹#›
Representing images
Image+source:+“Recognizing+and+learning+object+categories,”++
Li+Fei[Fei,+Rob+Fergus,+Anthony+Torralba,+ICCV+2005—2009.
Raw+image:++
millions+of+RGB+triplets,+
one+for+each+pixel
Classify:++
person+or+animal?
Raw+Image Bag+of+Visual+Words
‹#›
Representing images
Classify:++
person+or+animal?
Raw+Image Deep+learning+features
3.29+
[15+
[5.24+
48.3+
1.36+
47.1+
[1.92
36.5+
2.83+
95.4+
[19+
[89+
5.09+
37.8
Dense+vector+
representation
‹#›
Feature space in machine learning
• Raw data ! high dimensional vectors
• Collection of data points ! point cloud in feature space
• Feature engineering = creating features of the appropriate
granularity for the task
Visualizing Feature Space
Crudely speaking, mathematicians fall into two categories:
the algebraists, who find it easiest to reduce all problems
to sets of numbers and variables, and the geometers, who
understand the world through shapes.



-- Masha Gessen, “Perfect Rigor”
‹#›
Visualizing bag-of-words
puppy
cute
1
1
I+have+a+puppy+and+
it+is+extremely+cute
I#have#a#puppy#and#
it#is#extremely#cute
it 1
they 0
I 1
am 0
how 0
puppy 1
and 1
cat 0
aardvark 0
zebra 0
cute 1
extremely 1
… …
‹#›
Visualizing bag-of-words
puppy
cute
1
1
1
extremely
I+have+a+puppy+and++
it+is+extremely+cute
I+have+an+extremely++
cute+cat
I+have+a+cute++
puppy
‹#›
Document point cloud
word+1
word+2
Model = Mathematical “summary” of features
‹#›
What is a summary?
• Data ! point cloud in feature space
• Model = a geometric shape that best “fits” the point cloud
‹#›
Classification model
Feature+2
Feature+1
Decide+between+two+classes
‹#›
Clustering model
Feature+2
Feature+1
Group+data+points+tightly
‹#›
Regression model
Target
Feature
Fit+the+target+values
Visualizing Feature Engineering

‹#›
When does bag-of-words fail?
puppy
cat
2
1
1
have
I+have+a+puppy
I+have+a+cat
I+have+a+kitten
Task:+find+a+surface+that+separates++
documents+about+dogs+vs.+cats
Problem:+the+word+“have”+adds+fluff++
instead+of+information
I+have+a+dog+
and+I+have+a+pen
1
‹#›
Improving on bag-of-words
• Idea: “normalize” word counts so that popular words are
discounted
• Term frequency (tf) = Number of times a terms appears in a
document
• Inverse document frequency of word (idf) =
• N = total number of documents
• Tf-idf count = tf x idf
‹#›
From BOW to tf-idf
puppy
cat
2
1
1
have
I+have+a+puppy
I+have+a+cat
I+have+a+kitten
idf(puppy)+=+log+4+
idf(cat)+=+log+4+
idf(have)+=+log+1+=+0
I+have+a+dog+
and+I+have+a+pen
1
‹#›
From BOW to tf-idf
puppy
cat1
have
tfidf(puppy)+=+log+4+
tfidf(cat)+=+log+4+
tfidf(have)+=+0
I+have+a+dog+
and+I+have+a+pen,+
I+have+a+kitten
1
log+4
log+4
I+have+a+cat
I+have+a+puppy
Decision+surface
Tf[idf+flattens+
uninformative+
dimensions+in+the+
BOW+point+cloud
‹#›
That’s not all, folks!
• Geometry is the key to understanding feature space and
machine learning
• Many other fun topics:
- Feature normalization
- Feature transformations
- Model regularization
• Dato is hiring! jobs@dato.com
@RainyData,+@DatoInc
#datapopupseattle
@datapopup
#datapopupseattle
#datapopupseattle
Thank You To Our Sponsors
Ad

More Related Content

What's hot (20)

Python standard 2022 Spring
Python standard 2022 SpringPython standard 2022 Spring
Python standard 2022 Spring
anyakichi
 
ゼロから始めるQ#
ゼロから始めるQ#ゼロから始めるQ#
ゼロから始めるQ#
Takayoshi Tanaka
 
2021 DMM Tech Vision
2021 DMM Tech Vision2021 DMM Tech Vision
2021 DMM Tech Vision
DMM.com
 
Emacs上のターミナルを最強に
Emacs上のターミナルを最強にEmacs上のターミナルを最強に
Emacs上のターミナルを最強に
Lintaro Ina
 
[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?
NAVER D2
 
オセロの終盤ソルバーを100倍以上高速化した話
オセロの終盤ソルバーを100倍以上高速化した話オセロの終盤ソルバーを100倍以上高速化した話
オセロの終盤ソルバーを100倍以上高速化した話
京大 マイコンクラブ
 
VPP事始め
VPP事始めVPP事始め
VPP事始め
npsg
 
[124]자율주행과 기계학습
[124]자율주행과 기계학습[124]자율주행과 기계학습
[124]자율주행과 기계학습
NAVER D2
 
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
Taehoon Kim
 
F#によるFunctional Programming入門
F#によるFunctional Programming入門F#によるFunctional Programming入門
F#によるFunctional Programming入門
bleis tift
 
SAT/SMTソルバの仕組み
SAT/SMTソルバの仕組みSAT/SMTソルバの仕組み
SAT/SMTソルバの仕組み
Masahiro Sakai
 
暗認本読書会10
暗認本読書会10暗認本読書会10
暗認本読書会10
MITSUNARI Shigeo
 
Pythonではじめる競技プログラミング
Pythonではじめる競技プログラミングPythonではじめる競技プログラミング
Pythonではじめる競技プログラミング
cocodrips
 
PostgreSQL失敗談
PostgreSQL失敗談PostgreSQL失敗談
PostgreSQL失敗談
Takashi Meguro
 
細かすぎて伝わらないD3 ver.4の話
細かすぎて伝わらないD3 ver.4の話細かすぎて伝わらないD3 ver.4の話
細かすぎて伝わらないD3 ver.4の話
清水 正行
 
GPUが100倍速いという神話をぶち殺せたらいいな ver.2013
GPUが100倍速いという神話をぶち殺せたらいいな ver.2013GPUが100倍速いという神話をぶち殺せたらいいな ver.2013
GPUが100倍速いという神話をぶち殺せたらいいな ver.2013
Ryo Sakamoto
 
ClojureではじめるSTM入門
ClojureではじめるSTM入門ClojureではじめるSTM入門
ClojureではじめるSTM入門
sohta
 
Freer Monads, More Extensible Effects
Freer Monads, More Extensible EffectsFreer Monads, More Extensible Effects
Freer Monads, More Extensible Effects
Hiromi Ishii
 
マルチコアとネットワークスタックの高速化技法
マルチコアとネットワークスタックの高速化技法マルチコアとネットワークスタックの高速化技法
マルチコアとネットワークスタックの高速化技法
Takuya ASADA
 
쫄지말자딥러닝2 - CNN RNN 포함버전
쫄지말자딥러닝2 - CNN RNN 포함버전쫄지말자딥러닝2 - CNN RNN 포함버전
쫄지말자딥러닝2 - CNN RNN 포함버전
Modulabs
 
Python standard 2022 Spring
Python standard 2022 SpringPython standard 2022 Spring
Python standard 2022 Spring
anyakichi
 
2021 DMM Tech Vision
2021 DMM Tech Vision2021 DMM Tech Vision
2021 DMM Tech Vision
DMM.com
 
Emacs上のターミナルを最強に
Emacs上のターミナルを最強にEmacs上のターミナルを最強に
Emacs上のターミナルを最強に
Lintaro Ina
 
[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?
NAVER D2
 
オセロの終盤ソルバーを100倍以上高速化した話
オセロの終盤ソルバーを100倍以上高速化した話オセロの終盤ソルバーを100倍以上高速化した話
オセロの終盤ソルバーを100倍以上高速化した話
京大 マイコンクラブ
 
VPP事始め
VPP事始めVPP事始め
VPP事始め
npsg
 
[124]자율주행과 기계학습
[124]자율주행과 기계학습[124]자율주행과 기계학습
[124]자율주행과 기계학습
NAVER D2
 
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
Taehoon Kim
 
F#によるFunctional Programming入門
F#によるFunctional Programming入門F#によるFunctional Programming入門
F#によるFunctional Programming入門
bleis tift
 
SAT/SMTソルバの仕組み
SAT/SMTソルバの仕組みSAT/SMTソルバの仕組み
SAT/SMTソルバの仕組み
Masahiro Sakai
 
Pythonではじめる競技プログラミング
Pythonではじめる競技プログラミングPythonではじめる競技プログラミング
Pythonではじめる競技プログラミング
cocodrips
 
細かすぎて伝わらないD3 ver.4の話
細かすぎて伝わらないD3 ver.4の話細かすぎて伝わらないD3 ver.4の話
細かすぎて伝わらないD3 ver.4の話
清水 正行
 
GPUが100倍速いという神話をぶち殺せたらいいな ver.2013
GPUが100倍速いという神話をぶち殺せたらいいな ver.2013GPUが100倍速いという神話をぶち殺せたらいいな ver.2013
GPUが100倍速いという神話をぶち殺せたらいいな ver.2013
Ryo Sakamoto
 
ClojureではじめるSTM入門
ClojureではじめるSTM入門ClojureではじめるSTM入門
ClojureではじめるSTM入門
sohta
 
Freer Monads, More Extensible Effects
Freer Monads, More Extensible EffectsFreer Monads, More Extensible Effects
Freer Monads, More Extensible Effects
Hiromi Ishii
 
マルチコアとネットワークスタックの高速化技法
マルチコアとネットワークスタックの高速化技法マルチコアとネットワークスタックの高速化技法
マルチコアとネットワークスタックの高速化技法
Takuya ASADA
 
쫄지말자딥러닝2 - CNN RNN 포함버전
쫄지말자딥러닝2 - CNN RNN 포함버전쫄지말자딥러닝2 - CNN RNN 포함버전
쫄지말자딥러닝2 - CNN RNN 포함버전
Modulabs
 

Viewers also liked (19)

EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open EvidenceEU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
Kasia Szkuta
 
Building Intelligent Data Products (Applied AI)
Building Intelligent Data Products (Applied AI)Building Intelligent Data Products (Applied AI)
Building Intelligent Data Products (Applied AI)
Stephen Whitworth
 
Adding machine learning to a web app
Adding machine learning to a web appAdding machine learning to a web app
Adding machine learning to a web app
Richard Dallaway
 
Building Intelligent Data Products
Building Intelligent Data ProductsBuilding Intelligent Data Products
Building Intelligent Data Products
Stephen Whitworth
 
Applied Computer Vision - a Deep Learning Approach
Applied Computer Vision - a Deep Learning ApproachApplied Computer Vision - a Deep Learning Approach
Applied Computer Vision - a Deep Learning Approach
Jose Berengueres
 
Launching Data Products for Fun and Profit
Launching Data Products for Fun and ProfitLaunching Data Products for Fun and Profit
Launching Data Products for Fun and Profit
Zach Gemignani
 
Video Marketing for Self-Storage: Get Real Online
Video Marketing for Self-Storage: Get Real OnlineVideo Marketing for Self-Storage: Get Real Online
Video Marketing for Self-Storage: Get Real Online
SpareFoot
 
EEON103 Хичээл 11
EEON103 Хичээл 11EEON103 Хичээл 11
EEON103 Хичээл 11
E-Gazarchin Online University
 
Developing Data Products
Developing Data ProductsDeveloping Data Products
Developing Data Products
Peter Skomoroch
 
The Most Expensive Cars in the World.
The Most Expensive Cars in the World.The Most Expensive Cars in the World.
The Most Expensive Cars in the World.
Severus Prime
 
Simformer. Инструкции по разработке бизнес курсов и тренингов
Simformer. Инструкции по разработке бизнес курсов и тренинговSimformer. Инструкции по разработке бизнес курсов и тренингов
Simformer. Инструкции по разработке бизнес курсов и тренингов
Sergey Menshikov
 
Suomalaiset yritykset lean managementin soveltajina
Suomalaiset yritykset lean managementin soveltajinaSuomalaiset yritykset lean managementin soveltajina
Suomalaiset yritykset lean managementin soveltajina
TechFinland
 
Unidad 4 grecia antigua
Unidad 4 grecia antiguaUnidad 4 grecia antigua
Unidad 4 grecia antigua
Lucas Chalub
 
The Shotfarm Product Information Report
The Shotfarm Product Information ReportThe Shotfarm Product Information Report
The Shotfarm Product Information Report
FrenchWeb.fr
 
Slideshare 8.12.15
Slideshare 8.12.15Slideshare 8.12.15
Slideshare 8.12.15
Melana Shah
 
Build Your Intranet With Office 365
Build Your Intranet With Office 365Build Your Intranet With Office 365
Build Your Intranet With Office 365
Richard Harbridge
 
How to Discover and Create Great Visual Content for Facebook
How to Discover and Create Great Visual Content for FacebookHow to Discover and Create Great Visual Content for Facebook
How to Discover and Create Great Visual Content for Facebook
Peg Fitzpatrick
 
今さら聞けない人のためのDocker超入門 CentOS 7.2対応版
今さら聞けない人のためのDocker超入門 CentOS 7.2対応版今さら聞けない人のためのDocker超入門 CentOS 7.2対応版
今さら聞けない人のためのDocker超入門 CentOS 7.2対応版
VirtualTech Japan Inc.
 
Scaling up Machine Learning Algorithms for Classification
Scaling up Machine Learning Algorithms for ClassificationScaling up Machine Learning Algorithms for Classification
Scaling up Machine Learning Algorithms for Classification
smatsus
 
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open EvidenceEU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
Kasia Szkuta
 
Building Intelligent Data Products (Applied AI)
Building Intelligent Data Products (Applied AI)Building Intelligent Data Products (Applied AI)
Building Intelligent Data Products (Applied AI)
Stephen Whitworth
 
Adding machine learning to a web app
Adding machine learning to a web appAdding machine learning to a web app
Adding machine learning to a web app
Richard Dallaway
 
Building Intelligent Data Products
Building Intelligent Data ProductsBuilding Intelligent Data Products
Building Intelligent Data Products
Stephen Whitworth
 
Applied Computer Vision - a Deep Learning Approach
Applied Computer Vision - a Deep Learning ApproachApplied Computer Vision - a Deep Learning Approach
Applied Computer Vision - a Deep Learning Approach
Jose Berengueres
 
Launching Data Products for Fun and Profit
Launching Data Products for Fun and ProfitLaunching Data Products for Fun and Profit
Launching Data Products for Fun and Profit
Zach Gemignani
 
Video Marketing for Self-Storage: Get Real Online
Video Marketing for Self-Storage: Get Real OnlineVideo Marketing for Self-Storage: Get Real Online
Video Marketing for Self-Storage: Get Real Online
SpareFoot
 
Developing Data Products
Developing Data ProductsDeveloping Data Products
Developing Data Products
Peter Skomoroch
 
The Most Expensive Cars in the World.
The Most Expensive Cars in the World.The Most Expensive Cars in the World.
The Most Expensive Cars in the World.
Severus Prime
 
Simformer. Инструкции по разработке бизнес курсов и тренингов
Simformer. Инструкции по разработке бизнес курсов и тренинговSimformer. Инструкции по разработке бизнес курсов и тренингов
Simformer. Инструкции по разработке бизнес курсов и тренингов
Sergey Menshikov
 
Suomalaiset yritykset lean managementin soveltajina
Suomalaiset yritykset lean managementin soveltajinaSuomalaiset yritykset lean managementin soveltajina
Suomalaiset yritykset lean managementin soveltajina
TechFinland
 
Unidad 4 grecia antigua
Unidad 4 grecia antiguaUnidad 4 grecia antigua
Unidad 4 grecia antigua
Lucas Chalub
 
The Shotfarm Product Information Report
The Shotfarm Product Information ReportThe Shotfarm Product Information Report
The Shotfarm Product Information Report
FrenchWeb.fr
 
Slideshare 8.12.15
Slideshare 8.12.15Slideshare 8.12.15
Slideshare 8.12.15
Melana Shah
 
Build Your Intranet With Office 365
Build Your Intranet With Office 365Build Your Intranet With Office 365
Build Your Intranet With Office 365
Richard Harbridge
 
How to Discover and Create Great Visual Content for Facebook
How to Discover and Create Great Visual Content for FacebookHow to Discover and Create Great Visual Content for Facebook
How to Discover and Create Great Visual Content for Facebook
Peg Fitzpatrick
 
今さら聞けない人のためのDocker超入門 CentOS 7.2対応版
今さら聞けない人のためのDocker超入門 CentOS 7.2対応版今さら聞けない人のためのDocker超入門 CentOS 7.2対応版
今さら聞けない人のためのDocker超入門 CentOS 7.2対応版
VirtualTech Japan Inc.
 
Scaling up Machine Learning Algorithms for Classification
Scaling up Machine Learning Algorithms for ClassificationScaling up Machine Learning Algorithms for Classification
Scaling up Machine Learning Algorithms for Classification
smatsus
 
Ad

Similar to Understanding Feature Space in Machine Learning - Data Science Pop-up Seattle (20)

Get connected with python
Get connected with pythonGet connected with python
Get connected with python
Jan Kroon
 
AI Is Changing The Way We Look At Data Science
AI Is Changing The Way We Look At Data ScienceAI Is Changing The Way We Look At Data Science
AI Is Changing The Way We Look At Data Science
Abe
 
Ml3
Ml3Ml3
Ml3
poovarasu maniandan
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
Turi, Inc.
 
Understanding feature-space
Understanding feature-spaceUnderstanding feature-space
Understanding feature-space
Mihran Kalaydjian
 
Understanding Feature Space in Machine Learning
Understanding Feature Space in Machine LearningUnderstanding Feature Space in Machine Learning
Understanding Feature Space in Machine Learning
Alice Zheng
 
Design engineering with d3+react with-speaker-notes
Design engineering with d3+react with-speaker-notesDesign engineering with d3+react with-speaker-notes
Design engineering with d3+react with-speaker-notes
Lorraine Sawicki
 
The Road to Data Science - Joel Grus, June 2015
The Road to Data Science - Joel Grus, June 2015The Road to Data Science - Joel Grus, June 2015
The Road to Data Science - Joel Grus, June 2015
Seattle DAML meetup
 
Taking Your Website Mobile with TYPO3 (again)
Taking Your Website Mobile with TYPO3 (again)Taking Your Website Mobile with TYPO3 (again)
Taking Your Website Mobile with TYPO3 (again)
Jeremy Greenawalt
 
Integration-Monday-Logic-Apps-Tips-Tricks
Integration-Monday-Logic-Apps-Tips-TricksIntegration-Monday-Logic-Apps-Tips-Tricks
Integration-Monday-Logic-Apps-Tips-Tricks
BizTalk360
 
A Manager's Digitalization Agenda
A Manager's Digitalization Agenda A Manager's Digitalization Agenda
A Manager's Digitalization Agenda
Lindsey Parker
 
Recommender Trends 2014
Recommender Trends 2014Recommender Trends 2014
Recommender Trends 2014
Torben Brodt
 
Global Azure 2020 - Sandro Pereira - Logic apps: Best practices tips and tricks
Global Azure 2020 - Sandro Pereira - Logic apps: Best practices tips and tricksGlobal Azure 2020 - Sandro Pereira - Logic apps: Best practices tips and tricks
Global Azure 2020 - Sandro Pereira - Logic apps: Best practices tips and tricks
Sandro Pereira
 
Multimedia: Les 5
Multimedia: Les 5Multimedia: Les 5
Multimedia: Les 5
Erik Duval
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
Big Data Spain
 
The Tidyverse and the Future of the Monitoring Toolchain
The Tidyverse and the Future of the Monitoring ToolchainThe Tidyverse and the Future of the Monitoring Toolchain
The Tidyverse and the Future of the Monitoring Toolchain
John Rauser
 
Adapting Designers' tools, methodologies for the future
Adapting Designers' tools, methodologies for the futureAdapting Designers' tools, methodologies for the future
Adapting Designers' tools, methodologies for the future
Ariana Koblitz
 
What I tell myself before visualizing
What I tell myself before visualizingWhat I tell myself before visualizing
What I tell myself before visualizing
Krist Wongsuphasawat
 
Data Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4jData Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4j
Max De Marzi
 
The Why Of Ruby
The Why Of RubyThe Why Of Ruby
The Why Of Ruby
Brian Hogan
 
Get connected with python
Get connected with pythonGet connected with python
Get connected with python
Jan Kroon
 
AI Is Changing The Way We Look At Data Science
AI Is Changing The Way We Look At Data ScienceAI Is Changing The Way We Look At Data Science
AI Is Changing The Way We Look At Data Science
Abe
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
Turi, Inc.
 
Understanding Feature Space in Machine Learning
Understanding Feature Space in Machine LearningUnderstanding Feature Space in Machine Learning
Understanding Feature Space in Machine Learning
Alice Zheng
 
Design engineering with d3+react with-speaker-notes
Design engineering with d3+react with-speaker-notesDesign engineering with d3+react with-speaker-notes
Design engineering with d3+react with-speaker-notes
Lorraine Sawicki
 
The Road to Data Science - Joel Grus, June 2015
The Road to Data Science - Joel Grus, June 2015The Road to Data Science - Joel Grus, June 2015
The Road to Data Science - Joel Grus, June 2015
Seattle DAML meetup
 
Taking Your Website Mobile with TYPO3 (again)
Taking Your Website Mobile with TYPO3 (again)Taking Your Website Mobile with TYPO3 (again)
Taking Your Website Mobile with TYPO3 (again)
Jeremy Greenawalt
 
Integration-Monday-Logic-Apps-Tips-Tricks
Integration-Monday-Logic-Apps-Tips-TricksIntegration-Monday-Logic-Apps-Tips-Tricks
Integration-Monday-Logic-Apps-Tips-Tricks
BizTalk360
 
A Manager's Digitalization Agenda
A Manager's Digitalization Agenda A Manager's Digitalization Agenda
A Manager's Digitalization Agenda
Lindsey Parker
 
Recommender Trends 2014
Recommender Trends 2014Recommender Trends 2014
Recommender Trends 2014
Torben Brodt
 
Global Azure 2020 - Sandro Pereira - Logic apps: Best practices tips and tricks
Global Azure 2020 - Sandro Pereira - Logic apps: Best practices tips and tricksGlobal Azure 2020 - Sandro Pereira - Logic apps: Best practices tips and tricks
Global Azure 2020 - Sandro Pereira - Logic apps: Best practices tips and tricks
Sandro Pereira
 
Multimedia: Les 5
Multimedia: Les 5Multimedia: Les 5
Multimedia: Les 5
Erik Duval
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
Big Data Spain
 
The Tidyverse and the Future of the Monitoring Toolchain
The Tidyverse and the Future of the Monitoring ToolchainThe Tidyverse and the Future of the Monitoring Toolchain
The Tidyverse and the Future of the Monitoring Toolchain
John Rauser
 
Adapting Designers' tools, methodologies for the future
Adapting Designers' tools, methodologies for the futureAdapting Designers' tools, methodologies for the future
Adapting Designers' tools, methodologies for the future
Ariana Koblitz
 
What I tell myself before visualizing
What I tell myself before visualizingWhat I tell myself before visualizing
What I tell myself before visualizing
Krist Wongsuphasawat
 
Data Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4jData Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4j
Max De Marzi
 
Ad

More from Domino Data Lab (20)

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...
Domino Data Lab
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
Domino Data Lab
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops data
Domino Data Lab
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
Domino Data Lab
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
Domino Data Lab
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
Domino Data Lab
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile Virus
Domino Data Lab
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
Domino Data Lab
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data Science
Domino Data Lab
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
Domino Data Lab
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)
Domino Data Lab
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
Domino Data Lab
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
Domino Data Lab
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data Scientists
Domino Data Lab
 
Making Big Data Smart
Making Big Data SmartMaking Big Data Smart
Making Big Data Smart
Domino Data Lab
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Domino Data Lab
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
Domino Data Lab
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
Domino Data Lab
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino Data Lab
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data Science
Domino Data Lab
 
What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...
Domino Data Lab
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
Domino Data Lab
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops data
Domino Data Lab
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
Domino Data Lab
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
Domino Data Lab
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
Domino Data Lab
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile Virus
Domino Data Lab
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
Domino Data Lab
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data Science
Domino Data Lab
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
Domino Data Lab
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)
Domino Data Lab
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
Domino Data Lab
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
Domino Data Lab
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data Scientists
Domino Data Lab
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Domino Data Lab
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
Domino Data Lab
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
Domino Data Lab
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino Data Lab
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data Science
Domino Data Lab
 

Recently uploaded (20)

web-roadmap developer file information..
web-roadmap developer file information..web-roadmap developer file information..
web-roadmap developer file information..
pandeyarush01
 
Lesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdfLesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdf
hemelali11
 
Important JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must KnowImportant JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must Know
yashikanigam1
 
Introduction to Python_for_machine_learning.pdf
Introduction to Python_for_machine_learning.pdfIntroduction to Python_for_machine_learning.pdf
Introduction to Python_for_machine_learning.pdf
goldenflower34
 
Storage Devices and the Mechanism of Data Storage in Audio and Visual Form
Storage Devices and the Mechanism of Data Storage in Audio and Visual FormStorage Devices and the Mechanism of Data Storage in Audio and Visual Form
Storage Devices and the Mechanism of Data Storage in Audio and Visual Form
Professional Content Writing's
 
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
Taqyea
 
national income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptxnational income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptx
j2492618
 
Introduction to systems thinking tools_Eng.pdf
Introduction to systems thinking tools_Eng.pdfIntroduction to systems thinking tools_Eng.pdf
Introduction to systems thinking tools_Eng.pdf
AbdurahmanAbd
 
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
RomiRomeo
 
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOTTYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
CA Suvidha Chaplot
 
Concrete_Presenbmlkvvbvvvfvbbbfcfftation.pptx
Concrete_Presenbmlkvvbvvvfvbbbfcfftation.pptxConcrete_Presenbmlkvvbvvvfvbbbfcfftation.pptx
Concrete_Presenbmlkvvbvvvfvbbbfcfftation.pptx
ssuserd1f4a3
 
Time series analysis & forecasting-Day1.pptx
Time series analysis & forecasting-Day1.pptxTime series analysis & forecasting-Day1.pptx
Time series analysis & forecasting-Day1.pptx
AsmaaMahmoud89
 
Ann Naser Nabil- Data Scientist Portfolio.pdf
Ann Naser Nabil- Data Scientist Portfolio.pdfAnn Naser Nabil- Data Scientist Portfolio.pdf
Ann Naser Nabil- Data Scientist Portfolio.pdf
আন্ নাসের নাবিল
 
How to make impact with process mining? - PGGM
How to make impact with process mining? - PGGMHow to make impact with process mining? - PGGM
How to make impact with process mining? - PGGM
Process mining Evangelist
 
From Data to Insight: How News Aggregator APIs Deliver Contextual Intelligence
From Data to Insight: How News Aggregator APIs Deliver Contextual IntelligenceFrom Data to Insight: How News Aggregator APIs Deliver Contextual Intelligence
From Data to Insight: How News Aggregator APIs Deliver Contextual Intelligence
Contify
 
Dynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics DynamicsDynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics Dynamics
heyoubro69
 
Taking a customer journey with process mining
Taking a customer journey with process miningTaking a customer journey with process mining
Taking a customer journey with process mining
Process mining Evangelist
 
Urban models for professional practice 03
Urban models for professional practice 03Urban models for professional practice 03
Urban models for professional practice 03
DanisseLoiDapdap
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Jayantilal Bhanushali
 
web-roadmap developer file information..
web-roadmap developer file information..web-roadmap developer file information..
web-roadmap developer file information..
pandeyarush01
 
Lesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdfLesson 6-Interviewing in SHRM_updated.pdf
Lesson 6-Interviewing in SHRM_updated.pdf
hemelali11
 
Important JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must KnowImportant JavaScript Concepts Every Developer Must Know
Important JavaScript Concepts Every Developer Must Know
yashikanigam1
 
Introduction to Python_for_machine_learning.pdf
Introduction to Python_for_machine_learning.pdfIntroduction to Python_for_machine_learning.pdf
Introduction to Python_for_machine_learning.pdf
goldenflower34
 
Storage Devices and the Mechanism of Data Storage in Audio and Visual Form
Storage Devices and the Mechanism of Data Storage in Audio and Visual FormStorage Devices and the Mechanism of Data Storage in Audio and Visual Form
Storage Devices and the Mechanism of Data Storage in Audio and Visual Form
Professional Content Writing's
 
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
最新版澳洲西澳大利亚大学毕业证(UWA毕业证书)原版定制
Taqyea
 
national income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptxnational income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptx
j2492618
 
Introduction to systems thinking tools_Eng.pdf
Introduction to systems thinking tools_Eng.pdfIntroduction to systems thinking tools_Eng.pdf
Introduction to systems thinking tools_Eng.pdf
AbdurahmanAbd
 
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
2022.02.07_Bahan DJE Energy Transition Dialogue 2022 kirim.pdf
RomiRomeo
 
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOTTYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
TYPES OF SOFTWARE_ A Visual Guide.pdf CA SUVIDHA CHAPLOT
CA Suvidha Chaplot
 
Concrete_Presenbmlkvvbvvvfvbbbfcfftation.pptx
Concrete_Presenbmlkvvbvvvfvbbbfcfftation.pptxConcrete_Presenbmlkvvbvvvfvbbbfcfftation.pptx
Concrete_Presenbmlkvvbvvvfvbbbfcfftation.pptx
ssuserd1f4a3
 
Time series analysis & forecasting-Day1.pptx
Time series analysis & forecasting-Day1.pptxTime series analysis & forecasting-Day1.pptx
Time series analysis & forecasting-Day1.pptx
AsmaaMahmoud89
 
How to make impact with process mining? - PGGM
How to make impact with process mining? - PGGMHow to make impact with process mining? - PGGM
How to make impact with process mining? - PGGM
Process mining Evangelist
 
From Data to Insight: How News Aggregator APIs Deliver Contextual Intelligence
From Data to Insight: How News Aggregator APIs Deliver Contextual IntelligenceFrom Data to Insight: How News Aggregator APIs Deliver Contextual Intelligence
From Data to Insight: How News Aggregator APIs Deliver Contextual Intelligence
Contify
 
Dynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics DynamicsDynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics Dynamics
heyoubro69
 
Taking a customer journey with process mining
Taking a customer journey with process miningTaking a customer journey with process mining
Taking a customer journey with process mining
Process mining Evangelist
 
Urban models for professional practice 03
Urban models for professional practice 03Urban models for professional practice 03
Urban models for professional practice 03
DanisseLoiDapdap
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Day 1 MS Excel Basics #.pptxDay 1 MS Excel Basics #.pptxDay 1 MS Excel Basics...
Jayantilal Bhanushali
 

Understanding Feature Space in Machine Learning - Data Science Pop-up Seattle