SlideShare a Scribd company logo
‘ANATOMY OF A DATA
SCIENCE PROJECT’
ADAM SROKA, SENIOR DATA SCIENTIST
WELCOME TO THE DECEMBER SCOTLAND DATA SCIENCE & TECHNOLOGY MEETUP
RICARDO ANTUNES, DATA SCIENTIST
INCREMENTAL GROUP
- with
&
MBN ACADEMY ARE THE OFFICIAL PARTNERS OF THE
DATA LAB MSC. PLACEMENT PROGRAMME 2018/19.
IF YOU ARE PART OF THE SCOTTISH BUSINESS
COMMUNITY AND INTERESTED IN BECOMING A
POTENTIAL HOST ORGANISATION, PLEASE EMAIL ROB
AT ACADEMY@MBNSOLUTIONS.COM
Anatomy of a Data Science
Project
Adam Sroka, Senior Data Scientist
adam.sroka@incrementalgroup.co.uk
@adzsroka
Why we exist
Digital technology
is changing
everything
Sustainable
success comes
from incremental
improvements
Our mission is to
enable
government and
industry to digitally
transform, one
step at a time
THE DATA SCIENCE
HIERARCHY OF NEEDS
THE DATA SCIENCE
HIERARCHY OF NEEDS
COMPLEXITY
FROM SIMPLICITY
What makes a good project?
A few points to consider before you start
Anatomy of a data science project
Anatomy of a data science project
Anatomy of a data science project
Anatomy of a data science project
Ask yourself…
1. Will this be easy to deploy & use?
2. Will it be considerably better than existing solutions?
3. Can parts be automated, reproduced, and reused?
4. Is it easy to understand, explain, and test?
5. Is it technically interesting?
Data
Making your raw materials go further
Anatomy of a data science project
 This is always the
first step
 Build a reusable
set of tools for
measuring quality
 DataExplorer is
great start
Quality
Build pipelines
Models
Mastering the tools of the trade
“Never use a long word
where a short one will do.”
George Orwell
 It’s easy to get excited about the new big thing
 Sometimes seems like expressing intelligence has
taken priority over delivering value
 Marginal gains at the cost of understanding and
interoperability aren’t gains at all
Complexity
 What are other people
doing?
 Are there similar problems
to yours on Kaggle?
 Do you have any biases?
 Take a quick first pass with
everything and review
What works?
Tools
A few things to make your life easier
Templates
 Figure out a template that
works for you – then stick to it
 It makes moving between and
sharing projects tolerable
 https://drivendata.github.io/co
okiecutter-data-science/
 For longer lasting projects,
strongly consider build
automation
 This will manage rebuilding
what’s needed when you
make a change
 Tools like Azure Pipelines,
AWS CodeBuild, Luigi, or
even Makefiles
Build Tools
Containers
 Package your entire workspace
into easily manageable
containers
 Makes reproducibility and sharing
simple
 Many cloud platforms allow
automatic distribution of
containers to clusters
Resources
reddit.com/user/adzsroka/m/d
atascience
datatau.com
kagglenoobs.herokuapp.com
dataelixir.com
getrevue.co/profile/
datamachina/
Thanks!
Adam Sroka, Senior Data Scientist
adam.sroka@incrementalgroup.co.uk
@adzsroka
Ad

More Related Content

What's hot (20)

Agile Data Science
Agile Data ScienceAgile Data Science
Agile Data Science
Volodymyr Kazantsev
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
Domino Data Lab
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Formulatedby
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
Domino Data Lab
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
Domino Data Lab
 
Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.
Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.
Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.
Venveo
 
1645 track 3 porter
1645 track 3 porter1645 track 3 porter
1645 track 3 porter
Rising Media, Inc.
 
H2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.ioH2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.io
Sri Ambati
 
Evaluation of big data analysis
Evaluation of big data analysisEvaluation of big data analysis
Evaluation of big data analysis
Καρολίνα Κάτι
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
Sri Ambati
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
Domino Data Lab
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedback
Peculium Crypto
 
Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...
Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...
Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...
IADSS
 
Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprise
mark madsen
 
Operational analytics overview
Operational analytics overviewOperational analytics overview
Operational analytics overview
pallavi pentapati
 
1555 track 1 huang_using his mac
1555 track 1 huang_using his mac1555 track 1 huang_using his mac
1555 track 1 huang_using his mac
Rising Media, Inc.
 
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Ali Alkan
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
Domino Data Lab
 
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing ZhaoH2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
Sri Ambati
 
CRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining ProjectsCRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining Projects
Michał Łopuszyński
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
Domino Data Lab
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Formulatedby
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
Domino Data Lab
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
Domino Data Lab
 
Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.
Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.
Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.
Venveo
 
H2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.ioH2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.io
Sri Ambati
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
Sri Ambati
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
Domino Data Lab
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedback
Peculium Crypto
 
Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...
Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...
Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...
IADSS
 
Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprise
mark madsen
 
Operational analytics overview
Operational analytics overviewOperational analytics overview
Operational analytics overview
pallavi pentapati
 
1555 track 1 huang_using his mac
1555 track 1 huang_using his mac1555 track 1 huang_using his mac
1555 track 1 huang_using his mac
Rising Media, Inc.
 
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Ali Alkan
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
Domino Data Lab
 
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing ZhaoH2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
Sri Ambati
 
CRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining ProjectsCRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining Projects
Michał Łopuszyński
 

Similar to Anatomy of a data science project (20)

Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
Fangda Wang
 
Online productivity tools - SILS20090
Online productivity tools - SILS20090Online productivity tools - SILS20090
Online productivity tools - SILS20090
is20090
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez
Betacowork
 
Enabling Data centric Teams
Enabling Data centric TeamsEnabling Data centric Teams
Enabling Data centric Teams
Data Con LA
 
Forms 2 Future - the ongoing journey into the future for Oracle based organiz...
Forms 2 Future - the ongoing journey into the future for Oracle based organiz...Forms 2 Future - the ongoing journey into the future for Oracle based organiz...
Forms 2 Future - the ongoing journey into the future for Oracle based organiz...
Lucas Jellema
 
How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists
CCG
 
EclipseDay Milano 2017 - How to make Data Science appealing with open source ...
EclipseDay Milano 2017 - How to make Data Science appealing with open source ...EclipseDay Milano 2017 - How to make Data Science appealing with open source ...
EclipseDay Milano 2017 - How to make Data Science appealing with open source ...
SpagoWorld
 
A Methodology for Building the Internet of Things
A Methodology for Building the Internet of ThingsA Methodology for Building the Internet of Things
A Methodology for Building the Internet of Things
The Internet of Things Methodology
 
Pausefest: Solve your own damn problem
Pausefest: Solve your own damn problemPausefest: Solve your own damn problem
Pausefest: Solve your own damn problem
Mike Ojo
 
RightScale Roadtrip Boston: Accelerate to Cloud
RightScale Roadtrip Boston: Accelerate to CloudRightScale Roadtrip Boston: Accelerate to Cloud
RightScale Roadtrip Boston: Accelerate to Cloud
RightScale
 
OpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas Corporate Presentation
OpenSistemas Corporate Presentation
OpenSistemas
 
Digital Transformation: A Case for Modern Workplace
Digital Transformation: A Case for Modern WorkplaceDigital Transformation: A Case for Modern Workplace
Digital Transformation: A Case for Modern Workplace
Sani Garba Consulting
 
Azure - The Best Cloud for Developers
Azure - The Best Cloud for DevelopersAzure - The Best Cloud for Developers
Azure - The Best Cloud for Developers
Inovar Tech
 
Why Should Nonprofits Care About Cloud Computing
Why Should Nonprofits Care About Cloud ComputingWhy Should Nonprofits Care About Cloud Computing
Why Should Nonprofits Care About Cloud Computing
TechSoup Global
 
The silent project disruptor: Building AI solutions
The silent project disruptor: Building AI solutionsThe silent project disruptor: Building AI solutions
The silent project disruptor: Building AI solutions
Association for Project Management
 
Maciej Marek (Philip Morris International) - The Tools of The Trade
Maciej Marek (Philip Morris International) - The Tools of The TradeMaciej Marek (Philip Morris International) - The Tools of The Trade
Maciej Marek (Philip Morris International) - The Tools of The Trade
Codiax
 
Cloud Computing Webinar
Cloud Computing WebinarCloud Computing Webinar
Cloud Computing Webinar
TechSoup
 
Capgemini Ron Tolido - the 3rd Platform and Insurance
Capgemini   Ron Tolido - the 3rd Platform and InsuranceCapgemini   Ron Tolido - the 3rd Platform and Insurance
Capgemini Ron Tolido - the 3rd Platform and Insurance
EDGEteam
 
Where the Warehouse Ends: A New Age of Information Access
Where the Warehouse Ends: A New Age of Information AccessWhere the Warehouse Ends: A New Age of Information Access
Where the Warehouse Ends: A New Age of Information Access
Inside Analysis
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
Denodo
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
Fangda Wang
 
Online productivity tools - SILS20090
Online productivity tools - SILS20090Online productivity tools - SILS20090
Online productivity tools - SILS20090
is20090
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez
Betacowork
 
Enabling Data centric Teams
Enabling Data centric TeamsEnabling Data centric Teams
Enabling Data centric Teams
Data Con LA
 
Forms 2 Future - the ongoing journey into the future for Oracle based organiz...
Forms 2 Future - the ongoing journey into the future for Oracle based organiz...Forms 2 Future - the ongoing journey into the future for Oracle based organiz...
Forms 2 Future - the ongoing journey into the future for Oracle based organiz...
Lucas Jellema
 
How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists
CCG
 
EclipseDay Milano 2017 - How to make Data Science appealing with open source ...
EclipseDay Milano 2017 - How to make Data Science appealing with open source ...EclipseDay Milano 2017 - How to make Data Science appealing with open source ...
EclipseDay Milano 2017 - How to make Data Science appealing with open source ...
SpagoWorld
 
Pausefest: Solve your own damn problem
Pausefest: Solve your own damn problemPausefest: Solve your own damn problem
Pausefest: Solve your own damn problem
Mike Ojo
 
RightScale Roadtrip Boston: Accelerate to Cloud
RightScale Roadtrip Boston: Accelerate to CloudRightScale Roadtrip Boston: Accelerate to Cloud
RightScale Roadtrip Boston: Accelerate to Cloud
RightScale
 
OpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas Corporate Presentation
OpenSistemas Corporate Presentation
OpenSistemas
 
Digital Transformation: A Case for Modern Workplace
Digital Transformation: A Case for Modern WorkplaceDigital Transformation: A Case for Modern Workplace
Digital Transformation: A Case for Modern Workplace
Sani Garba Consulting
 
Azure - The Best Cloud for Developers
Azure - The Best Cloud for DevelopersAzure - The Best Cloud for Developers
Azure - The Best Cloud for Developers
Inovar Tech
 
Why Should Nonprofits Care About Cloud Computing
Why Should Nonprofits Care About Cloud ComputingWhy Should Nonprofits Care About Cloud Computing
Why Should Nonprofits Care About Cloud Computing
TechSoup Global
 
Maciej Marek (Philip Morris International) - The Tools of The Trade
Maciej Marek (Philip Morris International) - The Tools of The TradeMaciej Marek (Philip Morris International) - The Tools of The Trade
Maciej Marek (Philip Morris International) - The Tools of The Trade
Codiax
 
Cloud Computing Webinar
Cloud Computing WebinarCloud Computing Webinar
Cloud Computing Webinar
TechSoup
 
Capgemini Ron Tolido - the 3rd Platform and Insurance
Capgemini   Ron Tolido - the 3rd Platform and InsuranceCapgemini   Ron Tolido - the 3rd Platform and Insurance
Capgemini Ron Tolido - the 3rd Platform and Insurance
EDGEteam
 
Where the Warehouse Ends: A New Age of Information Access
Where the Warehouse Ends: A New Age of Information AccessWhere the Warehouse Ends: A New Age of Information Access
Where the Warehouse Ends: A New Age of Information Access
Inside Analysis
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
Denodo
 
Ad

Recently uploaded (20)

TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Ad

Anatomy of a data science project

Editor's Notes

  • #10: Ease of adoption – awful interfaces
  • #11: Is this an improvement https://medium.com/@EvanSinar/7-data-visualization-types-you-should-be-using-more-and-how-to-start-4015b5d4adf2
  • #12: https://aeon.co/essays/is-technology-making-the-world-indecipherable
  • #13: Reusability https://www.freeimageslive.co.uk/free_stock_image/building-concept-jpg
  • #16: https://gengo.ai/datasets/the-50-best-free-datasets-for-machine-learning/ https://digital.nhs.uk/ https://registry.opendata.aws/ https://data.europa.eu/euodp/en/data/ https://data.gov.uk/ https://www.data.gov/
  • #17: https://cran.r-project.org/web/packages/DataExplorer/vignettes/dataexplorer-intro.html
  • #18: https://icons8.com/icon/2297/gis