SlideShare a Scribd company logo
Store, Extract, Transform, Load, Visualize
Ani Lopez
@anilopez
linkedin.com/in/anilopez
What is this All About
In the beginning there was Data
Infrastructure &
Data Base Admins
BIs
Analysts
And everybody was fairly happy
Data got big & moved in need of strong support
What made Analysts’ work way harder
How do we solve that?
As long as you have
access to Sources,
and control over SETL,
you are ready to funk it up!
Go beyond GA/AA interface. You have to
No need to be an engineer. You can do it
BigData is not scary anymore
This is about how you take over the process
with minimum or no technical knowledge
Analyze
Visualize
Store, Extract, Transform, Load
Automate!
Step 1. Storage
Typical sources
• Online traffic measuring tools like GA or AA
• Social media platforms
• Customer Relationship Management platforms
• Booking systems, Call centers, Retailing
• Telemetry
Data don't exist till fixed somewhere
First challenge: get access
• Amount of sources: one, many, too many
• Access difficulty: simple, complicated, impossible
• Combinations of the above
Sources usually come with a Storing Solution
Yours
Why Our Own Storage?
Source
Source
Source
Source
Source
Safe
Why Our Own Storage?
Source
Source
Source
Types
• Internal
• Excel
• MSSQL / MySQL Server
• External or Cloud
• BigQuery, Cloud SQL, Big Table, DataStorage
• Amazon Redshift
Build your Own Storage
If you are lucky
• All data in a decent storage. Nothing else to do!
• DB / Infrastructure Admins connect the pipes for you
If you don’t
• Do it yourself, a little bit of coding becomes handy
• Cry for help
How?
Step 2. Extract
First
• From Sources to your Storage
• Minimum or no transformation at all
Second
• From your Storage to Intermediate tables
• Heavily transformed
Two moments of Extraction
Dirty cheap
• Next Analytics / BigQuery add-ins for Excel
• Supermetrics / OWOX BQ add-ins for Google Sheets
Careful
• They should be able to automate extraction
• If not some scripting might be required
Tools for Extraction (I)
Data Integration Services
Not so cheap, no coding!
• Analytics Canvas
• Xplenty
• Alteryx
• Fivetran
• Mode
Tools for Extraction (II)
With a hand from DBAs and Engineers
• Google Cloud Dataflow
• Amazon Kinesis
Tools for Extraction (III)
Step 3. Transform
• Viz is important, transformation is key
• No good data = No SUCCESS
Transformation
First
• Data cleansing
• Data enrichment
• Consistency ensuring
Second
• Data Modeling previous to analysis or visualization
Two moments of Transformation
• SQL is the tool to answer complex business question
• It can take you to the BI realm = more $$$ :-D
• A bit of code takes you further
• modeanalytics.com --> Resources
Learn SQL and some JS/Python
Store, Extract, Transform, Load, Visualize. Untagged Conference
Step 4. Load
Why not connecting Viz tool directly to Storage?
• They die when volume of data is huge
• Limited options for transformation
Solution
• Automate materialization to intermediate tables
• Feed Viz tools from those tables
Feed the Viz
Store, Extract, Transform, Load, Visualize. Untagged Conference
Rows: 3,706M
Total time: 180 secs
CPU time: 1.7 days
Rows: 2,3M
Total time: 18 secs
CPU time: 17 secs
Flight delays
1 year of data
Extract only November
10% sample of that
Quick guess
What city and day of November had highest delays?
And you need some
quick charts too
If you don’t know SQL
Xplenty
If you know
Step 5. Visualize
• It's not the same a dashboard than a visual analysis tool
• Insights don't come from any of those
• Insights are the outcome of analyst’s work
Let’s get some stuff straight
• Objective of the visualization itself, representative or exploratory
• Interactivity requirements (on click drill down?)
• Maturity of client's Measurement Culture
• What's data consumer's role: CEO, Analyst, Media planner
• Size of the audience and distribution needs
• Available infrastructure
• Data government and its requirements
• Time to finish the project
• Budget
• Politics
Viz: Factors determining What & How to use
• All of them
• From humble Excel
• To big guys like Qlik and Tableau
• And the middle ones like Data Studio
• Desktop or online solutions
• Coding your own (D3.js)? Interesting but resources intensive,
not agile for those just creating / distributing dashboards
Viz Tools?
• Lady Gaga KO
• Tron Legacy KO
• Minimal OK
3 Styles of Dashboards
Store, Extract, Transform, Load, Visualize. Untagged Conference
Store, Extract, Transform, Load, Visualize. Untagged Conference
Store, Extract, Transform, Load, Visualize. Untagged Conference
Store, Extract, Transform, Load, Visualize. Untagged Conference
Store, Extract, Transform, Load, Visualize. Untagged Conference
Store, Extract, Transform, Load, Visualize. Untagged Conference
Store, Extract, Transform, Load, Visualize. Untagged Conference
Store, Extract, Transform, Load, Visualize. Untagged Conference
Store, Extract, Transform, Load, Visualize. Untagged Conference
Store, Extract, Transform, Load, Visualize. Untagged Conference
Store, Extract, Transform, Load, Visualize. Untagged Conference
• Those using Excel default charts deserve the worst
• Same with the new shiny thing: Data Studio
What dashboards made with default styles look like to me
• Never use Excel default charts or Data Studio templates
• Read about art
• Modern Art de Giulio Carlo Argan
• Focus on: Rationalism / Minimalism / Functionalism
• Follow Viz masters
• Edward Tufte, Stephen Few, Robert Kosara, Alberto Cairo
For Fucks Sake, Educate your Aesthetics!
Examples
Viz
1. Franchise Based Business
SETLV all in once
Windows Task
Scheduler
Online Source
Internal Store
Offline Source
Server
Plotly + Shiny
2. Large Department Store Group. First Setup
Transform
& Viz
to Storage
Online Source
Internal Store
Offline Source
Server
2. Large Department Store Group. Second Setup
Transform
& Load
Vizto Storage
Storage Vizto Storage
3. Sports Equipment Company
Transform
GA
Views
Load
.tde
Live Example
Automated ETL with BigQuery + Apps Script
$0.0, 30 lines of code, 10 minutes
Scheduled
Transformation
Small & Fast
BQ Table
Visualization Tool
of your choice
Huge
BQ Table
Source Table
Destination Table
SQL QUERY doing the Transformation
We want
• To run the transformation every day/week/month
• Append results to existing table feeding the visualization tool
We need
• Your Transforming Query + SQL minifier
• Google Sheets + Apps Script (JavaScript)
Destination Table
Process
• Open a new Google Sheet
• Go to Tools > Script Editor
In Script Editor go to Resources
• Advanced Google Services: Enable BigQuery API
• Developers Console Project: Project Number (of the project
where tables live)
• Place the script and tweak accordingly. Save and schedule
Google Sheets
function saveQueryToTable() {
// Get previous day from cell B2 in spreadsheet
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Sheet1');
var previousDay = sheet.getRange("B2").getValue()
// Query
var sql = 'SELECT date, COUNT(*) FROM [bigquery-146904:test_datasets.flights_MASTER] WHERE YEAR(date)=2012 AND MONTH(date)='+previousDay+' GROUP BY date';
// Table destination details
var projectId = 'bigquery-XXXXXX';
var datasetId = 'test_datasets';
var newTableId = 'flights_2012';
// Job definition
var job = {
configuration: {
query: {
query: sql,
writeDisposition:'WRITE_APPEND',
destinationTable: {
projectId: projectId,
datasetId: datasetId,
tableId: newTableId
}
}
}
};
// Job execution
var queryResults = BigQuery.Jobs.insert(job, projectId);
Logger.log(queryResults.status);
}
JS Script
Schedule
Almost there
• Don’t try to sell to stakeholders the megaproject of your life
• Start small and simple, get buy in, grow little by little
• Plan SETLV carefully according to circumstances
• Don’t just buy first vendor solution presented
• Many solutions out there, ask for demos
• It tends to get messy, don’t panic
$0.02 more of advice
Ad

More Related Content

What's hot (19)

Enterprise and multi-tier Power BI deployments with Azure DevOps.
Enterprise and multi-tier Power BI deployments with Azure DevOps.Enterprise and multi-tier Power BI deployments with Azure DevOps.
Enterprise and multi-tier Power BI deployments with Azure DevOps.
Marc Lelijveld
 
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Looker
 
Driving Digital Transformation with Machine Learning in Oracle Analytics
Driving Digital Transformation with Machine Learning in Oracle AnalyticsDriving Digital Transformation with Machine Learning in Oracle Analytics
Driving Digital Transformation with Machine Learning in Oracle Analytics
Perficient, Inc.
 
DataCanvas: Big Data Analytic Flow in Cloud
DataCanvas: Big Data Analytic Flow in CloudDataCanvas: Big Data Analytic Flow in Cloud
DataCanvas: Big Data Analytic Flow in Cloud
Lei Fang
 
Tableau 2018 - Introduction to Visual analytics
Tableau 2018 - Introduction to Visual analyticsTableau 2018 - Introduction to Visual analytics
Tableau 2018 - Introduction to Visual analytics
Arun K
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scale
Looker
 
Data Visualization and Discovery
Data Visualization and DiscoveryData Visualization and Discovery
Data Visualization and Discovery
Datavail
 
Incorta story with product
Incorta story with productIncorta story with product
Incorta story with product
Incorta
 
Tableau Visual analytics complete deck 2
Tableau Visual analytics complete deck 2Tableau Visual analytics complete deck 2
Tableau Visual analytics complete deck 2
Arun K
 
Analytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsAnalytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old Constraints
Inside Analysis
 
Intro of Key Features of eCAAT-TS
Intro of Key Features of eCAAT-TSIntro of Key Features of eCAAT-TS
Intro of Key Features of eCAAT-TS
rafeq
 
6 steps to richer visualizations using alteryx for microsoft power bi updated
6 steps to richer visualizations using alteryx for microsoft power bi updated6 steps to richer visualizations using alteryx for microsoft power bi updated
6 steps to richer visualizations using alteryx for microsoft power bi updated
Phillip Reinhart
 
Group 3 slide presentation
Group 3 slide presentationGroup 3 slide presentation
Group 3 slide presentation
Michael Young
 
Executive Dashboard Design on Tableau
Executive Dashboard Design on TableauExecutive Dashboard Design on Tableau
Executive Dashboard Design on Tableau
Method360
 
Implementing best practice dashboards & KPIs
Implementing best practice dashboards & KPIsImplementing best practice dashboards & KPIs
Implementing best practice dashboards & KPIs
Access Analytic ... providing AMAZING Power BI and Excel solutions
 
KTern - The Best product for SAP S/4HANA Conversion
KTern - The Best product for SAP S/4HANA ConversionKTern - The Best product for SAP S/4HANA Conversion
KTern - The Best product for SAP S/4HANA Conversion
Akilesh Kumaran
 
From Architecture to Analytics: A look at Simply Business’s data strategy
From Architecture to Analytics: A look at Simply Business’s data strategy From Architecture to Analytics: A look at Simply Business’s data strategy
From Architecture to Analytics: A look at Simply Business’s data strategy
Looker
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016
Looker
 
Tableau - Learning Objectives for Data, Graphs, Filters, Dashboards and Advan...
Tableau - Learning Objectives for Data, Graphs, Filters, Dashboards and Advan...Tableau - Learning Objectives for Data, Graphs, Filters, Dashboards and Advan...
Tableau - Learning Objectives for Data, Graphs, Filters, Dashboards and Advan...
Srinath Reddy
 
Enterprise and multi-tier Power BI deployments with Azure DevOps.
Enterprise and multi-tier Power BI deployments with Azure DevOps.Enterprise and multi-tier Power BI deployments with Azure DevOps.
Enterprise and multi-tier Power BI deployments with Azure DevOps.
Marc Lelijveld
 
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Looker
 
Driving Digital Transformation with Machine Learning in Oracle Analytics
Driving Digital Transformation with Machine Learning in Oracle AnalyticsDriving Digital Transformation with Machine Learning in Oracle Analytics
Driving Digital Transformation with Machine Learning in Oracle Analytics
Perficient, Inc.
 
DataCanvas: Big Data Analytic Flow in Cloud
DataCanvas: Big Data Analytic Flow in CloudDataCanvas: Big Data Analytic Flow in Cloud
DataCanvas: Big Data Analytic Flow in Cloud
Lei Fang
 
Tableau 2018 - Introduction to Visual analytics
Tableau 2018 - Introduction to Visual analyticsTableau 2018 - Introduction to Visual analytics
Tableau 2018 - Introduction to Visual analytics
Arun K
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scale
Looker
 
Data Visualization and Discovery
Data Visualization and DiscoveryData Visualization and Discovery
Data Visualization and Discovery
Datavail
 
Incorta story with product
Incorta story with productIncorta story with product
Incorta story with product
Incorta
 
Tableau Visual analytics complete deck 2
Tableau Visual analytics complete deck 2Tableau Visual analytics complete deck 2
Tableau Visual analytics complete deck 2
Arun K
 
Analytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsAnalytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old Constraints
Inside Analysis
 
Intro of Key Features of eCAAT-TS
Intro of Key Features of eCAAT-TSIntro of Key Features of eCAAT-TS
Intro of Key Features of eCAAT-TS
rafeq
 
6 steps to richer visualizations using alteryx for microsoft power bi updated
6 steps to richer visualizations using alteryx for microsoft power bi updated6 steps to richer visualizations using alteryx for microsoft power bi updated
6 steps to richer visualizations using alteryx for microsoft power bi updated
Phillip Reinhart
 
Group 3 slide presentation
Group 3 slide presentationGroup 3 slide presentation
Group 3 slide presentation
Michael Young
 
Executive Dashboard Design on Tableau
Executive Dashboard Design on TableauExecutive Dashboard Design on Tableau
Executive Dashboard Design on Tableau
Method360
 
KTern - The Best product for SAP S/4HANA Conversion
KTern - The Best product for SAP S/4HANA ConversionKTern - The Best product for SAP S/4HANA Conversion
KTern - The Best product for SAP S/4HANA Conversion
Akilesh Kumaran
 
From Architecture to Analytics: A look at Simply Business’s data strategy
From Architecture to Analytics: A look at Simply Business’s data strategy From Architecture to Analytics: A look at Simply Business’s data strategy
From Architecture to Analytics: A look at Simply Business’s data strategy
Looker
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016
Looker
 
Tableau - Learning Objectives for Data, Graphs, Filters, Dashboards and Advan...
Tableau - Learning Objectives for Data, Graphs, Filters, Dashboards and Advan...Tableau - Learning Objectives for Data, Graphs, Filters, Dashboards and Advan...
Tableau - Learning Objectives for Data, Graphs, Filters, Dashboards and Advan...
Srinath Reddy
 

Viewers also liked (20)

Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
Omid Vahdaty
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
Aashish Rathod
 
"Taming Advanced Analytics Implementations at EA Scale" - Electronic Arts, Di...
"Taming Advanced Analytics Implementations at EA Scale" - Electronic Arts, Di..."Taming Advanced Analytics Implementations at EA Scale" - Electronic Arts, Di...
"Taming Advanced Analytics Implementations at EA Scale" - Electronic Arts, Di...
Tealium
 
Editing Techniques
Editing TechniquesEditing Techniques
Editing Techniques
gbuche
 
Air Miles Customer Dashboard
Air Miles Customer DashboardAir Miles Customer Dashboard
Air Miles Customer Dashboard
Hafiz Umar, MBA, PMP, ITIL
 
Business Intelligence Overview
Business Intelligence OverviewBusiness Intelligence Overview
Business Intelligence Overview
Claudio Menozzi
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
jeshocarme
 
Tableau Airline Solutions
Tableau Airline SolutionsTableau Airline Solutions
Tableau Airline Solutions
dghodke
 
Airline Analytics: Decision Analytics Centers of Excellence
Airline Analytics: Decision Analytics Centers of ExcellenceAirline Analytics: Decision Analytics Centers of Excellence
Airline Analytics: Decision Analytics Centers of Excellence
Booz Allen Hamilton
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
Theju Paul
 
ETL Using Informatica Power Center
ETL Using Informatica Power CenterETL Using Informatica Power Center
ETL Using Informatica Power Center
Edureka!
 
Transportation KPI Dashboard & Report - Example
Transportation KPI Dashboard & Report - ExampleTransportation KPI Dashboard & Report - Example
Transportation KPI Dashboard & Report - Example
Equilibria, Inc.
 
Effective Dashboard Design: Why Your Baby is Ugly
Effective Dashboard Design: Why Your Baby is UglyEffective Dashboard Design: Why Your Baby is Ugly
Effective Dashboard Design: Why Your Baby is Ugly
Aaron Hursman
 
Informatica PowerCenter
Informatica PowerCenterInformatica PowerCenter
Informatica PowerCenter
Ramy Mahrous
 
From KPIs to dashboards
From KPIs to dashboardsFrom KPIs to dashboards
From KPIs to dashboards
Ani Lopez
 
Architecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case StudyArchitecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case Study
Mark Ginnebaugh
 
The Power of Infographics
The Power of InfographicsThe Power of Infographics
The Power of Infographics
Mark Smiciklas
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
CloverDX (formerly known as CloverETL)
 
Fundamental Ways We Use Data Visualizations
Fundamental Ways We Use Data VisualizationsFundamental Ways We Use Data Visualizations
Fundamental Ways We Use Data Visualizations
Initial State
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
Omid Vahdaty
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
Aashish Rathod
 
"Taming Advanced Analytics Implementations at EA Scale" - Electronic Arts, Di...
"Taming Advanced Analytics Implementations at EA Scale" - Electronic Arts, Di..."Taming Advanced Analytics Implementations at EA Scale" - Electronic Arts, Di...
"Taming Advanced Analytics Implementations at EA Scale" - Electronic Arts, Di...
Tealium
 
Editing Techniques
Editing TechniquesEditing Techniques
Editing Techniques
gbuche
 
Business Intelligence Overview
Business Intelligence OverviewBusiness Intelligence Overview
Business Intelligence Overview
Claudio Menozzi
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
jeshocarme
 
Tableau Airline Solutions
Tableau Airline SolutionsTableau Airline Solutions
Tableau Airline Solutions
dghodke
 
Airline Analytics: Decision Analytics Centers of Excellence
Airline Analytics: Decision Analytics Centers of ExcellenceAirline Analytics: Decision Analytics Centers of Excellence
Airline Analytics: Decision Analytics Centers of Excellence
Booz Allen Hamilton
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
Theju Paul
 
ETL Using Informatica Power Center
ETL Using Informatica Power CenterETL Using Informatica Power Center
ETL Using Informatica Power Center
Edureka!
 
Transportation KPI Dashboard & Report - Example
Transportation KPI Dashboard & Report - ExampleTransportation KPI Dashboard & Report - Example
Transportation KPI Dashboard & Report - Example
Equilibria, Inc.
 
Effective Dashboard Design: Why Your Baby is Ugly
Effective Dashboard Design: Why Your Baby is UglyEffective Dashboard Design: Why Your Baby is Ugly
Effective Dashboard Design: Why Your Baby is Ugly
Aaron Hursman
 
Informatica PowerCenter
Informatica PowerCenterInformatica PowerCenter
Informatica PowerCenter
Ramy Mahrous
 
From KPIs to dashboards
From KPIs to dashboardsFrom KPIs to dashboards
From KPIs to dashboards
Ani Lopez
 
Architecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case StudyArchitecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case Study
Mark Ginnebaugh
 
The Power of Infographics
The Power of InfographicsThe Power of Infographics
The Power of Infographics
Mark Smiciklas
 
Fundamental Ways We Use Data Visualizations
Fundamental Ways We Use Data VisualizationsFundamental Ways We Use Data Visualizations
Fundamental Ways We Use Data Visualizations
Initial State
 
Ad

Similar to Store, Extract, Transform, Load, Visualize. Untagged Conference (20)

Levelling up your data infrastructure
Levelling up your data infrastructureLevelling up your data infrastructure
Levelling up your data infrastructure
Simon Belak
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Kent Graziano
 
Visualising montioring and evaluation data
Visualising montioring and evaluation dataVisualising montioring and evaluation data
Visualising montioring and evaluation data
Rob Worthington
 
Power BI - 2016 - Public
Power BI - 2016 - PublicPower BI - 2016 - Public
Power BI - 2016 - Public
Julian Payne
 
Lean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science teamLean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science team
Digital Transformation EXPO Event Series
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with Databricks
Grega Kespret
 
Agile Data Warehousing
Agile Data WarehousingAgile Data Warehousing
Agile Data Warehousing
Davide Mauri
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the Enterprise
Jesus Rodriguez
 
Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
Travis Oliphant
 
SPS Toronto 2015
SPS Toronto 2015SPS Toronto 2015
SPS Toronto 2015
Mike Maadarani
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL Server
Antonios Chatzipavlis
 
Taming the shrew Power BI
Taming the shrew Power BITaming the shrew Power BI
Taming the shrew Power BI
Kellyn Pot'Vin-Gorman
 
Marketing Analytics
Marketing AnalyticsMarketing Analytics
Marketing Analytics
isabat1
 
Ellucian Live 2014 Presentation on Reporting and BI
Ellucian Live 2014 Presentation on Reporting and BIEllucian Live 2014 Presentation on Reporting and BI
Ellucian Live 2014 Presentation on Reporting and BI
Kent Brooks
 
Tableau Seattle BI Event How Tableau Changed My Life
Tableau Seattle BI Event How Tableau Changed My LifeTableau Seattle BI Event How Tableau Changed My Life
Tableau Seattle BI Event How Tableau Changed My Life
Russell Spangler
 
Data Foundation for Analytics Excellence by Tanimura, cathy from Okta
Data Foundation for Analytics Excellence by Tanimura, cathy from OktaData Foundation for Analytics Excellence by Tanimura, cathy from Okta
Data Foundation for Analytics Excellence by Tanimura, cathy from Okta
Tin Ho
 
Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
Andrew Musselman
 
AnalytixLabs - Data Science 360 (Nasscom)-1648178720283 (1).pdf
AnalytixLabs - Data Science 360 (Nasscom)-1648178720283 (1).pdfAnalytixLabs - Data Science 360 (Nasscom)-1648178720283 (1).pdf
AnalytixLabs - Data Science 360 (Nasscom)-1648178720283 (1).pdf
NamanGulati17
 
Data modeling trends for Analytics
Data modeling trends for AnalyticsData modeling trends for Analytics
Data modeling trends for Analytics
Ike Ellis
 
Levelling up your data infrastructure
Levelling up your data infrastructureLevelling up your data infrastructure
Levelling up your data infrastructure
Simon Belak
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Kent Graziano
 
Visualising montioring and evaluation data
Visualising montioring and evaluation dataVisualising montioring and evaluation data
Visualising montioring and evaluation data
Rob Worthington
 
Power BI - 2016 - Public
Power BI - 2016 - PublicPower BI - 2016 - Public
Power BI - 2016 - Public
Julian Payne
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with Databricks
Grega Kespret
 
Agile Data Warehousing
Agile Data WarehousingAgile Data Warehousing
Agile Data Warehousing
Davide Mauri
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the Enterprise
Jesus Rodriguez
 
Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
Travis Oliphant
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL Server
Antonios Chatzipavlis
 
Marketing Analytics
Marketing AnalyticsMarketing Analytics
Marketing Analytics
isabat1
 
Ellucian Live 2014 Presentation on Reporting and BI
Ellucian Live 2014 Presentation on Reporting and BIEllucian Live 2014 Presentation on Reporting and BI
Ellucian Live 2014 Presentation on Reporting and BI
Kent Brooks
 
Tableau Seattle BI Event How Tableau Changed My Life
Tableau Seattle BI Event How Tableau Changed My LifeTableau Seattle BI Event How Tableau Changed My Life
Tableau Seattle BI Event How Tableau Changed My Life
Russell Spangler
 
Data Foundation for Analytics Excellence by Tanimura, cathy from Okta
Data Foundation for Analytics Excellence by Tanimura, cathy from OktaData Foundation for Analytics Excellence by Tanimura, cathy from Okta
Data Foundation for Analytics Excellence by Tanimura, cathy from Okta
Tin Ho
 
Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
Andrew Musselman
 
AnalytixLabs - Data Science 360 (Nasscom)-1648178720283 (1).pdf
AnalytixLabs - Data Science 360 (Nasscom)-1648178720283 (1).pdfAnalytixLabs - Data Science 360 (Nasscom)-1648178720283 (1).pdf
AnalytixLabs - Data Science 360 (Nasscom)-1648178720283 (1).pdf
NamanGulati17
 
Data modeling trends for Analytics
Data modeling trends for AnalyticsData modeling trends for Analytics
Data modeling trends for Analytics
Ike Ellis
 
Ad

Recently uploaded (20)

Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
VKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptxVKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptx
Vinod Srivastava
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjksPpt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
panchariyasahil
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...
Pixellion
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
VKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptxVKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptx
Vinod Srivastava
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjksPpt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
panchariyasahil
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...
Pixellion
 

Store, Extract, Transform, Load, Visualize. Untagged Conference

  • 1. Store, Extract, Transform, Load, Visualize
  • 3. What is this All About
  • 4. In the beginning there was Data
  • 5. Infrastructure & Data Base Admins BIs Analysts And everybody was fairly happy
  • 6. Data got big & moved in need of strong support
  • 7. What made Analysts’ work way harder
  • 8. How do we solve that?
  • 9. As long as you have access to Sources, and control over SETL, you are ready to funk it up!
  • 10. Go beyond GA/AA interface. You have to No need to be an engineer. You can do it BigData is not scary anymore
  • 11. This is about how you take over the process with minimum or no technical knowledge Analyze Visualize Store, Extract, Transform, Load Automate!
  • 13. Typical sources • Online traffic measuring tools like GA or AA • Social media platforms • Customer Relationship Management platforms • Booking systems, Call centers, Retailing • Telemetry Data don't exist till fixed somewhere
  • 14. First challenge: get access • Amount of sources: one, many, too many • Access difficulty: simple, complicated, impossible • Combinations of the above Sources usually come with a Storing Solution
  • 15. Yours Why Our Own Storage? Source Source Source Source Source
  • 16. Safe Why Our Own Storage? Source Source Source
  • 17. Types • Internal • Excel • MSSQL / MySQL Server • External or Cloud • BigQuery, Cloud SQL, Big Table, DataStorage • Amazon Redshift Build your Own Storage
  • 18. If you are lucky • All data in a decent storage. Nothing else to do! • DB / Infrastructure Admins connect the pipes for you If you don’t • Do it yourself, a little bit of coding becomes handy • Cry for help How?
  • 20. First • From Sources to your Storage • Minimum or no transformation at all Second • From your Storage to Intermediate tables • Heavily transformed Two moments of Extraction
  • 21. Dirty cheap • Next Analytics / BigQuery add-ins for Excel • Supermetrics / OWOX BQ add-ins for Google Sheets Careful • They should be able to automate extraction • If not some scripting might be required Tools for Extraction (I)
  • 22. Data Integration Services Not so cheap, no coding! • Analytics Canvas • Xplenty • Alteryx • Fivetran • Mode Tools for Extraction (II)
  • 23. With a hand from DBAs and Engineers • Google Cloud Dataflow • Amazon Kinesis Tools for Extraction (III)
  • 25. • Viz is important, transformation is key • No good data = No SUCCESS Transformation
  • 26. First • Data cleansing • Data enrichment • Consistency ensuring Second • Data Modeling previous to analysis or visualization Two moments of Transformation
  • 27. • SQL is the tool to answer complex business question • It can take you to the BI realm = more $$$ :-D • A bit of code takes you further • modeanalytics.com --> Resources Learn SQL and some JS/Python
  • 30. Why not connecting Viz tool directly to Storage? • They die when volume of data is huge • Limited options for transformation Solution • Automate materialization to intermediate tables • Feed Viz tools from those tables Feed the Viz
  • 32. Rows: 3,706M Total time: 180 secs CPU time: 1.7 days Rows: 2,3M Total time: 18 secs CPU time: 17 secs
  • 33. Flight delays 1 year of data Extract only November 10% sample of that Quick guess What city and day of November had highest delays?
  • 34. And you need some quick charts too
  • 35. If you don’t know SQL Xplenty
  • 38. • It's not the same a dashboard than a visual analysis tool • Insights don't come from any of those • Insights are the outcome of analyst’s work Let’s get some stuff straight
  • 39. • Objective of the visualization itself, representative or exploratory • Interactivity requirements (on click drill down?) • Maturity of client's Measurement Culture • What's data consumer's role: CEO, Analyst, Media planner • Size of the audience and distribution needs • Available infrastructure • Data government and its requirements • Time to finish the project • Budget • Politics Viz: Factors determining What & How to use
  • 40. • All of them • From humble Excel • To big guys like Qlik and Tableau • And the middle ones like Data Studio • Desktop or online solutions • Coding your own (D3.js)? Interesting but resources intensive, not agile for those just creating / distributing dashboards Viz Tools?
  • 41. • Lady Gaga KO • Tron Legacy KO • Minimal OK 3 Styles of Dashboards
  • 53. • Those using Excel default charts deserve the worst • Same with the new shiny thing: Data Studio
  • 54. What dashboards made with default styles look like to me
  • 55. • Never use Excel default charts or Data Studio templates • Read about art • Modern Art de Giulio Carlo Argan • Focus on: Rationalism / Minimalism / Functionalism • Follow Viz masters • Edward Tufte, Stephen Few, Robert Kosara, Alberto Cairo For Fucks Sake, Educate your Aesthetics!
  • 57. Viz 1. Franchise Based Business SETLV all in once Windows Task Scheduler
  • 58. Online Source Internal Store Offline Source Server Plotly + Shiny 2. Large Department Store Group. First Setup Transform & Viz to Storage
  • 59. Online Source Internal Store Offline Source Server 2. Large Department Store Group. Second Setup Transform & Load Vizto Storage
  • 60. Storage Vizto Storage 3. Sports Equipment Company Transform GA Views Load .tde
  • 62. Automated ETL with BigQuery + Apps Script $0.0, 30 lines of code, 10 minutes Scheduled Transformation Small & Fast BQ Table Visualization Tool of your choice Huge BQ Table
  • 65. SQL QUERY doing the Transformation
  • 66. We want • To run the transformation every day/week/month • Append results to existing table feeding the visualization tool We need • Your Transforming Query + SQL minifier • Google Sheets + Apps Script (JavaScript) Destination Table
  • 67. Process • Open a new Google Sheet • Go to Tools > Script Editor In Script Editor go to Resources • Advanced Google Services: Enable BigQuery API • Developers Console Project: Project Number (of the project where tables live) • Place the script and tweak accordingly. Save and schedule Google Sheets
  • 68. function saveQueryToTable() { // Get previous day from cell B2 in spreadsheet var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Sheet1'); var previousDay = sheet.getRange("B2").getValue() // Query var sql = 'SELECT date, COUNT(*) FROM [bigquery-146904:test_datasets.flights_MASTER] WHERE YEAR(date)=2012 AND MONTH(date)='+previousDay+' GROUP BY date'; // Table destination details var projectId = 'bigquery-XXXXXX'; var datasetId = 'test_datasets'; var newTableId = 'flights_2012'; // Job definition var job = { configuration: { query: { query: sql, writeDisposition:'WRITE_APPEND', destinationTable: { projectId: projectId, datasetId: datasetId, tableId: newTableId } } } }; // Job execution var queryResults = BigQuery.Jobs.insert(job, projectId); Logger.log(queryResults.status); } JS Script
  • 71. • Don’t try to sell to stakeholders the megaproject of your life • Start small and simple, get buy in, grow little by little • Plan SETLV carefully according to circumstances • Don’t just buy first vendor solution presented • Many solutions out there, ask for demos • It tends to get messy, don’t panic $0.02 more of advice