SlideShare a Scribd company logo
Gary Orenstein
Chief Product Officer
April 2019
How Yellowbrick
Integrates into
Existing Environments
In today’s talk
Market Drivers
Yellowbrick Data Warehouse Overview and Landscape
Data Management Evolution
How to Introduce Yellowbrick to Existing Environments
Security Application Demonstration
2
Market Drivers
Demands on data organizations
Serve more users, internally and externally
Provide more analytic services simultaneously
• New analytic applications
• Ad hoc business intelligence
• Operational dashboards
• All driving need for fast loading and high performance
Save money with deployment flexibility and consolidation
4
The Yellowbrick Data Warehouse
Yellowbrick fits enterprise needs
Always on and
available
Ad-hoc SQL
queries
Correct answers
on any schema
Terabytes to
petabytes of
data
Mixed real-time
inserts, ETL,
batch,
interactive
workloads
Thousands of
concurrent
users
6
Yellowbrick Data Warehouse attributes
Real-time Feeds
Ingest IoT or OLTP data
Capture 100,000s
of rows per second
Interactive Applications
Serve short queries in
under 100 milliseconds
Periodic Bulk Loads
Capture terabytes
of data, petabytes
over time
Powerful Analytics
Respond to complex
ad hoc BI queries
in just a few seconds
Load and Transform
Use existing ETL tools including
intensive push-down ELT
Business Critical Reporting
Workload management
for prioritized responses
PostgreSQL
compatible
7
The Yellowbrick Data Warehouse
8
PURPOSE-BUILT ALL FLASH SQL ENGINE
From tens of terabytes to 2.4 petabytes
MPP architecture
Start small
Grow compute
and storage
DEPLOY IN THE DATA CENTER OR CLOUD
Data warehousing approaches
9
SCALE UP SCALE OUT SCALE OUT
with minimal resources
Go as far as a single server
can go
Add servers continuously for scale
(parallel systems)
Maximum users, most amount of data,
minimum resources
Couple with workload management
SQL Server
Oracle
IBM Netezza
Greenplum
Vertica
Teradata
Oracle Exadata
SAP HANA
Cloud-only
AWS Redshift
Snowflake
Yellowbrick Data
Data Management Evolution
Simplified evolution of data management
Enterprise Data Warehouse model
• Consolidate one or multiple application data sets
into a data warehouse
Desire to capture all Internet data
led to adoption of a data lake
• However, MapReduce was challenging
SQL-as-a-Layer provides some relief
• But SQL on a file system IS NOT
a data warehouse
SQL as a Layer
11
Incoming Data
Structured and semi-structured
The Yellowbrick Data Warehouse 1000s of users
(Applications, BI analysts,
Data engineers)
High value data moves to EDW
Unstructured data Data Lake Data science
Modern architecture for scalable SQL analytics
12
Integration Recommendations
Retail analytics provider
We had six engineers touch the system and all of them found the system very
easy to use because there was a lot of commonality with the existing systems
that we already had.”
yellowbrick.com/symphony
14
“The [query] performance improvements we saw
were from 3x to 10x improvement basically
running them as is.
Integrating with Yellowbrick
Data Ingest
- Directly from SQL applications
- Via real-time streams from Kafka
- Transformation with Spark
Client interoperability
with PostgreSQL Interactive Applications
Build new analytical
applications
Bulk Loading
Use YBLOAD to move flat files
at 1 GB/s, and up to 6 TB/hr
Powerful BI Analytics
Respond to ad hoc
BI queries from
MicroStrategy, Tableau,
Business Objects and more
Load and Transform
- Informatica, Attunity, Talend, Syncsort
- Spark ETL
Business Critical Reporting
Build prioritized responses
and multi-department support
with workload management
Yellowbrick Database Details
ANSI SQL, stored procedures, views, schemas, workload management
Data Mining
SAS, R, Python
15
Data warehousing approaches
16
SCALE UP SCALE OUT SCALE OUT
with minimal resources
Go as far as a single server
can go
Add servers continuously for scale
(parallel systems)
Maximum users, most amount of data,
minimum resources
Couple with workload management
SQL Server
Oracle
IBM Netezza
Greenplum
Vertica
Teradata
Oracle Exadata
SAP HANA
Cloud-only
AWS Redshift
Snowflake
Yellowbrick DataTraditional MPP
Preconfigured
Cloud-only
Single Server
Integrating (upgrading) single server databases
When should you move?
- Outgrown a single server
- Managing too many independent
databases
17
Benefits with Yellowbrick
- More capacity (compute and
historical data)
- Simplified operation
- Same well-known SQL models
GROWTH BEYOND
A SINGLE SERVER
TOO MANY DISPARATE SYSTEMS YELLOWBRICK DATA WAREHOUSE
Integrating (upgrading) traditional MPP systems
When should you move?
- Netezza
- NOW! Prepare for end-of support
- Vertica/Greenplum
- When you need more active data
warehouse development
- Teradata
- When you are intent on saving millions
of dollars and willing to make selective
changes to your workflow
18
Benefits with Yellowbrick
- For Netezza, Greenplum, Vertica
users
- Same PostreSQL!
- For all traditional MPP users
- Massive savings in legacy footprint while
achieving 3-100x query improvements
Integrating (upgrading) ‘pre-configured’ systems
When should you move?
- When there is corporate objective to reduce
infrastructure costs from Oracle or SAP
- When there is understanding that core apps
can remain on ‘pre-configured’ systems, and
analytics can move to open, standards-based
data warehouses
- When you want to save big $$$
19
Benefits with Yellowbrick
- Significantly lower costs due to a flash-centric
architecture instead of a DRAM-centric
architecture
- Ability to open analytic access to multiple
departments and teams simultaneously
without incurring excessive costs
PRECONFIGURED SYSTEM
Integrating (upgrading) cloud-only systems
When should you move?
• When core your data and analytics
engine has become an unpredictable
and out-of-control cost
• When you have reached a performance or
concurrency limit with cloud-only solutions
• When you prefer the control of hybrid
environment, in the cloud with options
for on-premises
20
Benefits with Yellowbrick
- Redshift is also a PostgreSQL compatible
datastore, making for easy migration
- Yellowbrick offers cloud and on-premises
options with predictable costs
CLOUD-ONLY
Security Demonstration
DEMO Details
Netflow dataset from Spanish ISP
Intrusion detection system testing
Network capture over a six month period
Yellowbrick Showcase
- Perfect solution with Tableau, MicroStrategy or any BI
- Ideal solution for concurrent queries on fewer/smaller systems
22
How Yellowbrick Data Integrates to Existing Environments Webcast
24
How Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments Webcast
Customer videos at www.yellowbrick.com
29
30
Questions and Answers
THANK YOU
yellowbrick.com
S E E I N G I S B E L I E V I N G
Ad

More Related Content

What's hot (20)

Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...
Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...
Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...
Spark Summit
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Flink Forward
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
HostedbyConfluent
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward
 
Presto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performancePresto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performance
DataWorks Summit
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
Databricks
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!
Flink Forward
 
Oracle Spatial Studio: Fast and Easy Spatial Analytics and Maps
Oracle Spatial Studio:  Fast and Easy Spatial Analytics and MapsOracle Spatial Studio:  Fast and Easy Spatial Analytics and Maps
Oracle Spatial Studio: Fast and Easy Spatial Analytics and Maps
Jean Ihm
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
DataWorks Summit
 
Apache Flink Hands On
Apache Flink Hands OnApache Flink Hands On
Apache Flink Hands On
Robert Metzger
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編
Yuki Morishita
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
Apache Spark on K8S and HDFS Security with Ilan Flonenko
Apache Spark on K8S and HDFS Security with Ilan FlonenkoApache Spark on K8S and HDFS Security with Ilan Flonenko
Apache Spark on K8S and HDFS Security with Ilan Flonenko
Databricks
 
Data Con LA 2022 - Making real-time analytics a reality for digital transform...
Data Con LA 2022 - Making real-time analytics a reality for digital transform...Data Con LA 2022 - Making real-time analytics a reality for digital transform...
Data Con LA 2022 - Making real-time analytics a reality for digital transform...
Data Con LA
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
Rahul K Chauhan
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
Guozhang Wang
 
Node.js with WebRTC DataChannel
Node.js with WebRTC DataChannelNode.js with WebRTC DataChannel
Node.js with WebRTC DataChannel
mganeko
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine Learning
David Stein
 
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
Timothy Spann
 
Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...
Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...
Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...
Spark Summit
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Flink Forward
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
HostedbyConfluent
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward
 
Presto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performancePresto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performance
DataWorks Summit
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
Databricks
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!
Flink Forward
 
Oracle Spatial Studio: Fast and Easy Spatial Analytics and Maps
Oracle Spatial Studio:  Fast and Easy Spatial Analytics and MapsOracle Spatial Studio:  Fast and Easy Spatial Analytics and Maps
Oracle Spatial Studio: Fast and Easy Spatial Analytics and Maps
Jean Ihm
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
DataWorks Summit
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編
Yuki Morishita
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
Apache Spark on K8S and HDFS Security with Ilan Flonenko
Apache Spark on K8S and HDFS Security with Ilan FlonenkoApache Spark on K8S and HDFS Security with Ilan Flonenko
Apache Spark on K8S and HDFS Security with Ilan Flonenko
Databricks
 
Data Con LA 2022 - Making real-time analytics a reality for digital transform...
Data Con LA 2022 - Making real-time analytics a reality for digital transform...Data Con LA 2022 - Making real-time analytics a reality for digital transform...
Data Con LA 2022 - Making real-time analytics a reality for digital transform...
Data Con LA
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
Rahul K Chauhan
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
Guozhang Wang
 
Node.js with WebRTC DataChannel
Node.js with WebRTC DataChannelNode.js with WebRTC DataChannel
Node.js with WebRTC DataChannel
mganeko
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine Learning
David Stein
 
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
Timothy Spann
 

Similar to How Yellowbrick Data Integrates to Existing Environments Webcast (20)

Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoop
gluent.
 
Oracle GoldenGate
Oracle GoldenGate Oracle GoldenGate
Oracle GoldenGate
oracleonthebrain
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
DATAVERSITY
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
DataStax
 
single store faster analytics for warehousing
single store faster analytics for warehousingsingle store faster analytics for warehousing
single store faster analytics for warehousing
ballsmcballsack
 
Key Database Criteria for Cloud Applications
Key Database Criteria for Cloud ApplicationsKey Database Criteria for Cloud Applications
Key Database Criteria for Cloud Applications
NuoDB
 
informatica data replication (IDR)
informatica data replication (IDR)informatica data replication (IDR)
informatica data replication (IDR)
MaxHung
 
Cloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsCloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIs
SnapLogic
 
Actian Matrix Datasheet
Actian Matrix DatasheetActian Matrix Datasheet
Actian Matrix Datasheet
Edgar Alejandro Villegas
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
Alluxio, Inc.
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Denodo
 
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Data
 
HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016
INDUSCommunity
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Denodo
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lake
punedevscom
 
Oracle GoldenGate 12c - Real Time Access to Real Time Information
Oracle GoldenGate 12c - Real Time Access to Real Time InformationOracle GoldenGate 12c - Real Time Access to Real Time Information
Oracle GoldenGate 12c - Real Time Access to Real Time Information
Asha BG
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudBring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
DataWorks Summit
 
Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoop
gluent.
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
DATAVERSITY
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
DataStax
 
single store faster analytics for warehousing
single store faster analytics for warehousingsingle store faster analytics for warehousing
single store faster analytics for warehousing
ballsmcballsack
 
Key Database Criteria for Cloud Applications
Key Database Criteria for Cloud ApplicationsKey Database Criteria for Cloud Applications
Key Database Criteria for Cloud Applications
NuoDB
 
informatica data replication (IDR)
informatica data replication (IDR)informatica data replication (IDR)
informatica data replication (IDR)
MaxHung
 
Cloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsCloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIs
SnapLogic
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
Alluxio, Inc.
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Denodo
 
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Data
 
HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016HP Enterprises in Hana Pankaj Jain May 2016
HP Enterprises in Hana Pankaj Jain May 2016
INDUSCommunity
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Denodo
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lake
punedevscom
 
Oracle GoldenGate 12c - Real Time Access to Real Time Information
Oracle GoldenGate 12c - Real Time Access to Real Time InformationOracle GoldenGate 12c - Real Time Access to Real Time Information
Oracle GoldenGate 12c - Real Time Access to Real Time Information
Asha BG
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudBring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
DataWorks Summit
 
Ad

Recently uploaded (20)

Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Ad

How Yellowbrick Data Integrates to Existing Environments Webcast

  • 1. Gary Orenstein Chief Product Officer April 2019 How Yellowbrick Integrates into Existing Environments
  • 2. In today’s talk Market Drivers Yellowbrick Data Warehouse Overview and Landscape Data Management Evolution How to Introduce Yellowbrick to Existing Environments Security Application Demonstration 2
  • 4. Demands on data organizations Serve more users, internally and externally Provide more analytic services simultaneously • New analytic applications • Ad hoc business intelligence • Operational dashboards • All driving need for fast loading and high performance Save money with deployment flexibility and consolidation 4
  • 6. Yellowbrick fits enterprise needs Always on and available Ad-hoc SQL queries Correct answers on any schema Terabytes to petabytes of data Mixed real-time inserts, ETL, batch, interactive workloads Thousands of concurrent users 6
  • 7. Yellowbrick Data Warehouse attributes Real-time Feeds Ingest IoT or OLTP data Capture 100,000s of rows per second Interactive Applications Serve short queries in under 100 milliseconds Periodic Bulk Loads Capture terabytes of data, petabytes over time Powerful Analytics Respond to complex ad hoc BI queries in just a few seconds Load and Transform Use existing ETL tools including intensive push-down ELT Business Critical Reporting Workload management for prioritized responses PostgreSQL compatible 7
  • 8. The Yellowbrick Data Warehouse 8 PURPOSE-BUILT ALL FLASH SQL ENGINE From tens of terabytes to 2.4 petabytes MPP architecture Start small Grow compute and storage DEPLOY IN THE DATA CENTER OR CLOUD
  • 9. Data warehousing approaches 9 SCALE UP SCALE OUT SCALE OUT with minimal resources Go as far as a single server can go Add servers continuously for scale (parallel systems) Maximum users, most amount of data, minimum resources Couple with workload management SQL Server Oracle IBM Netezza Greenplum Vertica Teradata Oracle Exadata SAP HANA Cloud-only AWS Redshift Snowflake Yellowbrick Data
  • 11. Simplified evolution of data management Enterprise Data Warehouse model • Consolidate one or multiple application data sets into a data warehouse Desire to capture all Internet data led to adoption of a data lake • However, MapReduce was challenging SQL-as-a-Layer provides some relief • But SQL on a file system IS NOT a data warehouse SQL as a Layer 11
  • 12. Incoming Data Structured and semi-structured The Yellowbrick Data Warehouse 1000s of users (Applications, BI analysts, Data engineers) High value data moves to EDW Unstructured data Data Lake Data science Modern architecture for scalable SQL analytics 12
  • 14. Retail analytics provider We had six engineers touch the system and all of them found the system very easy to use because there was a lot of commonality with the existing systems that we already had.” yellowbrick.com/symphony 14 “The [query] performance improvements we saw were from 3x to 10x improvement basically running them as is.
  • 15. Integrating with Yellowbrick Data Ingest - Directly from SQL applications - Via real-time streams from Kafka - Transformation with Spark Client interoperability with PostgreSQL Interactive Applications Build new analytical applications Bulk Loading Use YBLOAD to move flat files at 1 GB/s, and up to 6 TB/hr Powerful BI Analytics Respond to ad hoc BI queries from MicroStrategy, Tableau, Business Objects and more Load and Transform - Informatica, Attunity, Talend, Syncsort - Spark ETL Business Critical Reporting Build prioritized responses and multi-department support with workload management Yellowbrick Database Details ANSI SQL, stored procedures, views, schemas, workload management Data Mining SAS, R, Python 15
  • 16. Data warehousing approaches 16 SCALE UP SCALE OUT SCALE OUT with minimal resources Go as far as a single server can go Add servers continuously for scale (parallel systems) Maximum users, most amount of data, minimum resources Couple with workload management SQL Server Oracle IBM Netezza Greenplum Vertica Teradata Oracle Exadata SAP HANA Cloud-only AWS Redshift Snowflake Yellowbrick DataTraditional MPP Preconfigured Cloud-only Single Server
  • 17. Integrating (upgrading) single server databases When should you move? - Outgrown a single server - Managing too many independent databases 17 Benefits with Yellowbrick - More capacity (compute and historical data) - Simplified operation - Same well-known SQL models GROWTH BEYOND A SINGLE SERVER TOO MANY DISPARATE SYSTEMS YELLOWBRICK DATA WAREHOUSE
  • 18. Integrating (upgrading) traditional MPP systems When should you move? - Netezza - NOW! Prepare for end-of support - Vertica/Greenplum - When you need more active data warehouse development - Teradata - When you are intent on saving millions of dollars and willing to make selective changes to your workflow 18 Benefits with Yellowbrick - For Netezza, Greenplum, Vertica users - Same PostreSQL! - For all traditional MPP users - Massive savings in legacy footprint while achieving 3-100x query improvements
  • 19. Integrating (upgrading) ‘pre-configured’ systems When should you move? - When there is corporate objective to reduce infrastructure costs from Oracle or SAP - When there is understanding that core apps can remain on ‘pre-configured’ systems, and analytics can move to open, standards-based data warehouses - When you want to save big $$$ 19 Benefits with Yellowbrick - Significantly lower costs due to a flash-centric architecture instead of a DRAM-centric architecture - Ability to open analytic access to multiple departments and teams simultaneously without incurring excessive costs PRECONFIGURED SYSTEM
  • 20. Integrating (upgrading) cloud-only systems When should you move? • When core your data and analytics engine has become an unpredictable and out-of-control cost • When you have reached a performance or concurrency limit with cloud-only solutions • When you prefer the control of hybrid environment, in the cloud with options for on-premises 20 Benefits with Yellowbrick - Redshift is also a PostgreSQL compatible datastore, making for easy migration - Yellowbrick offers cloud and on-premises options with predictable costs CLOUD-ONLY
  • 22. DEMO Details Netflow dataset from Spanish ISP Intrusion detection system testing Network capture over a six month period Yellowbrick Showcase - Perfect solution with Tableau, MicroStrategy or any BI - Ideal solution for concurrent queries on fewer/smaller systems 22
  • 24. 24
  • 29. Customer videos at www.yellowbrick.com 29
  • 31. THANK YOU yellowbrick.com S E E I N G I S B E L I E V I N G