SlideShare a Scribd company logo
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTING
John Santaferraro
Research Director
EMA
How to Streamline
DataOps on AWS
Modernizing Data Management in the Cloud
Will Davis
Sr. Director of Product Marketing
Trifacta
Nikki Rouda
Principal Product Marketing Manager, Amazon
Web Services
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTING
Watch the On-Demand Webinar
Slide 2
• EMA How to Streamline DataOps on AWS On-Demand webinar is
available here: http://info.enterprisemanagement.com/how-to-
streamline-dataops-on-aws-webinar-ws
• Check out upcoming webinars from EMA here:
http://www.enterprisemanagement.com/freeResearch
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTING
Featured Speakers
John Santaferraro, Research Director, EMA
John is the research director for analytics, business intelligence, and data management at EMA. He has 23 years of
experience in data and analytics, from startups to executive positions at Fortune 50 companies. His deep
understanding of the industry comes from years of leadership in implementation, product and marketing organizations,
along with multiple big data imagineering efforts for finance, communications, retail, manufacturing, healthcare, events,
oil and gas, and utilities. John's coverage area also includes data integration, data discovery, metadata management,
artificial intelligence, machine learning, data science, digital marketing, and innovation.
Will Davis, Sr. Director of Product Marketing, Trifacta
Will drives go-to-market and product marketing efforts at Trifacta having spent the past ten years managing the
marketing initiatives for several high-growth data companies. Prior to Trifacta, Will worked with a variety of companies
focused on data infrastructure, analytics and visualization, including GoodData, Greenplum and ClearStory Data. Will
leads Trifacta’s marketing strategy to rapidly expand business growth and brand awareness.
Nikki Rouda, Principal Product Marketing Manager, Amazon Web Services
Nikki is the principal product marketing manager for data lakes and big data at AWS. Nikki has spent 20+ years helping
enterprises in 40+ countries develop and implement solutions to their analytics and IT infrastructure challenges. Nikki
holds an MBA from the University of Cambridge and an ScB in geophysics and math from Brown University.
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTING
Logistics for Today’s Webinar
An archived version of the event recording will be
available at www.enterprisemanagement.com
• Log questions in the chat panel located on the lower
left-hand corner of your screen
• Questions will be addressed during the Q&A session
of the event
QUESTIONS
EVENT RECORDING
A PDF of the speaker slides will be distributed
to all attendees
PDF SLIDES
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTING
Agenda
• Top Cloud Challenges and Opportunities
• Modernization and DataOps in the Cloud
• Data Lakes and Analytics on AWS
• Trifacta DataOps on AWS
• Question and Answer
Slide 5 © 2018 Enterprise Management Associates, Inc.
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTING
Top Cloud Challenges and
Opportunities
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTINGSlide 7 © 2019 Enterprise Management Associates, Inc.
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTINGSlide 8 © 2019 Enterprise Management Associates, Inc.
MORE PLATFORMS:
Almost 8 of 10 participants
indicated they have between 3
and 7 different platforms in
their big data environment.
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTINGSlide 9 © 2019 Enterprise Management Associates, Inc.
FASTER SPEEDS:
Almost 3 of 4 participants
indicated they were adopting
real-time processing
strategies.
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTINGSlide 10 © 2019 Enterprise Management Associates, Inc.
FASTER SPEEDS:
Streaming platforms take the
#1 spot for platforms
implemented in 2018.
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTINGSlide 11 © 2019 Enterprise Management Associates, Inc.
MORE COMPLEXITY:
Almost 3 of 4 respondents
indicated they were adopting
complex workloads like data
science and machine learning.
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTINGSlide 12 © 2019 Enterprise Management Associates, Inc.
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTINGSlide 13 © 2019 Enterprise Management Associates, Inc.
MORE CLOUD:
More than 3 of every 4 big
data projects are using some
form of cloud implementation.
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTING
Modernization and DataOps in the Cloud
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTING
EMA Hybrid Data Ecosystems
Slide 15 © 2019 Enterprise Management Associates, Inc.
H/S
AP
DW
DM
NSSP
RS
OS
DP
AP - Analytic Platforms
DP - Discovery Platforms
H/S - Hadoop/Spark
DW - Data Warehouse
DM - Data Marts
NS - NoSQL
OS - Operational Systems
SP - Streaming Platforms
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTINGSlide 16 © 2019 Enterprise Management Associates, Inc.
H/S
AP
DW
DM
NSSP
SS
OS
RDS
DP
AP - Analytic Platforms
DP - Discovery Platforms
H/S - Hadoop/Spark
DW - Data Warehouse
DM - Data Marts
NS - NoSQL
OS - Operational Systems
SP - Streaming Platforms
SS - Simple Storage
EMA Hybrid Data Ecosystems
- in the cloud
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTINGSlide 17 © 2019 Enterprise Management Associates, Inc.
H/S
EMR
AP
Redshift
DW
Redshift
DM
RDS
NS
DynomoDB
SP
Kenesis
SS
S3
OS
RDS
DP
Several
AP - Analytic Platforms
DP - Discovery Platforms
H/S - Hadoop/Spark
DW - Data Warehouse
DM - Data Marts
NS - NoSQL
OS - Operational Systems
SP - Streaming Platforms
SS - Simple Storage
EMA Hybrid Data Ecosystems
- Example: AWS
IT & DATA MANAGEMENT RESEARCH, INDUSTRY
ANALYSIS & CONSULTING
7 Principles of DataOps for Cloud Data Ecosystems
• Multi-model data access
replaces single model
• Interoperability replaces
integration
• Data preparation and pipelines
replace data cleansing
• Automation replaces manual
data everything
• Elasticity replaces enterprise
scalability
• Multidimensional agility
replaces extensibility
• Automated governance
replaces simple metadata
Slide 18 © 2019 Enterprise Management Associates, Inc.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
Data lakes and analytics on
AWS
Nikki Rouda, AWS
February 21, 2019
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
Driving business with analytics
• Fighting fraud
• Quoting car & truck
prices
• Movies & TV on demand
• Delivering software
quality
• Mitigating safety issues
• Targeted marketingWhat do all these business challenges have in
common?
They are solved with AWS data lakes and analytics.
• Finding new revenue
• Improving health
• Serving retail customers
• Valuing real estate
• Reducing advertising costs
• Making music
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
Companies want more value from their data
Complications:
Siloed approaches don’t work anymore
It’s too expensive and limiting to store data
on-premises
Data is:
Implication:
A new approach is needed to extract insights
and value
Growing
exponentially
From new
sources
Increasingly
diverse
Used by
many people
Analyzed by
many applications
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
Cloud data lakes are the future
Customers want:
To eliminate siloes of data
To move to a single store, i.e. a data lake in the cloud
To store data securely in standard formats
To grow to any scale, with low costs
To analyze their data in a variety of ways
To have real-time analytics
To predict future outcomes
Data Lake
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
Why choose AWS for data lakes and analytics
Most
comprehensive
Most
secure
Most
scalable
Most
cost-effective
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
Our portfolio
Broadest and deepest portfolio, purpose-built for builders
Migration & Streaming Services
Infrastructure Data Catalog
& ETL
Security &
Management
Dashboards Machine Learning
Data
Warehousing
Big Data
Processing
Interactive
Query
Operational
Analytics
Real time
Analytics
Serverless
Data
processing
Visualization & machine
learning
Data movement
Analytics
Data lake
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
Data movement
Analytics
Data lake
Our portfolio
Broadest and deepest portfolio, purpose-built for builders
QuickSight SageMaker
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Managed Streaming for
Kafka
Redshift EMR Athena Elasticsearc
h Service
Kinesis Data
Analytics
Glue
S3/Glacier GlueLake
Formation
Visualization & machine
learning
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
More data lakes and analytics than anywhere else
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
125+ million players
Create a constant feedback
loop
for game designers
Up-to-the-minute
understanding
of gamer satisfaction to
guarantee gamers are engaged
Resulting in the most popular
game played in the world
Fortnite
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
Migrated from on-premises
data warehouse
Built a data warehouse with
Amazon Redshift and data lake
with Amazon S3
Analytics on data lake with
Amazon Athena, Amazon
Redshift Spectrum, and Amazon
EMR
Report delivery went from
months to days, at far lower cost
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
Needed to analyze data to find insights,
identify opportunities and evaluate business
performance
The Oracle data warehouse did not scale, was
difficult to maintain and costly
Deployed a data lake with Amazon S3, and run
analytics with Amazon Redshift, Amazon
Redshift Spectrum, and Amazon EMR
Result: They doubled the data stored (100PB),
lowered costs, and was able to gain insights
faster
50 PB of data
600,000 analytics jobs/day
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
Enabling all types of data-driven analytics
Retrospective
analysis and
reporting
Here-and-now
real-time processing
and dashboards
Predictions
to enable smart
applications
Trifacta on AWS
How to Streamline DataOps on AWS: Modernizing Data Management in the Cloud
Will Davis, Sr. Director of Product Marketing, Trifacta
DataOps Definition
32 Proprietary & Confidential
“DataOps is the function within an organization that controls
the data journey from source to value.”
Jarah Euston, What is DataOps?, (Nexla, 2017), https://www.nexla.com/define-dataops/
Data Platforms
Databases
Log Files
Spreadsheets
IoT Sensors
Apps
80%
Usage Patterns
Data Onboarding
ML/AI
Analytics
“It’s impossible to overstress this:
80% of the work in any data project
is in cleaning the data.”
— DJ Patil, Former Chief Data Scientist of the United States
Proprietary & Confidential.33
What is Biggest Challenge in Streamlining DataOps?
Analysis
Enterprise Data Warehouse
AI
Business Intelligence
Proprietary & Confidential.34
Data Analyst Data Engineer Data Scientist
ValidatingDiscovering Structuring Cleaning Enriching Deploying
Data Platforms
Databases
Log Files
Spreadsheets
IoT Sensors
Apps
And it Impacts Your Entire Data Team...
“Poor data quality is enemy number
one to the widespread, profitable use
of machine learning.”
—Harvard Business Review
“So, while there is a visible arms race as
companies bring on machine learning coders
and kick off AI initiatives, there is also a
behind-the-scenes, panicked race for new
and different data.”
—MIT Sloan Management Review
The Rise of Machine Learning & AI Compounds the Problem
"The hard part of
AI
is data wrangling.”
wrangles
For AI
—SWAMI SIVASUBRAMANIAN
VP – AMAZON MACHINE LEARNING
#reInvent2018
Proprietary & Confidential.36
37 Proprietary & Confidential
WHAT TO DO?
Data Platforms
Databases
Log Files
Spreadsheets
IoT Sensors
Apps
Analysis
Enterprise Data Warehouse
AI
Business Intelligence
Proprietary & Confidential.38
DATA WRANGLING
Data Platforms
Databases
Log Files
Spreadsheets
IoT Sensors
Apps
Analysis
Enterprise Data Warehouse
AI
Business Intelligence
Proprietary & Confidential.39
DATA WRANGLING
• Empowers domain experts with intelligent visual
interfaces that automate assessment and
transformation of data
• Enable IT to collaboratively curate and operationalize
data pipelines authored by domain experts
• Establish an enterprise-wide platform that refines data
from a variety of sources, supporting a range of users
and use cases
Predictive
Modeling
Business
Intelligence
Data
Onboarding
Risk &
Compliance
Audit, Testing
&
Validation
Data Migration
OPERATIONAL
Data Platforms
Databases
Log Files
Spreadsheets
IoT Sensors
Apps
Proprietary & Confidential.40
ValidatingDiscovering Structuring Cleaning Enriching Deploying
ANALYTIC
Data Analyst Data Engineer Data Scientist
And We’re Natively Integrated into AWS
• Native storage
• Native processing
• Native security
50+ Trifacta Customers Deployed on AWS
42 Proprietary & Confidential
Why Trifacta?
QUALITY
SPEED
EFFICIENCY
QUALITY
SPEED
EFFICIENCY
Empower the people who know the data best
While maintaining governance and lineage
Intuitive, visual
interface
Intuitive,
visual interface
Self-documenting
lineage
QUALITY
SPEED
EFFICIENCY
Faster to Design Preparation Workflows
Instant previews,
continuous validation
ML-driven
suggestion
s
QUALITY
SPEED
EFFICIENCY
Faster to Put Workflows into Production
Automate data pipelines
Share, test &
version control
QUALITY
SPEED
EFFICIENCY
Retire legacy
solutions
Utilize native
cloud elasticity
Industry Analysts All Rank Trifacta #1
48 Proprietary & Confidential
Self Service Data
Preparation Wave
“Customer references
can't say enough about
Trifacta’s ease of use"
A perfect score in 14 of
17 categories.
#1
with Gartner
#1
with Ovum,
Dresner,
and Bloor
49 Proprietary & Confidential.
How to Get
Started?
Different Editions & Deployments Options for Any Use Case
Proprietary & Confidential.50
FOR INDIVIDUALS
Free
FOR TEAMS & DEPARTMENTS
Starts at $5K per user*
FOR ENTERPRISE DEPLOYMENT
Contact Us
• Trifacta Managed Cloud
• Works with desktop files
• Functional, data volume, and
processing limitations
• Community support
• Trifacta Managed Cloud
• Files, relational, cloud
connectivity
• Job Scheduling & Collaboration
• Phone/email support
• Customer Managed Cloud
• Unlimited volume & scalability
• Broad connectivity
• Advanced security, access
controls, and governance
• Enterprise support & dedicated
customer success manager
Available on the AWS Marketplace
51 Proprietary & Confidential
Start Wrangling Today with Free Wrangler
52 Proprietary & Confidential
Questions?

More Related Content

What's hot (20)

PDF
Data Engineering Basics
Catherine Kimani
 
PDF
Introdution to Dataops and AIOps (or MLOps)
Adrien Blind
 
PPTX
DW Migration Webinar-March 2022.pptx
Databricks
 
PDF
Designing An Enterprise Data Fabric
Alan McSweeney
 
PDF
Business Intelligence & Data Analytics– An Architected Approach
DATAVERSITY
 
PPTX
Top 5 IoT Use Cases
Cloudera, Inc.
 
PPTX
Washington DC DataOps Meetup -- Nov 2019
DataKitchen
 
PPTX
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Eric Kavanagh
 
PPTX
Data Lake Overview
James Serra
 
PPTX
Platforms, Platform Engineering, & Platform as a Product
VMware Tanzu
 
PPT
Big data analysis using map/reduce
RenuSuren
 
PDF
Collibra - Forrester Presentation : Data Governance 2.0
Guillaume LE GALIARD
 
PDF
Big Data Architecture and Design Patterns
John Yeung
 
PPTX
Simple cloud migration with OpenText Migrate
OpenText
 
PPTX
Introduction to Data Engineering
Vivek Aanand Ganesan
 
PDF
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
DATAVERSITY
 
PPSX
Microservices Architecture - Cloud Native Apps
Araf Karsh Hamid
 
PPTX
Digital Reference Architecture- A FOCUS ON MIDDLEWARE “THE KILLER APP”
Kellton Tech Solutions Ltd
 
PDF
Azure+Databricks+Course+Slide+Deck+V4.pdf
Chitresh Kaushik
 
PDF
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
SlideTeam
 
Data Engineering Basics
Catherine Kimani
 
Introdution to Dataops and AIOps (or MLOps)
Adrien Blind
 
DW Migration Webinar-March 2022.pptx
Databricks
 
Designing An Enterprise Data Fabric
Alan McSweeney
 
Business Intelligence & Data Analytics– An Architected Approach
DATAVERSITY
 
Top 5 IoT Use Cases
Cloudera, Inc.
 
Washington DC DataOps Meetup -- Nov 2019
DataKitchen
 
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Eric Kavanagh
 
Data Lake Overview
James Serra
 
Platforms, Platform Engineering, & Platform as a Product
VMware Tanzu
 
Big data analysis using map/reduce
RenuSuren
 
Collibra - Forrester Presentation : Data Governance 2.0
Guillaume LE GALIARD
 
Big Data Architecture and Design Patterns
John Yeung
 
Simple cloud migration with OpenText Migrate
OpenText
 
Introduction to Data Engineering
Vivek Aanand Ganesan
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
DATAVERSITY
 
Microservices Architecture - Cloud Native Apps
Araf Karsh Hamid
 
Digital Reference Architecture- A FOCUS ON MIDDLEWARE “THE KILLER APP”
Kellton Tech Solutions Ltd
 
Azure+Databricks+Course+Slide+Deck+V4.pdf
Chitresh Kaushik
 
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
SlideTeam
 

Similar to How to Streamline DataOps on AWS (20)

PDF
Drive More Value with High Performance Cloud Data Warehousing
Enterprise Management Associates
 
PDF
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summits
 
PDF
Leveraging Streaming Data through Automation
Enterprise Management Associates
 
PDF
Looking Before You Leap into the Cloud: A proactive approach to machine learn...
Enterprise Management Associates
 
PPTX
AWS Initiate Day Mexico City | Sesión Plenaria
Amazon Web Services LATAM
 
PPTX
Data Integration for Both Self-Service Analytics and IT Users
Senturus
 
PDF
AWS Summit Singapore 2019 | Driving Business Outcomes with Data Lake on AWS
AWS Summits
 
PPTX
When SAP alone is not enough
Cloudera, Inc.
 
PDF
Is your data paying you dividends?
Karan Sachdeva
 
PDF
Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...
Enterprise Management Associates
 
PDF
Enabling 360-degree Business Insights with SAP Data
Enterprise Management Associates
 
PDF
C04 Driving understanding from Documents and unstructured data sources final.pdf
PhilipBasford
 
PDF
Cloud Migration Checklist: A Better Way to Set Priorities, Assess Your Progre...
Enterprise Management Associates
 
PDF
Réinventez le Data Management avec la Data Virtualization de Denodo
Denodo
 
PDF
Four Key Considerations for your Big Data Analytics Strategy
Arcadia Data
 
PDF
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Alluxio, Inc.
 
PPTX
A Journey to a Serverless Business Intelligence, Machine Learning and Big Dat...
DataWorks Summit
 
PDF
Event-driven Business: How Leading Companies are Adopting Streaming Strategies
Enterprise Management Associates
 
PDF
How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...
Enterprise Management Associates
 
PPTX
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
tsigitnist02
 
Drive More Value with High Performance Cloud Data Warehousing
Enterprise Management Associates
 
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summits
 
Leveraging Streaming Data through Automation
Enterprise Management Associates
 
Looking Before You Leap into the Cloud: A proactive approach to machine learn...
Enterprise Management Associates
 
AWS Initiate Day Mexico City | Sesión Plenaria
Amazon Web Services LATAM
 
Data Integration for Both Self-Service Analytics and IT Users
Senturus
 
AWS Summit Singapore 2019 | Driving Business Outcomes with Data Lake on AWS
AWS Summits
 
When SAP alone is not enough
Cloudera, Inc.
 
Is your data paying you dividends?
Karan Sachdeva
 
Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...
Enterprise Management Associates
 
Enabling 360-degree Business Insights with SAP Data
Enterprise Management Associates
 
C04 Driving understanding from Documents and unstructured data sources final.pdf
PhilipBasford
 
Cloud Migration Checklist: A Better Way to Set Priorities, Assess Your Progre...
Enterprise Management Associates
 
Réinventez le Data Management avec la Data Virtualization de Denodo
Denodo
 
Four Key Considerations for your Big Data Analytics Strategy
Arcadia Data
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Alluxio, Inc.
 
A Journey to a Serverless Business Intelligence, Machine Learning and Big Dat...
DataWorks Summit
 
Event-driven Business: How Leading Companies are Adopting Streaming Strategies
Enterprise Management Associates
 
How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...
Enterprise Management Associates
 
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
tsigitnist02
 
Ad

More from Enterprise Management Associates (20)

PDF
Unlocking the Future of Observability: OpenTelemetry’s Role in IT Performance...
Enterprise Management Associates
 
PDF
The AI Advantage: How IT Leaders are Redefining Operations in 2025
Enterprise Management Associates
 
PDF
The Future of Workload Automation and Orchestration: Driving Digital Transfor...
Enterprise Management Associates
 
PDF
From Adversaries to Allies: Bridge the NetOps-SecOps Gap with Network Observa...
Enterprise Management Associates
 
PDF
Network Observability: Managing Performance Across Hybrid Networks
Enterprise Management Associates
 
PDF
Zero Trust Networking: How Network Teams Support Cybersecurity
Enterprise Management Associates
 
PDF
Navigating the Future of Security Operations Centers (SOC) with Agentic AI
Enterprise Management Associates
 
PDF
Securing Tomorrow: The Role of AI in Transforming Cybersecurity
Enterprise Management Associates
 
PDF
Applying Generative AI to IT Operations Research
Enterprise Management Associates
 
PPTX
Network as a Service: Understanding the Cloud Consumption Model in Networking
Enterprise Management Associates
 
PDF
Orchestrating Data Transfers in the Digital Era: Navigating Challenges and So...
Enterprise Management Associates
 
PDF
Network Management Megatrends 2024: Skills Gaps, Hybrid and Multi-Cloud, SASE...
Enterprise Management Associates
 
PDF
ServiceOps 2024: automation and (gen)AI-powered IT service and operations
Enterprise Management Associates
 
PDF
The Evolution of Work: Enhancing Productivity and Collaboration through Digit...
Enterprise Management Associates
 
PDF
Avoid Observability Failure: Hybrid Enterprises Must Complement APM with Inte...
Enterprise Management Associates
 
PDF
EMA AIOps Radar: A Guide to Investing in Innovation
Enterprise Management Associates
 
PDF
Enterprise Network Automation: Emerging from the Dark Ages and Reaching Towar...
Enterprise Management Associates
 
PDF
Redefining Automation Horizons: Orchestrating Multi-Cloud Landscapes
Enterprise Management Associates
 
PDF
Expert Insights: Rethinking Your Network Operations Toolset as Cisco Prime En...
Enterprise Management Associates
 
PDF
Highlights from the EMA Radar™ Report for Workload Automation and Orchestrati...
Enterprise Management Associates
 
Unlocking the Future of Observability: OpenTelemetry’s Role in IT Performance...
Enterprise Management Associates
 
The AI Advantage: How IT Leaders are Redefining Operations in 2025
Enterprise Management Associates
 
The Future of Workload Automation and Orchestration: Driving Digital Transfor...
Enterprise Management Associates
 
From Adversaries to Allies: Bridge the NetOps-SecOps Gap with Network Observa...
Enterprise Management Associates
 
Network Observability: Managing Performance Across Hybrid Networks
Enterprise Management Associates
 
Zero Trust Networking: How Network Teams Support Cybersecurity
Enterprise Management Associates
 
Navigating the Future of Security Operations Centers (SOC) with Agentic AI
Enterprise Management Associates
 
Securing Tomorrow: The Role of AI in Transforming Cybersecurity
Enterprise Management Associates
 
Applying Generative AI to IT Operations Research
Enterprise Management Associates
 
Network as a Service: Understanding the Cloud Consumption Model in Networking
Enterprise Management Associates
 
Orchestrating Data Transfers in the Digital Era: Navigating Challenges and So...
Enterprise Management Associates
 
Network Management Megatrends 2024: Skills Gaps, Hybrid and Multi-Cloud, SASE...
Enterprise Management Associates
 
ServiceOps 2024: automation and (gen)AI-powered IT service and operations
Enterprise Management Associates
 
The Evolution of Work: Enhancing Productivity and Collaboration through Digit...
Enterprise Management Associates
 
Avoid Observability Failure: Hybrid Enterprises Must Complement APM with Inte...
Enterprise Management Associates
 
EMA AIOps Radar: A Guide to Investing in Innovation
Enterprise Management Associates
 
Enterprise Network Automation: Emerging from the Dark Ages and Reaching Towar...
Enterprise Management Associates
 
Redefining Automation Horizons: Orchestrating Multi-Cloud Landscapes
Enterprise Management Associates
 
Expert Insights: Rethinking Your Network Operations Toolset as Cisco Prime En...
Enterprise Management Associates
 
Highlights from the EMA Radar™ Report for Workload Automation and Orchestrati...
Enterprise Management Associates
 
Ad

Recently uploaded (20)

PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PDF
Learn Computer Forensics, Second Edition
AnuraShantha7
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
Learn Computer Forensics, Second Edition
AnuraShantha7
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
July Patch Tuesday
Ivanti
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 

How to Streamline DataOps on AWS

  • 1. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING John Santaferraro Research Director EMA How to Streamline DataOps on AWS Modernizing Data Management in the Cloud Will Davis Sr. Director of Product Marketing Trifacta Nikki Rouda Principal Product Marketing Manager, Amazon Web Services
  • 2. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Watch the On-Demand Webinar Slide 2 • EMA How to Streamline DataOps on AWS On-Demand webinar is available here: http://info.enterprisemanagement.com/how-to- streamline-dataops-on-aws-webinar-ws • Check out upcoming webinars from EMA here: http://www.enterprisemanagement.com/freeResearch
  • 3. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Featured Speakers John Santaferraro, Research Director, EMA John is the research director for analytics, business intelligence, and data management at EMA. He has 23 years of experience in data and analytics, from startups to executive positions at Fortune 50 companies. His deep understanding of the industry comes from years of leadership in implementation, product and marketing organizations, along with multiple big data imagineering efforts for finance, communications, retail, manufacturing, healthcare, events, oil and gas, and utilities. John's coverage area also includes data integration, data discovery, metadata management, artificial intelligence, machine learning, data science, digital marketing, and innovation. Will Davis, Sr. Director of Product Marketing, Trifacta Will drives go-to-market and product marketing efforts at Trifacta having spent the past ten years managing the marketing initiatives for several high-growth data companies. Prior to Trifacta, Will worked with a variety of companies focused on data infrastructure, analytics and visualization, including GoodData, Greenplum and ClearStory Data. Will leads Trifacta’s marketing strategy to rapidly expand business growth and brand awareness. Nikki Rouda, Principal Product Marketing Manager, Amazon Web Services Nikki is the principal product marketing manager for data lakes and big data at AWS. Nikki has spent 20+ years helping enterprises in 40+ countries develop and implement solutions to their analytics and IT infrastructure challenges. Nikki holds an MBA from the University of Cambridge and an ScB in geophysics and math from Brown University.
  • 4. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Logistics for Today’s Webinar An archived version of the event recording will be available at www.enterprisemanagement.com • Log questions in the chat panel located on the lower left-hand corner of your screen • Questions will be addressed during the Q&A session of the event QUESTIONS EVENT RECORDING A PDF of the speaker slides will be distributed to all attendees PDF SLIDES
  • 5. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Agenda • Top Cloud Challenges and Opportunities • Modernization and DataOps in the Cloud • Data Lakes and Analytics on AWS • Trifacta DataOps on AWS • Question and Answer Slide 5 © 2018 Enterprise Management Associates, Inc.
  • 6. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Top Cloud Challenges and Opportunities
  • 7. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 7 © 2019 Enterprise Management Associates, Inc.
  • 8. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 8 © 2019 Enterprise Management Associates, Inc. MORE PLATFORMS: Almost 8 of 10 participants indicated they have between 3 and 7 different platforms in their big data environment.
  • 9. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 9 © 2019 Enterprise Management Associates, Inc. FASTER SPEEDS: Almost 3 of 4 participants indicated they were adopting real-time processing strategies.
  • 10. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 10 © 2019 Enterprise Management Associates, Inc. FASTER SPEEDS: Streaming platforms take the #1 spot for platforms implemented in 2018.
  • 11. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 11 © 2019 Enterprise Management Associates, Inc. MORE COMPLEXITY: Almost 3 of 4 respondents indicated they were adopting complex workloads like data science and machine learning.
  • 12. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 12 © 2019 Enterprise Management Associates, Inc.
  • 13. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 13 © 2019 Enterprise Management Associates, Inc. MORE CLOUD: More than 3 of every 4 big data projects are using some form of cloud implementation.
  • 14. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Modernization and DataOps in the Cloud
  • 15. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING EMA Hybrid Data Ecosystems Slide 15 © 2019 Enterprise Management Associates, Inc. H/S AP DW DM NSSP RS OS DP AP - Analytic Platforms DP - Discovery Platforms H/S - Hadoop/Spark DW - Data Warehouse DM - Data Marts NS - NoSQL OS - Operational Systems SP - Streaming Platforms
  • 16. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 16 © 2019 Enterprise Management Associates, Inc. H/S AP DW DM NSSP SS OS RDS DP AP - Analytic Platforms DP - Discovery Platforms H/S - Hadoop/Spark DW - Data Warehouse DM - Data Marts NS - NoSQL OS - Operational Systems SP - Streaming Platforms SS - Simple Storage EMA Hybrid Data Ecosystems - in the cloud
  • 17. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 17 © 2019 Enterprise Management Associates, Inc. H/S EMR AP Redshift DW Redshift DM RDS NS DynomoDB SP Kenesis SS S3 OS RDS DP Several AP - Analytic Platforms DP - Discovery Platforms H/S - Hadoop/Spark DW - Data Warehouse DM - Data Marts NS - NoSQL OS - Operational Systems SP - Streaming Platforms SS - Simple Storage EMA Hybrid Data Ecosystems - Example: AWS
  • 18. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING 7 Principles of DataOps for Cloud Data Ecosystems • Multi-model data access replaces single model • Interoperability replaces integration • Data preparation and pipelines replace data cleansing • Automation replaces manual data everything • Elasticity replaces enterprise scalability • Multidimensional agility replaces extensibility • Automated governance replaces simple metadata Slide 18 © 2019 Enterprise Management Associates, Inc.
  • 19. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Data lakes and analytics on AWS Nikki Rouda, AWS February 21, 2019
  • 20. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Driving business with analytics • Fighting fraud • Quoting car & truck prices • Movies & TV on demand • Delivering software quality • Mitigating safety issues • Targeted marketingWhat do all these business challenges have in common? They are solved with AWS data lakes and analytics. • Finding new revenue • Improving health • Serving retail customers • Valuing real estate • Reducing advertising costs • Making music
  • 21. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Companies want more value from their data Complications: Siloed approaches don’t work anymore It’s too expensive and limiting to store data on-premises Data is: Implication: A new approach is needed to extract insights and value Growing exponentially From new sources Increasingly diverse Used by many people Analyzed by many applications
  • 22. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Cloud data lakes are the future Customers want: To eliminate siloes of data To move to a single store, i.e. a data lake in the cloud To store data securely in standard formats To grow to any scale, with low costs To analyze their data in a variety of ways To have real-time analytics To predict future outcomes Data Lake
  • 23. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Why choose AWS for data lakes and analytics Most comprehensive Most secure Most scalable Most cost-effective
  • 24. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Our portfolio Broadest and deepest portfolio, purpose-built for builders Migration & Streaming Services Infrastructure Data Catalog & ETL Security & Management Dashboards Machine Learning Data Warehousing Big Data Processing Interactive Query Operational Analytics Real time Analytics Serverless Data processing Visualization & machine learning Data movement Analytics Data lake
  • 25. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Data movement Analytics Data lake Our portfolio Broadest and deepest portfolio, purpose-built for builders QuickSight SageMaker Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Managed Streaming for Kafka Redshift EMR Athena Elasticsearc h Service Kinesis Data Analytics Glue S3/Glacier GlueLake Formation Visualization & machine learning
  • 26. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential More data lakes and analytics than anywhere else
  • 27. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential 125+ million players Create a constant feedback loop for game designers Up-to-the-minute understanding of gamer satisfaction to guarantee gamers are engaged Resulting in the most popular game played in the world Fortnite
  • 28. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Migrated from on-premises data warehouse Built a data warehouse with Amazon Redshift and data lake with Amazon S3 Analytics on data lake with Amazon Athena, Amazon Redshift Spectrum, and Amazon EMR Report delivery went from months to days, at far lower cost
  • 29. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Needed to analyze data to find insights, identify opportunities and evaluate business performance The Oracle data warehouse did not scale, was difficult to maintain and costly Deployed a data lake with Amazon S3, and run analytics with Amazon Redshift, Amazon Redshift Spectrum, and Amazon EMR Result: They doubled the data stored (100PB), lowered costs, and was able to gain insights faster 50 PB of data 600,000 analytics jobs/day
  • 30. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Enabling all types of data-driven analytics Retrospective analysis and reporting Here-and-now real-time processing and dashboards Predictions to enable smart applications
  • 31. Trifacta on AWS How to Streamline DataOps on AWS: Modernizing Data Management in the Cloud Will Davis, Sr. Director of Product Marketing, Trifacta
  • 32. DataOps Definition 32 Proprietary & Confidential “DataOps is the function within an organization that controls the data journey from source to value.” Jarah Euston, What is DataOps?, (Nexla, 2017), https://www.nexla.com/define-dataops/
  • 33. Data Platforms Databases Log Files Spreadsheets IoT Sensors Apps 80% Usage Patterns Data Onboarding ML/AI Analytics “It’s impossible to overstress this: 80% of the work in any data project is in cleaning the data.” — DJ Patil, Former Chief Data Scientist of the United States Proprietary & Confidential.33 What is Biggest Challenge in Streamlining DataOps?
  • 34. Analysis Enterprise Data Warehouse AI Business Intelligence Proprietary & Confidential.34 Data Analyst Data Engineer Data Scientist ValidatingDiscovering Structuring Cleaning Enriching Deploying Data Platforms Databases Log Files Spreadsheets IoT Sensors Apps And it Impacts Your Entire Data Team...
  • 35. “Poor data quality is enemy number one to the widespread, profitable use of machine learning.” —Harvard Business Review “So, while there is a visible arms race as companies bring on machine learning coders and kick off AI initiatives, there is also a behind-the-scenes, panicked race for new and different data.” —MIT Sloan Management Review The Rise of Machine Learning & AI Compounds the Problem
  • 36. "The hard part of AI is data wrangling.” wrangles For AI —SWAMI SIVASUBRAMANIAN VP – AMAZON MACHINE LEARNING #reInvent2018 Proprietary & Confidential.36
  • 37. 37 Proprietary & Confidential WHAT TO DO?
  • 38. Data Platforms Databases Log Files Spreadsheets IoT Sensors Apps Analysis Enterprise Data Warehouse AI Business Intelligence Proprietary & Confidential.38 DATA WRANGLING
  • 39. Data Platforms Databases Log Files Spreadsheets IoT Sensors Apps Analysis Enterprise Data Warehouse AI Business Intelligence Proprietary & Confidential.39 DATA WRANGLING • Empowers domain experts with intelligent visual interfaces that automate assessment and transformation of data • Enable IT to collaboratively curate and operationalize data pipelines authored by domain experts • Establish an enterprise-wide platform that refines data from a variety of sources, supporting a range of users and use cases
  • 40. Predictive Modeling Business Intelligence Data Onboarding Risk & Compliance Audit, Testing & Validation Data Migration OPERATIONAL Data Platforms Databases Log Files Spreadsheets IoT Sensors Apps Proprietary & Confidential.40 ValidatingDiscovering Structuring Cleaning Enriching Deploying ANALYTIC Data Analyst Data Engineer Data Scientist
  • 41. And We’re Natively Integrated into AWS • Native storage • Native processing • Native security
  • 42. 50+ Trifacta Customers Deployed on AWS 42 Proprietary & Confidential
  • 44. QUALITY SPEED EFFICIENCY Empower the people who know the data best While maintaining governance and lineage Intuitive, visual interface Intuitive, visual interface Self-documenting lineage
  • 45. QUALITY SPEED EFFICIENCY Faster to Design Preparation Workflows Instant previews, continuous validation ML-driven suggestion s
  • 46. QUALITY SPEED EFFICIENCY Faster to Put Workflows into Production Automate data pipelines Share, test & version control
  • 48. Industry Analysts All Rank Trifacta #1 48 Proprietary & Confidential Self Service Data Preparation Wave “Customer references can't say enough about Trifacta’s ease of use" A perfect score in 14 of 17 categories. #1 with Gartner #1 with Ovum, Dresner, and Bloor
  • 49. 49 Proprietary & Confidential. How to Get Started?
  • 50. Different Editions & Deployments Options for Any Use Case Proprietary & Confidential.50 FOR INDIVIDUALS Free FOR TEAMS & DEPARTMENTS Starts at $5K per user* FOR ENTERPRISE DEPLOYMENT Contact Us • Trifacta Managed Cloud • Works with desktop files • Functional, data volume, and processing limitations • Community support • Trifacta Managed Cloud • Files, relational, cloud connectivity • Job Scheduling & Collaboration • Phone/email support • Customer Managed Cloud • Unlimited volume & scalability • Broad connectivity • Advanced security, access controls, and governance • Enterprise support & dedicated customer success manager
  • 51. Available on the AWS Marketplace 51 Proprietary & Confidential
  • 52. Start Wrangling Today with Free Wrangler 52 Proprietary & Confidential