SlideShare a Scribd company logo
The Evolution of Big Data Platform
@
Netflix
Eva Tse
July 22, 2015
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
Our biggest challenge is scale
Netflix Key Business Metrics
65+ million
members
50 countries 1000+ devices
supported
10 billion
hours / quarter
Global Expansion
200 countries by end of 2016
Big Data Size
Total ~20 PB DW on S3
Read ~10% DW daily
Write ~10% of read data daily
~ 500 billion events daily
~ 350 active users
Our traditional BI stack is
our competition
How do we meet the functionality bar and yet
make it scale?
How do we make big data bite-size again?
Our North Star
Cloud
apps
Suro/Kaf
ka
Ursula
Cassandra
Aegisthus
Dimension Data
Event Data
15 min
Daily
AWS
S3
SS Tables
Storage Compute Service Tools
AWS
S3
Analytics
ETL
Interactive data exploration
Interactive slice & dice
RT analytics & iterative/ML algo
Evolving Big Data Processing Needs
Service Tools
Big Data
Portal
API Portal
Big Data
API
Evolving Services/Tools Ecosystem
AWS S3 as our DW Storage
Evolution of Big Data Processing Systems
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
Parquet
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
Service Tools
Big Data
Portal
API Portal
Big Data
APIEvolution of Services/Tools
Ecosystem
The evolution of the big data platform @ Netflix (OSCON 2015)
Metacat
Service Tools
Big Data
Portal
API Portal
Big Data
API dd
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
Big Data API
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
Big Data Portal
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
Open source is an integral part of our
strategy to achieve scale
Big Data Processing
Systems
Services/Tools
Ecosystem
Why use Open Source?
Why contribute back?
Why contribute our own tool?
The evolution of the big data platform @ Netflix (OSCON 2015)
Is open source right for you?
The evolution of the big data platform @ Netflix (OSCON 2015)
Measuring big data - understanding data by usage
By Charles Smith, Netflix
Tomorrow @ 1:40-2:20pm
Eva Tse
etse@netflix.com
jobs.netflix.com

More Related Content

PDF
Scalability, Availability & Stability Patterns
PPTX
Great Expectations Presentation
PDF
seven steps to dataops @ dataops.rocks conference Oct 2019
PPTX
Data Engineering Efficiency @ Netflix - Strata 2017
PDF
Data Platform Architecture Principles and Evaluation Criteria
PPTX
Analyzing 1.2 Million Network Packets per Second in Real-time
PDF
How to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
PDF
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Scalability, Availability & Stability Patterns
Great Expectations Presentation
seven steps to dataops @ dataops.rocks conference Oct 2019
Data Engineering Efficiency @ Netflix - Strata 2017
Data Platform Architecture Principles and Evaluation Criteria
Analyzing 1.2 Million Network Packets per Second in Real-time
How to build an ETL pipeline with Apache Beam on Google Cloud Dataflow
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard

What's hot (20)

PDF
Don’t Forget About Your Past—Optimizing Apache Druid Performance With Neil Bu...
PDF
Machine Learning at Netflix Scale
PDF
Qlik Sense Data Analytics
PPTX
Azure data platform overview
PPTX
Coral & Transport UDFs: Building Blocks of a Postmodern Data Warehouse​
PDF
How Apache Drives Music Recommendations At Spotify
PDF
Introduction to MLflow
PDF
Building Lakehouses on Delta Lake with SQL Analytics Primer
PDF
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
PPTX
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
PPTX
Data Observability Best Pracices
PPTX
Databricks Platform.pptx
PDF
Detection and Response Roles
PPTX
Netflix Data Pipeline With Kafka
PDF
Model selection and tuning at scale
PPTX
ODSC May 2019 - The DataOps Manifesto
PDF
Secure Systems Security and ISA99- IEC62443
PDF
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
PDF
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
PPTX
Building Modern Data Platform with Microsoft Azure
Don’t Forget About Your Past—Optimizing Apache Druid Performance With Neil Bu...
Machine Learning at Netflix Scale
Qlik Sense Data Analytics
Azure data platform overview
Coral & Transport UDFs: Building Blocks of a Postmodern Data Warehouse​
How Apache Drives Music Recommendations At Spotify
Introduction to MLflow
Building Lakehouses on Delta Lake with SQL Analytics Primer
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Data Observability Best Pracices
Databricks Platform.pptx
Detection and Response Roles
Netflix Data Pipeline With Kafka
Model selection and tuning at scale
ODSC May 2019 - The DataOps Manifesto
Secure Systems Security and ISA99- IEC62443
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
Building Modern Data Platform with Microsoft Azure
Ad

Viewers also liked (11)

PDF
Building a Data Pipeline from Scratch - Joe Crobak
PPTX
Bitcoin Data Pipeline - Insight Data Science project - September 2014
PDF
Werner Vogels @ FOWA Feb 07
PDF
Microservices Architectures: Become a Unicorn like Netflix, Twitter and Hailo
PDF
Get Started Quickly with IBM's Hadoop as a Service
PDF
HBase and Hadoop at Adobe
PPTX
Hadoop-as-a-Service for Lifecycle Management Simplicity
PPTX
Netflix viewing data architecture evolution - QCon 2014
PDF
Netflix Global Cloud Architecture
PPTX
Building a unified data pipeline in Apache Spark
PPTX
Culture
Building a Data Pipeline from Scratch - Joe Crobak
Bitcoin Data Pipeline - Insight Data Science project - September 2014
Werner Vogels @ FOWA Feb 07
Microservices Architectures: Become a Unicorn like Netflix, Twitter and Hailo
Get Started Quickly with IBM's Hadoop as a Service
HBase and Hadoop at Adobe
Hadoop-as-a-Service for Lifecycle Management Simplicity
Netflix viewing data architecture evolution - QCon 2014
Netflix Global Cloud Architecture
Building a unified data pipeline in Apache Spark
Culture
Ad

Similar to The evolution of the big data platform @ Netflix (OSCON 2015) (20)

PDF
Making it easy to work with data
PPTX
Presto@Netflix Presto Meetup 03-19-15
PPTX
Presto Meetup Talk @ FB (03/19/15)
PPTX
BizTech2017 Presentation
PDF
Jon Sanders on Collaborative Filters at SXSW
PDF
SiddharthAnand_NetflixsCloudDataArchitecture
PPTX
Big Data Day LA 2015 - Building a Big Data Culture in the Entertainment Indus...
PDF
Taylor Wicksell and Tom Gianos at SpringOne Platform 2019
PDF
Netflix cloud architecture...continued
PPTX
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
PDF
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
PPTX
Netflix_AWS_Case_Study_Presentation (1).pptx
PDF
The Netflix Way to deal with Big Data Problems
PDF
Introduction to Big Data and Hadoop
PPTX
Big data
PDF
The Evolving Landscape of Data Engineering
PDF
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
PDF
Going Real-Time: Creating Frequently-Updating Datasets for Personalization: S...
PPTX
Big Data - part 5/7 of "7 modern trends that every IT Pro should know about"
PPT
BIG DATA IN ENGINEERING APPLICATIONS,Big Data(globally)
Making it easy to work with data
Presto@Netflix Presto Meetup 03-19-15
Presto Meetup Talk @ FB (03/19/15)
BizTech2017 Presentation
Jon Sanders on Collaborative Filters at SXSW
SiddharthAnand_NetflixsCloudDataArchitecture
Big Data Day LA 2015 - Building a Big Data Culture in the Entertainment Indus...
Taylor Wicksell and Tom Gianos at SpringOne Platform 2019
Netflix cloud architecture...continued
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Netflix_AWS_Case_Study_Presentation (1).pptx
The Netflix Way to deal with Big Data Problems
Introduction to Big Data and Hadoop
Big data
The Evolving Landscape of Data Engineering
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Going Real-Time: Creating Frequently-Updating Datasets for Personalization: S...
Big Data - part 5/7 of "7 modern trends that every IT Pro should know about"
BIG DATA IN ENGINEERING APPLICATIONS,Big Data(globally)

Recently uploaded (20)

PDF
Monitoring Global Terrestrial Surface Water Height using Remote Sensing - ARS...
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
Practice Questions on recent development part 1.pptx
PDF
July 2025: Top 10 Read Articles Advanced Information Technology
PPT
Chapter 6 Design in software Engineeing.ppt
PDF
Traditional Exams vs Continuous Assessment in Boarding Schools.pdf
PPT
High Data Link Control Protocol in Data Link Layer
PPTX
Chapter----five---Resource Recovery.pptx
PPTX
Soil science - sampling procedures for soil science lab
PDF
오픈소스 LLM, vLLM으로 Production까지 (Instruct.KR Summer Meetup, 2025)
PDF
Queuing formulas to evaluate throughputs and servers
PDF
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PDF
International Journal of Information Technology Convergence and Services (IJI...
PPTX
Ship’s Structural Components.pptx 7.7 Mb
PPTX
Security-Responsibilities-in-the-Cloud-Azure-Shared-Responsibility-Model.pptx
PPTX
ANIMAL INTERVENTION WARNING SYSTEM (4).pptx
PDF
Top 10 read articles In Managing Information Technology.pdf
PDF
dse_final_merit_2025_26 gtgfffffcjjjuuyy
PPTX
The-Looming-Shadow-How-AI-Poses-Dangers-to-Humanity.pptx
Monitoring Global Terrestrial Surface Water Height using Remote Sensing - ARS...
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
Practice Questions on recent development part 1.pptx
July 2025: Top 10 Read Articles Advanced Information Technology
Chapter 6 Design in software Engineeing.ppt
Traditional Exams vs Continuous Assessment in Boarding Schools.pdf
High Data Link Control Protocol in Data Link Layer
Chapter----five---Resource Recovery.pptx
Soil science - sampling procedures for soil science lab
오픈소스 LLM, vLLM으로 Production까지 (Instruct.KR Summer Meetup, 2025)
Queuing formulas to evaluate throughputs and servers
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
International Journal of Information Technology Convergence and Services (IJI...
Ship’s Structural Components.pptx 7.7 Mb
Security-Responsibilities-in-the-Cloud-Azure-Shared-Responsibility-Model.pptx
ANIMAL INTERVENTION WARNING SYSTEM (4).pptx
Top 10 read articles In Managing Information Technology.pdf
dse_final_merit_2025_26 gtgfffffcjjjuuyy
The-Looming-Shadow-How-AI-Poses-Dangers-to-Humanity.pptx

The evolution of the big data platform @ Netflix (OSCON 2015)

Editor's Notes