0% found this document useful (0 votes)
21 views

cloud Presentation

The document provides an overview of Netflix's backend architecture and cloud services, highlighting its use of AWS for computing and storage needs, and its microservice architecture. Key statistics include over 200 million subscribers and the processing of billions of events daily, with a focus on video transcoding and delivery through Open Connect servers. Additionally, it discusses the use of databases like MySQL and Cassandra, as well as technologies like Kafka and Apache Chukwa for event processing and monitoring.

Uploaded by

anas170p
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

cloud Presentation

The document provides an overview of Netflix's backend architecture and cloud services, highlighting its use of AWS for computing and storage needs, and its microservice architecture. Key statistics include over 200 million subscribers and the processing of billions of events daily, with a focus on video transcoding and delivery through Open Connect servers. Additionally, it discusses the use of databases like MySQL and Cassandra, as well as technologies like Kafka and Apache Chukwa for event processing and monitoring.

Uploaded by

anas170p
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

2022

IUG

Understanding
System Design of
Netflix: Backend
Architecture and
Cloud Services
Team Members

1 - Hussein Abu Eliewa


120180600
2 - Abdalmoumen Sharaf
120180963
3 - Mahmoud Ayesh
120180555
Introduction:

● Netflix is the world’s leading internet television network,


with more than 200 million members in more than 190
countries enjoying 125 million hours of TV shows and
movies each day. Netflix uses AWS for nearly all its
computing and storage needs, including databases,
analytics, recommendation engines, video transcoding,
and more—hundreds of functions that in total use more
than 100,000 server instances on AWS.

● Netflix has 200M+ subscribers worldwide in 200+


countries.
Statistics
Before diving into the architecture of Netflix let us
first go through some of the important statistics
based on which this system is designed.

● Netflix Application has almost 400 billion events per day,


Around 8 million events and 17 GB per second during
peak.

● More than 200M subscribers, Subscribers spread across


more than 200 countries, Supports 2000+ devices.
● Average number of videos watched by a user per day =
5, Average size of video = 500 MB.

● Average number of videos uploaded per day from


backend = 1,000.

● Total upload storage required per day = 1,000 * 500 MB


= 500 GB(approximately).
Onboarding a Video on Netflix
Netflix receives high-quality videos and content from
production houses. However, Netflix supports 2000+
devices and each requires different resolutions and
formats.

Netflix prepares multiple replicas of different videos for


same movie with different resolution so as to serve
users, the video quality based on their network speed
and devices. To achieve this Netflix breaks the original
video into different smaller chunks and using parallel
workers in AWS it converts these chunks into different
● Netflix Uses Close to 1,000 Amazon Kinesis Shards in Parallel
to Process Billions
of Traffic Flows.

● Amazon Kinesis Data Streams is a scalable and durable real-


time data streaming service that can continuously capture
gigabytes of data per second from hundreds of thousands of
sources.
● Netflix’s Amazon Kinesis Streams-based solution has proven
to be highly scalable, each day processing billions of traffic
flows. Typically, about 1,000 Amazon Kinesis shards work in
parallel to process the data stream. “Amazon Kinesis
Streams processes multiple terabytes of log data each day,
yet events show up in our analytics in seconds,” says John
Bennett, senior software engineer at Netflix. “We can
discover and respond to issues in real time, ensuring high
availability and a great customer experience.”

● After transcoding, once we have multiple copies of the files


for the same movie, these copies are transferred to each
and every Open Connect server which are placed in
Cloud Services

Open Connect :-
Open connect or Netflix cdn(Content Delivery Network) is a
network of distributed servers spread across different
geographical locations. Different replicas of files of the same
movies are transferred to each and every open connect server.
Open connect is mainly responsible for video streaming in
Netflix. When you hit the play button, video is served to you
from the nearest open connect server which leads to faster and
better experience. It also increases the scalability of the whole
system. These servers are called open connect appliances(OCA).
Netflix cdn(Content Delivery
AWS:- Netflix uses AWS for almost everything except
video streaming. That includes online storage,
recommendation engine, video transcoding, databases,
and analytics.

When User clicks on the play button , Netflix analyzes the


network speed or connection stability, and then it figures
out the best Open Connect server near to the user.
Depending on the device and screen size, the right video
format is streamed into the user’s device.
Backend Architecture of Netflix
● Everything in Netflix except video streaming is
handled by its backend service including
onboarding new content, processing videos,
distributing them to servers located in different
parts of the world, and managing the network
traffic.

● The request from the client is sent to AWS Elastic


load balancer, which consists of a 2 tier
architecture.

Tier 1:- request from ELB first reaches this tier which is responsible
for balancing load in different zones. DNS based round robin
● ELB forwards this request to an API gateway. Netflix uses
ZUUL as its API gateway which runs on AWS EC2
instances. ZUUL is a library developed and used by
Netflix for dynamic routing, monitoring and security.
ZUUL provides routing based on query parameters, URL
and path.

● Netflix is built on collections of services. Building an


application using a collection of services is known as
microservice architecture. In microservice architecture,
services are independent of each other.

● ZUUL is the front door for all requests from devices and
websites to the backend of the Netflix streaming
● Netflix architecture has a complex distributed structure. Besides having
many advantages , there are some dependencies too . For example, one
server working can be dependent upon output of another server.
Dependencies among these servers can create latency and can also lead to
a single point of failure if one of the servers stopped working.

● For the above problem Netflix uses the hystrix . Hystrix is a very powerful
library developed by Netflix that isolates every microservice from one
another to minimize the number of failures. Hystrix does this by isolating
points of access between the services. Hystrix is used for fail fast and rapidly
recovery, near real-time monitoring, alerting, and operational control, reduce
latency and failure from dependencies accessed (typically over the network)
via third-party client libraries, stop cascading failures in a complex
distributed system.

● User activities and history data is sent to stream processing pipeline which is
used to give movie recommendations later

● This data is also sent to big data processing tools like AWS, Hadoop ,
Netflix Database:-

Netflix uses 2 different databases:-

1. MySQL

2. Cassandra
MySQL:-
For data like billing information, user information, and transaction
information Netflix uses MySQL because it needs ACID compliance.
Netflix has a master-master setup for MySQL and it is deployed on
Amazon large EC2 instances.

According to the master master setup, if the writer happens to be the


primary master node then it will be also replicated to another master
node. The acknowledgment will be sent only if both the primary and
remote master nodes’ writes have been confirmed. This ensures the
high availability of data.

Netflix has set up the read replica for each and every node (local, as
well as cross-region). This ensures high availability and scalability.
Cassandra:-
Netflix uses Cassandra for its scalability and lack of single points of
failure and for cross-regional deployments. In effect, a single global
Cassandra cluster can simultaneously service applications and
asynchronously replicate data across multiple geographic locations.

Cassandra data model in Netflix:-

1. 50+ Cassandra clusters


2. 500+ nodes
3. 30TB daily backups
4. 250k Writes/s at each node
Use of Kafka And Apache Chukwa In
Netflix
Netflix is built on a collection of microservices, these microservices
work together to provide a number of services to users.

Often in a microservice architecture, some percentage of failure is


acceptable. However, some failures could lead to greater problems. A
failure in any one of the microservice calls could lead to a plethora of
computations being out of sync and could result in data being off by
millions of dollars. It would also lead to availability problems and
cause blind spots while trying to effectively track down and answer
business questions as to what is causing this mismatch of data?
The solution of above problem is to rethink our service
interactions as asynchronous event exchanges instead of a
sequence of synchronous requests. This leads to following
benefits:

1. Our infrastructure becomes inherently asynchronous

2. Our application becomes loosely coupled and traceability of


errors is improved.

Netflix uses Apache Kafka for its eventing, messaging and stream
processing needs.
Apache Kafka works on a publish/subscribe model. services in Netflix
publish their changes as events into a message bus, which are then
consumed by another service of interest that needs to adjust its state
of the world.

This allows us to track whether services are in sync with respect to


state changes and, if not, how long before they can be in sync. These
insights are extremely powerful when operating a large graph of
dependent services.

Event-based communication and decentralized consumption helps us


overcome issues we usually see in large synchronous call graphs (as
mentioned above).
Apache Chukwa :-

Apache chukwa is an open source data collection


system which is used to monitor complex distributed
systems. chukwa collects events from different
microservices and writes them in Hadoop file sequence
format. Chukwa also provides traffic to Kafka for
uploading events to various sinks:- S3, Elastic search
etc.
Elastic Search:-

Elastic search is used in Netflix to provide customer


care support, data visualization and error detection. For
example, if a user is not able to play a video , the
playback team will go to elastic search and find the
reason for the issue. It is also used to keep track of
resource usage and to detect signup or login problems.
RESOURCES

● https://medium.com/@nidhiupreti99/understanding-system-design-of-
netflix-backend-architecture-and-cloud-services-b077162e45bc

● https://aws.amazon.com/solutions/case-studies/netflix/

● https://dev.to/gbengelebs/netflix-system-design-backend-architecture-10i
3

● https://www.youtube.com/watch?v=DxSdSmzXIsU

You might also like