0% found this document useful (0 votes)
136 views

Jamie Oconnor - Research Paper - Modern Distributed Databases

This research paper examines the distributed databases used by Google and Netflix to handle large amounts of user data. It discusses the challenges of building large-scale distributed databases, including scalability, security, reliability, and availability. It provides details on the architecture and implementation of Google's Spanner database and Netflix's use of the Apache Cassandra database. Both companies implemented replication and data sharding across multiple data centers to ensure reliability and continuous service availability.

Uploaded by

api-300647362
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views

Jamie Oconnor - Research Paper - Modern Distributed Databases

This research paper examines the distributed databases used by Google and Netflix to handle large amounts of user data. It discusses the challenges of building large-scale distributed databases, including scalability, security, reliability, and availability. It provides details on the architecture and implementation of Google's Spanner database and Netflix's use of the Apache Cassandra database. Both companies implemented replication and data sharding across multiple data centers to ensure reliability and continuous service availability.

Uploaded by

api-300647362
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Research Paper

Modern Distributed Databases


Bachelor of Science (Hons) in Software Design
Year 4
2015 2016

Student Name:

Jamie OConnor

Student ID:

A00180489

Group:

Web

Word Count: 1499 (excluding references and citations)


1

Contents

Introduction

Challenges and Issues of Modern Distributed Databases

Google

General Description of the Google Services and Database

Architecture of the Google Distributed Database

Hardware & Software used by Google

Security of the Google Distributed Database

Reliability of the Google Distributed Database

Examples of Googles historical issues.

Netflix

General Description of the Netflix Services and Database

Architecture of the Netflix Distributed Database

Hardware & Software used by Netflix

Security of the Netflix Distributed Database

Reliability of the Netflix Distributed Database

Examples of Netflixs historical issues.

Introduction
This report will examine the importance of distributed databases in modern computing and
how professional organisations require them to handle vast data. I will focus on the
distributed databases used by Google and Netflix. I will discuss the challenges and issues that
are associated with designing large scale modern distributed databases and how both
organisations have successfully overcome these challenges.
Issues & challenges faced
There are many challenges in designing a large scale modern distributed database. Google
spent up to 4 and half years designing the foundation for their own. (Hsieh, 2012)
Here are some of the issues faced:
Scalability - Google and Netflix are constantly growing their services. To keep up with public
demand and to maintain performance of data on a global scale, scalability must be addressed.
The architecture must be stable and have the ability to grow.
Security - The fact that data is located at multiple sites increases the probability of security
lapses.
Reliability - Without a reliable system you may not be able to provide your services to the
expected standard. Google and Netflix could lose millions of dollars, many customers and
tarnish their reputation.
Availability - Google and Netflix customers want continuous access to data. What happens if
a datacentre fails? Will customers be able to continue working?
The above is only a summary of the issues faced. Transaction management, concurrency
control, query optimization and data integrity must all be addressed while maintaining
transparency.

GOOGLE
General Description of the Google Services and Database
Google are a multinational organisation widely respected in the software industry. Their
services include: Gmail, Chrome, YouTube, Android, Google AdWords etc. Their goal is: to
make it as easy as possible for you to find the information you need and get the things you
need to do done. (Google, 2015) They need advanced databases to store their data.
Spanner is the system they created. Spanner uses replication for both global availability and
geographic locality. Spanners main focus is managing cross-datacentre replicated data, but
they have spent time designing and implementing important database features on top of their
distributed-systems infrastructure. (Corbett, et al., 2012)
Architecture of the Google Distributed Database
A deployment of Spanner is called a Universe. Each universe has zones amounting to the
locations across which data can be replicated. Zones can be added/removed from a running
system. There may be more than one zone in a datacentre, meaning applications can partition
data across different servers within the same datacentre.

Diagram - Illustrates servers in a Spanner Universe.

A zone has a zonemaster and can have 1000 spanservers. The zonemaster allocates data to
spanservers which provide the data to clients. The location proxies locate spanservers.
The universemaster displays status information about each zone and the placement driver is
responsible for the automated movement of data across zones. Spanservers manage 100s of
tablets (data structures) and have a Paxos machine which supports replication. (Corbett, et
al., 2012)
Hardware & Software used by Google
Spanner makes use of hardware-assisted time synchronization using GPS clocks and atomic
clocks to ensure global consistency. (Corbett, et al., 2012) Google had to install antennas on
the roofs of its datacentres connecting them to the hardware below. According to Andrew
Fikes, the GPS units they use were relatively inexpensive devices with lots of different
vendors. The time keepers are kept in racks onside the servers, and again, they need only
connect to some machines in the datacentre. (Metz, 2012) TrueTime is implemented by a
set of time master machines per datacentre and does not require specialized servers. Google
also make use of Paxos state machines and spanserver machines. (Corbett, et al., 2012)
Security of the Google Distributed Database
At our data centres, we take security very seriously. We keep your data safe and secure by
using dozens of critical security features. (Google, 2015) Google build exclusive custom
servers with only necessary hardware and software. They also have Emergency backup
generators. They automatically shift all data in randomly named chunks across datacentres
across many computers in different locations avoiding single point failures. The
location/status of each hard drive in their datacentres is tracked. If they have reached the end
of their lives they are destroyed in a thorough, multi-step process. At their datacentres,
Google have access controls, guards on duty 24/7, video surveillance and perimeter fencing
to physically protect the sites at all times. (Google, 2015)

Reliability of the Distributed Database


Spanner is used on a global scale which means reliability is essential. Spanner automatically
shards data and synchronously-replicates data across machines (even across datacentres) to
balance load, in response to failures and in response to change in the amount of data or
number of servers. (Corbett, et al., 2012) Spanner is designed to scale up to millions of
machines across hundreds of datacentres and trillions of database rows. (Corbett, et al.,
2012, p. 1) Applications can use Spanner for high availability even in the face of wide-area
natural disasters. (Corbett, et al., 2012) They do this by replicating their data within or even
across continents. Clients automatically failover between replicas which means that if a server
crashes the user will access the same data from a backup on a different server. (Corbett, et
al., 2012)
Examples of historical issues
An early Spanner incarnation supported multiple Paxos state machines which allowed for
flexible replication configurations. Complexity of that design led to it being abandoned.
(Corbett, et al., 2012)

Netflix
General Description of the Netflix Services and Database
Netflix are international providers of on-demand internet streaming media. Netflix have new
movies and TV shows coming all the time, options for subtitles or dubbing, award-winning
original series and documentaries that you wont find anywhere else. (Netflix, 2015) The
distributed database they use is Apache Cassandra (The DataStax Enterprise Edition).
(Datastax, 2014) Cassandra is a distributed storage system for managing structured data that
is designed to scale to a very large size across many commodity servers, with no single point
of failure. (Lakshman, 2008) Cassandra was originally developed by Facebook and is now
used by Netflix for 95% of their database needs. Subscriber data, video metadata, pause
location and every user interaction is stored and processed to build a recommendation for
individual users. (Kalantzis, 2014)
Architecture of the Netflix Distributed Database
Cassandra doesnt support a full relational data model. It provides clients with a simple data
model that supports dynamic control over data layout and format. (Lakshman & Malik, 2010)
An instance of Cassandra has one table made up of one or more column families as defined
by the user. Each column family can consist of supercolumns/columns which are dynamically
created. There is no limit on the number of these that can be stored within a family. Columns
constructs have a name, value and a timestamp. Supercolumns have a name and an infinite
number of columns. Every row has a unique key. Keys are strings of any size. Key K4 could
have 94 columns/supercolumns and key K5 could have 20 columns/supercolumns.
(Lakshman, 2008)

Hardware & Software used by Netflix


Netflix has been shifting technology from in-house data centres to third-party facilities for
years now and it says that the process is coming to its logical conclusionthe company
is shutting down the last of its data centres. (Brodkin, 2015) Netflix use AWS (Amazon web
services), a cloud technology. Cassandra is stored in the cloud and because of this, Netflix do
not need their own hardware to run Cassandra. Netflix are outsourcing, using Amazons
services, therefore they dont have to keep hardware in-house. We are fully reliant on
Amazon Web Services". (Brodkin, 2015) On the software side of things they use DataStaxs
version of Cassandra. It is Java based and Netflix is primarily a Java shop. (Kalantzis, 2014)
Security of the Netflix Distributed Database
We believe we use reasonable administrative, logical, physical and managerial measures to
safeguard your personal information against loss, theft and unauthorized access, use and
modification. Unfortunately, no measures can be guaranteed to provide 100% security.
Accordingly, we cannot guarantee the security of your information. (Netflix, 2015) Netflix
used the open sourced Cassandra but migrated to DataStax Enterprise for its enterprise-level
security and production. (Datastax, 2014) DataStax Enterprise, built on Cassandra, provides
enhanced data security. DataStax provides many security features including: internal/external
authentication, permission management, transparent data encryption, data auditing etc.
DataStax Enterprise (DSE) builds on the basic security feature set provided in open source
Apache Cassandra. (DataStax, 2015)
Reliability of the Netflix Distributed Database
Netflix uses the Simian Army to ensure reliability. The army consists of the following
members; Chaos Monkey, Chaos Gorilla and Chaos Kong. Netflix use these to test Cassandra.
Chaos Monkey kills individual nodes. Chaos Gorilla kills a whole availability zone (a whole rack
in your datacentre) to prove that your application is fault tolerant to such an event. Chaos
Kong turns off a whole data centre (a whole Amazon Web Services region) to prove Cassandra
is resilient to that type of failure as well. (Kalantzis, 2014) Small and large components fail
continuously; the way Cassandra manages the persistent state in the face of these failures

drives the reliability and scalability of the software systems relying on this service.
(Lakshman, 2008)
Historical Issues
Netflixs previous Oracle database went down for 48+ hours. It wasnt the databases fault
but the Storage Area Network that was storing all the data. This was the reason Netflix
decided to look for an alternative. Netflix tried using SimpleDB but it wasnt scalable enough
for their requirements. (Kalantzis, 2014)

References
Brodkin, J., 2015. Netflix shuts down its last data center, but it still runs a big IT operation.
[Online]
Available at: http://arstechnica.com/information-technology/2015/08/netflix-shuts-downits-last-data-center-but-still-runs-a-big-it-operation/
[Accessed 23 October 2015].
Corbett, J. C. et al., 2012. Spanner: Googles Globally-Distributed Database. [Online]
Available at:
http://static.googleusercontent.com/media/research.google.com/en//archive/spannerosdi2012.pdf
[Accessed 12 October 2015].
Datastax, 2014. Netflix Personalizes Viewing for Over 50 Million Customers with DataStax.
[Online]
Available at: http://www.datastax.com/wp-content/uploads/2011/09/CS-Netflix.pdf?3
[Accessed 27 October 2015].
DataStax, 2015. DataStax Enterprise Advanced Security. [Online]
Available at: http://www.datastax.com/products/datastax-enterprise-security
[Accessed 26 October 2015].
Google, 2015. Google Datacenters. [Online]
Available at: https://www.google.com/about/datacenters/inside/data-security/
[Accessed 12 October 2015].
Google, 2015. Our products and services. [Online]
Available at: https://www.google.com/about/company/products/
[Accessed 12 October 2015].
Hsieh, W., 2012. Wilson Hsieh - Spanner: Google's Globally-Distributed Database - OSDI
2012. [Online]
Available at: https://www.youtube.com/watch?v=NthK17nbpYs
[Accessed 12 October 2015].

10

Kalantzis, C., 2014. Netflix: Cassandra @ Netflix Building a House of Cards on a Solid
Foundation. [Online]
Available at: https://www.youtube.com/watch?v=RMSNLP_ORg8
[Accessed 23 October 2015].
Lakshman, A., 2008. Cassandra A structured storage system on a P2P Network. [Online]
Available at: https://www.facebook.com/notes/facebook-engineering/cassandra-astructured-storage-system-on-a-p2p-network/24413138919
[Accessed 23 October 2015].
Lakshman, A. & Malik, P., 2010. Cassandra - A Decentralized Structured Storage System.
[Online]
Available at: https://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf
[Accessed 23 October 2015].
Metz, C., 2012. Exclusive: Inside Google Spanner, the Largest Single Database on Earth.
[Online]
Available at: http://www.wired.com/2012/11/google-spanner-time/
[Accessed 23 October 2015].
Netflix, 2015. [Online]
Available at: https://www.netflix.com/ie/
[Accessed 23 October 2015].
Netflix, 2015. Privacy Statement. [Online]
Available at: https://www.netflix.com/privacy
[Accessed 23 October 2015].

11

You might also like