SlideShare a Scribd company logo
Google Bigtable
The magic behind Google’s
data management
Overview
● Introduction
● Challenges
● Data model
● Building blocks
● Conclusion
➔ Bigtable is Google’s cloud based data storage service.
➔ It works on distributed parallel architecture and clustering.
➔ It is self managing, highly scalable, fault tolerant and flexible.
➔ Bigtable provide low latency real time access and improved higher workload
processing.
➔ It provides integration capabilities with other products and services through
API’s
➔ Many services by Google use Bigtable to store data , including Gmail, Youtube,
web indexing, Google Maps and Google Analytics
Intro
Original Idea
Challenges
Jeffrey and sanjay decided to build a datastore service that could scale linearly across thousands of
commodity servers.
● Using cheap hardware may lead to system failure.
● How to retain performance at high scale
-- Compromise with few things
>Abandon traditional relational model (No joins )
>Replication of data
>Using parallel and distributed architecture
Data Model
A Bigtable is a sparse, distributed, persistent multi-dimensional sorted map. The map is indexed by a
row key, column key, and a timestamp; each value in the map is an uninterpreted array of bytes.
Bigtable considers data as strings, both in case of structured and unstructured data.
● Rows
➔ The row keys in a table are arbitrary strings.
➔ Data is maintained in lexicographic order by row key
➔ Each row range is called a tablet, which is the unit of distribution and load balancing.
● Columns
➔ Column keys are grouped into sets called column families.
➔ Data stored in a column family is usually of the same type
➔ A column key is named using the syntax: family : qualifier.
➔ Column family names must be printable , but qualifiers may be arbitrary strings.
● Timestamps
➔ Each cell in a Bigtable can contain multiple versions of the same data
➔ Versions are indexed by 64-bit integer timestamps
➔ Timestamps can be assigned: automatically by Bigtable , or explicitly by client applications
Rows Timestamps
Columns
Building Blocks
Bigtable is built on several other pieces of Google infrastructure.
● Google File system(GFS)
● SSTable : Data structure for storage
● Chubby: Distributed lock service.
Three major components
❖ Library linked into every client
❖ Single master server
▪ Assigning tablets to tablet servers
▪ Detecting addition and expiration of tablet servers
▪ Balancing tablet-server load
▪ Garbage collection files in GFS
❖ Many tablet servers
▪ Manages a set of tablets
▪ Tablet servers handle read and write requests to its table
▪ Splits tablets that have grown too large
Three level hierarchy
Level 2: Root tablet contains the location of METADATA tablets
Level 3: Each METADATA tablet contains the location of user tablets
Level 1: Chubby file containing location of the root tablet
▪ Location of tablet is stored under a row key that
encodes table identifier and its end row
Google Bigtable
“All models are wrong. Some models are
useful.”
- George Box,"one of the great statistical minds of the 20th century”
Distributed and
parallel computing
has paved the way
for new
technologies to
flourish.
Conclusion
Bigtable has provided low latency real time access and improved higher workload processing with high scalability and
high throughput. It’s Robust fault tolerant architecture helps to reduce risk of data loss, reliable cluster resizing enables to
provision or de-provision the new cluster with no down time , autonomous management let’s the user be free of
managing the tasks and assignment of data, while Bigtable does it automatically and provided integration capabilities
with other products and services through API’s really make it a general purpose data store, extending it’s capability and
giving user a reliable interface to get more out of less. Bigtable uses a parallel and distributed architecture to process the
data at lightning speeds while reducing cost per computation, the architecture at back end is advanced and proved to be
better in performance and user experience.
With the demand of huge cloud data storage making so much sense now, Bigtable has landed being one of the best
possible solution with lower cost, high performance, durability and flexibility. Since it’s already powering most of
Google’s services , it has proved its usability, and its really the Google’s magic behind it’s data management and high
performance operability, giving it an edge over other giants in the field.
Thanks!
For giving
Your
Precious
Time.

More Related Content

What's hot (19)

Google BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday Developer
Márton Kodok
 
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure Chest
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure ChestWeb Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure Chest
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure Chest
Axiell ALM
 
DocumentDB - NoSQL on Cloud at Reboot2015
DocumentDB - NoSQL on Cloud at Reboot2015DocumentDB - NoSQL on Cloud at Reboot2015
DocumentDB - NoSQL on Cloud at Reboot2015
Vidyasagar Machupalli
 
Bigtable a distributed storage system
Bigtable a distributed storage systemBigtable a distributed storage system
Bigtable a distributed storage system
Devyani Vaidya
 
Dbscripts Drupalcon DC 2009 Presentation
Dbscripts Drupalcon DC 2009 PresentationDbscripts Drupalcon DC 2009 Presentation
Dbscripts Drupalcon DC 2009 Presentation
ceardach
 
Architecture Blue Print
Architecture Blue PrintArchitecture Blue Print
Architecture Blue Print
Bogdan Nedelcu
 
Google Big Query UDFs
Google Big Query UDFsGoogle Big Query UDFs
Google Big Query UDFs
David Gloyn-Cox
 
Tableau Data Sheet | Whitepaper
Tableau Data Sheet | WhitepaperTableau Data Sheet | Whitepaper
Tableau Data Sheet | Whitepaper
Vasu S
 
Google Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.comGoogle Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.com
Alex Van Boxel
 
TDC2016SP - Trilha BigData
TDC2016SP - Trilha BigDataTDC2016SP - Trilha BigData
TDC2016SP - Trilha BigData
tdc-globalcode
 
Google Bigtable
Google BigtableGoogle Bigtable
Google Bigtable
GirdhareeSaran
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
ObjectRocket
 
CZJUG Intro - BI Platform as a Service - a case for Java in the Cloud
CZJUG Intro - BI Platform as a Service - a case for Java in the CloudCZJUG Intro - BI Platform as a Service - a case for Java in the Cloud
CZJUG Intro - BI Platform as a Service - a case for Java in the Cloud
Jaroslav Gergic
 
Data Structure and Types
Data Structure and TypesData Structure and Types
Data Structure and Types
Anjani Phuyal
 
Google App Engine 7 9-14
Google App Engine 7 9-14Google App Engine 7 9-14
Google App Engine 7 9-14
Tony Frame
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
VMware Tanzu
 
Big data converted
Big data convertedBig data converted
Big data converted
ABINAYAM20
 
Building Resilient and Scalable Data Pipelines by Decoupling Compute and Storage
Building Resilient and Scalable Data Pipelines by Decoupling Compute and StorageBuilding Resilient and Scalable Data Pipelines by Decoupling Compute and Storage
Building Resilient and Scalable Data Pipelines by Decoupling Compute and Storage
Databricks
 
Nyc web perf-final-july-23
Nyc web perf-final-july-23Nyc web perf-final-july-23
Nyc web perf-final-july-23
Dan Boutin
 
Google BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday Developer
Márton Kodok
 
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure Chest
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure ChestWeb Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure Chest
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure Chest
Axiell ALM
 
DocumentDB - NoSQL on Cloud at Reboot2015
DocumentDB - NoSQL on Cloud at Reboot2015DocumentDB - NoSQL on Cloud at Reboot2015
DocumentDB - NoSQL on Cloud at Reboot2015
Vidyasagar Machupalli
 
Bigtable a distributed storage system
Bigtable a distributed storage systemBigtable a distributed storage system
Bigtable a distributed storage system
Devyani Vaidya
 
Dbscripts Drupalcon DC 2009 Presentation
Dbscripts Drupalcon DC 2009 PresentationDbscripts Drupalcon DC 2009 Presentation
Dbscripts Drupalcon DC 2009 Presentation
ceardach
 
Architecture Blue Print
Architecture Blue PrintArchitecture Blue Print
Architecture Blue Print
Bogdan Nedelcu
 
Tableau Data Sheet | Whitepaper
Tableau Data Sheet | WhitepaperTableau Data Sheet | Whitepaper
Tableau Data Sheet | Whitepaper
Vasu S
 
Google Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.comGoogle Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.com
Alex Van Boxel
 
TDC2016SP - Trilha BigData
TDC2016SP - Trilha BigDataTDC2016SP - Trilha BigData
TDC2016SP - Trilha BigData
tdc-globalcode
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
ObjectRocket
 
CZJUG Intro - BI Platform as a Service - a case for Java in the Cloud
CZJUG Intro - BI Platform as a Service - a case for Java in the CloudCZJUG Intro - BI Platform as a Service - a case for Java in the Cloud
CZJUG Intro - BI Platform as a Service - a case for Java in the Cloud
Jaroslav Gergic
 
Data Structure and Types
Data Structure and TypesData Structure and Types
Data Structure and Types
Anjani Phuyal
 
Google App Engine 7 9-14
Google App Engine 7 9-14Google App Engine 7 9-14
Google App Engine 7 9-14
Tony Frame
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
VMware Tanzu
 
Big data converted
Big data convertedBig data converted
Big data converted
ABINAYAM20
 
Building Resilient and Scalable Data Pipelines by Decoupling Compute and Storage
Building Resilient and Scalable Data Pipelines by Decoupling Compute and StorageBuilding Resilient and Scalable Data Pipelines by Decoupling Compute and Storage
Building Resilient and Scalable Data Pipelines by Decoupling Compute and Storage
Databricks
 
Nyc web perf-final-july-23
Nyc web perf-final-july-23Nyc web perf-final-july-23
Nyc web perf-final-july-23
Dan Boutin
 

Similar to Google Bigtable (20)

Google Big Table
Google Big TableGoogle Big Table
Google Big Table
Omar Al-Sabek
 
Bigtable
BigtableBigtable
Bigtable
ptdorf
 
Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06
mrlonganh
 
Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06
Manivasagam Mohan
 
Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06
temp2004it
 
Google - Bigtable
Google - BigtableGoogle - Bigtable
Google - Bigtable
영원 서
 
Getting more into GCP.pdf
Getting more into GCP.pdfGetting more into GCP.pdf
Getting more into GCP.pdf
Knoldus Inc.
 
GCP Data Engineering Online Training in Hyderabad - GCP.pptx
GCP Data Engineering Online Training in Hyderabad - GCP.pptxGCP Data Engineering Online Training in Hyderabad - GCP.pptx
GCP Data Engineering Online Training in Hyderabad - GCP.pptx
sivavisualpath
 
bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptx
Harissh16
 
Exploring BigData with Google BigQuery
Exploring BigData with Google BigQueryExploring BigData with Google BigQuery
Exploring BigData with Google BigQuery
Dharmesh Vaya
 
Big table
Big tableBig table
Big table
Manuel Correa
 
Introduction to GCP DataFlow Presentation
Introduction to GCP DataFlow PresentationIntroduction to GCP DataFlow Presentation
Introduction to GCP DataFlow Presentation
Knoldus Inc.
 
Introduction to GCP Data Flow Presentation
Introduction to GCP Data Flow PresentationIntroduction to GCP Data Flow Presentation
Introduction to GCP Data Flow Presentation
Knoldus Inc.
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Edwin Poot
 
Executive Intro to BigQuery
Executive Intro to BigQueryExecutive Intro to BigQuery
Executive Intro to BigQuery
William M. Cohee
 
GCP On Prem Buyers Guide - White-paper | Qubole
GCP On Prem Buyers Guide - White-paper | Qubole GCP On Prem Buyers Guide - White-paper | Qubole
GCP On Prem Buyers Guide - White-paper | Qubole
Vasu S
 
Facade
FacadeFacade
Facade
Louis Zhang
 
Traditional data word
Traditional data wordTraditional data word
Traditional data word
orcoxsm
 
Cloud computing overview
Cloud computing overviewCloud computing overview
Cloud computing overview
KHANSAFEE
 
Google BigTable
Google BigTableGoogle BigTable
Google BigTable
New York City College of Technology Computer Systems Technology Colloquium
 
Bigtable
BigtableBigtable
Bigtable
ptdorf
 
Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06
mrlonganh
 
Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06
temp2004it
 
Google - Bigtable
Google - BigtableGoogle - Bigtable
Google - Bigtable
영원 서
 
Getting more into GCP.pdf
Getting more into GCP.pdfGetting more into GCP.pdf
Getting more into GCP.pdf
Knoldus Inc.
 
GCP Data Engineering Online Training in Hyderabad - GCP.pptx
GCP Data Engineering Online Training in Hyderabad - GCP.pptxGCP Data Engineering Online Training in Hyderabad - GCP.pptx
GCP Data Engineering Online Training in Hyderabad - GCP.pptx
sivavisualpath
 
bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptx
Harissh16
 
Exploring BigData with Google BigQuery
Exploring BigData with Google BigQueryExploring BigData with Google BigQuery
Exploring BigData with Google BigQuery
Dharmesh Vaya
 
Introduction to GCP DataFlow Presentation
Introduction to GCP DataFlow PresentationIntroduction to GCP DataFlow Presentation
Introduction to GCP DataFlow Presentation
Knoldus Inc.
 
Introduction to GCP Data Flow Presentation
Introduction to GCP Data Flow PresentationIntroduction to GCP Data Flow Presentation
Introduction to GCP Data Flow Presentation
Knoldus Inc.
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Edwin Poot
 
Executive Intro to BigQuery
Executive Intro to BigQueryExecutive Intro to BigQuery
Executive Intro to BigQuery
William M. Cohee
 
GCP On Prem Buyers Guide - White-paper | Qubole
GCP On Prem Buyers Guide - White-paper | Qubole GCP On Prem Buyers Guide - White-paper | Qubole
GCP On Prem Buyers Guide - White-paper | Qubole
Vasu S
 
Traditional data word
Traditional data wordTraditional data word
Traditional data word
orcoxsm
 
Cloud computing overview
Cloud computing overviewCloud computing overview
Cloud computing overview
KHANSAFEE
 

Recently uploaded (20)

"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko
Fwdays
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Datastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptxDatastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptx
kaleeswaric3
 
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
Lynda Kane
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5..."Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
Fwdays
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
"PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System""PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System"
Jainul Musani
 
Network Security. Different aspects of Network Security.
Network Security. Different aspects of Network Security.Network Security. Different aspects of Network Security.
Network Security. Different aspects of Network Security.
gregtap1
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Buckeye Dreamin 2024: Assessing and Resolving Technical Debt
Buckeye Dreamin 2024: Assessing and Resolving Technical DebtBuckeye Dreamin 2024: Assessing and Resolving Technical Debt
Buckeye Dreamin 2024: Assessing and Resolving Technical Debt
Lynda Kane
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko
Fwdays
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Datastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptxDatastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptx
kaleeswaric3
 
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
Lynda Kane
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5..."Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
Fwdays
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
"PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System""PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System"
Jainul Musani
 
Network Security. Different aspects of Network Security.
Network Security. Different aspects of Network Security.Network Security. Different aspects of Network Security.
Network Security. Different aspects of Network Security.
gregtap1
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Buckeye Dreamin 2024: Assessing and Resolving Technical Debt
Buckeye Dreamin 2024: Assessing and Resolving Technical DebtBuckeye Dreamin 2024: Assessing and Resolving Technical Debt
Buckeye Dreamin 2024: Assessing and Resolving Technical Debt
Lynda Kane
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 

Google Bigtable

  • 1. Google Bigtable The magic behind Google’s data management
  • 2. Overview ● Introduction ● Challenges ● Data model ● Building blocks ● Conclusion
  • 3. ➔ Bigtable is Google’s cloud based data storage service. ➔ It works on distributed parallel architecture and clustering. ➔ It is self managing, highly scalable, fault tolerant and flexible. ➔ Bigtable provide low latency real time access and improved higher workload processing. ➔ It provides integration capabilities with other products and services through API’s ➔ Many services by Google use Bigtable to store data , including Gmail, Youtube, web indexing, Google Maps and Google Analytics Intro
  • 5. Challenges Jeffrey and sanjay decided to build a datastore service that could scale linearly across thousands of commodity servers. ● Using cheap hardware may lead to system failure. ● How to retain performance at high scale -- Compromise with few things >Abandon traditional relational model (No joins ) >Replication of data >Using parallel and distributed architecture
  • 6. Data Model A Bigtable is a sparse, distributed, persistent multi-dimensional sorted map. The map is indexed by a row key, column key, and a timestamp; each value in the map is an uninterpreted array of bytes. Bigtable considers data as strings, both in case of structured and unstructured data. ● Rows ➔ The row keys in a table are arbitrary strings. ➔ Data is maintained in lexicographic order by row key ➔ Each row range is called a tablet, which is the unit of distribution and load balancing. ● Columns ➔ Column keys are grouped into sets called column families. ➔ Data stored in a column family is usually of the same type ➔ A column key is named using the syntax: family : qualifier. ➔ Column family names must be printable , but qualifiers may be arbitrary strings.
  • 7. ● Timestamps ➔ Each cell in a Bigtable can contain multiple versions of the same data ➔ Versions are indexed by 64-bit integer timestamps ➔ Timestamps can be assigned: automatically by Bigtable , or explicitly by client applications Rows Timestamps Columns
  • 8. Building Blocks Bigtable is built on several other pieces of Google infrastructure. ● Google File system(GFS) ● SSTable : Data structure for storage ● Chubby: Distributed lock service.
  • 9. Three major components ❖ Library linked into every client ❖ Single master server ▪ Assigning tablets to tablet servers ▪ Detecting addition and expiration of tablet servers ▪ Balancing tablet-server load ▪ Garbage collection files in GFS ❖ Many tablet servers ▪ Manages a set of tablets ▪ Tablet servers handle read and write requests to its table ▪ Splits tablets that have grown too large
  • 10. Three level hierarchy Level 2: Root tablet contains the location of METADATA tablets Level 3: Each METADATA tablet contains the location of user tablets Level 1: Chubby file containing location of the root tablet ▪ Location of tablet is stored under a row key that encodes table identifier and its end row
  • 12. “All models are wrong. Some models are useful.” - George Box,"one of the great statistical minds of the 20th century”
  • 13. Distributed and parallel computing has paved the way for new technologies to flourish.
  • 14. Conclusion Bigtable has provided low latency real time access and improved higher workload processing with high scalability and high throughput. It’s Robust fault tolerant architecture helps to reduce risk of data loss, reliable cluster resizing enables to provision or de-provision the new cluster with no down time , autonomous management let’s the user be free of managing the tasks and assignment of data, while Bigtable does it automatically and provided integration capabilities with other products and services through API’s really make it a general purpose data store, extending it’s capability and giving user a reliable interface to get more out of less. Bigtable uses a parallel and distributed architecture to process the data at lightning speeds while reducing cost per computation, the architecture at back end is advanced and proved to be better in performance and user experience. With the demand of huge cloud data storage making so much sense now, Bigtable has landed being one of the best possible solution with lower cost, high performance, durability and flexibility. Since it’s already powering most of Google’s services , it has proved its usability, and its really the Google’s magic behind it’s data management and high performance operability, giving it an edge over other giants in the field.