SlideShare a Scribd company logo
Dremel:
Interactive Analysis of Web-Scale Datasets
Carl Adler
IDSL - Dep. IM - NTUST
Outline
• About Dremel
• Main Features
• Record-Oriented v.s. Column-Oriented
• Data Model
• Nested Columnar Storage
• Query Execution
• Experiments
• Conclusions
About Dremel
A scalable, interactive ad-hoc query system for analysis of read-only nested
data. By combining multi-level execution trees and columnar data layout, it is
capable of running aggregation queries over trillion-row tables in seconds.
Main Features
• Dremel is a large-scale system
• The complement for MapReduce-based interactive query
• The nested data model
• Build on ideas from web search and parallel DBMSs
• Column-striped storage representation
Main Features
Dremel is a large-scale system:
• Reading 1TB of compressed data in 1 sec
-> Needs tens of thousands of disks, concurrently reading.
-> Fault tolerance is critical.
Main Features
The complement for MapReduce-based interactive query:
• Unlike traditional DBs, it is capable of operating on in situ nested data.
• Not a replacement for MR.
Main Features
The nested data model:
• Data used in web are often non-relational.
• Need some flexible data model like json.
Main Features
Build on ideas from web search and parallel DBMSs:
• Serving tree:
Divide a huge and complicated query into several small queries.
• SQL-like interface:
Like Hive and Pig.
Main Features
Column-striped storage representation:
• Read less data from secondary storage and reduce CPU cost due to
compression.
• Column stores have been adopted for analyzing relational data but to
the best of our knowledge have not been extended to nested data
models.
Record-Oriented v.s. Column-Oriented
Record-Oriented Column-Oriented
Record-Oriented v.s. Column-Oriented
• We can just retrieve A.B.C without
reading A.E or A.B.D, etc.
• Challenge: How to scan arbitrary
subset of fields efficiently and process
some analysis in the same
time.
Data Model
• The data model originated in the context of distributed systems (Protocol
Buffers), is used widely at Google, and is available as an open source
implementation.
• The data model is based on strongly-typed nested records.
Its abstract syntax is given by:
𝝉 = dom | < A1 : 𝝉[∗|?], ..., An : 𝝉[∗|?] >
Data Model
𝝉 = dom | < A1 : 𝝉[∗|?], ..., An : 𝝉[∗|?] >
• 𝝉: An atomic type or a record type.
• Atomic type: Integers, floating-point numbers, strings, etc.
• Record: It consist of one or multiple fields.
• Repeated fields (*) may occur multiple times in a record.
• Optional fields (?) may be missing from the record.
• Otherwise, a field is required.
Data Model
Data Model
This type of data model is language independent and platform-neutral, so a MR
program written in Java can consume records from a data source exposed via a
C++ library.
Nested Columnar Storage
• Values alone do not convey the structure of a record.
• Given two values of a repeated field, we do not know at what ‘level’ the value
repeated (e.g., whether these values are from two different records, or two
repeated values in the same record).
 Repetition Levels
• Given a missing optional field, we do not know which enclosing records were
defined explicitly.
 Definition Levels
Nested Columnar Storage
Nested Columnar Storage: Repetition Levels
• Repetition Levels:
It tells us at what repeated field in the field’s path the value has repeated.
• The field path Name.Language.Code contains two repeated fields, Name
and Language. Hence, the repetition level of Code ranges between 0 and 2;
level 0 denotes the start of a new record.
Nested Columnar Storage: Repetition Levels
Nested Columnar Storage: Definition Levels
• Definition Levels:
Each value of a field with path p, esp. every NULL, has a definition level
specifying how many fields in p that could be undefined (because they are
optional or repeated) are actually present in the record.
Nested Columnar Storage: Definition Levels
Nested Columnar Storage
• Splitting Records into Columns:
With this type of data model, write operation is very easy, but we need to
focus on reading. When reading, we don’t need to read the entire records,
and we can just read those columns we need to form a partial data model.
Nested Columnar Storage
Nested Columnar Storage
Complete record assembly automaton. Edges are labeled with repetition levels.
Query Execution
• Dremel’s query language is based on SQL and is designed to be efficiently
implementable on columnar nested storage.
• Each SQL statement takes as input one or multiple nested tables and their
schemas and produces a nested table and its output schema.
Query Execution
Sample query, its result, and output schema.
Query Execution
Architecture:
• Dremel uses a multi-level serving tree to execute queries.
• A root server receives incoming queries, reads metadata from the tables,
and routes the queries to the next level in the serving tree. The leaf servers
communicate with the storage layer or access the data on local disk.
Query Execution
System architecture and execution inside a server node.
Query Execution
• Consider a simple aggregation query below:
SELECT A, COUNT(B) FROM T GROUP BY A
• When the root server receives the above query, it determines all tablets, i.e.,
horizontal partitions of the table, that comprise T and rewrites the query as
follows:
SELECT A, SUM(c) FROM (R1
1 UNION ALL ... R1
n) GROUP BY A
• Tables R1
1 , …, R1
n are the results of queries sent to the nodes 1, …, n at level
1 of the serving tree:
Query Execution
• Tables R1
1 , …, R1
n are the results of queries sent to the nodes 1, …, n at level
1 of the serving tree:
R1
i = SELECT A, COUNT(B) AS c FROM T1
i GROUP BY A
• T1
i is a disjoint partition of tablets in T processed by server i at level 1.
• Here, we can know that the dataset will smaller than the original one, and
each dataset can be processed faster.
Query Execution
• Because Dremel is a multi-user system(usually several queries are executed
simultaneously).
• A query dispatcher schedules queries based on their priorities and balances
the load. Its other important role is to provide fault tolerance when one
server becomes much slower than others or a tablet replica becomes
unreachable.
Query Execution
• A system with 3000 leaf servers
• Each leaf server using 8 threads
• 3000 * 8 = 24000 (slots)
• A table spanning 100,000 tablets
• Assigning about 5 tablets / slot
Experiments
• The basic data access characteristics on a single machine
• How columnar storage benefits MR execution
• Dremel’s performance
Experiments
Table
name
Number of
records
Size (unrepl.,
compressed)
Number
of fields
Data
center
Repl.
factor
T1 85 billion 87 TB 270 A 3×
T2 24 billion 13 TB 530 A 3×
T3 4 billion 70 TB 1200 A 3×
T4 1+ trillion 105 TB 50 B 3×
T5 1+ trillion 20 TB 30 B 2×
Datasets used in the experimental study
Experiments – Single Machine
Performance breakdown when reading from a local disk
(300K-record fragment of Table T1)
T1 85 billion 87 TB 270 A 3×
Experiments – MR and Dremel
Q1: SELECT SUM( CountWords (txtField)) / COUNT(*) FROM T1
T1 85 billion 87 TB 270 A 3×
Experiments – Serving Tree Topology
Q2: SELECT country, SUM( item.amount ) FROM T2 GROUP BY country
Q3: SELECT domain, SUM( item.amount ) FROM T2 WHERE domain CONTAINS ’.net’ GROUP BY domain
T2 24 billion 13 TB 530 A 3×
Experiments – Per-tablet Histograms
The area under each histogram corresponds to 100%. As the figure
indicates, 99% of Q2 (or Q3) tablets are processed under one second
(or two seconds).
Experiments – Scalability
In each run, the total expended CPU time is nearly identical, at about
300K seconds, whereas the user-perceived time decreases near-linearly
with the growing size of the system.
Experiments – Stragglers
Q6: SELECT COUNT(DISTINCT a) FROM T5
In contrast to the other datasets, T5 is two-way replicated. Hence, the
likelihood of stragglers slowing the execution is higher since there are
fewer opportunities to reschedule the work.
T5 1+ trillion 20 TB 30 B 2×
Conclusions
• Dremel is a custom, scalable data management solution built from simpler
components. It complements the MR paradigm.
• We outlined the key aspects of Dremel, including its storage format, query
language, and execution.
• Multi-level execution trees & Columnar data layout
• In the future, it might be widely adopted in the world.
Reference
• Dremel: Interactive Analysis of Web-Scale Datasets
END
Ad

More Related Content

What's hot (20)

Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets
robertlz
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented database
Kanike Krishna
 
Cassandra
CassandraCassandra
Cassandra
Upaang Saxena
 
Data mining with differential privacy
Data mining with differential privacy Data mining with differential privacy
Data mining with differential privacy
Wei-Yuan Chang
 
DBMS Practical File
DBMS Practical FileDBMS Practical File
DBMS Practical File
Dushmanta Nath
 
Single row functions
Single row functionsSingle row functions
Single row functions
Balqees Al.Mubarak
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
Marin Dimitrov
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databases
ArangoDB Database
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
nehabsairam
 
Aggregate functions
Aggregate functionsAggregate functions
Aggregate functions
sinhacp
 
Apache Spark & Streaming
Apache Spark & StreamingApache Spark & Streaming
Apache Spark & Streaming
Fernando Rodriguez
 
Big Data Profiling
Big Data Profiling Big Data Profiling
Big Data Profiling
eXascale Infolab
 
Mca iii dfs u-4 tree and graph
Mca iii dfs u-4 tree and graphMca iii dfs u-4 tree and graph
Mca iii dfs u-4 tree and graph
Rai University
 
Data Structures - Lecture 1 [introduction]
Data Structures - Lecture 1 [introduction]Data Structures - Lecture 1 [introduction]
Data Structures - Lecture 1 [introduction]
Muhammad Hammad Waseem
 
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESDISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
AAKANKSHA JAIN
 
Storage talk
Storage talkStorage talk
Storage talk
christkv
 
Joins in SQL
Joins in SQLJoins in SQL
Joins in SQL
Vigneshwaran Sankaran
 
Introduction to SQL
Introduction to SQLIntroduction to SQL
Introduction to SQL
Ram Kedem
 
Dbms
DbmsDbms
Dbms
Prof. Dr. K. Adisesha
 
Tries
TriesTries
Tries
Shubham Shukla
 
Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets
robertlz
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented database
Kanike Krishna
 
Data mining with differential privacy
Data mining with differential privacy Data mining with differential privacy
Data mining with differential privacy
Wei-Yuan Chang
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databases
ArangoDB Database
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
nehabsairam
 
Aggregate functions
Aggregate functionsAggregate functions
Aggregate functions
sinhacp
 
Mca iii dfs u-4 tree and graph
Mca iii dfs u-4 tree and graphMca iii dfs u-4 tree and graph
Mca iii dfs u-4 tree and graph
Rai University
 
Data Structures - Lecture 1 [introduction]
Data Structures - Lecture 1 [introduction]Data Structures - Lecture 1 [introduction]
Data Structures - Lecture 1 [introduction]
Muhammad Hammad Waseem
 
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESDISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
AAKANKSHA JAIN
 
Storage talk
Storage talkStorage talk
Storage talk
christkv
 
Introduction to SQL
Introduction to SQLIntroduction to SQL
Introduction to SQL
Ram Kedem
 

Similar to Dremel interactive analysis of web scale datasets (20)

UNIT1 data structure recursion model with example.pptx
UNIT1 data structure recursion model with example.pptxUNIT1 data structure recursion model with example.pptx
UNIT1 data structure recursion model with example.pptx
SrikarKethiri
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
Volodymyr Rovetskiy
 
Cost Based Oracle
Cost Based OracleCost Based Oracle
Cost Based Oracle
Santosh Kangane
 
Introduction To Programming In R for data analyst
Introduction To Programming In R for data analystIntroduction To Programming In R for data analyst
Introduction To Programming In R for data analyst
ssuser26ff68
 
Hadoop map reduce concepts
Hadoop map reduce conceptsHadoop map reduce concepts
Hadoop map reduce concepts
Subhas Kumar Ghosh
 
Interactive big data analytics
Interactive big data analyticsInteractive big data analytics
Interactive big data analytics
Viet-Trung TRAN
 
DSJ_Unit I & II.pdf
DSJ_Unit I & II.pdfDSJ_Unit I & II.pdf
DSJ_Unit I & II.pdf
Arumugam90
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
Kel Graham
 
Oracle sql tutorial
Oracle sql tutorialOracle sql tutorial
Oracle sql tutorial
Mohd Tousif
 
SQL
SQLSQL
SQL
kaushal123
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
Boris Yen
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
Jacky Chu
 
dbs class 7.ppt
dbs class 7.pptdbs class 7.ppt
dbs class 7.ppt
MARasheed3
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDB
AWS Germany
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
Andrey Lomakin
 
04 pig data operations
04 pig data operations04 pig data operations
04 pig data operations
Subhas Kumar Ghosh
 
Database Sizing
Database SizingDatabase Sizing
Database Sizing
Amin Chowdhury
 
Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021
Mark Kromer
 
Intro to Data Structure & Algorithms
Intro to Data Structure & AlgorithmsIntro to Data Structure & Algorithms
Intro to Data Structure & Algorithms
Akhil Kaushik
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
András Fehér
 
UNIT1 data structure recursion model with example.pptx
UNIT1 data structure recursion model with example.pptxUNIT1 data structure recursion model with example.pptx
UNIT1 data structure recursion model with example.pptx
SrikarKethiri
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
Volodymyr Rovetskiy
 
Introduction To Programming In R for data analyst
Introduction To Programming In R for data analystIntroduction To Programming In R for data analyst
Introduction To Programming In R for data analyst
ssuser26ff68
 
Interactive big data analytics
Interactive big data analyticsInteractive big data analytics
Interactive big data analytics
Viet-Trung TRAN
 
DSJ_Unit I & II.pdf
DSJ_Unit I & II.pdfDSJ_Unit I & II.pdf
DSJ_Unit I & II.pdf
Arumugam90
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
Kel Graham
 
Oracle sql tutorial
Oracle sql tutorialOracle sql tutorial
Oracle sql tutorial
Mohd Tousif
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
Boris Yen
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
Jacky Chu
 
dbs class 7.ppt
dbs class 7.pptdbs class 7.ppt
dbs class 7.ppt
MARasheed3
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDB
AWS Germany
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
Andrey Lomakin
 
Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021
Mark Kromer
 
Intro to Data Structure & Algorithms
Intro to Data Structure & AlgorithmsIntro to Data Structure & Algorithms
Intro to Data Structure & Algorithms
Akhil Kaushik
 
Ad

Recently uploaded (20)

AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Ad

Dremel interactive analysis of web scale datasets

  • 1. Dremel: Interactive Analysis of Web-Scale Datasets Carl Adler IDSL - Dep. IM - NTUST
  • 2. Outline • About Dremel • Main Features • Record-Oriented v.s. Column-Oriented • Data Model • Nested Columnar Storage • Query Execution • Experiments • Conclusions
  • 3. About Dremel A scalable, interactive ad-hoc query system for analysis of read-only nested data. By combining multi-level execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds.
  • 4. Main Features • Dremel is a large-scale system • The complement for MapReduce-based interactive query • The nested data model • Build on ideas from web search and parallel DBMSs • Column-striped storage representation
  • 5. Main Features Dremel is a large-scale system: • Reading 1TB of compressed data in 1 sec -> Needs tens of thousands of disks, concurrently reading. -> Fault tolerance is critical.
  • 6. Main Features The complement for MapReduce-based interactive query: • Unlike traditional DBs, it is capable of operating on in situ nested data. • Not a replacement for MR.
  • 7. Main Features The nested data model: • Data used in web are often non-relational. • Need some flexible data model like json.
  • 8. Main Features Build on ideas from web search and parallel DBMSs: • Serving tree: Divide a huge and complicated query into several small queries. • SQL-like interface: Like Hive and Pig.
  • 9. Main Features Column-striped storage representation: • Read less data from secondary storage and reduce CPU cost due to compression. • Column stores have been adopted for analyzing relational data but to the best of our knowledge have not been extended to nested data models.
  • 11. Record-Oriented v.s. Column-Oriented • We can just retrieve A.B.C without reading A.E or A.B.D, etc. • Challenge: How to scan arbitrary subset of fields efficiently and process some analysis in the same time.
  • 12. Data Model • The data model originated in the context of distributed systems (Protocol Buffers), is used widely at Google, and is available as an open source implementation. • The data model is based on strongly-typed nested records. Its abstract syntax is given by: 𝝉 = dom | < A1 : 𝝉[∗|?], ..., An : 𝝉[∗|?] >
  • 13. Data Model 𝝉 = dom | < A1 : 𝝉[∗|?], ..., An : 𝝉[∗|?] > • 𝝉: An atomic type or a record type. • Atomic type: Integers, floating-point numbers, strings, etc. • Record: It consist of one or multiple fields. • Repeated fields (*) may occur multiple times in a record. • Optional fields (?) may be missing from the record. • Otherwise, a field is required.
  • 15. Data Model This type of data model is language independent and platform-neutral, so a MR program written in Java can consume records from a data source exposed via a C++ library.
  • 16. Nested Columnar Storage • Values alone do not convey the structure of a record. • Given two values of a repeated field, we do not know at what ‘level’ the value repeated (e.g., whether these values are from two different records, or two repeated values in the same record).  Repetition Levels • Given a missing optional field, we do not know which enclosing records were defined explicitly.  Definition Levels
  • 18. Nested Columnar Storage: Repetition Levels • Repetition Levels: It tells us at what repeated field in the field’s path the value has repeated. • The field path Name.Language.Code contains two repeated fields, Name and Language. Hence, the repetition level of Code ranges between 0 and 2; level 0 denotes the start of a new record.
  • 19. Nested Columnar Storage: Repetition Levels
  • 20. Nested Columnar Storage: Definition Levels • Definition Levels: Each value of a field with path p, esp. every NULL, has a definition level specifying how many fields in p that could be undefined (because they are optional or repeated) are actually present in the record.
  • 21. Nested Columnar Storage: Definition Levels
  • 22. Nested Columnar Storage • Splitting Records into Columns: With this type of data model, write operation is very easy, but we need to focus on reading. When reading, we don’t need to read the entire records, and we can just read those columns we need to form a partial data model.
  • 24. Nested Columnar Storage Complete record assembly automaton. Edges are labeled with repetition levels.
  • 25. Query Execution • Dremel’s query language is based on SQL and is designed to be efficiently implementable on columnar nested storage. • Each SQL statement takes as input one or multiple nested tables and their schemas and produces a nested table and its output schema.
  • 26. Query Execution Sample query, its result, and output schema.
  • 27. Query Execution Architecture: • Dremel uses a multi-level serving tree to execute queries. • A root server receives incoming queries, reads metadata from the tables, and routes the queries to the next level in the serving tree. The leaf servers communicate with the storage layer or access the data on local disk.
  • 28. Query Execution System architecture and execution inside a server node.
  • 29. Query Execution • Consider a simple aggregation query below: SELECT A, COUNT(B) FROM T GROUP BY A • When the root server receives the above query, it determines all tablets, i.e., horizontal partitions of the table, that comprise T and rewrites the query as follows: SELECT A, SUM(c) FROM (R1 1 UNION ALL ... R1 n) GROUP BY A • Tables R1 1 , …, R1 n are the results of queries sent to the nodes 1, …, n at level 1 of the serving tree:
  • 30. Query Execution • Tables R1 1 , …, R1 n are the results of queries sent to the nodes 1, …, n at level 1 of the serving tree: R1 i = SELECT A, COUNT(B) AS c FROM T1 i GROUP BY A • T1 i is a disjoint partition of tablets in T processed by server i at level 1. • Here, we can know that the dataset will smaller than the original one, and each dataset can be processed faster.
  • 31. Query Execution • Because Dremel is a multi-user system(usually several queries are executed simultaneously). • A query dispatcher schedules queries based on their priorities and balances the load. Its other important role is to provide fault tolerance when one server becomes much slower than others or a tablet replica becomes unreachable.
  • 32. Query Execution • A system with 3000 leaf servers • Each leaf server using 8 threads • 3000 * 8 = 24000 (slots) • A table spanning 100,000 tablets • Assigning about 5 tablets / slot
  • 33. Experiments • The basic data access characteristics on a single machine • How columnar storage benefits MR execution • Dremel’s performance
  • 34. Experiments Table name Number of records Size (unrepl., compressed) Number of fields Data center Repl. factor T1 85 billion 87 TB 270 A 3× T2 24 billion 13 TB 530 A 3× T3 4 billion 70 TB 1200 A 3× T4 1+ trillion 105 TB 50 B 3× T5 1+ trillion 20 TB 30 B 2× Datasets used in the experimental study
  • 35. Experiments – Single Machine Performance breakdown when reading from a local disk (300K-record fragment of Table T1) T1 85 billion 87 TB 270 A 3×
  • 36. Experiments – MR and Dremel Q1: SELECT SUM( CountWords (txtField)) / COUNT(*) FROM T1 T1 85 billion 87 TB 270 A 3×
  • 37. Experiments – Serving Tree Topology Q2: SELECT country, SUM( item.amount ) FROM T2 GROUP BY country Q3: SELECT domain, SUM( item.amount ) FROM T2 WHERE domain CONTAINS ’.net’ GROUP BY domain T2 24 billion 13 TB 530 A 3×
  • 38. Experiments – Per-tablet Histograms The area under each histogram corresponds to 100%. As the figure indicates, 99% of Q2 (or Q3) tablets are processed under one second (or two seconds).
  • 39. Experiments – Scalability In each run, the total expended CPU time is nearly identical, at about 300K seconds, whereas the user-perceived time decreases near-linearly with the growing size of the system.
  • 40. Experiments – Stragglers Q6: SELECT COUNT(DISTINCT a) FROM T5 In contrast to the other datasets, T5 is two-way replicated. Hence, the likelihood of stragglers slowing the execution is higher since there are fewer opportunities to reschedule the work. T5 1+ trillion 20 TB 30 B 2×
  • 41. Conclusions • Dremel is a custom, scalable data management solution built from simpler components. It complements the MR paradigm. • We outlined the key aspects of Dremel, including its storage format, query language, and execution. • Multi-level execution trees & Columnar data layout • In the future, it might be widely adopted in the world.
  • 42. Reference • Dremel: Interactive Analysis of Web-Scale Datasets
  • 43. END