SlideShare a Scribd company logo
+ 
CPU 
DBA Level 300
About me 
 An independent SQL Consultant 
 A user of SQL Server from version 2000 onwards with 12+ years 
experience. 
 I have a passion for understanding how the database engine works 
at a deep level.
A Brief History Of Column Store Technology 
 The lineage of column store databases can be traced back to the MonetDb and 
VectorWise projects from Holland, developed at around the turn of the millennium. 
 Store is column oriented. 
 Column store technology aims to exploit modern CPU architectures. 
 Virtually all database vendors now have a column store database offering. 
Many people predict a future where all OLAP workloads will be serviced by column 
oriented databases.
Column Store Compression Schemes 
Colour 
Red 
Red 
Blue 
Blue 
Green 
Green 
Green 
Dictionary 
Lookup ID Label 
1 Red 
2 Blue 
3 Green 
Segment 
Lookup ID Run Length 
1 2 
2 2 
3 3 
 Compressing data going 
down the column using run 
length compression. 
 Global and local 
dictionaries are used to 
store compression 
metadata.
Column store segments 
Column Store Index ‘Anantomy’ 
Global dictionary 
Deletion Bitmap 
Local Dictionary
What Levels Of Compression Can Be Achieved ? 
350 
300 
250 
200 
150 
100 
50 
0 
Heap Row Compression Page compression Clustered column 
store index 
Clustered column 
store index archive 
compression 
* Posts tables from the four largest stack 
exchanges combined ( superuser, 
serverfault, maths and Ubuntu ) 
53 % 59 % 64 % 72 %
Demonstration 1: The Difference Batch Mode Makes 
Test Data Creation
Demonstration 1: The Difference Batch Mode Makes 
Test Queries
How Queries are Executedlans Run 
Row by row Row by row 
Row by row Row by row 
How do rows 
travel between 
Iterators ? 
Control flow 
Data Flow
L0 UOP cache 
Core 
Modern CPU Architecture 
C P U 
L3 Cache 
L1 Instruction 
Cache 32KB 
L2 Unified Cache 256K 
Core 
L0 UOP cache 
L1 Data Cache 
32KB 
Power 
and 
Clock 
Un-core 
QPI 
Memory 
Controller 
L1 Data Cache 
32KB 
Core 
Core 
L1 Instruction 
Cache 32KB 
L2 Unified Cache 256K 
Bi-directional ring bus 
 system-on-chip ( SOC ) design with 
CPU cores as the basic building 
block. 
 Utility services are provisioned by 
the ‘Un-core’ part of the CPU die. 
 Four level cache hierarchy. 
Memory bus TLB IO 
… QPI
Memory Is The “New Disk”Memory 
4 
4 
4 
18 
14 
11 
11 
11 
38 
167 
0 50 100 150 200 
Main memory 
L3 Cache Full Random access 
L3 Cache In Page Random access 
L3 Cache sequential access 
L2 Cache Full Random access 
L2 Cache In Page Random access 
L2 Cache sequential access 
L1 Cache In Full Random access 
L1 Cache In Page Random access 
L1 Cache sequential access 
Batch mode is about working in the 4 ~ 38 
clock cycle range and NOT the 167 cycle 
“CPU stall” range.
How Can A Column Store Index Fit Inside The CPU Cache ? 
C P U 
Column store 
object pool 
Segment 
Batches
The Column Store Object Pool
Batch Mode Pre-Requisites 
Feature SQL Server 2012 
SQL 
Server 
2014 
Presence of column store indexes Yes Yes 
Parallel execution plan Yes Yes 
No outer joins, NOT Ins or UNION ALLs Yes No 
Hash joins do not spill from memory Yes No 
Scalar aggregates cannot be used Yes No
Feature SQL Server 2012 
SQL Server 
2014 
Column store indexes Yes Yes 
Clustered column store indexes No Yes 
Updateable column store indexes No Yes 
Column store archive compression No Yes 
Columns in a column store index can be dropped No Yes 
Support for GUID, binary, datetimeoffset precision > 2, numeric precision > 18. No Yes 
Enhanced compression by storing short strings natively ( instead of 32 bit IDs ) No Yes 
Bookmark support ( row_group_id:tuple_id) No Yes 
Mixed row / batch mode execution No Yes 
Optimized hash build and join in a single iterator No Yes 
Hash memory spills cause row mode execution No Yes 
Iterators supported Scan, filter, project, hash (inner) join 
and (local) hash aggregate 
Yes
How Column Store Index Updates Are Handled 
Row 
Groups 
A B C 
Columns 
< 1,048,576 rows 
Tuple mover 
Encode and 
Compress 
Segments 
Store 
Blobs 
Encode & 
Compress 
Delta stores
aDemonstration 2: Delta Stores In Action
Demonstration 3: Pre-sorting and Segment Elimination 
Test Data Creation
Demonstration 3: Pre-sorting and Segment Elimination 
Test Queries
Demonstration 4: Pre-sorting and Hash Aggregate Performance
Test Setup 
CPU 
6 core 2.0 Ghz 
(Sandybridge) 
Warm large object 
cache used in all 
tests to remove 
storage as a factor. 
CPU 
6 core 2.0 Ghz 
(Sandybridge) 
48 Gb quad channel 1333 
Mhz DDR3 memory 
Hyper-threading enabled, unless specified otherwise.
Atypical Data Warehouse Query On Extra Large Non Sorted Data 
1095500000 rows 
1,798MB in size
Atypical Data Warehouse Query On Extra Large Pre-Sorted Data 
1095500000 rows 
8,555MB in size
Elapsed Time (ms) / Degree of Parallelism 
80000 
70000 
60000 
50000 
40000 
30000 
20000 
10000 
0 
2 4 6 8 10 12 14 16 18 20 22 24 
Time (ms) 
Degree of Parallelism 
Non-sorted column store Sorted column store
Lowering Clock Cycles Per Instruction By Leveraging SIMD 
1 + 2 = 3 Scalar instruction 
1 2 3 4 
+ 
2 3 4 5 
= 
3 5 7 9 
C = A + B 
SIMD instruction 
Vector C = Vector A + Vector B
Takeaways 
 Column store indexes are only half the story, its column store index and batch mode that make the real 
difference to performance. 
 Pre-sort data where applicable and possible to encourage segment elimination. 
 Pre-sort data on fact table key column subject to the heaviest hash join / aggregate activity. 
 Column Store indexes and batch mode is fast, but not scalable. 
 Many other vendors leverage SIMD, Microsoft are yet to do this, this can result in another step change 
in performance.
Questions ?
ChrisAdkin8 
chris1adkin@yahoo.co.uk 
http://uk.linkedin.com/in/wollatondba
Ad

More Related Content

What's hot (20)

Scaling sql server 2014 parallel insert
Scaling sql server 2014 parallel insertScaling sql server 2014 parallel insert
Scaling sql server 2014 parallel insert
Chris Adkin
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling Examples
Tanel Poder
 
Oracle in-Memory Column Store for BI
Oracle in-Memory Column Store for BIOracle in-Memory Column Store for BI
Oracle in-Memory Column Store for BI
Franck Pachot
 
PostgreSQL and RAM usage
PostgreSQL and RAM usagePostgreSQL and RAM usage
PostgreSQL and RAM usage
Alexey Bashtanov
 
Really Big Elephants: PostgreSQL DW
Really Big Elephants: PostgreSQL DWReally Big Elephants: PostgreSQL DW
Really Big Elephants: PostgreSQL DW
PostgreSQL Experts, Inc.
 
Как PostgreSQL работает с диском
Как PostgreSQL работает с дискомКак PostgreSQL работает с диском
Как PostgreSQL работает с диском
PostgreSQL-Consulting
 
Introduction to Vacuum Freezing and XID
Introduction to Vacuum Freezing and XIDIntroduction to Vacuum Freezing and XID
Introduction to Vacuum Freezing and XID
PGConf APAC
 
Testing Delphix: easy data virtualization
Testing Delphix: easy data virtualizationTesting Delphix: easy data virtualization
Testing Delphix: easy data virtualization
Franck Pachot
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
PGConf APAC
 
Dbvisit replicate: logical replication made easy
Dbvisit replicate: logical replication made easyDbvisit replicate: logical replication made easy
Dbvisit replicate: logical replication made easy
Franck Pachot
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
Equnix Business Solutions
 
PostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability ImprovementsPostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability Improvements
PGConf APAC
 
Managing terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigManaging terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets big
Selena Deckelmann
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
PostgreSQL-Consulting
 
Linux internals for Database administrators at Linux Piter 2016
Linux internals for Database administrators at Linux Piter 2016Linux internals for Database administrators at Linux Piter 2016
Linux internals for Database administrators at Linux Piter 2016
PostgreSQL-Consulting
 
Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits
ScyllaDB
 
Tuning Linux for Databases.
Tuning Linux for Databases.Tuning Linux for Databases.
Tuning Linux for Databases.
Alexey Lesovsky
 
Exadata X3 in action: Measuring Smart Scan efficiency with AWR
Exadata X3 in action:  Measuring Smart Scan efficiency with AWRExadata X3 in action:  Measuring Smart Scan efficiency with AWR
Exadata X3 in action: Measuring Smart Scan efficiency with AWR
Franck Pachot
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
Equnix Business Solutions
 
CBO choice between Index and Full Scan: the good, the bad and the ugly param...
CBO choice between Index and Full Scan:  the good, the bad and the ugly param...CBO choice between Index and Full Scan:  the good, the bad and the ugly param...
CBO choice between Index and Full Scan: the good, the bad and the ugly param...
Franck Pachot
 
Scaling sql server 2014 parallel insert
Scaling sql server 2014 parallel insertScaling sql server 2014 parallel insert
Scaling sql server 2014 parallel insert
Chris Adkin
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling Examples
Tanel Poder
 
Oracle in-Memory Column Store for BI
Oracle in-Memory Column Store for BIOracle in-Memory Column Store for BI
Oracle in-Memory Column Store for BI
Franck Pachot
 
Как PostgreSQL работает с диском
Как PostgreSQL работает с дискомКак PostgreSQL работает с диском
Как PostgreSQL работает с диском
PostgreSQL-Consulting
 
Introduction to Vacuum Freezing and XID
Introduction to Vacuum Freezing and XIDIntroduction to Vacuum Freezing and XID
Introduction to Vacuum Freezing and XID
PGConf APAC
 
Testing Delphix: easy data virtualization
Testing Delphix: easy data virtualizationTesting Delphix: easy data virtualization
Testing Delphix: easy data virtualization
Franck Pachot
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
PGConf APAC
 
Dbvisit replicate: logical replication made easy
Dbvisit replicate: logical replication made easyDbvisit replicate: logical replication made easy
Dbvisit replicate: logical replication made easy
Franck Pachot
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
Equnix Business Solutions
 
PostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability ImprovementsPostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability Improvements
PGConf APAC
 
Managing terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigManaging terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets big
Selena Deckelmann
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
PostgreSQL-Consulting
 
Linux internals for Database administrators at Linux Piter 2016
Linux internals for Database administrators at Linux Piter 2016Linux internals for Database administrators at Linux Piter 2016
Linux internals for Database administrators at Linux Piter 2016
PostgreSQL-Consulting
 
Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits
ScyllaDB
 
Tuning Linux for Databases.
Tuning Linux for Databases.Tuning Linux for Databases.
Tuning Linux for Databases.
Alexey Lesovsky
 
Exadata X3 in action: Measuring Smart Scan efficiency with AWR
Exadata X3 in action:  Measuring Smart Scan efficiency with AWRExadata X3 in action:  Measuring Smart Scan efficiency with AWR
Exadata X3 in action: Measuring Smart Scan efficiency with AWR
Franck Pachot
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
Equnix Business Solutions
 
CBO choice between Index and Full Scan: the good, the bad and the ugly param...
CBO choice between Index and Full Scan:  the good, the bad and the ugly param...CBO choice between Index and Full Scan:  the good, the bad and the ugly param...
CBO choice between Index and Full Scan: the good, the bad and the ugly param...
Franck Pachot
 

Viewers also liked (20)

Beyond function approximators for batch mode reinforcement learning: rebuildi...
Beyond function approximators for batch mode reinforcement learning: rebuildi...Beyond function approximators for batch mode reinforcement learning: rebuildi...
Beyond function approximators for batch mode reinforcement learning: rebuildi...
Université de Liège (ULg)
 
Animation
AnimationAnimation
Animation
Preet Kanwal
 
World wide web with multimedia
World wide web with multimediaWorld wide web with multimedia
World wide web with multimedia
Afaq Siddiqui
 
Bitmap and vector
Bitmap and vectorBitmap and vector
Bitmap and vector
haverstockmedia
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
Sopheaktra YONG
 
Chapter 3 : IMAGE
Chapter 3 : IMAGEChapter 3 : IMAGE
Chapter 3 : IMAGE
azira96
 
Interactive multimedia and virtual reality
Interactive multimedia and virtual realityInteractive multimedia and virtual reality
Interactive multimedia and virtual reality
Suprabha B
 
Animation & Animation Techniques
Animation & Animation TechniquesAnimation & Animation Techniques
Animation & Animation Techniques
Narendra Bhavsar
 
Internet and World Wide Web
Internet and World Wide WebInternet and World Wide Web
Internet and World Wide Web
Samudin Kassan
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
Hakky St
 
Lecture5 graphics
Lecture5   graphicsLecture5   graphics
Lecture5 graphics
Mr SMAK
 
Multimedia And Animation
Multimedia And AnimationMultimedia And Animation
Multimedia And Animation
Ram Dutt Shukla
 
Animation Techniques
Animation TechniquesAnimation Techniques
Animation Techniques
Media Studies
 
Back propagation
Back propagationBack propagation
Back propagation
Nagarajan
 
Color & light
Color & lightColor & light
Color & light
Micheal Abebe
 
Chapter 3
Chapter 3Chapter 3
Chapter 3
nooramirahazmn
 
Lecture 9 animation
Lecture 9 animationLecture 9 animation
Lecture 9 animation
Mr SMAK
 
The Internet and Multimedia
The Internet and Multimedia The Internet and Multimedia
The Internet and Multimedia
CeliaBSeaton
 
Digital imaging
Digital imagingDigital imaging
Digital imaging
islam kassem
 
Animation
AnimationAnimation
Animation
ankur bhalla
 
Beyond function approximators for batch mode reinforcement learning: rebuildi...
Beyond function approximators for batch mode reinforcement learning: rebuildi...Beyond function approximators for batch mode reinforcement learning: rebuildi...
Beyond function approximators for batch mode reinforcement learning: rebuildi...
Université de Liège (ULg)
 
World wide web with multimedia
World wide web with multimediaWorld wide web with multimedia
World wide web with multimedia
Afaq Siddiqui
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
Sopheaktra YONG
 
Chapter 3 : IMAGE
Chapter 3 : IMAGEChapter 3 : IMAGE
Chapter 3 : IMAGE
azira96
 
Interactive multimedia and virtual reality
Interactive multimedia and virtual realityInteractive multimedia and virtual reality
Interactive multimedia and virtual reality
Suprabha B
 
Animation & Animation Techniques
Animation & Animation TechniquesAnimation & Animation Techniques
Animation & Animation Techniques
Narendra Bhavsar
 
Internet and World Wide Web
Internet and World Wide WebInternet and World Wide Web
Internet and World Wide Web
Samudin Kassan
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
Hakky St
 
Lecture5 graphics
Lecture5   graphicsLecture5   graphics
Lecture5 graphics
Mr SMAK
 
Multimedia And Animation
Multimedia And AnimationMultimedia And Animation
Multimedia And Animation
Ram Dutt Shukla
 
Animation Techniques
Animation TechniquesAnimation Techniques
Animation Techniques
Media Studies
 
Back propagation
Back propagationBack propagation
Back propagation
Nagarajan
 
Lecture 9 animation
Lecture 9 animationLecture 9 animation
Lecture 9 animation
Mr SMAK
 
The Internet and Multimedia
The Internet and Multimedia The Internet and Multimedia
The Internet and Multimedia
CeliaBSeaton
 
Ad

Similar to An introduction to column store indexes and batch mode (20)

Novedades SQL Server 2014
Novedades SQL Server 2014Novedades SQL Server 2014
Novedades SQL Server 2014
netmind
 
SQL Server It Just Runs Faster
SQL Server It Just Runs FasterSQL Server It Just Runs Faster
SQL Server It Just Runs Faster
Bob Ward
 
Sap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory databaseSap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory database
Alexander Talac
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL
Bernd Ocklin
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 edition
Bob Ward
 
Performance and predictability (1)
Performance and predictability (1)Performance and predictability (1)
Performance and predictability (1)
RichardWarburton
 
Performance and Predictability - Richard Warburton
Performance and Predictability - Richard WarburtonPerformance and Predictability - Richard Warburton
Performance and Predictability - Richard Warburton
JAXLondon2014
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
Chester Chen
 
Memory Mapping Cache
Memory Mapping CacheMemory Mapping Cache
Memory Mapping Cache
Sajith Harshana
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
Malin Weiss
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
Speedment, Inc.
 
11g R2
11g R211g R2
11g R2
afa reg
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
MariaDB plc
 
Oracle Database InMemory
Oracle Database InMemoryOracle Database InMemory
Oracle Database InMemory
Jorge Barba
 
SQL Server In-Memory OLTP introduction (Hekaton)
SQL Server In-Memory OLTP introduction (Hekaton)SQL Server In-Memory OLTP introduction (Hekaton)
SQL Server In-Memory OLTP introduction (Hekaton)
Shy Engelberg
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
ScyllaDB
 
Inside sql server in memory oltp sql sat nyc 2017
Inside sql server in memory oltp sql sat nyc 2017Inside sql server in memory oltp sql sat nyc 2017
Inside sql server in memory oltp sql sat nyc 2017
Bob Ward
 
Sql Server 2014 In Memory
Sql Server 2014 In MemorySql Server 2014 In Memory
Sql Server 2014 In Memory
Ravi Okade
 
The Future of Fast Databases: Lessons from a Decade of QuestDB
The Future of Fast Databases: Lessons from a Decade of QuestDBThe Future of Fast Databases: Lessons from a Decade of QuestDB
The Future of Fast Databases: Lessons from a Decade of QuestDB
javier ramirez
 
Replacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz Grabiec
Replacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz GrabiecReplacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz Grabiec
Replacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz Grabiec
ScyllaDB
 
Novedades SQL Server 2014
Novedades SQL Server 2014Novedades SQL Server 2014
Novedades SQL Server 2014
netmind
 
SQL Server It Just Runs Faster
SQL Server It Just Runs FasterSQL Server It Just Runs Faster
SQL Server It Just Runs Faster
Bob Ward
 
Sap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory databaseSap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory database
Alexander Talac
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL
Bernd Ocklin
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 edition
Bob Ward
 
Performance and predictability (1)
Performance and predictability (1)Performance and predictability (1)
Performance and predictability (1)
RichardWarburton
 
Performance and Predictability - Richard Warburton
Performance and Predictability - Richard WarburtonPerformance and Predictability - Richard Warburton
Performance and Predictability - Richard Warburton
JAXLondon2014
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
Chester Chen
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
Malin Weiss
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
Speedment, Inc.
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
MariaDB plc
 
Oracle Database InMemory
Oracle Database InMemoryOracle Database InMemory
Oracle Database InMemory
Jorge Barba
 
SQL Server In-Memory OLTP introduction (Hekaton)
SQL Server In-Memory OLTP introduction (Hekaton)SQL Server In-Memory OLTP introduction (Hekaton)
SQL Server In-Memory OLTP introduction (Hekaton)
Shy Engelberg
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
ScyllaDB
 
Inside sql server in memory oltp sql sat nyc 2017
Inside sql server in memory oltp sql sat nyc 2017Inside sql server in memory oltp sql sat nyc 2017
Inside sql server in memory oltp sql sat nyc 2017
Bob Ward
 
Sql Server 2014 In Memory
Sql Server 2014 In MemorySql Server 2014 In Memory
Sql Server 2014 In Memory
Ravi Okade
 
The Future of Fast Databases: Lessons from a Decade of QuestDB
The Future of Fast Databases: Lessons from a Decade of QuestDBThe Future of Fast Databases: Lessons from a Decade of QuestDB
The Future of Fast Databases: Lessons from a Decade of QuestDB
javier ramirez
 
Replacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz Grabiec
Replacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz GrabiecReplacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz Grabiec
Replacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz Grabiec
ScyllaDB
 
Ad

More from Chris Adkin (9)

Bdc from bare metal to k8s
Bdc   from bare metal to k8sBdc   from bare metal to k8s
Bdc from bare metal to k8s
Chris Adkin
 
Data weekender deploying prod grade sql 2019 big data clusters
Data weekender deploying prod grade sql 2019 big data clustersData weekender deploying prod grade sql 2019 big data clusters
Data weekender deploying prod grade sql 2019 big data clusters
Chris Adkin
 
Data relay introduction to big data clusters
Data relay introduction to big data clustersData relay introduction to big data clusters
Data relay introduction to big data clusters
Chris Adkin
 
Ci with jenkins docker and mssql belgium
Ci with jenkins docker and mssql belgiumCi with jenkins docker and mssql belgium
Ci with jenkins docker and mssql belgium
Chris Adkin
 
Continuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL ServerContinuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL Server
Chris Adkin
 
TSQL Coding Guidelines
TSQL Coding GuidelinesTSQL Coding Guidelines
TSQL Coding Guidelines
Chris Adkin
 
J2EE Performance And Scalability Bp
J2EE Performance And Scalability BpJ2EE Performance And Scalability Bp
J2EE Performance And Scalability Bp
Chris Adkin
 
J2EE Batch Processing
J2EE Batch ProcessingJ2EE Batch Processing
J2EE Batch Processing
Chris Adkin
 
Oracle Sql Tuning
Oracle Sql TuningOracle Sql Tuning
Oracle Sql Tuning
Chris Adkin
 
Bdc from bare metal to k8s
Bdc   from bare metal to k8sBdc   from bare metal to k8s
Bdc from bare metal to k8s
Chris Adkin
 
Data weekender deploying prod grade sql 2019 big data clusters
Data weekender deploying prod grade sql 2019 big data clustersData weekender deploying prod grade sql 2019 big data clusters
Data weekender deploying prod grade sql 2019 big data clusters
Chris Adkin
 
Data relay introduction to big data clusters
Data relay introduction to big data clustersData relay introduction to big data clusters
Data relay introduction to big data clusters
Chris Adkin
 
Ci with jenkins docker and mssql belgium
Ci with jenkins docker and mssql belgiumCi with jenkins docker and mssql belgium
Ci with jenkins docker and mssql belgium
Chris Adkin
 
Continuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL ServerContinuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL Server
Chris Adkin
 
TSQL Coding Guidelines
TSQL Coding GuidelinesTSQL Coding Guidelines
TSQL Coding Guidelines
Chris Adkin
 
J2EE Performance And Scalability Bp
J2EE Performance And Scalability BpJ2EE Performance And Scalability Bp
J2EE Performance And Scalability Bp
Chris Adkin
 
J2EE Batch Processing
J2EE Batch ProcessingJ2EE Batch Processing
J2EE Batch Processing
Chris Adkin
 
Oracle Sql Tuning
Oracle Sql TuningOracle Sql Tuning
Oracle Sql Tuning
Chris Adkin
 

Recently uploaded (20)

Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia
Alexander Romero Arosquipa
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136
illuminati Agent uganda call+256776963507/0741506136
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 

An introduction to column store indexes and batch mode

  • 1. + CPU DBA Level 300
  • 2. About me  An independent SQL Consultant  A user of SQL Server from version 2000 onwards with 12+ years experience.  I have a passion for understanding how the database engine works at a deep level.
  • 3. A Brief History Of Column Store Technology  The lineage of column store databases can be traced back to the MonetDb and VectorWise projects from Holland, developed at around the turn of the millennium.  Store is column oriented.  Column store technology aims to exploit modern CPU architectures.  Virtually all database vendors now have a column store database offering. Many people predict a future where all OLAP workloads will be serviced by column oriented databases.
  • 4. Column Store Compression Schemes Colour Red Red Blue Blue Green Green Green Dictionary Lookup ID Label 1 Red 2 Blue 3 Green Segment Lookup ID Run Length 1 2 2 2 3 3  Compressing data going down the column using run length compression.  Global and local dictionaries are used to store compression metadata.
  • 5. Column store segments Column Store Index ‘Anantomy’ Global dictionary Deletion Bitmap Local Dictionary
  • 6. What Levels Of Compression Can Be Achieved ? 350 300 250 200 150 100 50 0 Heap Row Compression Page compression Clustered column store index Clustered column store index archive compression * Posts tables from the four largest stack exchanges combined ( superuser, serverfault, maths and Ubuntu ) 53 % 59 % 64 % 72 %
  • 7. Demonstration 1: The Difference Batch Mode Makes Test Data Creation
  • 8. Demonstration 1: The Difference Batch Mode Makes Test Queries
  • 9. How Queries are Executedlans Run Row by row Row by row Row by row Row by row How do rows travel between Iterators ? Control flow Data Flow
  • 10. L0 UOP cache Core Modern CPU Architecture C P U L3 Cache L1 Instruction Cache 32KB L2 Unified Cache 256K Core L0 UOP cache L1 Data Cache 32KB Power and Clock Un-core QPI Memory Controller L1 Data Cache 32KB Core Core L1 Instruction Cache 32KB L2 Unified Cache 256K Bi-directional ring bus  system-on-chip ( SOC ) design with CPU cores as the basic building block.  Utility services are provisioned by the ‘Un-core’ part of the CPU die.  Four level cache hierarchy. Memory bus TLB IO … QPI
  • 11. Memory Is The “New Disk”Memory 4 4 4 18 14 11 11 11 38 167 0 50 100 150 200 Main memory L3 Cache Full Random access L3 Cache In Page Random access L3 Cache sequential access L2 Cache Full Random access L2 Cache In Page Random access L2 Cache sequential access L1 Cache In Full Random access L1 Cache In Page Random access L1 Cache sequential access Batch mode is about working in the 4 ~ 38 clock cycle range and NOT the 167 cycle “CPU stall” range.
  • 12. How Can A Column Store Index Fit Inside The CPU Cache ? C P U Column store object pool Segment Batches
  • 13. The Column Store Object Pool
  • 14. Batch Mode Pre-Requisites Feature SQL Server 2012 SQL Server 2014 Presence of column store indexes Yes Yes Parallel execution plan Yes Yes No outer joins, NOT Ins or UNION ALLs Yes No Hash joins do not spill from memory Yes No Scalar aggregates cannot be used Yes No
  • 15. Feature SQL Server 2012 SQL Server 2014 Column store indexes Yes Yes Clustered column store indexes No Yes Updateable column store indexes No Yes Column store archive compression No Yes Columns in a column store index can be dropped No Yes Support for GUID, binary, datetimeoffset precision > 2, numeric precision > 18. No Yes Enhanced compression by storing short strings natively ( instead of 32 bit IDs ) No Yes Bookmark support ( row_group_id:tuple_id) No Yes Mixed row / batch mode execution No Yes Optimized hash build and join in a single iterator No Yes Hash memory spills cause row mode execution No Yes Iterators supported Scan, filter, project, hash (inner) join and (local) hash aggregate Yes
  • 16. How Column Store Index Updates Are Handled Row Groups A B C Columns < 1,048,576 rows Tuple mover Encode and Compress Segments Store Blobs Encode & Compress Delta stores
  • 17. aDemonstration 2: Delta Stores In Action
  • 18. Demonstration 3: Pre-sorting and Segment Elimination Test Data Creation
  • 19. Demonstration 3: Pre-sorting and Segment Elimination Test Queries
  • 20. Demonstration 4: Pre-sorting and Hash Aggregate Performance
  • 21. Test Setup CPU 6 core 2.0 Ghz (Sandybridge) Warm large object cache used in all tests to remove storage as a factor. CPU 6 core 2.0 Ghz (Sandybridge) 48 Gb quad channel 1333 Mhz DDR3 memory Hyper-threading enabled, unless specified otherwise.
  • 22. Atypical Data Warehouse Query On Extra Large Non Sorted Data 1095500000 rows 1,798MB in size
  • 23. Atypical Data Warehouse Query On Extra Large Pre-Sorted Data 1095500000 rows 8,555MB in size
  • 24. Elapsed Time (ms) / Degree of Parallelism 80000 70000 60000 50000 40000 30000 20000 10000 0 2 4 6 8 10 12 14 16 18 20 22 24 Time (ms) Degree of Parallelism Non-sorted column store Sorted column store
  • 25. Lowering Clock Cycles Per Instruction By Leveraging SIMD 1 + 2 = 3 Scalar instruction 1 2 3 4 + 2 3 4 5 = 3 5 7 9 C = A + B SIMD instruction Vector C = Vector A + Vector B
  • 26. Takeaways  Column store indexes are only half the story, its column store index and batch mode that make the real difference to performance.  Pre-sort data where applicable and possible to encourage segment elimination.  Pre-sort data on fact table key column subject to the heaviest hash join / aggregate activity.  Column Store indexes and batch mode is fast, but not scalable.  Many other vendors leverage SIMD, Microsoft are yet to do this, this can result in another step change in performance.

Editor's Notes

  • #5: Run length compression is a generic term that pre-dates the column store functionality in SQL Server, it alludes to the techniques of compressing data by converting sequences of values into encoded “Run lengths”. The database engine scans down a column and stores each unique value it encounters in a dictionary, this can be local to a segment, ( the basic column store unit of storage; containing roughly 1 million rows ) and / or the dictionary can be global to the column store. Where sequences of values are found these are stored as encoded run lengths. In the example above the sequence of two red values is stored as 1, 2 etc . . .
  • #10: Each operator has an open(), close(), next() method. An operator pulls data through the plan by calling the next() method on the next operator down. The ‘Root’ operator drives the control flow for the whole plan. Data is moved row by row throughout the entire plan; inefficient in terms of CPU instructions per row and prone to expensive CPU cache misses.
  • #13: According to the Microsoft research paper on improvements made to column store indexes and batch mode in SQL 2014, the large object cache is new and it stores blob data contiguously in memory without any page breaks. The reason for this is that sequential page access for a CPU cache gives twice the throughput compared to single page access. Refer to slide 73 of “Super Scaling SQL Server Diagnosing and Fixing Hard Problems” by Thomas Kejser.
  • #15: Not much difference, why ?. Column stores do not store data in sorted order, however the encoding and compression process can reorder data in order to help achieve better levels of compression.
  • #17: Delta stores are new to SQL Server 2014 and they provide the means via which existing column stores can be inserted into. SQL Server 2014 also introduces column store archive compression. Writes to blobs, which row groups are stored as, are sequential in nature, for trickle inserts the presence of a delta store (b-tree) to act as a buffer mitigates against this. Updates take place by the deletion bit map for the column store being set and a new row being inserted into the column store via a delta store.