NewSQL Database Overview

NewSQL Database Overview

민형기 (S-Core)
hg.min@samsung.com
2013. 2. 22.

Contents
I. Why NewSQL?
II. NewSQL 기본 개념
III. NewSQL 종류
IV.NewSQL 정리

1

Thinking – Extreme Data

Qcon London 2012 3

Thinking - Traffic Explosion

출처 : Netflix in the Cloud (http://www.slideshare.net/adrianco/netflix-in-the-cloud-2011) 4

Organizations need deeper insights

Qcon London 2012 5

Solutions

□Buy High end Technology
□Higher more developers
□Using NoSQL
□Using NewSQL

6

Solution – Buy High End Technology

Oracle, IBM 7

Solution – Higher more developers

□Application Level Sharding
□Build your replication middleware
□…

http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_4_series/madone_4_5 8

Solutions – Use NoSQL
□새로운 비 관계형 데이터 베이스
□분산 아키텍처
□수평 확장성
□고정된 테이블 스키마가 없음
□Join, UPDATE, DELETE 연산이 없음
□트랜잭션이 없음
□SQL 지원이 없음

9

NoSQL Ecosystems

451 group 10

MongoDB
□Document-oriented database
 JSON-style documents: Lists, Maps, primitives
 Schema-less
□Transaction = update of a single
document
□Rich query language for dynamic queries
□Tunable writes: speed reliability
□Highly scalable and available

11

MongoDB 사용예
□Use cases
 High volume writes
 Complex data
 Semi-structured data

□주요 고객
 Foursquare
 Bit.ly Intuit
 SourceForge, NY Times
 GILT Groupe, Evite,
 SugarCRM

12

Apache Cassandra
□Column-oriented database/Extensible row store
 Think Row ~= java.util.SortedMap
□Transaction = update of a row
□Fast writes = append to a log
□Tunable reads/writes: consistency / availability
□Extremely scalable
 Transparent and dynamic clustering
 Rack and datacenter aware data replication
□CQL = “SQL”-like DDL and DML

13

Apache Cassandra 사용 예
□사용 예
 Big data
 Multiple Data Center distributed database
 Persistent cache
 (Write intensive) Logging
 High-availability (writes)

□주요 고객
 Digg, Facebook, Twitter, Reddit, Rackspace
 Cloudkick, Cisco, SimpleGeo, Ooyala, OpenX
 The largest production cluster has over 100 TB of
data in over 150 machines.“ – Casssandra web site
14

Solutions – Use NewSQL
□새로운 관계형 데이터베이스

□SQL과 ACID 트랜잭션을 유지
□새롭고 개선된 분산 아키텍처
□뛰어난 확장성과 성능을 지원

□NewSQL vendors: ScaleDB, NimbusDB, ...,
VoltDB

15

http://www.cs.brown.edu/courses/cs227/slides/newsql/newsql-intro.pdf 16

NewSQL 정의 – Wikipedia

NewSQL is a class of modern relational
database management systems that seek
to provide the same scalable performance
of NoSQL systems for OLTP workloads while
still maintaining the ACID guarantees of a
traditional single-node database system

http://en.wikipedia.org/wiki/NewSQL 17

NewSQL 정의 – 451 Group

A DBMS that delivers the scalability and
flexibility promised by NoSQL while retaining
the support for SQL queries and/or ACID, or
to improve performance for appropriate

workloads.


NewSQL 정의 – Stonbraker

 SQL as the primary interface.

 ACID support for transactions

 Non-locking concurrency control.

 High per-node performance.

 Parallel, shared-nothing architecture.


NewSQL Category
 New Database
 New MySQL Storage Engines
 Transparent Clustering

20

The evolving database landscape

OSBC 21

NewSQL Ecosystem

23

New Database
□ Newly designed from scratch to achieve
scalability and performance
 One of the key considerations in improving the
performance is making non-disk (memory) or new
kinds of disks (flash/SSD) the primary data store.
 some (hopefully minor) changes to the code will be
required and data migration is still needed.

□Solutions
 Software-Only: VoltDB, NuoDB, Drizzle, Google Spanner
 Supported as an appliance: Clustrix, Translattice.

http://www.linuxforu.com/2012/01/newsql-handle-big-data/ 24

New MySQL Storage Engines
□Highly optimized storage engines for MySQL
□Scale better than built-in engines, such as
InnoDB.
 Good part: the usage of the MySQL interface
 Downside part: data migration from other databases

□Solutions
 TokuDB, MemSQL, Xeround, Akiban, NDB


Transparent Clustering
□Retain the OLTP databases in their original
format, but provide a pluggable feature
 Cluster transparently
 Ensure Scalability
□Avoid the rewrite code or perform any
data migration
□Solutions
 Cluster transparently: Schooner MySQL, Continuent
Tungsten, ScalArc
 Ensure Scalability: ScaleBase, dbShards


NewSQL Products
 VoltDB
 Google Spanner

27

VoltDB
□ VoltDB, 2010, GPL/VoltDB Proprietary License, Java/C++
□ Type: NewSQL, New Database
□ Main Point: In-memory Database, Java Stored Procedure, VoltDB
implements the design of the academic H-Store project
□ Protocol: SQL
□ Transaction: Yes
□ Data Storage: Memory
□ Features
□ in-memory relational database
□ Durability thru replication, snapshots, logging
□ Transparent partitioning
□ ACID-level consistency
□ Synchronous multi-master replication
□ Database Replication

http://voltdb.com/products-services/products, http://www.slideshare.net/chris.e.richardson/polygot-persistenceforjavadevs-jfokus2012reorgpptx 28

VoltDB- Technical Overview
 “OLTP Through the Looking Glass”
http://cs-www.cs.yale.edu/homes/dna/papers/oltpperf-sigmod08.pdf

 VoltDB avoids the overhead of traditional databases
K-safety for fault tolerance
• no logging
In memory operation for maximum throughput
• no buffer management

Partitions operate autonomously
X X
and single-threaded
• no latching or locking X
 Built to horizontally scale X
29 29

VoltDB - Partitions (1/3)
 1 partition per physical CPU core
– Each physical server has multiple VoltDB partitions
 Data - Two types of tables
– Partitioned
Single column serves as partitioning key
Rows are spread across all VoltDB partitions by partition column X X
Transactional data (high frequency of modification)
– Replicated
All rows exist within all VoltDB partitions
Relatively static data (low frequency of modification)
 Code - Two types of work – both ACID
– Single-Partition X
All insert/update/delete operations within single partition X X
Majority of transactional workload
– Multi-Partition
CRUD against partitioned tables across multiple partitions
Insert/update/delete on replicated tables

30

 Single-partition vs. Multi-partition

select count(*) from orders where customer_id = 5
single-partition

select count(*) from orders where product_id = 3
multi-partition

insert into orders (customer_id, order_id, product_id) values (3,303,2)
single-partition

update products set product_name = ‘spork’ where product_id = 3
multi-partition

Partition 1 Partition 2 Partition 3

1 101 2 2 201 1 3 201 1 table orders : customer_id (partition key)
1 101 3 5 501 3 6 601 1 (partitioned) order_id
4 401 2 5 502 2 6 601 2 product_id

1 knife 1 knife 1 knife table products : product_id
2 spoon 2 spoon 2 spoon (replicated) product_name
3 fork 3 fork 3 fork

31

 Looking inside a VoltDB partition…
– Each partition contains data and an
execution engine.
– The execution engine contains a queue
for transaction requests.
Work
– Requests are executed sequentially
(single threaded).
Queue

execution engine

Table Data
Index Data

- Complete copy of all replicated tables
- Portion of rows (about 1/partitions) of
all partitioned tables

32

VoltDB - Compiling
Schema Stored Procedures
 The database is constructed from CREATE TABLE HELLOWORLD ( import org.voltdb. * ;
import org.voltdb. * ;
HELLO CHAR(15),
@ProcInfo( org.voltdb. * ;
import
@ProcInfo(
WORLD CHAR(15), partitionInfo = "HELLOWORLD.DIA

– The schema (DDL)
DIALECT CHAR(15), partitionInfo true "HE
singlePartition = =
@ProcInfo(
partitionInfo = "HELLOWORLD.DIA
)singlePartition = t
PRIMARY KEY (DIALECT) singlePartition = true
); )
public class Insert extends VoltPr

– The work load (Java stored procedures)
public final SQLStmt
public final SQLStmt sql =
public class Insert extends VoltPr
new SQLStmt("INSERT INTO HELLO
public VoltTable[] sql =
public final SQLStmt run
new SQLStmt("INSERT INTO HELLO
public VoltTable[] run( String hel

– The Project (users, groups, partitioning)
public VoltTable[] run( String hel

 VoltCompiler creates application
catalog Project.xml
– Copy to servers along with 1 .jar and <?xml version="1.0"?>
<project>

1 .so <database name='data
<schema path='ddl.
<partition table=‘

– Start servers </database>
</project>

33

VoltDB - Transactions

 All access to VoltDB is via Java stored procedures (Java +
SQL)
 A single invocation of a stored procedure is a transaction
(committed on success)
SQL
 Limits round trips between DBMS
and application
 High performance client applications communicate
asynchronously with VoltDB

34

VoltDB - Clusters/Durability
 Scalability
– Increase RAM in servers to add capacity
– Add servers to increase performance / capacity
– Consistently measuring 90% of single-node performance increase per additional
node
 High availability
– K-safety for redundancy
 Snapshots
– Scheduled, continuous, on demand
 Spooling to data warehouse
 Disaster Recovery/WAN replication (Future)
– Asynchronous replication

35

Google Spanner
□ Google, 2012, Paper, C++
□ Type: NewSQL, New Database
□ Main Point: Google's scalable, multi-version, globally-distributed, and
synchronously-replicated database

□ Distributed multiversion database
 General-purpose transactions (ACID)
 SQL query language
 Schematized tables
 Semi-relational data model
□ Running in production
 Storage for Google’s ad data
 Replaced a sharded MySQL database

http://research.google.com/archive/spanner.html 36

Google Spanner Overview
□Feature: Lock-free distributed read
transactions
□Property: External consistency of distributed
transactions
□First system at global scale
□Implementation: Integration of concurrency
control, replication, and 2PC
□Correctness and performance
□Enabling technology: TrueTime
□Interval-based global time

http://research.google.com/archive/spanner.html 37

Design Goals for Spanner

http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf 38

MySQL Cluster – NDB Architecture

http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-overview.html 39

Schooner MySQL Active Cluster

http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-overview.html 40

dbShards Architecture


Database 업계의 3가지 Trends
□ NoSQL 데이터베이스:
 분산 아키텍처의 확장성 등의 요구 사항을 충족하며, 스키마 없는 데이터
관리 요구 사항에 부합하도록 설계됨.
□ NewSQL 데이터베이스:
 분산 아키텍처의 확장성 등의 요구 사항을 충족하거나 혹은 수평 확장을
필요로하지 않지만 성능을 개선은 되도록 설계됨.
□ Data Grid/Cache 제품:
 응용 프로그램 및 데이터베이스 성능을 높이기 위해 메모리에 데이터를
저장하도록 설계됨.

43

결론
□ 데이터 저장을 위한 많은 솔루션이 존재
□ Oracle, MySQL만 있다는 생각은 버려야 함
□ 먼저 시스템의 데이터 속성과 요구사항을 파악(CAP, ACID/BASE)
□ 한 시스템에 여러 솔루션을 적용
 소규모/복잡한 관계 데이터: RDBMS
 대규모 실시간 처리 데이터: NoSQL, NewSQL
 대규모 저장용 데이터: Hadoop 등
□ 적절한 솔루션 선택
□ 반드시 운영 중 발생할 수 있는 이슈에 대해 검증 후 도입 필요
□ 대부분의 NewSQL 솔루션은 베타 상태(섣부른 선택은 독이 될 수 있음)
□ 솔루션의 프로그램 코드 수준으로 검증 필요
□ NewSQL 솔루션에 대한 안정성 확보
□ 솔루션 자체의 안정성은 검증이 필요하며 현재의 DBMS 수준의 안정성은 지원하
지 않음
□ 반드시 안정적인 데이터 저장 방안 확보 후 적용 필요
□ 운영 및 개발 경험을 가진 개발자 확보 어려움
□ 요구사항에 부합되는 NewSQL 선정 필요
□ 처음부터 중요 시스템에 적용하기 보다는 시범 적용 필요
□ 선정된 솔루션 검증, 기술력 내재화

44

감사합니다.

45

Early – 2000s

□All the big players were heavyweight
and expensive.
 Oracle, DB2, Sybase, SQL Server, etc.

□Open-source databases were missing
important features.
 Postgres, mSQL, and MySQL.


Early – 2000s : eBay Architecture

http://highscalability.com/ebay-architecture 48

Early – 2000s : eBay Architecture

 Push functionality to application:
 Joins
 Referential integrity
 Sorting done

 No distributed transactions

http://highscalability.com/ebay-architecture 49

Mid– 2000s

□MySQL + InnoDB is widely adopted by
new web companies:
 Supported transactions, replication,
recovery.
 Still must use custom middleware to scale
out across multiple machines.
 Memcache for caching queries.


Mid – 2000s : Facebook Architecture

http://www.techthebest.com/2011/11/29/technology-used-in-facebook/ 51

Mid – 2000s : Facebook Architecture

 Scale out using custom middleware.
 Store ~75% of database in Memcache.
 No distributed transactions.

http://www.techthebest.com/2011/11/29/technology-used-in-facebook/ 52

Late – 2000s

□MySQL + InnoDB is widely adopted by
new web companies:
 Supported transactions, replication,
recovery.
 Still must use custom middleware to scale
out across multiple machines.
 Memcache for caching queries.


Late – 2000s : MongoDB Architecture

http://sett.ociweb.com/sett/settAug2011.html 54

Late – 2000s : MongoDB Architecture

 Easy to use.
 Becoming more like a DBMS over time.
 No transactions.

http://sett.ociweb.com/sett/settAug2011.html 55

Early – 2010s

□New DBMSs that can scale across
multiple machines natively and provide
ACID guarantees.
 MySQL Middleware
 Brand New Architectures


Database SPRAIN
□“An injury to ligaments... caused by being
stretched beyond normal capacity”

□Six key drivers for NoSQL/NewSQL/DDG
adoption
 Scalability
 Performance
 Relaxed consistency
 Agility
 Intricacy
 Necessity
58

Database SPRAIN - Scalability
□Associated sub-driver: Hardware
economics
 Scale-out across clusters of commodity servers
□Example project/service/vendor
 BigTable HBase Riak MongoDB Couchbase, Hadoop
 Amazon RDS, Xeround, SQL Azure, NimbusDB
 Data grid/cache
□Associated use case:
 Large-scale distributed data storage
 Analysis of continuously updated data
 Multi-tenant PaaS data layer
59

Database SPRAIN - Scalability
□User: StumbleUpon
□Problem:
 Scaling problems with recommendation engine on
MySQL
□Solution: HBase
 Started using Apache HBase to provide real-time
analytics on Su.pr
 MySQL lacked the performance headroom and scale
 Multiple benefits including avoiding declaring schema
 Enables the data to be used for multiple applications
and use cases

60

Database SPRAIN - Performance
□Associated sub-driver: MySQL limitations
 Inability to perform consistently at scale
 Hypertable Couchbase Membrain MongoDB Redis
 Data grid/cache
 VoltDB, Clustrix
 Real time data processing of mixed read/write
workloads
 Data caching
 Large-scale data ingestion
61

Database SPRAIN - Performance
□User: AOL Advertising
□Problem:
 Real-time data processing to support targeted
advertising
□Solution: Membase Server
 Segmentation analysis runs in CDH, results passed into
Membase
 Make use of its sub-millisecond data delivery
 More time for analysis as part of a 40ms targeted and
response time
 Also real time log and event management

62

Database SPRAIN – Relaxed Consistency
□Associated sub-driver: CAP theorem
 The need to relax consistency in order to maintain
availability
□Example project/service/vendor:
 Dynamo, Voldemort, Cassandra
 Amazon SimpleDB
 Multi-data center replication
 Service availability
 Non-transactional data off-load

63

Database SPRAIN – Relaxed Consistency
□User: Wordnik
□Problem:
 MySQL too consistent –blocked access to data during
inserts and created numerous temp files to stay
consistent.
□Solution: MongoDB
 Single word definition contains multiple data items
from various sources
 MongoDB stores data as a complete document
 Reduced the complexity of data storage

64

Database SPRAIN – Agility
□ Associated sub-driver: Polyglot
persistence
 Choose most appropriate storage technology for app
in development
 MongoDB, CouchDB, Cassandra
 Google App Engine, SimpleDB, SQL Azure
 Mobile/remote device synchronization
 Agile development
 Data caching
65

Database SPRAIN – Agility
□ User: Dimagi BHOMA (Better Health
Outcomes through Mentoring and
Assessments) project
□Problem:
 Deliver patient information to clinics despite a lack of
reliable Internet connections
□Solution: Apache CouchDB
 Replicates data from regional to national database
 When Internet connection, and power, is available
 Upload patient data from cell phones to local clinic

66

Database SPRAIN – Intricacy
□ Associated sub-driver: Big data, total
data
 Rising data volume, variety and velocity
 Neo4j GraphDB, InfiniteGraph
 Apache Cassandra, Hadoop,
 VoltDB, Clustrix
 Social networking applications
 Geo-locational applications
 Configuration management database
67

Database SPRAIN – Intricacy
□ User: Evident Software
□Problem:
 Mapping infrastructure dependencies for application
performance management
□Solution: Neo4j
 Apache Cassandra stores performance data
 Neo4j used to map the correlations between different
elements
 Enables users to follow relationships between
resources while investigating issues

68

Database SPRAIN – Necessity
□ Associated sub-driver: Open source
 The failure of existing suppliers to address the
performance, scalability and flexibility requirements of
large-scale data processing
□ Example project/service/vendor
 BigTable, Dynamo, MapReduce, Memcached
 Hadoop HBase, Hypertable, Cassandra, Membase
 Voldemort, Riak, BigCouch
 MongoDB, Redis, CouchDB, Neo4J
 All of the above

69

Database SPRAIN – Necessity
□BigTable: Google
□Dynamo: Amazon
□Cassandra: Facebook
□HBase: Powerset
□Voldemort: LinkedIn
□Hypertable: Zvents
□Neo4j: Windh Technologies
 Yahoo: Apache Hadoop and Apache HBase
 Digg: Apache Cassandra
 Twitter: Apache Cassandra, Apache Hadoop and
FlockDB
70

NewSQL Database Overview

Recommended

More Related Content

What's hot (20)

Viewers also liked (17)

Similar to NewSQL Database Overview (20)

NewSQL Database Overview