0% found this document useful (0 votes)
12 views

Lecture 05 - RDBMS

rdbms

Uploaded by

idc.cupons
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Lecture 05 - RDBMS

rdbms

Uploaded by

idc.cupons
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Cloud  

Computing
Lecture  5

Database  As  A  Service

Dan  Amiga
[email protected]
Agenda

• Relational Database
• Replication
• Sharding
• Database as a Service
– AWS RDS
– SQL Azure
CAP Theorem

• Consistency (Atomic data objects)


– any read operation that begins after a write
operation completes must return that value, or
the result of a later write operation.
• Available Data Objects
– even when severe (network? storage?) failures
occur, every request must terminate + minimal
latency
• Partition Tolerance
– No set of failures less than total network failure
is allowed to cause the system to respond
incorrectly.
Relational Database

• Tables + Normalization + Integrity


• Triggers
• Views
• Stored Procedures
• Triggers
• Indexes
• Joins
• ACID Transactions
Some well knows issues with RDMBS

• Scale Up (expensive, hard to do in produc)


• Scale Out (…)
• Locking
• Fast CUD vs Fast R
• Multithreading
• Data Fragmentation
– SSD; Index Rebuilt; Design
• Replication issues
Relational Database Indexes

• Officially B-Tree variations


• How to search through 1,000,000 sorted items?
Considering seek/rotate is 10ms
• 0.2 Seconds?
• Last 6 compares are in the same block
• Index
– We’ll built a 10,000 items index (14 searches, last 6
are same block)
– Index the Index.. 100 items points to 10,000 (3
searches)
• Index fragmentation on disk (as they grow)
– Streaming B-Trees
• Cluster Index
Transactions

• The execution of a transaction must maintain


the relationship between the business state
and the database state
• Hence ACID
– Atomicity is an all-or-none proposition.
– Consistency guarantees that a transaction never
leaves your database in a half-finished state.
– Isolation keeps transactions separated from each
other until they’re finished.
– Durability guarantees that the database will keep
track of pending changes in such a way that the
server can recover from an abnormal termination
Concurrency

• Optimistic
– Assume multiple transactions can complete
without effecting each other
– No locking
– Each transaction verified no other transaction
modified it’s data, otherwise rollback
• Pessimistic
– Blocks resource; effects performance
Scaling RDBMS

• Issues with scaling up when the dataset is just too


big
• RDBMS were not designed to be distributed
• Began to look at multi-node database solutions
• Known as ‘scaling out’ or ‘horizontal scaling’
• Different approaches include:
– Master-slave
– Sharding
– Geo replication (usually not part of the deal)
Master – Slave Replication
• Replication can be synchronous or asynchronous
• All writes are written to the master. All reads
performed against the replicated slave databases
• Critical reads may be incorrect as writes may not
have been propagated down
• Large data sets can pose problems as master
needs to duplicate data to slaves
General Replication Model (MySQL)

Source:
Sharding

• Scales well for both reads and writes


• Not transparent, application needs to be partition-aware
• Can no longer have relationships/joins across partitions
• Loss of referential integrity across shards
Database on the Cloud options

• Install your own (IaaS)


– Azure VMRole
– EC2 machine
– Pros: Full Symmetry with on premise
– Cons: Everything we mentioned minus small delta
• Database as a Service (Paas Approach)
– SQL Azure
– Amazon RDS (MySql, Oracle standard one,
standard, enterprise)
– Pros: Fully managed by the cloud provider
including: backup, replication, failover, availability,
security
Amazon RDS

• Similar Capabilities as MSSQL,Oracle,MySQL


• Cost Efficient and Resizable Capacity
• Lower time-consuming database tasks
• Automatic patches and Backups
• Scale compute/storage via API call
• Provisioned IOPS – fast and consistent, I/O
intensive (3TB & 30,000 IOPS)
• Monitor via CloudWatch APIs
Amazon RDS

• Read Replicas
• Multi-Availability Zone
– Production excellence
– Synchronous replication (across AZ)
– Automatic failover, endpoint stays.
IOPS vs MBps

IOPS = (MBps Throughput / KB per IO) * 1024


Or
MBps = (IOPS * KB per IO) / 1024
Amazon RDS
• http://aws.amazon.com/rds/
• Read Replicates
• Standard vs Multi-AZ deployments
• Bring your own vs included pricing
IOPS vs MBPS

• IOPS = (MBps Throughput / KB per IO) *


1024
Or
MBps = (IOPS * KB per IO) / 1024
SQL Azure

• Database as a service
• Not a full blown SQL Server
– e.g. SQLCLR, ServiceBroker
SQL Azure Database

• Familiar SQL Server relational database model delivered as a service


– Support for existing APIs & tools
– Built for the cloud with high availability & fault tolerance
– Easily provision and manage databases across multiple datacenters

• SQL Azure provides logical server


– Gateway server that understands TDS protocol
– Looks like SQL Server to TDS Client
– Actual data stored on multiple backend data nodes

• Logical optimizations supported


– Indexes, Query plans etc..

• Physical optimizations not supported


– File Groups, Partitions etc…

• Transparently manages physical storage


Database Choices
Value  Props:

Value  Props:

Roll-­‐your-­‐own  HA/DR/scale
Dedicated
On-­‐premise
Resources

Value  Props:

Hosted

SQL  Azure    (RDBMS)

Shared
Low “Friction”/Control High
SQL Azure Application Topologies

SQL  Azure  access  from  within  MS  Datacenter   SQL  Azure  access  from  outside  MS  Datacenter  
(Azure  compute  – ADO.NET) (On-­‐premises  – ADO.NET)
Application/  
Browser App  Code  
/  Tools

SOAP/REST ADO.NET  Data  Svcs/REST  -­‐ EF


HTTP/S HTTP/S

App  Code
(ASP.NET)

Windows  Azure
T-­‐SQL  (TDS)
T-­‐SQL  (TDS)

SQL  Data   SQL  Data  


MS MS
Services Services Datacenter
Datacenter

Code  Near Code  Far


Architecture
• Shared infrastructure at SQL database and below
– Request routing, security and isolation
• Scalable HA technology provides the glue
– Automatic replication and failover
• Provisioning, metering and billing infrastructure
SDS  Provisioning  (databases,    accounts,    roles,  …,  Metering,  and  Billing

Machine  4 Machine  5 Machine  6


SQL  Instance SQL  Instance SQL  Instance
SQL  DB SQL  DB SQL  DB
User User User User User User User User User User User User
DB1 DB2 DB3 DB4 DB1 DB2 DB3 DB4 DB1 DB2 DB3 DB4

Scalability  and  Availability:  Fabric,  Failover,  Replication,  and    Load  balancing


Scalability  and  Availability:  Fabric,  Failover,  Replication,  and    Load  balancing
SQL Azure Federation

• Provides Scale-Out Support in SQL Azure


– Partition data and load across many servers
– Bring computational resources of many to bear
• Take advantage of elastic provisioning of
databases
• Pay as you go benefits
• Zero physical administration
• Federation includes
– Robust Connection Management
– Online repartition operations
– Split & Merge Databases
SQL Azure Federations: Concepts

Root
• Federation Federation  “CustData”
– Represents the data being partitioned (Federation  Key:  CustID)
• Federation Key
– The value that determines the routing of a piece of data Member:  PK  [min,  100)

• Atomic Unit AU AU AU
PK=5 PK=25 PK=35
– All rows with the same federation key value: always together!
• Federation Member (aka Shard)
– A physical container for a range of atomic units Member:  PK  [100,  488)
• Federation Root AU AU AU
– The database that houses federation directory PK=105 PK=235 PK=365

Member:  PK  [488,  max)

AU AU AU
PK=555 PK=2545 PK=3565
Distribution of Data: Concepts

• Partitioned
– Spread across member machines
– Each piece is on one machine (+HA)
– Most of the data!
• Centralized Config

– Only available in one place


– Read and write, but not too much
• Reference
– Copied to all member machines Data1 Data2 Data3 Data4 Data5
– Can be read anywhere (reference)
– Should not be written to often ref ref ref ref ref
In Memory Database

• Full symmetry of the SQL model


– All in memory
• Example H2Database:
Quick Architecture Exercise

• Build a website with DB…


• Requirements?
– HA (Cloud)
– Scale (Cloud)
– Fast (Multi level Cache)
– Stateless (Patterns)
– Database (DaaS)

You might also like