Azure Cosmos DB: Technical Deep Dive
Azure Cosmos DB: Technical Deep Dive
<Speaker>
<date>
Reviewing Azure Cosmos DB
Partitioning
Querying
AGENDA
Programming
Troubleshooting
SQL
MongoDB
Table API
Column-family Document
Key-value Graph
Turnkey global
Comprehensive
distribution
SLAs
AZURE COSMOS DB
A FULLY-MANAGED GLOBALLY DISTRIBUTED DATABASE SERVICE BUILT TO GUARANTEE
EXTREMELY LOW LATENCY AND MASSIVE SCALE FOR MODERN APPS
TURNKEY GLOBAL DISTRIBUTION
SQL
MongoDB
Table API
Column-family Document
Key-value Graph
COMPREHENSIVE SLAs
Azure Cosmos DB is the only service with financially-backed SLAs for millisecond
latency at the 99th percentile, 99.999% HA and guaranteed throughput and
consistency
Microwave Liquid
Item Color CPU Memory Storage
safe capacity
Geek mug Graphite Yes 16ox ??? ??? ???
Coffee Tan No 12oz ??? ??? ???
Bean mug
Surface Gray ??? ??? 3.4 GHz 16GB 1 TB
book Intel SSD
Skylake
Core i7-
6600U
T R U S T Y O U R D ATA T O I N D U S T R Y-
LEADING SECURITY & COMPLIANCE
The 1st and only Deliver massive Provides guaranteed Natively supports Boasts 5 well-defined
database with global storage/throughput single digit different types of data consistency models to
distribution turnkey scalability database millisecond latency at at massive scale pick the right
capability 99th percentile consistency/latency/
worldwide throughput tradeoff
Enables mission Gives high flexibility Tackles big data Provides multi-tenancy Naturally analytics-
critical intelligent to optimize for speed workloads with high and enterprise-grade ready and perfect for
applications and cost availability and security event-driven
reliability architectures
POWERING GLOBAL SOLUTIONS
Azure Cosmos DB was built to support modern app patterns and use cases.
It enables industry-leading organizations to unlock the value of data, and respond to
global customers and changing business dynamics in real-time.
Data distributed and Build real-time Ideal for gaming, Simplified Run Spark analytics Lift and shift
available globally customer experiences IoT & eCommerce development with over operational data NoSQL data
serverless architecture
Puts data where your Enable latency-sensitive Predictable and fast Fully-managed event- Accelerate insights from Lift and shift MongoDB
users are personalization, bidding, service, even during driven micro-services fast, global data and Cassandra
and fraud detection. traffic spikes with elastic computing workloads
power
D AT A D I S T R I B U T E D
A N D AVA I L A B L E
G L O B A L LY
Put your data where your users are to give real-time access
and uninterrupted service to customers anywhere in the Azure region A
<10 ms Azure Cosmos DB
world. (app + session state)
Azure region C
<10 ms
BUILD REAL-TIME Online Recommendations Service
CUSTOMER HOT path
EXPERIENCES
Offer latency-sensitive applications with Azure Service Fabric Azure Cosmos DB
personalization, bidding, and fraud-detection. (Personalization (distributed model
Decision Engine) store)
Logic apps
SIMPLIFIED
DEVELOPMENT WITH
S E RV E R L E S S Azure Functions Azure Cosmos DB
A RCHITECTURE
Experience decreased time-to-market, enhanced
(E-commerce Checkout, API) (Order event score)
Lift and shift Run Spark over Build real-time customer Ideal for IoT, gaming and
MongoDB apps operational data experiences eCommerce
RESOURCE MODEL
Database
Database
Database
Database
Database
Item
ACCOUNT URI AND CREDENTIALS
Account ********.azure.com
Database
Database
Database
IGeAvVUp …
Database
Database
Container
Database
Database
Item
C R E AT I N G A C C O U N T
Account
Database
Database
Database
Database
Database
Container
Database
Database
Item
D AT A B A S E R E P R E S E N TAT I O N S
Account
Database
Database
Database
Database
Database Database
Database
Container Collection
Database
Database Database
Database
Item Document
C O N T A I N E R R E P R E S E N TAT I O N S
Account
Database
Database
Database
Database
Database
Container = Collection Graph Table
Database
Database
Item
C R E AT I N G C O L L E C T I O N S – S Q L A P I
Account
Database
Database
Database
Database
Database
Container
Database
Database
Item
C O N TA I N E R - L E V E L R E S O U R C E S
Account
Database
Database
Database
Database
Database
Container
Database
Database
Item Sproc Trigger UDF Conflict
SYSTEM TOPOLOGY (BEHIND THE SCENES)
CONTAINERS
Logical resources “surfaced” to APIs as tables, collections
or graphs, which are made up of one or more physical
Tenants partitions or servers.
Follower
Leader
Forwarder
Containers
Replica Set
RESOURCE PARTITIONS
• Consistent, highly available, and resource-governed
coordination primitives
Resource Partitions
• Consist of replica sets, with each replica hosting an
instance of the database engine
PA R T I T I O N I N G
Cosmos DB Container
(e.g. Collection)
hash(User ID)
hash(User ID)
Dharma
Andrew
Shireesh
Karthik
Alice
… Carol
…
…
hash(User ID)
Dharma
Andrew
Shireesh
Karthik
Alice
… Carol
…
…
hash(User ID)
Dharma
Partition Ranges can be dynamically sub-divided to seamlessly grow Dharma Rimma
Shireesh
database as the application grows while simultaneously maintaining
high availability. Karthik
KEY MOTIVATIONS
• Distribute Requests
• Distribute Storage
• Intelligently Route Queries for Efficiency
PA R T I T I O N D E S I G N
EXAMPLE SCENARIO
Contoso Connected Car is a vehicle telematics company.
They are planning to store vehicle telemetry data from
millions of vehicles every second in Azure Cosmos DB to
power predictive maintenance, fleet management, and driver
risk analysis.
The partition key we select will be the scope for multi-record
transactions.
Most auto manufactures only have a couple dozen Auto manufacturers have transactions occurring
models. This will create a fixed number of logical throughout the year. This will create a more balanced
partition key values; and is potentially the least granular distribution of storage across partition key values.
option. However, most business transactions occur on recent
Depending how uniform sales are across various models data creating the possibility of a hot partition key for the
– this introduces possibilities for hot partition keys on current month on throughput.
both storage and throughput.
Each car would have a unique device ID. This creates a This composite option increases the granularity of
large number of partition key values and would have a partition key values by combining the current month and
significant amount of granularity. a device ID. Specific partition key values have less of a
Depending on how many transactions occur per vehicle, risk of hitting storage limitations as they only relate to a
it is possible to a specific partition key that reaches the single month of data for a specific vehicle.
storage limit per partition key Throughput in this example would be distributed more to
logical partition key values for the current month.
Partitions should be based on your most often occurring query and transactional
needs. The goal is to maximize granularity and minimize cross-partition
requests.
RODEL_2017_L
RODEL_2017 X
RODEL_2017_E
X
RODEL
RODEL_2018 RODEL_2018_L
X
Don’t be afraid to have more partitions! More partition keys = More scalability
Example – Contoso Connected Car
PA R T I T I O N G R A N U L A R I T Y
RODEL_2017_L
RODEL_2017 X
RODEL_2017_E
X
RODEL
RODEL_2018 RODEL_2018_L
X
Don’t be afraid to have more partitions! More partition keys = More scalability
Example – Contoso Connected Car
PA R T I T I O N G R A N U L A R I T Y
SELECT * FROM C
RODEL_2017_L
RODEL_2017 X
RODEL_2017_E
X
RODEL
RODEL_2018 RODEL_2018_L
X
Airport
Airport
C49E27EB-2016
C49E27EB-2016-09
C49E27EB-2016-09-25
C49E27EB-2016-09-25-04
C49E27EB-2016-09-25-04-15
C49E27EB-2016
m
si
st
rr
rt
it
y
k
n
o
g
n
h
?
c
e
a
r
r
i
t
C49E27EB-2016-09
C49E27EB-2016-09-25
C49E27EB-2016-09-25-04
os DB
cardin
impac
Azure
Cosm
query
t ease
highe
allow
evenl
distri
grow
data;
your
ality
may
bute
also
and
key
ing
but
of
to
C49E27EB-2016-09-25-04-15
A
y
r
Example – Contoso Connected Car
PA R T I T I O N S
General Tips
• Build a POC to strengthen your understanding of the workload and iterate
(avoid analyses paralysis)
• Don’t be afraid of having too many partition keys
• Partitions keys are logical
• More partition keys more scalability
PA R T I T I O N K E Y
STORAGE LIMITS
HTTP 403
HTTP 403
D E S I G N PAT T E R N S F O R L A R G E PA R T I T I O N K E Y S
15
HTTP 403
ARD 15 2 3 4 5 6 7 8 9 10 11 12 13 14
H O T / C O L D PA R T I T I O N S
Partitions that are approaching thresholds are referred to as hot. Partitions that are
underutilized are referred to as cold.
Hot Partition
Partition Partition
Cold
Cold
Partition
Partition
Q U E RY FA N - O U T
var querySpec = {
query: 'SELECT * FROM container c’
};
var feedOptions = {
enableCrossPartitionQuery = true
maxDegreeOfParallelism = 10
};
Iterator<Document> it = client.queryDocuments(
collectionLink,
"SELECT * from r",
options
).getQueryIterator();
DEMO
Cross-Partition Query
Q U E RY FA N O U T
If you have relevant data to return, creating a cross-partition query is a perfectly acceptable workload with a predictable throughput.
You are charged ~1 RU for each partition that doesn’t have any relevant data.
Multiple fan-out queries can quickly max out RU/s for each partition
Q U E RY FA N O U T
ng this
checki
querie
partiti
secon
10,00
s per
on
0
SELECT * FROM car a WHERE a.year = “2015”
ng this
checki
querie
partiti
secon
10,00
s per
on
0
>10,000 more queries
per second
PK = origin
SELECT * FROM car a WHERE a.year = “2016”
ng this
checki
querie
partiti
secon
10,00
s per
on
0
Example – Contoso Connected Car
Q U E RY FA N O U T
on
partiti
ng this
checki
s
querie
nt
releva
Only
SELECT * FROM car a
WHERE a.model = “TURLIC” AND a.year = “2015”
on
partiti
ng this
checki
s
querie
nt
releva
Only
>10,000 more queries
per second
PK = origin
on
partiti
ng this
checki
s
querie
nt
releva
Only
WHERE a.model = “COASH” AND a.year = “2016”
MULTIPLE THINGS CAN IMPACT THE PERFORMANCE OF A QUERY RUNNING IN AZURE COSMOS DB. A FEW IMPORTANT
QUERY PERFORMANCE FACTORS INCLUDE:
Provisioned throughput
Measure RU per query, and ensure that you have the required provisioned throughput for your queries
Favor queries with the partition key value in the filter clause for low latency
Follow SDK best practices like direct connectivity, and tune client-side query execution options
Q U E RY T U N I N G
MANY THINGS CAN IMPACT THE PERFORMANCE OF A QUERY RUNNING IN AZURE COSMOS DB. IMPORTANT
PERFORMANCE FACTORS INCLUDE:
Network latency
Account for network overhead in measurement, and use multi-homing APIs to read from the nearest region
Indexing Policy
Ensure that you have the required indexing paths/policy for the query
Query Complexity
Analyze the query execution metrics to identify potential rewrites of query and data shapes
C L I E N T Q U E R Y PA R A L L E L I S M
Modern processors ship with both physical and virtual (hyper-threading) cores. For
any given cross-partition query, the SDK can use concurrent threads to issue the
query across the underlying partitions.
Primary Thread
By default, the SDK uses a slow start algorithm for cross-partition queries,
increasing the amount of threads over time. This increase is exponential up to any
Concurrent Thread
physical or network limitations.
First Request
1 3
First Response
6 4
Concurrent Request
5
Concurrent Response
Client Buffer
S D K Q U E RY O P T I O N S
GET =
Normalized across various access methods
1 RU = 1 read of 1 KB document
POST =
Each request consumes fixed RUs
Query =
…
REQUEST UNITS
Incoming Requests
Max RU/sec
Metered Hourly
No rate limiting
The complexity of a query impacts how many Request Units are consumed for an operation. The number of predicates, nature of the
predicates, number of system functions, and the number of index matches / query results all influence the cost of query operations.
Every write operation will require the indexer to run. The more indexed terms you have, the more indexing will be directly having an effect
on the RU charge.
You can optimize for this by fine-tuning your index policy to include only fields and/or paths certain to be used in queries.
MEASURING RU CHARGE
Azure Cosmos DB uses information about past runs to produce a stable logical charge for the majority of CRUD or query operations.
Since this stable charge exists, we can rely on our operations having a high degree of predictability with very little variation. We can use
the predictable RU charges for future capacity planning.
client.createDocument(
collectionLink,
documentDefinition,
function (err, document, headers) {
if (err) {
console.log(err);
}
var requestData = headers['x-ms-request-charge'];
}
);
RU CHARGE MEASUREMENT EXAMPLE
Storage Cost
Throughput Cost
Operation Type Number of Requests per Second Avg RU's per Request RU's Needed
Create 100 5 500
Read 400 1 400
Tuning a Query
VA L I D AT I N G T H R O U G H P U T L E V E L C H O I C E
SELECT VALUE
COUNT(1)
FROM
telemetry t
WHERE
t.deviceId = "craft267_seat17"
{
"type":"Point",
GEOJSON SPECIFICATION
"coordinates":[ 31.9, -4.8 ]
}
Azure Cosmos DB supports indexing and querying of geospatial point data that's
represented using the GeoJSON specification.
{
SEARCH BY DISTANCE FROM POINT "type":"Polygon",
"coordinates":[[
The ST_DISTANCE built-in function returns the distance between the two [ 31.8, -5 ],
GeoJSON Point expressions. [ 31.8, -4.7 ],
[ 32, -4.7 ],
SEARCH WITHOUT BOUNDED POLYGON [ 32, -5 ],
[ 31.8, -5 ]
The ST_WITHIN built-in function returns a Boolean indicating whether the first ]]
GeoJSON Point expression is within a GeoJSON Polygon expression. }
D I S TA N C E F R O M C E N T E R P O I N T S E A R C H
SELECT *
FROM flights f
ST_DISTANCE
WHERE ST_DISTANCE(f.origin.location, {
"type": "Point",
ST_DISTANCE can be used to measure the distance between two points.
"coordinates": [-122.19, 47.36]
Commonly this function is used to determine if a point is within a specified range
(meters) of another point. }) < 100 * 1000
P O LY G O N S H A P E S E A R C H
SELECT *
FROM flights f
ST_WITHIN
WHERE ST_WITHIN(f.destination.location,
{
ST_WITHIN can be used to check if a point lies within a Polygon. Commonly "type": "Polygon",
Polygons are used to represent boundaries like zip codes, state boundaries, or
"coordinates": [[
natural formations.
[-124.63, 48.36],
[-123.87, 46.14],
Polygon arguments in ST_WITHIN can contain only a single ring, that is, the
Polygons must not contain holes in them. [-122.23, 45.54],
[-119.17, 45.95],
[-116.92, 45.96],
[-116.99, 49.00],
[-123.05, 49.02],
[-123.15, 48.31],
[-124.63, 48.36]
]]
})
H A N D L E A N Y D ATA W I T H N O
SCHEMA OR INDEXING
REQUIRED
Azure Cosmos DB’s schema-less service automatically indexes all your
data, regardless of the data model, to delivery blazing fast queries.
Microwave Liquid
Item Color CPU Memory Storage
safe capacity
Geek mug Graphite Yes 16ox ??? ??? ???
Coffee Tan No 12oz ??? ??? ???
Bean mug
Surface Gray ??? ??? 3.4 GHz 16GB 1 TB
book Intel SSD
Skylake
Core i7-
6600U
INDEXING JSON DOCUMENTS
{
"locations": [
{
"country": "Germany",
"city": "Berlin" locations headquarter exports
},
{
"country": "France", 0 1 Belgium 0 1
"city": "Paris"
} country city country city city city
],
"headquarter": "Belgium",
"exports": [ Germany Berlin France Paris Moscow Athens
{ "city": "Moscow" },
{ "city": "Athens" }
]
}
INDEXING JSON DOCUMENTS
{
"locations": [
{
locations headquarter exports
"country": "Germany",
"city": "Bonn",
"revenue": 200 0 Italy 0 1
}
],
"headquarter": "Italy", country city revenue city dealers city
"exports": [
{
Germany Bonn 200 Berlin 0
"city": "Berlin",
"dealers": [
{ "name": "Hans" } name
]
},
Hans
{ "city": "Athens" }
]
}
INDEXING JSON DOCUMENTS
+
0 1 Belgium 0 1 0 Italy 0 1
country city country city city city country city revenue city dealers city
Germany Berlin France Paris Moscow Athens Germany Bonn 200 Berlin 0 Athens
name
Hans
I N V E RT E D I N D E X
{1,
2}
{1 {1 {1 {1
} Berlin } France } Paris } Moscow
{1,
2} Germany
{2 {2 {2 {2 {2
} Bonn } 200 } Berlin } 0 } Athens
{2
} name
{2
} Hans
INDEX POLICIES
Policy Policy
t0 t1
Collection
S Q L S Y N TA X
The SELECT & FROM keywords are the basic components of every query.
SELECT SELECT
tickets.id, t.id,
tickets.pricePaid t.pricePaid
FROM tickets FROM tickets t
S Q L Q U E R Y S Y N TA X - W H E R E
FILTERING
SELECT
tickets.id,
tickets.pricePaid
FROM tickets
WHERE
tickets.pricePaid > 500.00 AND
tickets.pricePaid <= 1000.00
S Q L Q U E R Y S Y N TA X - P R O J E C T I O N
JSON PROJECTION
SELECT {
"id": tickets.id,
[
"flightNumber": tickets.assignedFlight.flightNumber, {
"purchase": { "ticket": {
"id": "6ebe1165836a",
"cost": tickets.pricePaid "purchase": {
}, "cost": 575.5
},
"stops": [ "stops": [
tickets.assignedFlight.origin, "SEA",
"JFK"
tickets.assignedFlight.destination ]
] }
}
} AS ticket ]
FROM tickets
S Q L Q U E R Y S Y N TA X - P R O J E C T I O N
SELECT VALUE
The VALUE keyword can further flatten the result collection if needed for a specific
application workload
SELECT VALUE {
"id": tickets.id,
"flightNumber": tickets.assignedFlight.flightNumber, [
"purchase": { {
"id": "6ebe1165836a",
"cost": tickets.pricePaid "purchase": {
}, "cost": 575.5
},
"stops": [ "stops": [
tickets.assignedFlight.origin, "SEA",
"JFK"
tickets.assignedFlight.destination ]
] }
]
}
FROM tickets
INTRA-DOCUMENT JOIN
[
SELECT
{
tickets.assignedFlight.number,
"number":"F125","seat":"12A",
tickets.seat,
"requests": [
ticket.requests
"kosher_meal",
FROM
"aisle_seat"
tickets
]
WHERE
}
ticket.requests[1] == "aisle_seat"
]
SQL
INTRA-DOCUMENT JOIN
JOIN allows us to merge embedded documents or arrays across multiple documents and returned a flattened result set:
[
{
"number":"F125","seat":"12A",
"requests":"kosher_meal"
SELECT
},
tickets.assignedFlight.number,
{
tickets.seat, "number":"F125","seat":"12A",
requests "requests":"aisle_seat"
FROM },
{
tickets
"number":"F752","seat":"14C",
JOIN "requests":"early_boarding"
requests IN tickets.requests },
{
"number":"F752","seat":"14C",
"requests":"window_seat"
}
]
SQL
INTRA-DOCUMENT JOIN
Along with JOIN, we can also filter the cross products without knowing the array index position:
[
SELECT
{
tickets.id, requests
"number":"F125","seat":"12A“,
FROM
"requests": "aisle_seat"
tickets
},
JOIN
{
requests IN tickets.requests
"number":"F752","seat":"14C",
WHERE
"requests": "window_seat"
requests
}
IN ("aisle_seat", "window_seat")
]
SQL
PA G I N AT E D Q U E R Y R E S U LT S
while(documents.hasNext()) {
Document current = documents.next();
}
S Q L Q U E R Y PA R A M E T R I Z AT I O N
2. Running Intra-document
Q U E RY I N G T H E
D AT A B A S E U S I N G Queries
SQL 3. Projecting Query Results
PROGRAMMING
OPTIMISTIC CONCURRENCY
• The SQL API supports optimistic concurrency control (OCC) through HTTP entity tags, or ETags
• Every SQL API resource has an ETag system property, and the ETag value is generated on the server every time a document is updated.
• If the ETag value stays constant – that means no other process has updated the document. If the ETag value unexpectedly mutates – then
another concurrent process has updated the document.
• ETags can be used with the If-Match HTTP request header to allow the server to decide whether a resource should be
updated:
If-Match
ETag Match
Check
ETag
HTTP 412
ETag Stale
C O N T R O L C O N C U R R E N C Y U S I N G E TA G S
try
if (dce.StatusCode == HttpStatusCode.PreconditionFailed)
}
STORED PROCEDURES
BENEFITS
• Familiar programming language
• Atomic Transactions
• Built-in Optimizations
• Business Logic Encapsulation
SIMPLE STORED PROCEDURE
function createSampleDocument(documentToCreate) {
var context = getContext();
var collection = context.getCollection();
var accepted = collection.createDocument(
collection.getSelfLink(),
documentToCreate,
function (error, documentCreated) {
context.getResponse().setBody(documentCreated.id)
}
);
if (!accepted) return;
}
M U LT I - D O C U M E N T T R A N S A C T I O N S
DATABASE TRANSACTIONS
In a typical database, a transaction can be defined as a
sequence of operations performed as a single logical unit of
work. Each transaction provides ACID guarantees.
In Azure Cosmos DB, JavaScript is hosted in the same
memory space as the database. Hence, requests made within
stored procedures and triggers execute in the same scope of a
database session.
Stored procedures utilize snapshot
isolation to guarantee all reads
within the transaction will see a
consistent snapshot of the data
All Azure Cosmos DB operations must complete within the server-specified request timeout duration. If an operation does not
complete within that time limit, the transaction is rolled back.
All functions under the collection object (for create, read, replace, and delete of documents and attachments) return a Boolean
value that represents whether that operation will complete:
• If true, the operation is expected to complete
• If false, the time limit will soon be reached and your function should end execution as soon as possible.
T R A N S A C T I O N C O N T I N U AT I O N M O D E L
Observe
Each
Try Create Return Done
Document
Value
Stored procedures allow you to naturally express control flow, variable scoping, assignment, and integration of exception handling
primitives with database transactions directly in terms of the JavaScript programming language.
ES6 PROMISES
ES6 promises can be used to implement promises for Azure Cosmos DB stored procedures. Unfortunately, promises “swallow”
exceptions by default. It is recommended to use callbacks instead of ES6 promises.
STORED PROCEDURE CONTROL FLOW
if (!bAccepted) return;
};
context.getResponse().setBody({
"firstDocId": created.id,
"secondDocId": created.id
});
};
}
STORED PROCEDURE CONTROL FLOW
context.getResponse().setBody({
});
});
if (!aAccepted) return;
});
if (!bAccepted) return;
}
ROLLING BACK TRANSACTIONS
TRANSACTION ROLL-BACK
Inside a JavaScript function, all operations are automatically wrapped under a single transaction:
• If the function completes without any exception, all data changes are committed
• If there is any exception that’s thrown from the script, Azure Cosmos DB’s JavaScript runtime will roll back the whole transaction.
Implicit Implicit
BEGIN TRANSACTION COMMIT TRANSACTION
Update Delete
Create New Query
Existing Existing
Document Collection
Document Document
Implicit Transaction Scope
ROLLBACK TRANSACTION
If exception, undo changes
TRANSACTION ROLLBACK IN STORED PROCEDURE
collection.createDocument(
collection.getSelfLink(),
documentToCreate,
function (error, documentCreated) {
if (error) throw "Unable to create document, aborting...";
}
);
collection.createDocument(
documentToReplace._self,
replacementDocument,
function (error, documentReplaced) {
if (error) throw "Unable to update document, aborting...";
}
);
DEBUGGING STORED PROCEDURES
CONSOLE LOGGING
Much like with traditional JavaScript applications, you can use console.log() to capture various telemetry and data points for your running
code.
.NET
You must opt-in to viewing and capturing console output using the EnableScriptLogging boolean property available in the client
SDK. The SDK has a ScriptLog property on the StoredProcedureResponse class that contains the captured output of the JavaScript
console log.
DEBUGGING STORED PROCEDURES
UDF
• User-defined functions (UDFs) are used to extend the Azure Cosmos DB SQL API’s query language grammar and
implement custom business logic. UDFs can only be called from inside queries
• They do not have access to the context object and are meant to be used as compute-only code
USER-DEFINED FUNCTION DEFINITION
var taxUdf = {
id: "tax",
serverScript: function tax(income) {
if (income == undefined)
throw 'no input’;
if (income < 1000)
return income * 0.1;
else if (income < 10000)
return income * 0.2;
else
return income * 0.4;
}
}
USER-DEFINED FUNCTION USAGE IN QUERIES
SELECT
*
FROM
TaxPayers t
WHERE
udf.tax(t.income) > 20000
SQL
DEMO
COMMON SCENARIOS
• Trigger notification for new items
• Perform real-time analytics on streamed data
• Synchronize data with a cache, search engine or data
warehouse.
CHANGE FEED
New Events
Stream processing
Perform real-time (stream) IoT processing, Data science & analytics
processing on updates to data
New Events
Trigger Action
From Change Feed
…
Azure Cosmos DB Change feed
Microservice #N
M AT E R I A L I Z I N G V I E W S
Application
Secondary Datastore
(e.g. archive)
Replicated Records
CRUD Data
Consumer parallelization
Change feed listens for any changes in Azure Cosmos DB
Event/stream
collection. It then outputs the sorted list of documents that were processing app tier
changed in the order in which they were modified.
Consumer 3
The changes are persisted, can be processed asynchronously and
incrementally, and the output can be distributed across one or
more consumers for parallel processing. The change feed is
available for each partition key range within the document
collection, and thus can be distributed across one or more Consumer 2
consumers for parallel processing.
Consumer 1
CHANGE FEED – RETRIEVING KEY RANGES
https://www.nuget.org/packages/M
icrosoft.Azure.DocumentDB.Chan
geFeedProcessor
/
CHANGE FEED PROCESSOR –
BEHIND THE SCENES
CHANGE FEED PROCESSOR –
I N T E R F A C E I M P L E M E N TAT I O N
public class DocumentFeedObserver : IChangeFeedObserver
{
...
public Task IChangeFeedObserver.ProcessChangesAsync(ChangeFeedObserverContext context, IReadOnlyList<Document>
docs)
{
Console.WriteLine("Change feed: {0} documents", Interlocked.Add(ref totalDocs, docs.Count));
foreach(Document doc in docs)
{
Console.WriteLine(doc.Id.ToString());
}
return Task.CompletedTask;
}
}
C H A N G E F E E D P R O C E S S O R - R E G I S T R AT I O N
Code Meaning
409 Request Timeout Stored Procedure, Trigger or UDF exceeded maximum execution time
R E S P O N S E S TAT U S C O D E S
Header Value
409 Conflict The item Id for a PUT or POST operation conflicts with an existing item
412 Precondition Failure The specified eTag is different from the version on the server (optimistic
concurrency error)
413 Entity Too Large The item size exceeds maximum allowable document size of 2MB
429 Too Many Requests Container has exceeded provisioned throughput limit
Header Value
etag The same value as the _etag property of the requested item
RESPONSE HEADERS
Header Value
x-ms-continuation Token returned if a query (or read-feed) has more results and is resubmitted by
clients as a request header to resume execution
x-ms-session-token Used to maintain session consistency. Clients much echo this as a request
header in subsequent operations to the same container
x-ms-retry-after-ms If rate limited, the number of milliseconds to wait before retrying the
operation
DEMO
A rate limited request will return a HTTP status code of 429 (Too Many
Requests). This response indicates that the container has exceeded provisioned
throughput limit.
A rate limited request will also have a x-ms-retry-after-ms header. This header
gives the number of milliseconds your application should wait before retrying the
current request.
The SDK automatically retries any throttled requests. This can potentially create
a long-running client-side method that is attempting to retry throttled requests.
LOGGING
ENABLE LOGGING
Diagnostic Logs for Azure Services are opt-in. You should first enable logging (using the Portal, CLI or PowerShell).
LOG ANALYTICS
If you selected the Send to Log Analytics option when you turned on diagnostic logging, diagnostic data from your collection is forwarded
to Log Analytics.
CONTAINERS
• Ability to query across multiple entity types with a single network request.
Ability to query across multiple entity types with a single network request.
{
{
"id": "Andrew",
"id": "Ralph",
"type": "Person",
"type": "Cat",
"familyId": "Liu",
"familyId": "Liu",
"worksOn": "Azure Cosmos DB"
"fur": {
}
"length": "short",
"color": "brown"
}
}
We can query both types of documents without needing a JOIN simply by running a query without a filter on type:
If we wanted to filter on type = “Person”, we can simply add a filter on type to our query:
{
"id": "08259",
UPDATES ARE ATOMIC "ticketPrice": 255.00,
"flightCode": "3754",
Update operations update the entire document, not specific fields or "origin": {
“parts” of the document. "airport": "SEA",
"gate": "A13",
DE-NORMALIZED DOCUMENTS CAN BE EXPENSIVE TO "departure": "2014-09-15T23:14:25.7251173Z"
UPDATE },
"destination": {
De-normalization has benefits for read operations, but you must weigh "airport": "JFK",
this against the costs in write operations. "gate": "D4",
"arrival": "2014-09-16T02:10:10.2379581Z"
De-normalization may require fanning out update operations. },
"pilot": [{
Normalization may require chaining a series of requests to "id": "EBAMAO",
resolve relationships. “name": "Hailey Nelson"
}]
}
U P D AT I N G N O R M A L I Z E D D AT A
Normalized: Optimized for writes over reads De-normalized: Optimized for reads over
writes
{ {
"id": "08259", "id": "08259",
"pilot": [{ "id": "EBAMAO", “name": "Hailey Nelson“ }] "ticketPrice": 255.00,
}, "flightCode": "3754",
{ "origin": {
"id": "08259", "airport": "SEA",
"ticketPrice": 255.00, "gate": "A13",
"flightCode": "3754" "departure": "2014-09-15T23:14:25.7251173Z"
}, },
{ "destination": {
"id": "08259", "airport": "JFK",
"origin": { "gate": "D4",
"airport": "SEA", "gate": "A13", "arrival": "2014-09-16T02:10:10.2379581Z"
"departure": "2014-09-15T23:14:25.7251173Z" },
}, "pilot": [{
"destination": { "id": "EBAMAO",
"airport": "JFK", "gate": "D4", “name": "Hailey Nelson"
"arrival": "2014-09-16T02:10:10.2379581Z" }]
} }
}
U P D AT I N G N O R M A L I Z E D D AT A
{
"id": "08259",
THE SOLUTION IS TYPICALLY A COMPROMISE BASED ON "flightCode": "3754",
YOUR WORKLOAD "pilot": [{ "id": "EBAMAO", “name": "Hailey Nelson“ }]
},
Examine your workload. Answer the following questions: {
• Which fields are commonly updated together? "id": "08259",
• What are the most common fields included in all queries? "flightCode": "3754",
"ticketPrice": 255.00,
Example: The ticketPrice, origin and destination fields are "origin": {
often updated together. The pilot field is only rarely updated. "airport": "SEA", "gate": "A13",
The flightCode field is included in almost all queries across the "departure": "2014-09-15T23:14:25.7251173Z"
board. },
"destination": {
"airport": "JFK", "gate": "D4",
"arrival": "2014-09-16T02:10:10.2379581Z"
}
}
S H O R T- L I F E T I M E
D AT A
TTL BEHAVIOR
The TTL feature is controlled by TTL properties at two levels -
the collection level and the document level.
• DefaultTTL for the collection
• If missing (or set to null), documents are not deleted
Document
automatically.
• If present and the value is "-1" = infinite – documents Document TTL
don’t expire by default
• If present and the value is some number ("n") –
Default TTL
documents expire "n” seconds after last modification
• TTL for the documents:
Collection
• Property is applicable only if DefaultTTL is present for
the parent collection.
• Overrides the DefaultTTL value for the parent
collection.
The values are set in seconds and are treated as a delta from the
_ts that the document was last modified at.
DEMO
High Availability
• Automatic and Manual Failover
Container
• Automatic and transparent replication worldwide
Partition-key = "airport"
• Each partition hosts a replica set per region
"airport" : "LAX" "airport" : “AMS" "airport" : “MEL"
DocumentClient client;
ConnectionPolicy policy = new ConnectionPolicy()
{
ConnectionMode = ConnectionMode.Direct,
ConnectionProtocol = Protocol.Tcp
};
policy.PreferredLocations.Add(LocationNames.CentralUS); //first preference
policy.PreferredLocations.Add(LocationNames.SoutheastAsia); //second preference
policy.PreferredLocations.Add(LocationNames.NorthEurope); //third preference
(West US)
(East US)
Value = 5 6
Value = 5 6
Update 5 => 6
(North Europe)
Value = 5
Latency: packet of information can travel as fast as speed of light. Replication between distant geographic regions can take 100’s of milliseconds
Value = 5 6
Value = 5 6
Update 5 => 6
Value = 5
CONSISTENCY
Value = 5 6
Value = 5 6
Update 5 => 6
Value = 5
Consistent Prefix Reads will never see out of order writes (no gaps).
Eventual Potential for out of order reads. Lowest cost for reads of all consistency levels.
DEMYSTIFYING CONSISTENCY MODELS
Strong consistency
Eventual consistency
Replicas are eventually consistent with any operations. There is a potential for
out-of-order reads. Lowest cost and highest performance for reads of all
consistency levels.
Eventual
STRONG CONSISTENCY
• Strong consistency offers a linearizability guarantee with the reads guaranteed to return the most recent version of an item.
• Strong consistency guarantees that a write is only visible after it is committed durably by the majority quorum of replicas.
• A client is always guaranteed to read the latest acknowledged write.
• The cost of a read operation (in terms of request units consumed) with strong consistency is higher than session and eventual, but the
same as bounded staleness.
EVENTUAL CONSISTENCY
• Eventual consistency guarantees that in absence of any further writes, the replicas within the group eventually converge.
• Eventual consistency is the weakest form of consistency where a client may get the values that are older than the ones it had seen before.
• Eventual consistency provides the weakest read consistency but offers the lowest latency for both reads and writes.
• The cost of a read operation (in terms of RUs consumed) with the eventual consistency level is the lowest of all the Azure Cosmos DB
consistency levels.
DEMYSTIFYING CONSISTENCY MODELS
Bounded-staleness
Bounded-staleness
Session
Consistent prefix. Within a session, reads and writes are monotonic. This is referred
to as “read-your-writes” and “write-follows-reads”. Predictable consistency for a
session. High read throughput and low latency outside of session. Session
Consistent Prefix
• Bounded staleness consistency guarantees that the reads may lag behind writes by at most K versions or prefixes of an item or t time-
interval.
• Bounded staleness offers total global order except within the "staleness window." The monotonic read guarantees exist within a region
both inside and outside the "staleness window."
• Bounded staleness provides a stronger consistency guarantee than session, consistent-prefix, or eventual consistency.
• The cost of a read operation (in terms of RUs consumed) with bounded staleness is higher than session and eventual consistency, but the
same as strong consistency.
SESSION CONSISTENCY
• Unlike the global consistency models offered by strong and bounded staleness consistency levels, session consistency is scoped to a
client session.
• Session consistency is ideal for all scenarios where a device or user session is involved since it guarantees monotonic reads, monotonic
writes, and read your own writes (RYW) guarantees.
• Session consistency provides predictable consistency for a session, and maximum read throughput while offering the lowest latency
writes and reads.
• The cost of a read operation (in terms of RUs consumed) with session consistency level is less than strong and bounded staleness, but
more than eventual consistency.
CONSISTENT PREFIX CONSISTENCY
• Consistent prefix guarantees that in absence of any further writes, the replicas within the group eventually converge.
• Consistent prefix guarantees that reads never see out of order writes. If writes were performed in the order A, B, C, then a client sees
either A, A,B, or A,B,C, but never out of order like A,C or B,A,C.
B O U N D E D S T A L E N E S S I N T H E P O R TA L
Bounded-staleness
Session
string sessionToken;
using (DocumentClient client = new DocumentClient(new Uri(""), "")) using (DocumentClient client = new DocumentClient(new Uri(""),
{ ""))
ResourceResponse<Document> response = client.CreateDocumentAsync( {
collectionLink, ResourceResponse<Document> read = client.ReadDocumentAsync(
new { id = "an id", value = "some value" } documentLink,
).Result; new RequestOptions { SessionToken = sessionToken }
sessionToken = response.SessionToken; ).Result;
} }
RELAXING CONSISTENCY IN CODE
client.ReadDocumentAsync(
documentLink,
new RequestOptions { ConsistencyLevel = ConsistencyLevel.Eventual }
);
© 2018 Microsoft Corporation. All rights reserved.