0% found this document useful (0 votes)

1K views

Building A Transactional Distributed Data Store With Erlang

The e-commerce platforms at Amazon, E-Bay or Google serve millions of customers using tens of thousands of servers located in data centers throughout the world. At this scale, components fail continuously and it is difficult to maintain a consistent state while hiding failures from the application. Peer-to-peer protocols have been invented to provide availability by replicating services among peers. The current systems are perfectly tuned for sharing read-only data. To extend them beyond the typical file sharing, the support of transactions on distributed hash tables (DHTs) is a most important but yet missing feature. At this talk given at the Erlang eXchange 2008, Alexander presented a key/value store based on DHTs that supports consistent writes. Alexander will explain how a system by Zuse Institute Berlin and onScale solutions GmbH comprises of three layers, all of them implemented in Erlang: a DHT layer for scalable, reliable access to replicated distributed data, a transaction layer to ensure data consistency in the face of concurrent write operations, an application layer with a very demanding access rate of several thousand reads/writes per second. For the application layer, Zuse Institute Berlin and onScale solutions GmbH selected a distributed, scalable Wiki with full transaction support. Alexander will show that its Wiki outperforms the public Wikipedia in terms of served page requests per second and he will discuss how the development of the distributed code benefited from the use of Erlang rather than C++ or Java.

Uploaded by

Dmytro Shteflyuk

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views

Building A Transactional Distributed Data Store With Erlang

Uploaded by

Dmytro Shteflyuk

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Building a transactional distributed

data store with Erlang

Alexander Reinefeld, Florian Schintke, Thorsten Schütt

Zuse Institute Berlin,

onScale solutions GmbH
Transactional data store - What for?
Web 2.0 services: shopping, banking, gaming, …
− don’t need full SQL semantics, key/value DB often suffice
e.g. Mike Stonebreaker: “One size does not fit all”

Scalability matters
− >104 accesses per sec.
− many concurrent writes
Traditional Web 2.0 hosting
Clients
Traditional Web 2.0 hosting
Clients
Traditional Web 2.0 hosting
Clients

...
Traditional Web 2.0 hosting
Clients

...
Now think big. Clients

Really BIG.
Clients

Not how fast our code is today, but:

Clients
− Can it “scale out”?
− Can it run in parallel? … distributed?
− Any common resources causing locking? Clients
Asymptotic performance matters!
Clients
Our Approach: P2P makes it scalable
“arbitrary“ number of clients

Web 2.0 services

with P2P nodes
in datacenters
x
x
Our Approach

Application Layer

crash
recovery
model Key/Value Store (= simple database) strong data consistency

Transaction Layer implements ACID

improves availability
Replication Layer at the cost of consistency

crash stop implements

model P2P Layer - scalability
- eventual consistency

unreliable, distributed nodes

providing a scalable distributed data store:

P2P LAYER
Key/Value Store
for storing “items” (= “key/value pairs”)
− synonyms: “key/value store”, “dictionary”, “map”, …

just 3 ops
Turing Award Winners
− insert(key, value)
Key Value
− delete(key) Backus 1977
− lookup(key) Hoare 1980
Karp 1985
Knuth 1974
Wirth 1984
... ...
Chord# - Distributed Key/Value Store
key space: total order on items (strings, numbers, …)
nodes have a random key as their position in ring
items are stored on the successor node (clockwise)

(Backus, …, Karp]
keys

(Karp, …, Knuth] Key Value

Backus 1977
Chord# item
distributed Hoare 1980
key/value Karp 1985
store
Knuth 1974
Wirth 1984
nodes
... ...
Routing Table and Data Lookup
Building the routing table
log2N pointers
exponentially spaced pointers

Chord#
Routing Table and Data Lookup
Building the routing table Retrieving items
log2N pointers ≤ log2N hops
exponentially spaced pointers Example:
lookup (Hoare)
started from here
(Backus – Karp]

Chord# Chord#
Churn
Nodes join, leave, or crash at any time

Need “failure detector” to check aliveness of nodes

− failure detector may be wrong: Node dead? Or just slow
network?

Churn may cause inconsistencies

− need local repair mechanism
Responsibility Consistency
Violated responsibility consistency caused by imperfect
failure detector: Both, N3 and N4 claim responsibility for item k

N3
crashed
k !
N2 N3
N1
N4
Lookup Consistency
Violated lookup consistency caused by imperfect failure
detector: lookup(k): at N1 N3, but at N2 N4

N2
crashed k
! N2 N3
N1
N4

N3
crashed
!
How often does this occur?
Simulated nodes with imperfect failure detectors
(A node detects another alive node as dead probabilistically)
SUMMARY P2P LAYER
Chord# provides a key/value store
− scalable
− efficient: log2N hops

Quality of failure detector is crucial

Need replication to prevent data loss …

improving availability

REPLICATION LAYER
Replication
Many schemes
− symmetric replication
− succ. list replication
− …

Must ensure data consistency

− need quorum-based methods
Quorum based algorithms
Enforce consistency by operating on majorities

r1 r2 r3 r4 r5

majority

Comes at the cost of increased latency

− but latency can be avoided by clever replica distribution
in datacenters (cloud computing)
SUMMARY REPLICATION LAYER

availability in face of churn

quorum algorithms

But need transactional data access …

coping with concurrency:
TRANSACTION LAYER
Transaction Layer
Transactions on P2P are challenging because of …
− churn
changing node responsibilities

− crash stop fault model

as opposed to crash recovery in traditional DBMS

− imperfect failure detector

don’t know whether node crashed or slow network
Strong Data Consistency
What is it?
− When a write is finished, all following reads return the new
value.

How to implement?
− Always read/write majority f/2 + 1 of f replicas.
Latest version is always in the read or write set

− Must ensure that replication degree is ≤ f

Atomicity
What is it?
− Make all or no changes!
− Either ‘commit’ or ‘abort’.

How to implement?
− 2PC? Blocks if the transaction manager fails.
− 3PC? Too much latency.
− We use a variant of the Paxos Commit Protocol
non-blocking: Votes of transaction participants are sent to
multiple “acceptors”
Adapted Paxos Commit
Optimistic CC with fallback
Write
− 3 rounds
− non-blocking (fallback)
Read even faster
− reads majority of replicas
− just 1 round
succeeds when >f/2 nodes alive
Adapted Paxos Commit
replicated Items at
Optimistic CC with fallback Transaction Transaction
Managers Participants
Leader (TMs) (TPs)
Write
− 3 rounds
1. Step: 1. Step
− non-blocking (fallback) O(log N) hops

Read even faster 2. Step

− reads majority of replicas

2.-6. Step: 3. Step
− just 1 round O(1) hops

4. Step
succeeds when
>f/2 nodes alive 5. Step
After majority

6. Step
Transactions have two purposes:
Consistency of replicas & consistency across items

User Request Operation on replicas

BOT BOT
− debit (a1, 100);
− debit (a, 100); − debit (a2, 100);
− debit (a3, 100);
− deposit (b1, 100);
− deposit (b, 100); − deposit (b2, 100);
− deposit (b3, 100);

EOT EOT
SUMMARY TRANSACTION LAYER

Consistent update of items and replicas

Mitigates some of the overlay oddities

− node failures
− asynchronicity
demonstrator application:
WIKIPEDIA
Wikipedia
Top 10 Web sites 50.000 requests/sec
1. Yahoo!
− 95% are answered by squid proxies
2. Google
− only 2,000 req./sec hit the backend
3. YouTube
4. Windows Live
5. MSN
6. Myspace
7. Wikipedia
8. Facebook
9. Blogger.com
10. Yahoo!カテゴリ
Public Wikipedia

other

search
servers

NFS web
servers
Our Wikipedia
Renderer
Java
− Tomcat, Plog4u Java
Jinterface
− Interface to Erlang Erlang

Key/Value Store
Chord# + Replication
+ Transactions
Mapping Wikipedia to Key/Value Store
Mapping
key value
page content title list of Wikitext for
all versions

backlinks title list of titles

categories category name list of titles

For each insert or modify we must

− update backlinks
write transaction
− update category page(s)
Erlang Processes

Erlang Processes
− Chord#
− load balancing
− transaction framework
− supervision (OTP)
Erlang Processes (per node)
Failure Detector supervises Chord# nodes and sends crash messages
when a failure is detected.

Configuration provides access to the configuration file and maintains

parameter changes made at runtime.

Key Holder stores the identifier of the node in the overlay.

Statistics Collector collects statistics information and forwards them to

statistic servers.

Chord# Node performs the main functionality of the node, e.g. successor
list and routing table.

Database stores key/value pairs in each node.

Accessing Erlang Transactions
from Java via Jinterface
void updatePage(string title, int oldVersion, string newText)
{
Transaction t = new Transaction(); //new transaction
Page p = t.read(title); // read old version
if (p.currentVersion != oldVersion) // concurrent update?
t.abort();
else {
t.write(p.add(newText)); // write new text
//update categories
foreach(Category c in p)
t.write(t.read(c.name).add(title));
t.commit();
}
}
Performance on Linux Cluster
test results with load generator

throughput with increasing access rate over time CPU load with increasing access rate over time

1500 trans./sec on 10 CPUs

2500 trans./sec on 16 CPUs (64 cores) and 128 DHT nodes
Implementation
11,000 lines of Erlang code
− 2,700 for transactions
− 1,300 for Wikipedia
− 7,000 for Chord# and infrastructure

Distributed Erlang
− currently has weak security and limited scalability
⇒ we implemented own transport layer on top of TCP

Java for rendering and user interface

SUMMARY
Summary
P2P as new paradigm for Web 2.0 hosting
− we support consistent, distributed write operations.

Numerous applications:
− Internet databases, transactional online-services, …
Tradeoff: High availability vs. data consistency
Team
Thorsten Schütt
Florian Schintke
Monika Moser
Stefan Plantikow
Alexander Reinefeld
Nico Kruber
Christian von Prollius
Seif Haridi (SICS)
Ali Ghodsi (SICS)
Tallat Shafaat (SICS)
Publications
Chord# Talks / Demos
T. Schütt, F. Schintke, A. Reinefeld.
A Structured Overlay for Multi-dimensional IEEE Scale Challenge, May 2008
Range Queries. Euro-Par, August 2007. 1st price (live demo)

T. Schütt, F. Schintke, A. Reinefeld.

Structured Overlay without Consistent
Hashing: Empirical Results. GP2PC, May 2006.

Transactions
M. Moser, S. Haridi.
Atomic Commitment in Transactional DHTs.
1st CoreGRID Symposium, August 2007.

T. Shafaat, M. Moser, A. Ghodsi , S. Haridi,

T. Schütt, A. Reinefeld. Key-Based Consistency
and Availability in Structured Overlay Networks.
Infoscale, June 2008.

Wiki
S. Plantikow, A. Reinefeld, F. Schintke.
Transactions for Distributed Wikis on Structured Overlays.
DSOM, October 2007.

Beginning MySQL
86% (37)
Beginning MySQL
865 pages
Valacich Msad8e ch07
No ratings yet
Valacich Msad8e ch07
51 pages
Ch02 - Big Data Storage Concepts
No ratings yet
Ch02 - Big Data Storage Concepts
23 pages
Peer-to-Peer (P2P) Systems: DHT, Chord, Pastry
No ratings yet
Peer-to-Peer (P2P) Systems: DHT, Chord, Pastry
122 pages
Nosql1
No ratings yet
Nosql1
40 pages
5.1 Distributed Hash Table
No ratings yet
5.1 Distributed Hash Table
49 pages
Dynamo: Amazon'S Highly Available Key-Value Store: Csci 8101: Advanced Operating Systems Presented By: Chaithra KN
No ratings yet
Dynamo: Amazon'S Highly Available Key-Value Store: Csci 8101: Advanced Operating Systems Presented By: Chaithra KN
23 pages
Craq
No ratings yet
Craq
16 pages
An Introduction To Peer-to-Peer Networks: Presentation For CSE620:Advanced Networking Anh Le Nov. 4
No ratings yet
An Introduction To Peer-to-Peer Networks: Presentation For CSE620:Advanced Networking Anh Le Nov. 4
128 pages
Structured P2P Networks by Example Chord, DKS (N, K, F) : Jun Qin
No ratings yet
Structured P2P Networks by Example Chord, DKS (N, K, F) : Jun Qin
11 pages
DK Stalk
No ratings yet
DK Stalk
105 pages
Lecture Notes: Architecture Styles
No ratings yet
Lecture Notes: Architecture Styles
10 pages
Introduction To P2P Systems: Compsci 230 - Uc, Irvine Prof. Nalini Venkatasubramanian
No ratings yet
Introduction To P2P Systems: Compsci 230 - Uc, Irvine Prof. Nalini Venkatasubramanian
83 pages
Chord: A Scalable Peer-To-Peer Lookup Service For Internet Applications
No ratings yet
Chord: A Scalable Peer-To-Peer Lookup Service For Internet Applications
32 pages
Chord Hotos
No ratings yet
Chord Hotos
21 pages
Chord
No ratings yet
Chord
15 pages
Naming, Identifiers and Addresses
No ratings yet
Naming, Identifiers and Addresses
65 pages
A Survey and Comparison of Peer-to-Pee R Overlay Network Schemes
No ratings yet
A Survey and Comparison of Peer-to-Pee R Overlay Network Schemes
26 pages
6.1 Cassandra
No ratings yet
6.1 Cassandra
21 pages
Lect 3
No ratings yet
Lect 3
34 pages
Blockchain Bitcoin
No ratings yet
Blockchain Bitcoin
86 pages
21 p2p
No ratings yet
21 p2p
64 pages
Slides 05
No ratings yet
Slides 05
62 pages
Replication
No ratings yet
Replication
11 pages
CassandraTraining v3.3.4
100% (1)
CassandraTraining v3.3.4
183 pages
slides.06
No ratings yet
slides.06
69 pages
SDX Formularifinal
No ratings yet
SDX Formularifinal
2 pages
Nosql Overview: Implementation Free
No ratings yet
Nosql Overview: Implementation Free
40 pages
Grid vs. Peer-to-Peer: Project Report No. 2
No ratings yet
Grid vs. Peer-to-Peer: Project Report No. 2
7 pages
Nosql Systems: Sharding, Replication and Consistency: Riccardo Torlone Università Roma Tre
No ratings yet
Nosql Systems: Sharding, Replication and Consistency: Riccardo Torlone Università Roma Tre
28 pages
Dynamo: Amazon's Highly Available Key-Value Store
No ratings yet
Dynamo: Amazon's Highly Available Key-Value Store
21 pages
DISTRIBUTED SYSTEMS R19 - UNIT-5
No ratings yet
DISTRIBUTED SYSTEMS R19 - UNIT-5
32 pages
The Effects of Churn On Complex Search Techniques: Jamie Furness, Mario Kolberg
No ratings yet
The Effects of Churn On Complex Search Techniques: Jamie Furness, Mario Kolberg
21 pages
Peer To Peer Computing: Partially Based On Nelson Minar's Article at 01/08/p2p - Topologies - pt2.html
No ratings yet
Peer To Peer Computing: Partially Based On Nelson Minar's Article at 01/08/p2p - Topologies - pt2.html
29 pages
A Peer-To-Peer Based Chat System: Tommy Mattsson
No ratings yet
A Peer-To-Peer Based Chat System: Tommy Mattsson
32 pages
Chord (Peer-To-Peer) : Computing Peer-To-Peer Distributed Hash Table Key-Value Pairs
No ratings yet
Chord (Peer-To-Peer) : Computing Peer-To-Peer Distributed Hash Table Key-Value Pairs
7 pages
CoSc4191 DS ch2
No ratings yet
CoSc4191 DS ch2
36 pages
NoSQL M2
No ratings yet
NoSQL M2
47 pages
Chapter 6
No ratings yet
Chapter 6
69 pages
DrKP-Module-2-1
No ratings yet
DrKP-Module-2-1
77 pages
Dynamo
No ratings yet
Dynamo
19 pages
6 Replication Nhom3
No ratings yet
6 Replication Nhom3
44 pages
Lec14 Dhts
No ratings yet
Lec14 Dhts
22 pages
Web Caches, CDNS, and P2Ps
No ratings yet
Web Caches, CDNS, and P2Ps
7 pages
Assignment - 4: Name: Kaiwalya A. Kulkarni Reg - No.: 2015BCS019 Roll No.: A-18
No ratings yet
Assignment - 4: Name: Kaiwalya A. Kulkarni Reg - No.: 2015BCS019 Roll No.: A-18
6 pages
Instant download JXTA 1st Edition Brendon J. Wilson pdf all chapter
No ratings yet
Instant download JXTA 1st Edition Brendon J. Wilson pdf all chapter
80 pages
Chapter 08 بالمحذوف
No ratings yet
Chapter 08 بالمحذوف
29 pages
Lekcija09 - 04 NoSQL Redis
No ratings yet
Lekcija09 - 04 NoSQL Redis
40 pages
Introduction of P2P Systems
No ratings yet
Introduction of P2P Systems
24 pages
Mini Project Report
No ratings yet
Mini Project Report
18 pages
SDE Report
No ratings yet
SDE Report
13 pages
4 - Key-Value Stores
No ratings yet
4 - Key-Value Stores
47 pages
Self Chord-Achieving Load Balancing in Peer To Peer Network: M.Divya B.Saranya
No ratings yet
Self Chord-Achieving Load Balancing in Peer To Peer Network: M.Divya B.Saranya
5 pages
CC - Lecture 8-Final
No ratings yet
CC - Lecture 8-Final
51 pages
Download Full JXTA 1st Edition Brendon J. Wilson PDF All Chapters
100% (3)
Download Full JXTA 1st Edition Brendon J. Wilson PDF All Chapters
81 pages
Architectures For Distributed Systems
No ratings yet
Architectures For Distributed Systems
52 pages
Murex GPC
No ratings yet
Murex GPC
40 pages
Big Data IN A Gist
No ratings yet
Big Data IN A Gist
16 pages
VI - Adaptive Overlays
No ratings yet
VI - Adaptive Overlays
46 pages
Peer To Peer Systems
No ratings yet
Peer To Peer Systems
41 pages
Ruby Gems Mastery: 100 Essential Packages for 2024
From Everand
Ruby Gems Mastery: 100 Essential Packages for 2024
Kanto
No ratings yet
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
Understanding The Top 5 Redis Performance Metrics
No ratings yet
Understanding The Top 5 Redis Performance Metrics
22 pages
Why RSA Works PDF
No ratings yet
Why RSA Works PDF
19 pages
Redis Cluster
67% (3)
Redis Cluster
17 pages
CSS 3 Help Cheat Sheet
100% (1)
CSS 3 Help Cheat Sheet
1 page
Sigmod278 Silberstein
No ratings yet
Sigmod278 Silberstein
12 pages
Redis Cluster
67% (3)
Redis Cluster
17 pages
Authors: Thanks To:: Miek Gieben Go Authors Google Go Nuts Mailing List
No ratings yet
Authors: Thanks To:: Miek Gieben Go Authors Google Go Nuts Mailing List
272 pages
CSS 2.1 Help Cheat Sheet
No ratings yet
CSS 2.1 Help Cheat Sheet
1 page
Go Programming
80% (5)
Go Programming
60 pages
RFM: A Precursor To Data Mining
No ratings yet
RFM: A Precursor To Data Mining
10 pages
Backup Strategies With MySQL Enterprise Backup
No ratings yet
Backup Strategies With MySQL Enterprise Backup
33 pages
Performance Tuning For The InfiniDB Analytics Database (For Version 1.0.3)
100% (1)
Performance Tuning For The InfiniDB Analytics Database (For Version 1.0.3)
72 pages
Window Vista Business 20070810
No ratings yet
Window Vista Business 20070810
29 pages
Calpont InfiniDB Administrator Guide (For Version 1.0.3)
100% (2)
Calpont InfiniDB Administrator Guide (For Version 1.0.3)
106 pages
Scaling MySQL Writes Through Partitioning
No ratings yet
Scaling MySQL Writes Through Partitioning
38 pages
The Complete Google Analytics Power User Guide PDF
100% (1)
The Complete Google Analytics Power User Guide PDF
45 pages
Building TweetReach With Sinatra, Tokyo Cabinet and Grackle
No ratings yet
Building TweetReach With Sinatra, Tokyo Cabinet and Grackle
21 pages
Amdahl's Law in The Multicore Era
100% (1)
Amdahl's Law in The Multicore Era
6 pages
Hadoop Training #4: Programming With Hadoop
100% (2)
Hadoop Training #4: Programming With Hadoop
46 pages
Scribd Architecture Overview
100% (20)
Scribd Architecture Overview
19 pages
Amdahl's Law in The Multicore Era - HPCA Keynote 02/2008
No ratings yet
Amdahl's Law in The Multicore Era - HPCA Keynote 02/2008
61 pages
Scaling Rails Applications in The Cloud
100% (3)
Scaling Rails Applications in The Cloud
59 pages
Hadoop Training #5: MapReduce Algorithm
100% (2)
Hadoop Training #5: MapReduce Algorithm
31 pages
Hadoop Training #1: Thinking at Scale
100% (1)
Hadoop Training #1: Thinking at Scale
20 pages
Integrity Rules and Constraints
No ratings yet
Integrity Rules and Constraints
17 pages
Cleansing Data With SQL Server 2016 Data Quality Services
No ratings yet
Cleansing Data With SQL Server 2016 Data Quality Services
60 pages
datamining1
No ratings yet
datamining1
7 pages
ChiragMangla - Hadoop Architecture
No ratings yet
ChiragMangla - Hadoop Architecture
24 pages
BW DL Issues Index
No ratings yet
BW DL Issues Index
74 pages
Chapter-8-DBMS Cocepts
No ratings yet
Chapter-8-DBMS Cocepts
11 pages
Module 1
100% (1)
Module 1
2 pages
20th June Grammarly Working Cookie
No ratings yet
20th June Grammarly Working Cookie
3 pages
BLOCKCHAIN - A New Digital World
No ratings yet
BLOCKCHAIN - A New Digital World
8 pages
DBMS 1st 10 Q & A.
No ratings yet
DBMS 1st 10 Q & A.
25 pages
Learning Apache Kafka - Second Edition - Sample Chapter
No ratings yet
Learning Apache Kafka - Second Edition - Sample Chapter
12 pages
DB2 L2
No ratings yet
DB2 L2
2 pages
The in Operator in SQL
No ratings yet
The in Operator in SQL
9 pages
Triggers Interview Questions
No ratings yet
Triggers Interview Questions
16 pages
(E20-585 Free Dumps) E20-585 Specialist-Systems Administrator, Data Domain Exam - CertQueen Free Exam Dumps To Test Online
No ratings yet
(E20-585 Free Dumps) E20-585 Specialist-Systems Administrator, Data Domain Exam - CertQueen Free Exam Dumps To Test Online
14 pages
SP3d Guidlines For Reference Data Guide
No ratings yet
SP3d Guidlines For Reference Data Guide
38 pages
Introduction To Data Management: Chapter 1, Pratt & Adamski
100% (1)
Introduction To Data Management: Chapter 1, Pratt & Adamski
25 pages
Rdbms Unit III
No ratings yet
Rdbms Unit III
16 pages
Cpe CAD
No ratings yet
Cpe CAD
3 pages
Informatica Data Engineering Hackathon 2024 - Idea Submission Template
No ratings yet
Informatica Data Engineering Hackathon 2024 - Idea Submission Template
19 pages
Management & Administration: Backup & Restore
No ratings yet
Management & Administration: Backup & Restore
8 pages
princy
No ratings yet
princy
6 pages
Database Programming With PL/SQL 13-5: Managing Triggers Practice Activities
No ratings yet
Database Programming With PL/SQL 13-5: Managing Triggers Practice Activities
2 pages
Business Analyst Course With Tableau Power BI SQL Advance Excel Brochure
No ratings yet
Business Analyst Course With Tableau Power BI SQL Advance Excel Brochure
26 pages
Introduction To Dbms
No ratings yet
Introduction To Dbms
10 pages
ACFS
No ratings yet
ACFS
6 pages
Petunjuk Penginstalan Dips: 1. Generate Serial Number
No ratings yet
Petunjuk Penginstalan Dips: 1. Generate Serial Number
2 pages
Untitled
No ratings yet
Untitled
2,204 pages

Building A Transactional Distributed Data Store With Erlang

Uploaded by

Building A Transactional Distributed Data Store With Erlang

Uploaded by

Building a transactional distributed

data store with Erlang

Alexander Reinefeld, Florian Schintke, Thorsten Schütt

Zuse Institute Berlin,

Not how fast our code is today, but:

Web 2.0 services

Transaction Layer implements ACID

crash stop implements

unreliable, distributed nodes

(Karp, …, Knuth] Key Value

Need “failure detector” to check aliveness of nodes

Churn may cause inconsistencies

Quality of failure detector is crucial

Need replication to prevent data loss …

Must ensure data consistency

Comes at the cost of increased latency

availability in face of churn

But need transactional data access …

− crash stop fault model

− imperfect failure detector

− Must ensure that replication degree is ≤ f

Read even faster 2. Step

− reads majority of replicas

User Request Operation on replicas

Consistent update of items and replicas

Mitigates some of the overlay oddities

backlinks title list of titles

For each insert or modify we must

Configuration provides access to the configuration file and maintains

Key Holder stores the identifier of the node in the overlay.

Statistics Collector collects statistics information and forwards them to

Database stores key/value pairs in each node.

1500 trans./sec on 10 CPUs

Java for rendering and user interface

T. Schütt, F. Schintke, A. Reinefeld.

T. Shafaat, M. Moser, A. Ghodsi , S. Haridi,

You might also like