1-Introduction TO Principles of Distributed Database Systems
1-Introduction TO Principles of Distributed Database Systems
Systems
Logically integrated
but
Physically distributed
❑ History
❑
◼ Delivery modes
❑ Pull-only
❑ Push-only
❑ Hybrid
◼ Frequency
❑ Periodic
❑ Conditional
❑ Ad-hoc or irregular
◼ Communication Methods
❑ Unicast
❑ One-to-many
◼ Note: not all combinations make sense
© 2020, M.T. Özsu & P. Valduriez 18
Outline
◼ Introduction
❑
Improved performance
Tokyo
SELECT ENAME,SAL
FROM EMP,ASG,PAY Boston Paris
WHERE DUR > 12 Paris projects
Paris employees
AND EMP.ENO = ASG.ENO Communication Paris assignments
Network Boston employees
AND PAY.TITLE = EMP.TITLE
Boston projects
Boston employees
Boston assignments
Montreal
New
Montreal projects
York Paris projects
Boston projects New York projects
New York employees with budget > 200000
New York projects Montreal employees
New York assignments Montreal assignments
Distributed Database
User
DBMS
Application
Software
DBMS
Software
DBMS Communication
Software Subsystem
User
DBMS User Application
Software Query
DBMS
Software
User
Query
◼ Data independence
◼ Network transparency (or distribution transparency)
❑ Location transparency
❑ Fragmentation transparency
◼ Fragmentation transparency
◼ Replication transparency
❑ Failure atomicity
❑ Commit protocols
◼ Data replication
❑ Great for read-intensive workloads, problematic for updates
❑ Replication protocols
◼ Parallelism in execution
❑ Inter-query parallelism
❑ Intra-query parallelism
❑ Design issues
❑
◼ Reliability
❑ How to make the system resilient to failures
❑ Atomicity and durability
◼ Replication
❑ Mutual consistency
❑ Freshness of copies
❑ Eager vs lazy
❑ Centralized vs distributed
◼ Parallel DBMS
❑ Objectives: high scalability and performance
❑ Not geo-distributed
❑ Cluster computing
◼ Distribution
❑ Whether the components of the system are located on the same machine or
not
◼ Heterogeneity
❑ Various levels (hardware, communications, operating system)
❑ DBMS important one
◼ data model, query language,transaction management algorithms
◼ Autonomy
❑ Not well understood and most troublesome
❑ Various versions
◼ Design autonomy: Ability of a component DBMS to decide on issues related to its
own design.
◼ Communication autonomy: Ability of a component DBMS to decide whether and
how to communicate with other DBMSs.
◼ Execution autonomy: Ability of a component DBMS to execute local operations in
any manner it wants to.
◼ PaaS – Platform-as-a-Service
◼ SaaS – Software-as-a-Service
◼ DaaS – Database-as-a-Service