Distributed Computing Series 2 Important Topics
Distributed Computing Series 2 Important Topics
Important-Topics
For more notes visit
https://rtpnotes.vercel.app
Distributed-Computing-Series-2-Important-Topics
1. Illustrate Richart Algorithm for achieving mutual exclusion
How It Works (Step-by-Step)
🟢 Step 1: Asking for Permission
🟠 Step 2: Receiving a Request
🟡 Step 3: Using the Resource
🔴 Step 4: Releasing the Resource
2. Explain how wait for graph can be used in deadlock detection
What is a Wait-For Graph (WFG)?
How is WFG Used for Deadlock Detection?
Example: Detecting Deadlock Using WFG
Scenario:
WFG Representation:
The Ricart–Agrawala Algorithm helps computers take turns fairly and efficiently. Here's how
it works:
A Wait-For Graph (WFG) is a visual way to check for deadlocks in a system. It is a directed
graph where:
Scenario:
WFG Representation:
This is exactly how Checkpointing and Rollback Recovery work in distributed systems to
handle failures!
What is Checkpointing?
Checkpointing is the process of saving the state of a process at a certain point so that if a
failure occurs, it can restart from that point instead of starting over.
1. Detect Failure 🚨
The system notices that a process has failed.
2. Restore Checkpoint 🔄
The system reloads the last saved checkpoint for that process.
3. Re-execute ⏩
The process continues from where it left off, avoiding major data loss.
🛠 Example:
If an airline booking system crashes after booking 50 tickets, it restores the last saved
checkpoint (e.g., after booking 40 tickets) and reprocesses the last 10 tickets.
🛠 Example:
If process P1 rolls back, but it has sent data to P2, then P2 must also roll back, and so on.
This can cause the system to restart from a very old state.
🛠 Example:
An online shopping system saves a checkpoint every hour across all servers. If one server
crashes, it restores all servers to the last saved state to maintain consistency.
🛠 Example:
A cloud storage service automatically saves checkpoints when many files are being
uploaded, ensuring smooth recovery if a failure occurs.
Types of Messages:
🛠 Example: Imagine you and your friend are writing in a shared online document, but updates
take 5 seconds to appear. It would be frustrating!
🛠 Example: Buying a ready-made suit vs. getting a tailor-made suit. DSM is like the ready-
made suit—it works, but not always a perfect fit.
🔹 Example: Imagine a group of friends voting on where to eat. Everyone must answer within
10 seconds, or they are considered unavailable.
Consensus in the Presence of Crash Failures
In a synchronous system, crash failures happen when some processes stop working but
don’t send incorrect data. The goal of the consensus algorithm is to let the remaining
processes agree on a decision, even if some fail.
1. Each process proposes a value (e.g., "Let's eat pizza" or "Let's eat burgers").
2. Processes exchange values with each other within a fixed time.
3. If a process crashes, it stops responding, but the remaining processes continue.
4. Majority rule: If more than half of the processes agree on a value, that value is chosen.
5. Final Decision: All non-crashed processes adopt the agreed-upon value.
When a user opens a file, the entire file is downloaded to their computer.
This file is then cached (stored temporarily) so future access is faster.
Even if the computer restarts, cached files remain available.
Whole-File Serving
AFS Components
Vice (Server-Side)
Venus (Client-Side)
Master Server
Chunk Servers
Clients
Advantages of DSM
✔ All computers see the same memory instead of handling multiple copies.
✔ No need to move data back and forth between systems.
🛠 Example: Instead of one cashier handling all customers, multiple cashiers serve
different people at once.
✔ DSM combines memory from multiple computers into one large memory space.
✔ Programs can run as if they have huge memory available.
🛠 Example: Instead of one water tank, DSM connects multiple tanks to store more water.
7️. Portable Across Different Systems
🛠 Example: Just like Google Docs works on Windows, Mac, and Mobile, DSM works
across different computers without changes.
Instead of asking all sites for permission (like in Lamport or Ricart-Agrawala algorithms), a
process only asks a subset of sites called a quorum.
Quorums are designed to overlap, so at least one site knows about both requests and
ensures only one process enters the CS at a time.
A site locks its quorum members before executing the CS.
It must receive a RELEASE message before granting permission to another process.
🔹 Example:
Imagine you need approval from a group of teachers to submit an assignment. Instead of
asking all teachers, you ask only a small group, ensuring at least one teacher is in multiple
groups to avoid conflicts.
✅ Reduces Message Complexity – Instead of contacting all sites, only a subset is involved.
✅ Faster Execution – Since fewer messages are exchanged, CS is accessed quicker.
✅ Scalable – Works better in large distributed systems compared to Lamport’s or Ricart-
Agrawala.
11. Explain different file system requirements
A distributed file system (DFS) allows users to access files from multiple computers as if they
were stored locally. To ensure efficiency and usability, the system must meet certain
requirements.
1. Access Transparency
Users and programs should not need to know whether a file is stored locally or remotely.
The same file operations should work for both.
Example: Accessing a file on Google Drive should feel the same as opening a file stored on
a personal computer.
2. Location Transparency
Files can be moved between servers, but their path remains unchanged.
Users do not need to know where the file is physically stored.
Example: A video on a streaming platform may move to different data centers, but users
can still access it using the same link.
3. Mobility Transparency
Files can be moved without requiring changes in client applications or system settings.
Ensures that files remain accessible even when they are relocated.
Example: A company might move employee files from one server to another without
employees noticing any change.
4. Performance Transparency
The system should maintain stable performance even when the load on servers varies.
Example: A cloud storage service should provide smooth access to files even when many
users are active.
5. Scaling Transparency
The system should be able to expand and handle an increasing number of users and data
without major changes.
Example: A cloud-based file service should function efficiently whether it serves ten users
or a million users.
7. File Replication
The system should maintain multiple copies of files across different locations.
Helps in load balancing and fault tolerance.
Example: A distributed file system storing data on multiple servers ensures availability even
if one server fails.
The system should work across different operating systems and hardware.
Example: A file service should be accessible from Windows, Linux, and macOS without
compatibility issues.
9. Fault Tolerance
The system should continue functioning even if some servers or clients fail.
Example: If one storage server crashes, another should take over automatically to prevent
data loss.
10. Consistency
When files are updated, all copies should reflect the latest changes.
There might be delays in propagating updates across different sites.
Example: When an email is deleted from one device, it should also disappear from all other
devices.
11. Security
The system should protect data using authentication, access control, and encryption.
Example: Only authorized users should be able to access confidential company files, and
data should be encrypted to prevent unauthorized access.
12. Efficiency
The system should provide high performance comparable to traditional file systems.
Example: Opening and saving files in a distributed system should be as fast as working
with local files.
1. Single-Resource Model
Deadlock Detection:
In a Wait-For Graph (WFG), each node can have at most one outgoing edge.
If a cycle is present, a deadlock has occurred.
Example:
2. AND Model
Example:
3. OR Model
A process requests multiple resources, but it only needs any one of them to proceed.
If at least one resource is granted, the process continues execution.
Deadlock Detection:
A cycle in the WFG does not always mean a deadlock because a process may still
proceed if one of the requested resources is available.
Example:
4. AND-OR Model
Example:
Example:
A cloud server needs any 2 out of 5 available CPU cores to process a request.
If at least 2 cores are available, execution proceeds.
If fewer than 2 cores are available, the process waits.
1. In-Transit Messages
Messages that have been sent but not yet received by the destination process.
These messages do not cause inconsistency because they will eventually be delivered.
Example:
Process A sends a message to Process B, but before B receives it, A crashes.
When A recovers, the message is still in transit and will be delivered later.
2. Lost Messages
Messages that were sent but were never received due to a rollback.
The sender does not roll back, but the receiver rolls back to a state before the message
was received.
Example:
Process A sends a message to Process B.
Process B rolls back to an earlier state before it received the message.
The message is lost because B no longer remembers receiving it.
3. Delayed Messages
Messages that were sent, but their reception was not recorded because:
The receiver was down when the message arrived.
The receiver rolled back before processing the message.
Example:
Process A sends a message to Process B.
Before B receives it, B crashes and rolls back.
The message is now delayed and might be received later or lost.
4. Orphan Messages
Messages where the receive event is recorded, but the send event is not recorded.
This happens when a rollback undoes the send event, but the receive event remains.
Example:
5. Duplicate Messages
Since both the send and receive events are recorded, this state is consistent.
2. Inconsistent State
A process shows that it received a message, but the corresponding send event is
missing.
This situation is impossible in a correct failure-free execution.