0% found this document useful (0 votes)
2 views

Distributed Systems

The document provides a comprehensive overview of distributed systems, detailing their characteristics, goals, types, and scaling techniques. It covers various computing models such as cluster computing, grid computing, and cloud computing, along with the importance of transparency, scalability, and middleware. Additionally, it discusses processes, threads, virtualization, code migration, and inter-process communication within distributed systems.

Uploaded by

Gech Deb
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Distributed Systems

The document provides a comprehensive overview of distributed systems, detailing their characteristics, goals, types, and scaling techniques. It covers various computing models such as cluster computing, grid computing, and cloud computing, along with the importance of transparency, scalability, and middleware. Additionally, it discusses processes, threads, virtualization, code migration, and inter-process communication within distributed systems.

Uploaded by

Gech Deb
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Distributed Systems - Detailed Lecture

Notes
1. Introduction to Distributed Systems
A distributed system is a collection of autonomous computing elements (nodes) that appear
to users as a single coherent system.

Key Characteristics:

 Autonomous Nodes: Each node (hardware/software) operates independently.

 No Global Clock: No single time reference → synchronization challenges.

 Independent Failures: Nodes can fail independently → need for fault tolerance.

 Message Passing: Nodes communicate via network messages.

 Shared Resource Access: Users access distributed resources as if they were local.

2. Goals of Distributed Systems


1. Resource Accessibility

 Enables shared access to resources like storage, databases, and services.

 Example: Cloud storage services (Google Drive, Dropbox).

 Challenges: Security risks (e.g., unauthorized access, DDoS attacks).

2. Transparency in Distributed Systems

A well-designed system hides the complexity of distribution.

Type of Transparency Explanation Example

Access Transparency Users should not know how they A file appears the same on
access a resource (local or different computers.
remote).

Location Transparency Users should not know where the Google Docs allows access from
resource is physically located. any location.

Replication Transparency Users should not know if multiple Content Delivery Networks
copies of a resource exist. (CDNs) store copies of a website
worldwide.

Failure Transparency Users should not notice if a failure Automatic failover in cloud
occurs. services.

Concurrency Transparency Multiple users can access Online banking systems handling
resources simultaneously without multiple transactions.
conflicts.

3. Openness

 A system should allow easy integration with other systems.

 Uses standardized protocols (e.g., HTTP, SOAP, REST).

 Example: Web browsers can access different websites due to HTTP standardization.

4. Scalability

A distributed system should be able to handle growth efficiently.

Types of Scalability

1. Size Scalability: The system supports adding more users/resources.

2. Geographical Scalability: The system operates efficiently across large distances.

3. Administrative Scalability: The system remains manageable as it grows.

3. Scaling Techniques
To make a system scalable, the following techniques are used:

1. Hiding Communication Latency

o Asynchronous communication: Requests are made without waiting for


responses.
o Example: Web browsers pre-load content while a user scrolls.

2. Partitioning & Distribution

o Splitting data/workloads into smaller units distributed across nodes.

o Example: Domain Name System (DNS) partitions domain resolution.

3. Replication

o Data Replication: Storing copies of data on multiple nodes.

o Advantages: Reduces latency, improves availability.

o Challenges: Consistency issues (ensuring all copies remain up-to-date).

o Example: Google replicates search indexes across global data centers.

4. Types of Distributed Systems


1. High-Performance Distributed Computing

o Designed for complex computations.

o Examples: Cluster Computing, Grid Computing, Cloud Computing.

2. Distributed Information Systems

o Used for business applications and enterprise integration.

o Examples: Online banking, airline reservation systems.

3. Pervasive Computing

o Smart devices and IoT systems that integrate into daily life.

o Examples: Smart homes, wearable health monitors.

5. Cluster Computing Systems


 A cluster consists of multiple computers working together.

 Nodes: Computers in the cluster, connected via LAN.

 Master Node: Controls task allocation.


 Worker Nodes: Execute tasks.

Types of Clusters

1. High-Performance Clusters: Used for scientific computations (e.g., weather


simulations).

2. High-Availability Clusters: Ensure reliability by preventing failures.

3. Load-Balancing Clusters: Distribute workload evenly.

Example:

Beowulf Cluster → Used for parallel processing.

6. Grid Computing Systems


 Distributed computing across multiple organizations.

 Differences from Clusters:

o High degree of heterogeneity (different systems collaborate).

o Operates over wide-area networks (WAN).

o Uses Virtual Organizations (VOs) → Groups of users sharing resources.

Example:

 SETI@Home → Uses volunteer computers to analyze space signals.

Comparison Table: Grid vs. Cluster Computing


Feature Cluster Computing Grid Computing 🌍
Architecture Centralized Decentralized
Computers Homogeneous (same OS, hardware) Heterogeneous (different OS, hardware)
Location Same physical location (LAN) Different geographical locations (WAN)
Management Single administrative control Multiple organizations collaborate
Communication Fast (low-latency) Slower (higher-latency)
Use Case Parallel processing, real-time applications Large-scale computations across different
institutions
Examples Google Search Engine, NASA simulations SETI@Home, CERN research projects
7. Cloud Computing
Provides on-demand computing services over the Internet.

Key Features:

 Virtualized resources.

 Pay-as-you-go model.

 Scalability.

Cloud Service Models:

Model Description Example

IaaS (Infrastructure as a Service) Provides computing resources like AWS EC2, Google
VMs, storage. Compute Engine

PaaS (Platform as a Service) Provides development platforms. Google App Engine,


Microsoft Azure

SaaS (Software as a Service) Provides complete applications. Gmail, Dropbox

Challenges:

 Security & privacy concerns.

 Dependency on service providers.

8. Distributed Information Systems


Enterprise Application Integration (EAI)

 Connects business applications (e.g., HR, finance).

 Uses middleware for communication.

Middleware Solutions
1. Transaction Processing Monitors → Coordinate distributed database transactions.

2. Remote Procedure Calls (RPCs) → Enable function calls between machines.

3. Message-Oriented Middleware (MOM) → Uses publish-subscribe models for


asynchronous communication.

Example:

 SAP ERP systems integrate finance, HR, and inventory management.

9. Distributed Pervasive Systems


 Smart devices operate without human intervention.

 Examples:

o IoT-based smart cities.

o Electronic health monitoring systems (e.g., wearable sensors).

10. Summary
Concept Key Points

Distributed System Collection of independent computers that appear as a single system.

Transparency Hides complexity (location, access, failure, replication).

Scalability Handles increased load via replication, partitioning, latency hiding.

Cluster Computing Parallel computing using homogenous machines connected via LAN.

Grid Computing Heterogeneous distributed computing across organizations.

Cloud Computing Virtualized services available over the internet.

Middleware Software that integrates distributed applications.

Pervasive Computing Smart devices interacting seamlessly.


Exam Preparation Tips
 Understand Key Concepts: Focus on transparency, scalability, and replication.

 Memorize Examples: Cluster vs. Grid Computing, Cloud Models, Middleware.

 Review Diagrams: Distributed system architectures, grid vs. cloud.

 Practice Questions: Solve past exam questions on scalability, failure handling.

Distributed Systems Exam Preparation


Summary
1. Process in Distributed Systems
A process is a program in execution.

 A program is just a set of instructions, while a process is an active execution of that


program.

 One program can create multiple processes.

Components of a Process

1. Program Code (Text Section): Contains the actual program instructions.

2. Program Counter: Keeps track of the next instruction to be executed.

3. Processor Registers: Store temporary data for execution.

4. Stack: Holds function parameters, return addresses, and local variables.

5. Data Section: Stores global variables.

6. Heap: Contains memory that is dynamically allocated at runtime.

2. Thread in Distributed Systems


A thread is the smallest unit of execution within a process.
 Threads belong to a process and share the same memory space.

 A process can have multiple threads, each executing independently.

Thread Components

1. Program Counter: Keeps track of the execution flow.

2. Register Set: Stores temporary computation values.

3. Stack: Each thread has its own stack for function calls.

Thread vs. Process

Feature Thread Process

Definition A lightweight unit of execution An independent program in


within a process execution

Memory Shares memory with other threads Has its own memory space
in the same process

Execution Faster, since it does not require Slower due to memory allocation
memory allocation and OS involvement

Communicatio Easy (shared memory) Expensive (Inter-Process


n Communication needed)

3. Context in Distributed Systems


A context is the set of information needed to execute a process or thread.

Types of Contexts

1. Processor Context:

o Registers needed for instruction execution (e.g., program counter, stack


pointer).

2. Thread Context:

o Registers and memory values used for executing thread instructions.


3. Process Context:

o Includes thread context plus memory management unit (MMU) registers.

Observations

 Threads share the same address space within a process.

 Thread switching is cheaper than process switching (no need to involve the OS).

 Process switching is more expensive since it requires memory reallocation and OS


intervention.

4. Multi-Threading Benefits
1. Avoid Blocking: A blocked thread does not block the entire process.

2. Parallelism: Takes advantage of multi-core processors for faster execution.

3. Efficient Communication: Inter-thread communication is faster than inter-process


communication (IPC).

4. Improved Structure: Applications can be divided into multiple cooperating


threads (e.g., a word processor uses separate threads for user input, spell checking,
and saving files).

5. Thread Implementation Models


Threads can be implemented in three ways:

1. User-Level Threads

 Threads are managed at the application level, without OS involvement.

 Pros:

o Fast thread creation and context switching.

o Efficient execution in user mode.

 Cons:

o If one thread is blocked, the entire process gets blocked.


2. Kernel-Level Threads

 Threads are managed by the OS kernel.

 Pros:

o The OS can schedule another thread when one gets blocked.

o Handles multiple external events efficiently.

 Cons:

o Expensive because every thread operation requires a system call.

3. Lightweight Process (LWP)

 A hybrid model that combines user-level and kernel-level threads.

 Pros:

o Allows efficient scheduling.

 Cons:

o Complex to manage and has been mostly abandoned.

6. Multi-Threaded Servers in Distributed Systems


Why Multi-Threading in Servers?

1. Improves Performance: Faster response time.

2. Hides Network Latency: The server can handle requests while waiting for data.

3. Better Use of Resources: Threads are more efficient than multiple processes.

4. Scalability: Supports multiple clients concurrently.

Dispatcher/Worker Model

 Dispatcher Thread: Accepts client requests.

 Worker Threads: Handle requests in parallel.

Example: A web server uses multiple threads to handle different HTTP requests
simultaneously.
7. Virtualization in Distributed Systems
Virtualization allows the creation of virtual versions of hardware, operating systems, or
network resources.

Why Use Virtualization?

1. Resource Efficiency: Multiple virtual machines (VMs) can run on one physical
machine.

2. Portability: Applications can run on any virtualized environment.

3. Isolation: VMs are independent and do not interfere with each other.

Types of Virtualization

1. Application Virtualization: Runs applications independently from the OS (e.g., Java


Virtual Machine).

2. Desktop Virtualization: Allows remote access to desktop environments.

3. Network Virtualization: Creates virtual networks (e.g., VPNs).

4. Server Virtualization: Allows multiple OS instances on a single machine (e.g.,


VMware, VirtualBox).

5. Storage Virtualization: Combines multiple storage devices into one logical unit.

Virtualization in Cloud Computing

Cloud Service Model Description Example

IaaS (Infrastructure-as-a- Provides virtual machines instead of AWS EC2


Service) physical servers

PaaS (Platform-as-a- Offers tools for developing applications Google App


Service) Engine

SaaS (Software-as-a- Provides ready-to-use applications Google Docs


Service)

8. Code Migration in Distributed Systems


Code Migration refers to moving a program or process from one machine to another.

Why Migrate Code?

1. Load Balancing: Move processes from overloaded to underutilized machines.

2. Improved Performance: Reduce network latency by executing code closer to the


data.

3. Flexibility: Clients do not need to pre-install all software.

4. Minimized Communication Costs: Instead of sending large data sets, send code to
process data remotely.

Types of Code Migration

1. Weak Mobility: Moves only the code segment (e.g., Java Applets).

2. Strong Mobility: Moves code + execution state (e.g., process migration).

Examples of Code Migration

 Sending Client Code to Server: A client sends code to execute database operations
on the server.

 Sending Server Code to Client: A web form validation script is executed on the
client instead of the server.

 Live Migration of Running Processes: Used in cloud computing to migrate virtual


machines.

9. Inter-Process Communication (IPC) in Distributed


Systems
Processes in a distributed system communicate using IPC mechanisms.

Types of IPC

1. Message Passing:

o Remote Procedure Call (RPC)

o Remote Method Invocation (RMI)


2. Shared Memory:

o Processes communicate via a common memory space.

Common Communication Protocols

 TCP/IP: Reliable data transmission.

 UDP: Fast but does not guarantee delivery.

 HTTP/WebSockets: Used for web-based communication.

You might also like