0% found this document useful (0 votes)
6 views

PDC 1.1

Distributed computing involves software components distributed across multiple computers, functioning as a cohesive system to enhance scalability, performance, and cost-effectiveness. It encompasses various architectures, including client-server and peer-to-peer models, and emphasizes the importance of security through robust authentication and authorization mechanisms. The document details the principles, architectures, and security challenges associated with distributed systems, highlighting the need for effective data protection and secure communication.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

PDC 1.1

Distributed computing involves software components distributed across multiple computers, functioning as a cohesive system to enhance scalability, performance, and cost-effectiveness. It encompasses various architectures, including client-server and peer-to-peer models, and emphasizes the importance of security through robust authentication and authorization mechanisms. The document details the principles, architectures, and security challenges associated with distributed systems, highlighting the need for effective data protection and secure communication.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

What Is Distributed Computing?

Architecture of Distributed Systems


Distributed computing is a system of software components spread
over different computers but running as a single entity. A
distributed system can be an arrangement of different
configurations, such as mainframes, computers, workstations, and
minicomputers.

Distributed System
Sharing resources such as hardware, software, and data is one of the
principles of cloud computing. With different levels of openness to the
software and concurrency, it’s easier to process data simultaneously
through multiple processors. The more fault-tolerant an application is,
the more quickly it can recover from a system failure.

Organizations have turned to distributed computing systems to handle


data generation explosion and increased application performance needs.
These distributed systems help businesses scale as data volume grows.
This is especially true because the process of adding hardware to a
distributed system is simpler than upgrading and replacing an entire
centralized system made up of powerful servers.

Distributed systems consist of many nodes that work together toward a


single goal. These systems function in two general ways, and both of
them have the potential to make a huge difference in an organization.

• The first type is a cohesive system where the customer has


each machine, and the results are routed from one source.
• The second type allows each node to have an end-user with
their own needs, and the distributed system facilitates sharing
resources or communication.
• Benefits of a multi-computer model
• Improved scalability: Distributed computing clusters are a
great way to scale your business. They use a ‘scale-out
architecture,’ which makes adding new hardware easier as load
increases.
• Enhanced performance: This model uses ‘parallelism’ for the
divide-and-conquer approach. In other words, all computers
in the cluster simultaneously handle a subset of the overall
task. Therefore, as the load increases, businesses can add more
computers and optimize overall performance.
• Cost-effectiveness: The cost-efficiency of a distributed
system depends on its latency, response time, bandwidth, and
throughput. Distributed systems work toward a common goal
of delivering high performance by minimizing latency and
enhancing response time and throughput. They achieve this
goal by using low-cost commodity hardware to ensure zero
data loss, making initial deployments and cluster expansions
easy.

Architecture of Distributed Systems

Cloud-based software, the backbone of distributed systems, is


a complicated network of servers that anyone with an internet
connection can access. In a distributed system, components and
connectors arrange themselves in a way that eases communication.
Components are modules with well-defined interfaces that can be
replaced or reused. Similarly, connectors are communication links
between modules that mediate coordination or cooperation among
components.

A distributed system is broadly divided into two essential concepts —


software architecture (further divided into layered architecture, object-
based architecture, data-centered architecture, and event-based
architecture) and system architecture (further divided into client-server
architecture and peer-to-peer architecture).

Let’s understand each of these architecture systems in detail:

1. Software architecture

Software architecture is the logical organization of software components


and their interaction with other structures. It is at a lower level than
system architecture and focuses entirely on components; e.g., the web
front end of an ecommerce system is a component. The four main
architectural styles of distributed systems in software components entail:

i) Layered architecture

Layered architecture provides a modular approach to software. By


separating each component, it is more efficient. For example, the open
systems interconnection (OSI) model uses a layered architecture for
better results. It does this by contacting layers in sequence, which allows
it to reach its goal. In some instances, the implementation of layered
architecture is in cross-layer coordination. Under cross-layer, the
interactions can skip any adjacent layer until it fulfills the request and
provides better performance results.
Layered Architecture

Layered architecture is a type of software that separates components into


units. A request goes from the top down, and the response goes from the
bottom up. The advantage of layered architecture is that it keeps things
orderly and modifies each layer independently without affecting the rest
of the system.

ii) Object-based architecture

Object-based architecture centers around an arrangement of loosely


coupled objects with no specific architecture like layers. Unlike layered
architecture, object-based architecture doesn’t have to follow any steps
in a sequence. Each component is an object, and all the objects can
interact through an interface (or connector). Under object-based
architecture, such interactions between components can happen through
a direct method call.

Object-based Architecture

At its core, communication between objects happens through method


invocations, often called remote procedure calls (RPC). Popular RPC
systems include Java RMI and Web Services and REST API Calls. The
primary design consideration of these architectures is that they are less
structured. Here, component equals object, and connector equals RPC or
RMI.

iii) Data-centered architecture

Data-centered architecture works on a central data repository, either


active or passive. Like most producer-consumer scenarios, the producer
(business) produces items to the common data store, and the consumer
(individual) can request data from it. Sometimes, this central repository
can be just a simple database.
All communication between objects happens through a data storage
system in a data-centered system. It supports its stores’ components with
a persistent storage space such as an SQL database, and the system stores
all the nodes in this data storage.

iv) Event-based architecture

In event-based architecture, the entire communication is through events.


When an event occurs, the system gets the notification. This means that
anyone who receives this event will also be notified and has access to
information. Sometimes, these events are data, and at other times they
are URLs to resources. As such, the receiver can process what information
they receive and act accordingly.
Event-Based Architecture

One significant advantage of event-based architecture is that the


components are loosely coupled. Eventually, it means that it’s easy to
add, remove, and modify them. To better understand this, think of
publisher-subscriber systems, enterprise services buses, or akka.io. One
advantage of event-based architecture is allowing heterogeneous
components to communicate with the bus, regardless of their
communication protocols.

2. System architecture
System-level architecture focuses on the entire system and the
placement of components of a distributed system across multiple
machines. The client-server architecture and peer-to-peer architecture
are the two major system-level architectures that hold significance today.
An example would be an ecommerce system that contains a service layer,
a database, and a web front.

i) Client-server architecture
As the name suggests, client-server architecture consists of a client and
a server. The server is where all the work processes are, while the client
is where the user interacts with the service and other resources (remote
server). The client can then request from the server, and the server will
respond accordingly. Typically, only one server handles the remote side;
however, using multiple servers ensures total safety.

Client-server Architecture

Client-server architecture has one standard design feature: centralized


security. Data such as usernames and passwords are stored in a secure
database for any server user to have access to this information. This
makes it more stable and secure than peer-to-peer. This stability comes
from client-server architecture, where the security database can allow
resource usage in a more meaningful way. The system is much more
stable and secure, even though it isn’t as fast as a server. The
disadvantages of a distributed system are its single point of failure and
not being as scalable as a server.
ii) Peer-to-peer (P2P) architecture

A peer-to-peer network, also called a (P2P) network, works on the


concept of no central control in a distributed system. A node can either
act as a client or server at any given time once it joins the network. A node
that requests something is called a client, and one that provides
something is called a server. In general, each node is called a peer.

Peer-to-Peer Architecture

If a new node wishes to provide services, it can do so in two ways. One


way is to register with a centralized lookup server, which will then direct
the node to the service provider. The other way is for the node to
broadcast its service request to every other node in the network, and
whichever node responds will provide the requested service.

P2P networks of today have three separate sections:

• Structured P2P: The nodes in structured P2P follow a


predefined distributed data structure.
• Unstructured P2P: The nodes in unstructured P2P randomly
select their neighbors.
• Hybrid P2P: In a hybrid P2P, some nodes have unique
functions appointed to them in an orderly manner.
Security in Distributed System

Securing distributed systems is crucial for ensuring data integrity,
confidentiality, and availability across interconnected networks. Key
measures include implementing strong authentication mechanisms, like
multi-factor authentication (MFA), and robust authorization controls such
as role-based access control (RBAC). Encryption ensures data protection
during transmission and storage. Continuous monitoring and auditing help
detect and respond to security threats promptly.

Goals of Distributed System Security

Security in a distributed system poses unique challenges that need to be


considered when designing and implementing systems. A compromised
computer or network may not be the only location where data is at risk;
other systems or segments may also become infected with malicious code.

• Because these types of threats can occur anywhere, even across


distances in networks with few connections between them
• New research has been produced to help determine how well,
distributed security architectures are actually performing.
In the past, security was typically handled on an end-to-end basis. All the
work involved in ensuring safety occurred “within” a single system and was
controlled by one or two administrators. The rise of distributed systems has
created a new ecosystem that brings with it unique challenges to
security. Distributed systems are made up of multiple nodes working
together to achieve a common goal, these nodes are usually called peers.

Authentication Mechanisms in Distributed System


Authentication mechanisms in distributed systems ensure that users and
services are who they claim to be before granting access to resources. Here
are the key authentication mechanisms:
• Password-based Authentication: Users authenticate with a
username and password. Commonly used but vulnerable to
password breaches and phishing attacks.
• Multi-factor Authentication (MFA): Requires users to provide
two or more authentication factors (e.g., password + OTP,
fingerprint). Enhances security by adding an extra layer of
verification.
• Token-based Authentication: Uses tokens (e.g., JWT, OAuth
tokens) for authentication. Tokens are generated by an
authentication server and validated by services.
• Biometric Authentication: Uses unique biological traits (e.g.,
fingerprints, facial recognition) for authentication. Provides strong
authentication but may require specialized hardware.
• Certificate-based Authentication: Uses digital certificates to
authenticate clients and servers. Certificates are issued by trusted
Certificate Authorities (CAs) and verify identity.
• Single Sign-On (SSO): Allows users to authenticate once and gain
access to multiple systems. Improves user experience and reduces
password fatigue.

Authorization Controls in Distributed System


Authorization controls in distributed systems dictate what actions users and
services are permitted to perform on resources. Here are the key
authorization controls:
• Role-Based Access Control (RBAC): Assigns permissions based
on roles (e.g., admin, user, manager). Simplifies access
management by grouping users with similar responsibilities.
• Attribute-Based Access Control (ABAC): Grants access based on
attributes (e.g., user properties, resource properties,
environmental conditions). Provides fine-grained access control
tailored to specific contexts.
• Access Control Lists (ACLs): Lists of permissions associated with
users or groups. Applied directly to resources to specify who can
access them and what actions they can perform.
• Policy-Based Access Control: Uses policies to define access rules
based on conditions and attributes. Offers flexibility to adapt access
controls dynamically based on changing conditions.
• Mandatory Access Control (MAC): Assigns labels (e.g., sensitivity
levels) to resources and subjects. Access decisions are based on
predefined security policies set by administrators.
• Role-Based Access Control (RBAC): Assigns permissions based
on roles (e.g., admin, user, manager). Simplifies access
management by grouping users with similar responsibilities.

Data Protection and Encryption in Distributed System


Data protection and encryption in distributed systems are essential for
ensuring confidentiality, integrity, and availability of sensitive information
across interconnected networks. Here are the key aspects:
1. Importance of Data Protection:
• Safeguards sensitive data from unauthorized access,
modification, or theft.
• Ensures compliance with privacy regulations (e.g., GDPR,
HIPAA).
2. Encryption Basics:
• Converts plaintext data into ciphertext using algorithms
(e.g., AES, RSA).
• Ensures data confidentiality during transmission and
storage.
3. Transport Layer Security (TLS) / Secure Socket Layer (SSL):
• Protocols that encrypt data exchanged between clients
and servers.
• Establish secure communication channels to prevent
eavesdropping and tampering.
4. End-to-End Encryption (E2EE):
• Encrypts data at the source and decrypts it only at the
destination.
• Protects data from interception or surveillance during
transmission.
5. Data Masking and Tokenization:
• Techniques to obscure sensitive data (e.g., masking credit
card numbers, tokenizing data).
• Reduces exposure of sensitive information in non-
production environments.
6. Key Management:
• Manages encryption keys used to encrypt and decrypt
data.
• Ensures secure storage, rotation, and distribution of keys
to authorized entities.

Secure Communication in Distributed System


Secure communication in distributed systems is essential for maintaining
the confidentiality, integrity, and authenticity of data exchanged between
various components across a network. It ensures that data is protected from
eavesdropping, tampering, and impersonation, thereby preserving the
privacy and security of sensitive information.
Key Aspects of Secure Communication:
1. Encryption Protocols:
• Transport Layer Security (TLS)/Secure Socket Layer
(SSL): TLS and SSL are cryptographic protocols designed
to provide secure communication over a computer
network.
• TLS, the successor to SSL, encrypts the data transmitted
between a client and a server, ensuring that it cannot be
intercepted or read by unauthorized parties.
• This encryption is achieved using a combination of
symmetric and asymmetric encryption, with certificates
issued by trusted Certificate Authorities (CAs) to
authenticate the server’s identity.
2. End-to-End Encryption (E2EE):
• End-to-end encryption ensures that data is encrypted on
the sender’s side and only decrypted on the receiver’s
side, preventing intermediaries from accessing the
plaintext data.
• This method is crucial for protecting data in transit,
especially in messaging applications and VoIP services,
where data security is paramount.
3. Secure Channels and Protocols:
• Internet Protocol Security (IPsec): IPsec is a protocol
suite that provides secure IP communications by
authenticating and encrypting each IP packet of a
communication session.
•It operates at the network layer, ensuring that all traffic
between two endpoints is secure, regardless of the
application or protocol being used.
4. Authentication Mechanisms:
• Public Key Infrastructure (PKI): PKI supports secure
communication by providing a framework for managing
digital certificates and public-key encryption. It ensures
that the parties involved in a communication session are
authenticated and that the integrity of the data is
maintained.
• HMAC (Hash-Based Message Authentication
Code): HMAC provides a way to verify the integrity and
authenticity of a message. It combines a cryptographic
hash function with a secret key, ensuring that the message
has not been altered in transit and confirming the identity
of the sender.
5. Transport Layer Security Enhancements:
• Perfect Forward Secrecy (PFS): PFS ensures that
session keys are not compromised even if the server’s
long-term keys are exposed. By generating unique session
keys for each communication session, PFS provides an
additional layer of security.
• Certificate Pinning: Certificate pinning involves
associating a host with a specific certificate or public key.
This practice helps protect against man-in-the-middle
(MITM) attacks by ensuring that the client connects only
to the server with the expected certificate.
Challenges in Distributed System Security
Securing distributed systems poses several significant challenges due to
their complexity, scale, and dynamic nature. Here are the key challenges in
distributed system security:
• Network Complexity: Increases the attack surface and complexity
of managing security configurations and updates across diverse
network environments.
• Data Protection and Encryption: Vulnerabilities in encryption
implementations or weak key management practices can lead to
data breaches and unauthorized access.
• Authentication and Authorization: Misconfigurations or
vulnerabilities in authentication mechanisms can lead to
unauthorized access and compromise the integrity of the entire
system.
• Diverse Technologies and Platforms: Compatibility issues,
differing security postures, and varying levels of support for
security standards can introduce vulnerabilities and complexities
in maintaining a consistent security posture.
• Scalability and Performance: Security measures such as
encryption and authentication may introduce latency and
overhead, affecting system performance and responsiveness,
especially under high load conditions.

Real-world examples of security breaches in Distributed System


Here are a few notable real-world examples of security breaches in
distributed systems:
1. Equifax Data Breach (2017):
• Description: Equifax, one of the largest credit reporting agencies,
experienced a massive data breach that exposed sensitive personal
information of over 147 million consumers.
• Cause: Exploitation of a vulnerability in Apache Struts, a popular
open-source framework used in Equifax’s web applications.
• Impact: Stolen data included names, Social Security numbers, birth
dates, addresses, and in some cases, driver’s license numbers. The
breach led to legal and financial repercussions for Equifax,
including settlements with regulatory authorities and affected
consumers.
2. SolarWinds Supply Chain Attack (2020):
• Description: A sophisticated cyberattack compromised the
software supply chain of SolarWinds, a leading provider of network
management software.
• Cause: Attackers inserted malicious code, named Sunburst or
Solorigate, into SolarWinds’ Orion platform updates distributed to
its customers.
• Impact: The attack compromised numerous high-profile
organizations, including government agencies and major tech
companies, allowing the attackers to access sensitive information
and conduct espionage activities. It highlighted vulnerabilities in
software supply chain security and the challenges of defending
against supply chain attacks.
3. Yahoo Data Breaches (2013-2016):
• Description: Yahoo experienced two major data breaches
between 2013 and 2016 that compromised the personal
information of billions of users.
• Cause: Cybercriminals exploited vulnerabilities in Yahoo’s systems
to steal user account information, including names, email
addresses, hashed passwords, and security questions.
• Impact: The breaches not only compromised user privacy but also
affected Yahoo’s reputation and resulted in legal consequences and
a decreased valuation during Verizon’s acquisition of Yahoo.
4. Capital One Data Breach (2019):
• Description: Capital One experienced a data breach where a
former employee of Amazon Web Services (AWS) gained
unauthorized access to sensitive data of more than 100 million
customers and applicants.
• Cause: Exploitation of a misconfigured web application firewall
(WAF) in AWS, allowing the attacker to access data stored in Capital
One’s AWS S3 buckets.
• Impact: Stolen data included personal information such as names,
addresses, credit scores, and social security numbers. The incident
raised concerns about cloud security practices and the shared
responsibility model between cloud providers and their customers.
What is Pipelining?

Pipelining is the process of accumulating instruction from the processor


through a pipeline. It allows storing and executing instructions in an
orderly process. It is also known as pipeline processing.

Pipelining is a technique where multiple instructions are overlapped


during execution. Pipeline is divided into stages and these stages are
connected with one another to form a pipe like structure. Instructions
enter from one end and exit from another end.

Pipelining increases the overall instruction throughput.


In pipeline system, each segment consists of an input register followed
by a combinational circuit. The register is used to hold data and
combinational circuit performs operations on it. The output of
combinational circuit is applied to the input register of the next segment.

Pipeline system is like the modern day assembly line setup in factories.
For example in a car manufacturing industry, huge assembly lines are
setup and at each point, there are robotic arms to perform a certain task,
and then the car moves on ahead to the next arm.

Types of Pipeline
It is divided into 2 categories:
1. Arithmetic Pipeline
2. Instruction Pipeline

Arithmetic Pipeline
Arithmetic pipelines are usually found in most of the computers. They
are used for floating point operations, multiplication of fixed point
numbers etc. For example: The input to the Floating Point Adder pipeline
is:
X = A*2^a
Y = B*2^b
Here A and B are mantissas (significant digit of floating point numbers),
while a and b are exponents.
The floating point addition and subtraction is done in 4 parts:
1. Compare the exponents.
2. Align the mantissas.
3. Add or subtract mantissas
4. Produce the result.
Registers are used for storing the intermediate results between the above
operations.

Instruction Pipeline

In this a stream of instructions can be executed by


overlapping fetch, decode and execute phases of an instruction cycle. This
type of technique is used to increase the throughput of the computer
system.
An instruction pipeline reads instruction from the memory while
previous instructions are being executed in other segments of the
pipeline. Thus we can execute multiple instructions simultaneously. The
pipeline will be more efficient if the instruction cycle is divided into
segments of equal duration.
Data Parallelism
Data Parallelism means concurrent execution of the same task on each multiple
computing core.

Let’s take an example, summing the contents of an array of size N. For a single-
core system, one thread would simply sum the elements [0] . . . [N − 1]. For a
dual-core system, however, thread A, running on core 0, could sum the elements
[0] . . . [N/2 − 1] and while thread B, running on core 1, could sum the elements
[N/2] . . . [N − 1]. So the Two threads would be running in parallel on separate
computing cores.

Task Parallelism

Task Parallelism means concurrent execution of the different task on multiple


computing cores.

Consider again our example above, an example of task parallelism might involve
two threads, each performing a unique statistical operation on the array of
elements. Again The threads are operating in parallel on separate computing
cores, but each is performing a unique operation.

The key differences between Data Parallelisms and Task Parallelisms are −

Data Parallelisms Task Parallelisms

1. Same task are performed on different 1. Different task are performed on the same
subsets of same data. or different data.

2. Synchronous computation is performed. 2. Asynchronous computation is performed.

3. As there is only one execution thread 3. As each processor will execute a different
operating on all sets of data, so the thread or process on the same or different set
speedup is more. of data, so speedup is less.

4. Amount of parallelization is proportional


4. Amount of parallelization is
to the number of independent tasks is
proportional to the input size.
performed.

5. Here, load balancing depends


5. It is designed for optimum load
upon on the availability of the
balance on multiprocessor
hardware and scheduling algorithms
system.
like static and dynamic scheduling.
Difference between SIMD and MIMD
S.NO SIMD MIMD

SIMD stands for Single While MIMD stands for Multiple


1.
Instruction Multiple Data. Instruction Multiple Data.

SIMD requires small or less While it requires more or large


2.
memory. memory.

The cost of SIMD is less than


3. While it is costlier than SIMD.
MIMD.

4. It has single decoder. While it have multiple decoders.

It is latent or tacit While it is accurate or explicit


5.
synchronization. synchronization.

SIMD is a synchronous While MIMD is a asynchronous


6.
programming. programming.

SIMD is a simple in terms of While MIMD is complex in terms


7.
complexity than MIMD. of complexity than SIMD.

SIMD is less efficient in terms While MIMD is more efficient in


8.
of performance than MIMD. terms of performance than SIMD.
Shared and Distributed Memory in Parallel Computing

In parallel and distributed computing, memory management becomes crucial


when dealing with multiple processors working together.
Two prominent approaches exist:
1. Shared memory and
2. Distributed memory.
We will delve into these concepts, highlighting their key differences, advantages,
disadvantages, and applications.

1. Shared Memory
Shared memory systems provide a single, unified memory space accessible by all
processors in a computer. Imagine a whiteboard where multiple people can write
and read simultaneously.
Physically, the memory resides in a central location, accessible by all processors
through a high-bandwidth connection like a memory bus. Hardware enforces data
consistency, ensuring all processors see the same value when accessing a shared
memory location.

Fig-Shared Memory
Hardware Mechanisms for Shared Memory

Memory Bus: The shared memory resides in a central location (DRAM) and is
connected to all processors via a high-bandwidth memory bus. This bus acts as a
critical communication channel, allowing processors to fetch and store data from
the shared memory. However, with multiple processors vying for access, the bus
can become a bottleneck, limiting scalability.

Cache Coherence: To ensure all processors see the same value when accessing a
shared memory location, cache coherence protocols are implemented. These
protocols maintain consistency between the central memory and the private caches
of each processor. There are various cache coherence protocols with varying
trade-offs between performance and complexity.

Synchronization and Coordination


Shared memory programming offers a simpler model compared to distributed
memory, but it’s not without its challenges. Since multiple processors can access
and modify shared data concurrently, ensuring data consistency and preventing
race conditions is crucial. Programmers need to employ synchronization
primitives like locks, semaphores, and monitors to control access to shared
resources and coordinate execution between threads. Choosing the appropriate
synchronization mechanism depends on the specific needs of the program.

Complexities and Challenges


• Cache Line Size: The size of a cache line (the minimum unit of data
transferred between cache and memory) can impact performance. If the
cache line size is larger than the data being accessed, it can lead to
unnecessary cache invalidation and bus traffic.
• False Sharing: When unrelated data items are placed in the same cache
line, accessing one item can invalidate the entire line, even though the
other item wasn’t involved. This can degrade performance.
• Memory Bandwidth: The bandwidth of the memory bus is a critical
factor. If the memory access rate exceeds the bus bandwidth, processors
will experience stalls, hindering performance.
Advantages
• Simpler Programming: Processes can directly access and modify
shared data, leading to a more intuitive programming model.
• Faster Communication: Data exchange happens within the same
memory space, resulting in high-speed communication compared to
distributed memory.
• Cache Coherence: Hardware manages cache consistency, reducing
the need for explicit synchronization between processors.

Disadvantages
• Scalability: Adding processors becomes complex as the shared
memory bus becomes a bottleneck for communication.
• Limited Memory Size: The total memory capacity is restricted by the
central memory unit.
• Single Point of Failure: A hardware failure in the shared memory can
bring the entire system down.

Applications
• Multiprocessor systems designed for tight collaboration between
processes, like scientific simulations with frequent data sharing.
• Operating systems for efficient task management and resource sharing.
Distributed Memory

Distributed memory systems consist of independent processors, each with its local
private memory. There’s no single shared memory space. Communication
between processors happens explicitly by sending and receiving messages.

Fig-Distributed Memory

Processors communicate through a network like Ethernet or a dedicated


interconnection network. Software protocols manage data exchange and ensure
consistency.

Hardware Mechanism for Distributed Computing


In distributed memory systems, there’s no single hardware mechanism for
memory management since each processor has its private memory. The hardware
focus shifts towards enabling communication and interaction between these
independent memory spaces. Here’s a breakdown of the key hardware
components involved:
Processors: Each node in the distributed system consists of a processor (CPU)
with its local memory (DRAM) for storing program instructions and data. These
processors are responsible for executing the distributed program and managing
their local memory.
Network Interface Controller (NIC): Each processor is equipped with a
Network Interface Controller (NIC). This hardware component acts as the
communication bridge between the processor and the network. It facilitates
sending and receiving messages containing data or instructions to and from other
processors in the system.
Interconnection Network: The processors are interconnected through a
dedicated network. This network allows processors to exchange messages with
each other. Common network topologies used in distributed memory systems.
Advantages
• Scalability: Adding more processors is simpler as each has its own
memory. This allows for building large-scale parallel and distributed
systems.
• Large Memory Capacity: The total memory capacity can be
significantly larger by aggregating the local memory of each processor.
• Fault Tolerance: Processor failures are isolated, and the system can
continue operating with remaining functional processors.
Disadvantages
• Complex Programming: Programmers need to explicitly manage data
communication between processors, making the development process
more intricate.
• Slower Communication: Communication through the network
introduces latency compared to direct access in shared memory.
• Cache Management: Software protocols are needed to maintain cache
coherence, adding complexity.
Applications:
• High-performance computing clusters tackling large-scale problems
like weather simulations and big data analytics.
• Distributed computing systems where tasks are spread across
geographically dispersed machines.

Q: Choosing Between Shared and Distributed Memory


The choice between shared and distributed memory depends on several factors:
• Problem Size and Data Sharing: For problems with frequent and
irregular data access patterns, shared memory might be better. For
problems with large and well-defined datasets, distributed memory is
often preferred.
• Scalability Needs: If scalability is a major concern, distributed memory
offers a more flexible solution.
• Programming Complexity: Shared memory can be easier to program
initially, but distributed memory might require more development effort.
Multithreading?
Multithreading is a feature in operating systems that allows a program to do several
tasks at the same time. Think of it like having multiple hands working together to
complete different parts of a job faster. Each “hand” is called a thread, and they
help make programs run more efficiently. Multithreading makes your computer
work better by using its resources more effectively, leading to quicker and
smoother performance for applications like web browsers, games, and many other
programs you use every day.

How Does Multithreading Work?


Multithreading works by allowing a computer’s processor to handle multiple tasks
at the same time. Even though the processor can only do one thing at a time, it
switches between different threads from various programs so quickly that it looks
like everything is happening all at once.
Here’s how it simplifies:
• Processor Handling : The processor can execute only one instruction
at a time, but it switches between different threads so fast that it gives
the illusion of simultaneous execution.
• Thread Synchronization : Each thread is like a separate task within a
program. They share resources and work together smoothly, ensuring
programs run efficiently.
• Efficient Execution : Threads in a program can run independently or
wait for their turn to process, making programs faster and more
responsive.
• Programming Considerations : Programmers need to be careful about
managing threads to avoid problems like conflicts or situations where
threads get stuck waiting for each other.

What is Process?
Processes are basically the programs that are dispatched from the ready state and
are scheduled in the CPU for execution. PCB ( Process Control Block ) holds the
context of process. A process can create other processes which are known as Child
Processes. The process takes more time to terminate, and it is isolated means it
does not share the memory with any other process. The process can have the
following states new, ready, running, waiting, terminated and suspended.

Thread?
Threads are often called “lightweight processes” because they share some features
of processes but are smaller and faster. Each thread is always part of one specific
process. A thread has three states: Running, Ready and Blocked.
A thread takes less time to terminate as compared to the process but unlike the
process, threads do not isolate.
Examples:
• Imagine a word processor that works with two separate tasks happening
at the same time. One task focuses on interacting with the user, like
responding to typing or scrolling, while the other works in the
background to adjust the formatting of the entire document.
For example if you delete a sentence on page 1, the user-focused task
immediately tells the background task to reformat the entire book. While
the background task is busy reformatting, the user-focused task continues
to handle simple actions like letting you scroll through page 1 or click on
things.
• When you use a web browser, threads are working behind the scenes to
handle different tasks simultaneously.
For example: One thread is loading the webpage content (text, images,
videos). Another thread is responding to your actions like scrolling,
clicking, or typing.
A separate thread might be running JavaScript to make the webpage
interactive.
This multitasking makes the browser smooth and responsive.
For instance you can scroll through a page or type in a search bar while
the rest of the page is still loading. If threads weren’t used the browser
would freeze and wait for one task to finish before starting another.
Threads ensure everything feels fast and seamless.

Process vs Thread
Difference Between Process and Thread
The table below represents the difference between process and thread.

Process Thread

Process means a program in


Thread means a segment of a process.
execution.

A process takes more time to


A thread takes less time to terminate.
terminate.

It takes more time for creation. It takes less time for creation.

It also takes more time for context


It takes less time for context switching.
switching.

A process is less efficient in terms of Thread is more efficient in terms of


communication. communication.

We don’t need multi programs in action for


Multiprogramming holds the concepts
multiple threads because a single process
of multi-process.
consists of multiple threads.

Every process runs in its own


Threads share memory.
memory.

A process is heavyweight compared to A Thread is lightweight as each thread in a


a thread. process shares code, data, and resources.

Process switching uses an interface in Thread switching may not require calling
an operating system. involvement of operating system.

If one process is blocked, then it will


If a user-level thread is blocked, then all
not affect the execution of other
other user-level threads are blocked.
processes.

Thread has Parents’ PCB, its own Thread


A process has its own Process Control
Control Block, and Stack and common
Block, Stack, and Address Space.
Address space.
Process Thread

Since all threads of the same process share


Changes to the parent process do not address space and other resources so any
affect child processes. changes to the main thread may affect the
behavior of the other threads of the process.

No system call is involved, it is created


A system call is involved in it.
using APIs.

A process does not share data with


Threads share data with each other.
each other.

Advantages of Process
• Processes work independently in their own memory, ensuring no
interference and better security.
• Resources like CPU and memory are allocated effectively to optimize
performance.
• Processes can be prioritized to ensure important tasks get the resources
they need.
Disadvantages of Process
• Frequent switching between processes can slow down the system and
reduce speed.
• Improper resource management can cause deadlocks where processes
stop working and block progress.
• Having too many processes can make the process table take up a lot of
memory. This can also make searching or updating the table slower,
which can reduce system performance.

Advantages of Thread
• When there is a lot of computing and input/output (I/O) work, threads
help tasks run at the same time, making the app faster.
• Another advantage for having threads is that since they are lighter
weight than processes, they are easier (i.e., faster) to create and destroy
than processes.
• Many apps need to handle different tasks at the same time. For
example, a web browser can load a webpage, play a video, and let you
scroll all at once. Threads make this possible by dividing these tasks
into smaller parts that can run together.
Disadvantages of Thread
• Threads in the same process are not completely independent like
separate processes. They share the same memory space including global
variables. This means one thread can accidentally change or even erase
another thread’s data as there is no protection between them.
• Threads also share resources like files. For example – if one thread
closes a file while another is still using it, it can cause errors or
unexpected behavior.
• If too many threads are created they can slow down the system or cause
it to run out of memory.
CAP Theorem?
The CAP theorem is a fundamental concept in distributed systems theory that
was first proposed by Eric Brewer in 2000 and subsequently shown by Seth
Gilbert and Nancy Lynch in 2002. It asserts that all three of the following
qualities cannot be concurrently guaranteed in any distributed data system:

1. Consistency
Consistency means that all the nodes (databases) inside a network will have the
same copies of a replicated data item visible for various transactions. It
guarantees that every node in a distributed cluster returns the same, most recent,
and successful write. It refers to every client having the same view of the data.
There are various types of consistency models. Consistency in CAP refers to
sequential consistency, a very strong form of consistency.
For example, a user checks his account balance and knows that he has 500
rupees. He spends 200 rupees on some products. Hence the amount of 200 must
be deducted changing his account balance to 300 rupees. This change must be
committed and communicated with all other databases that hold this user’s
details. Otherwise, there will be inconsistency, and the other database might
show his account balance as 500 rupees which is not true.

Consistency problem

2. Availability
Availability means that each read or write request for a data item will either be
processed successfully or will receive a message that the operation cannot be
completed. Every non-failing node returns a response for all the read and write
requests in a reasonable amount of time. The key word here is “every”. In
simple terms, every node (on either side of a network partition) must be able to
respond in a reasonable amount of time.
For example, user A is a content creator having 1000 other users subscribed to
his channel. Another user B who is far away from user A tries to subscribe to
user A’s channel. Since the distance between both users are huge, they are
connected to different database node of the social media network. If the
distributed system follows the principle of availability, user B must be able to
subscribe to user A’s channel.

Availability problem

3. Partition Tolerance
Partition tolerance means that the system can continue operating even if the
network connecting the nodes has a fault that results in two or more partitions,
where the nodes in each partition can only communicate among each other. That
means, the system continues to function and upholds its consistency guarantees
in spite of network partitions. Network partitions are a fact of life. Distributed
systems guaranteeing partition tolerance can gracefully recover from partitions
once the partition heals.
For example, take the example of the same social media network where two
users are trying to find the subscriber count of a particular channel. Due to some
technical fault, there occurs a network outage, the second database connected
by user B losses its connection with first database. Hence the subscriber count
is shown to the user B with the help of replica of data which was previously
stored in database 1 backed up prior to network outage. Hence the distributed
system is partition tolerant.
Partition Tolerance

The CAP theorem states that distributed databases can have at most two
of the three properties: CONSISTENCY, AVAILABILITY, AND
PARTITION TOLERANCE. As a result, database systems prioritize only
two properties at a time.

Venn diagram of CAP theorem


The Trade-Offs in the CAP Theorem
The CAP theorem implies that a distributed system can only provide two out
of three properties:
1. CA (Consistency and Availability)
These types of system always accept the request to view or modify the data
sent by the user and they are always responded with data which is consistent
among all the database nodes of a big, distributed network.
However, such type of distributed systems is not realizable in real world
because when network failure occurs, there are two options: Either send old
data which was replicated moments ago before network failure or do not allow
user to access the already moments old data. If we choose first option, our
system will become Available and if we choose second option our system will
become Consistent.
The combination of consistency and availability is not possible in distributed
systems and for achieving CA, the system has to be monolithic such that when
a user updates the state of the system, all other users accessing it are also
notified about the new changes which means that the consistency is maintained.
And since it follows monolithic architecture, all users are connected to single
system which means it is also available. These types of systems are generally
not preferred due to a requirement of distributed computing which can be only
done when consistency or availability is sacrificed for partition tolerance.
Example databases: MySQL, PostgreSQL

CAP diagram
2. AP (Availability and Partition Tolerance)
These types of system are distributed in nature, ensuring that the request sent
by the user to view or modify the data present in the database nodes are not
dropped and are processed in presence of a network partition.
The system prioritizes availability over consistency and can respond with
possibly stale data which was replicated from other nodes before the partition
was created due to some technical failure. Such design choices are generally
used while building social media websites such as Facebook, Instagram, Reddit,
etc. and online content websites like YouTube, blog, news, etc. where
consistency is usually not required, and a bigger problem arises if the service is
unavailable causing corporations to lose money since the users may shift to new
platform. The system can be distributed across multiple nodes and is designed
to operate reliably even in the face of network partitions.
Example databases: Amazon DynamoDB, Google Cloud Spanner.

3. CP (Consistency and Partition Tolerance)


These types of system are distributed in nature, ensuring that the request sent
by the user to view or modify the data present in the database nodes are
dropped instead of responding with inconsistent data in presence of a network
partition.
The system prioritizes consistency over availability and does not allow users to
read crucial data from the stored replica which was backed up prior to the
occurrence of network partition. Consistency is chosen over availability for
critical applications where latest data plays an important role such as stock
market application, ticket booking application, banking, etc. where problem
will arise due to old data present to users of application.
For example, in a train ticket booking application, there is one seat which can
be booked. A replica of the database is created, and it is sent to other nodes of
the distributed system. A network outage occurs which causes the user
connected to the partitioned node to fetch details from this replica. Some user
connected to the unpartitioned part of distributed network and already booked
the last remaining seat. However, the user connected to partitioned node will
still one seat which makes the available data inconsistent. It would have been
better if the user was shown error and make the system unavailable for the user
and maintain consistency. Hence consistency is chosen in such scenarios.
Questions for Revision

Q1: Introduction to Parallel and Distributed Computing, Characteristics, Scope , Goals,


applications.

Q2: Differences between parallel and distributed systems.

Q3:Discuss issues and challenges of Parallel and Distributed Computing.

Q4: Explain architecture of distributed systems.

Q5: Discuss Types of distributed systems: client-server, peer-to-peer.

Q6: Why Security in distributed system is so important. Discuss in brief?

Q7: What are Types of parallelism? Explain Data parallelism, task parallelism with block
diagram and example.

Q8: Explain Flynn’s taxonomy for Parallel computing models. Also compare SIMD, MIMD,
in details.

Q9: Shared memory vs. distributed memory. Give Differences.

Q10: Detail differences between Process and Thread. Also discuss multithreading
concept.

Q11: Explain CAP Theorem. Also, discuss Trade-Offs in the CAP Theorem.

You might also like