Lecture 09
Lecture 09
computing
Dr. Mohamed Ghetas
Lecture 09
1
Cluster computing and grid computing
Topics
Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
2
Cluster computing:
Cluster: is a type of distributed computing system, consist of a
collection of interconnected similar workstations or PCs working
together as a single computing resource.
Master
node
- Onemaster node(master-slave).
- Middleware needed to support single system image (distribution
transparency) ,as a communication facilitator .
Cluster properties :
high performance/low cost.
High availability(due to redundancy).
Same hardware, software and application in all
cluster nodes, (homogenous).
Load balancing between servers(which mean
fast response to user).
parallel processing.
Scalability.
Fault tolerance
Single owner.
Types of cluster:
Load balancing cluster
The nodes in the system share the workload to
provide a better performance.
1. GFS client passes a file name and chunk index to the master.
2. Waiting for contact address of the chunk. The contact address
contains all the information to access the correct chunk server to
obtain the required file chunk.
3. The Master periodically asks all Chunk Servers for the stored data
block list and reconstructs the location information mapping table .
4. Chunk server will lookup for the file chunk using chunk ID.
5. The master keeps track of where a chunk is located.
Continue ..
Chunk Server store and manage chunks.
Chunks are replicated to handle failures.
The Chunk Server checks all the data block file names it has
stored (also checking if the data block is corrupted) and sorts
the valid data block names into a list to answer to the Server.
When the client is performing an update operation:
contacts the nearest chunk server holding that data.
pushes its updates to that server.
server will push the update to the next closest one holding the data,
and so on.
Google data centers locations
1-1.19
Criteria for a Grid
Coordinates resources that are not subject to centralized
control.
Uses standard, open, general-purpose protocols and
interfaces.
Delivers nontrivial qualities of service.
Benefits
Exploit Underutilized resources
Resource load Balancing
Virtualize resources across an enterprise
Data Grids, Compute Grids
Enable collaboration for virtual organizations
Grid Architecture
Working of layers
Fabric. The lowest layer job is used to make a common interface on all
possible kinds of resources available. Access by higher layers is
granted via standardized processes.
Resource and connectivity protocols: The connectivity layer defines
the basic communication- and authentication protocols which are
needed by the grid. While the communication protocols allow the
exchange of files between different resources connected by the first
layer, the authentication protocols allow to communicate confidentially
and to ensure the identity of the two partners.
Collective services: The purpose of this layer is the coordination of
multiple resources. Access to these resources doesn’t happen directly
but merely via the underlying protocols and interfaces.
User applications: To this layer belong all those applications which
are operating in the environment of a virtual organization. Jobs of the
lower layers get called by applications and can use resources
transparently.
Advantages of Grid Computing
Advantages of Grid Computing
Advantages of Grid Computing
Advantages of Grid Computing
Summary on advantages of Grid
Computing
Disadvantages of Grid Computing
Disadvantages of Grid Computing
Methods of Grid Computing
Distributed Supercomputing
High-Throughput Computing
On-Demand Computing
Data-Intensive Computing
Collaborative Computing
Logistical Networking
Distributed Supercomputing
On-Demand Computing
Uses grid capabilities to meet short-term
requirements for resources that are not
locally accessible.
Models real-time computing demands.
Collaborative Computing
Concerned primarily with enabling and enhancing
human-to-human interactions.
Applications are often structured in terms of a
virtual shared space.
Data-Intensive Computing
The focus is on synthesizing new information
from data that is maintained in geographically
distributed repositories, digital libraries, and
databases.
35