100% found this document useful (1 vote)
42 views35 pages

Lecture 09

The document discusses cluster computing and grid computing. Cluster computing uses multiple low-cost computers together as a single system to solve large computational problems. Grid computing shares geographically distributed computer resources for resource sharing and computing. The document then provides examples of cluster and grid architectures and applications.

Uploaded by

Mohamed Ghetas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
42 views35 pages

Lecture 09

The document discusses cluster computing and grid computing. Cluster computing uses multiple low-cost computers together as a single system to solve large computational problems. Grid computing shares geographically distributed computer resources for resource sharing and computing. The document then provides examples of cluster and grid architectures and applications.

Uploaded by

Mohamed Ghetas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

Cluster computing and grid

computing
Dr. Mohamed Ghetas

Lecture 09

1
Cluster computing and grid computing

Topics

 What is Grid Computing ?


 How Grid Computing Works ?
 Reasons for using Grid Computing ?
 Grid Architecture
 Grid computing behavior
 Advantages and Disadvantages

Couloris,Dollimore and Kindberg Distributed Systems: Concepts & Design Edn. 4 , Pearson Education 2005
2
Cluster computing:
Cluster: is a type of distributed computing system, consist of a
collection of interconnected similar workstations or PCs working
together as a single computing resource.

It is difficult to handle big data in one node(computer or


server).
Clustering is a solution compensates for the using of
supercomputer (“mainframes “which is very expensive) to solve
big computational task.
Cluster computing : use many low-end computers(computers
with normal performance) together to solve big problem that
need massive computations.
Each computer gives part of the solution.
A typical cluster architecture
middleware

Master
node

- Onemaster node(master-slave).
- Middleware needed to support single system image (distribution
transparency) ,as a communication facilitator .
Cluster properties :
 high performance/low cost.
 High availability(due to redundancy).
 Same hardware, software and application in all
cluster nodes, (homogenous).
 Load balancing between servers(which mean
fast response to user).
 parallel processing.
 Scalability.
 Fault tolerance
 Single owner.
Types of cluster:
Load balancing cluster
The nodes in the system share the workload to
provide a better performance.

High availability cluster (Failover cluster)


use extra nodes which are only used if some of the
system components fail.

High performance cluster (HPC)


Many computers work together to solve big
computational task or problem (act like
supercomputer).
Case study application
Google File System
 Google offer services to Web clients like reads and updates to a huge
number of files distributed across tens of thousands of computers.
 Google files are very large.
 Google has developed its own Google file system (GFS).
Because:
 very large(gigabytes)files, each one contains lots of smaller objects.
updates to files run by attached data rather than overwriting parts of a
file.
server failures.
All lead google to constructing clusters of servers.
Google file system cluster
structure:
 GFS cluster consists of ( Master Server ,large number of Chunk Servers).
 GFS stores files in chunks for better efficiency.
 Each file divide into 64 MB chunks, and distribute/replicate chunks across many
servers.
 Chunk Server stores file in data blocks(chunk) form .
 Data block(chunk) is identified by a unique ID .
 Master Server stores all information except the data block:
 The master maintains a (file name, chunk server) table in main memory.
 files directories.
 the ID list of each file content data block.
 the location of the Chunk Server where each data block is located.
 the amount of data stored by the Master is small.
- Client contact Master Server for data block(chunk) read/write.
- Data transfers happen directly between clients/chunk server.
Request procedure:

1. GFS client passes a file name and chunk index to the master.
2. Waiting for contact address of the chunk. The contact address
contains all the information to access the correct chunk server to
obtain the required file chunk.
3. The Master periodically asks all Chunk Servers for the stored data
block list and reconstructs the location information mapping table .
4. Chunk server will lookup for the file chunk using chunk ID.
5. The master keeps track of where a chunk is located.
Continue ..
 Chunk Server store and manage chunks.
 Chunks are replicated to handle failures.
 The Chunk Server checks all the data block file names it has
stored (also checking if the data block is corrupted) and sorts
the valid data block names into a list to answer to the Server.
 When the client is performing an update operation:
 contacts the nearest chunk server holding that data.
 pushes its updates to that server.
 server will push the update to the next closest one holding the data,
and so on.
Google data centers locations

https://www.google.com/a Google data


bout/datacenters/ center
cooling
system

One of Google data Google Cluster servers/racks


centers/Netherland
Grid Computing
Using
geographically
distributed and
interconnected
computers
together for
computing and
for resource
sharing.
“The grid virtualizes heterogeneous geographically disperse
resources” from "Introduction to Grid Computing with Globus," IBM
Redbooks 1-1.16
Grid computing is a method of harnessing the power of many computers in
a network to solve problems requiring a large number of processing cycles
and involving huge amounts of data. Most organizations today deploy
firewalls around their computer networks to protect their sensitive
proprietary data. But the central idea of grid computing-to enable resource
sharing makes mechanisms such as firewalls difficult to use
How Grid computing works ?
In general, a grid computing
system requires:
At least one computer,
usually a server, which
handles all the
administrative duties for
the System
 A network of computers
running special grid
computing network
software.
 A collection of computer
software called middleware
Virtual Organizations
Grid computing offers
potential of virtual organizations:
– groups of people, both geographically and
organizationally distributed, working
together on a problem, sharing computers
AND other resources such as databases
and experimental equipment.

1-1.19
Criteria for a Grid
Coordinates resources that are not subject to centralized
control.
Uses standard, open, general-purpose protocols and
interfaces.
Delivers nontrivial qualities of service.

Benefits
Exploit Underutilized resources
Resource load Balancing
Virtualize resources across an enterprise
Data Grids, Compute Grids
Enable collaboration for virtual organizations
Grid Architecture
Working of layers
 Fabric. The lowest layer job is used to make a common interface on all
possible kinds of resources available. Access by higher layers is
granted via standardized processes.
 Resource and connectivity protocols: The connectivity layer defines
the basic communication- and authentication protocols which are
needed by the grid. While the communication protocols allow the
exchange of files between different resources connected by the first
layer, the authentication protocols allow to communicate confidentially
and to ensure the identity of the two partners.
 Collective services: The purpose of this layer is the coordination of
multiple resources. Access to these resources doesn’t happen directly
but merely via the underlying protocols and interfaces.
 User applications: To this layer belong all those applications which
are operating in the environment of a virtual organization. Jobs of the
lower layers get called by applications and can use resources
transparently.
Advantages of Grid Computing
Advantages of Grid Computing
Advantages of Grid Computing
Advantages of Grid Computing
Summary on advantages of Grid
Computing
Disadvantages of Grid Computing
Disadvantages of Grid Computing
Methods of Grid Computing

 Distributed Supercomputing
 High-Throughput Computing
 On-Demand Computing
 Data-Intensive Computing
 Collaborative Computing
 Logistical Networking
Distributed Supercomputing

 Combining multiple high-capacity resources on


a computational grid into a single, virtual
distributed supercomputer.

 Tackle problems that cannot be solved on a


single system.
High-Throughput Computing
 Uses the grid to schedule large numbers of
loosely coupled or independent tasks, with the
goal of putting unused processor cycles to
work.

On-Demand Computing
 Uses grid capabilities to meet short-term
requirements for resources that are not
locally accessible.
 Models real-time computing demands.
Collaborative Computing
 Concerned primarily with enabling and enhancing
human-to-human interactions.
 Applications are often structured in terms of a
virtual shared space.
Data-Intensive Computing
 The focus is on synthesizing new information
from data that is maintained in geographically
distributed repositories, digital libraries, and
databases.

 Particularly useful for distributed data mining.


Logistical Networking

 Logistical networks focus on exposing storage


resources inside networks by optimizing the global
scheduling of data transport, and data storage.
 Contrasts with traditional networking, which does
not explicitly model storage resources in the
network.
 high-level services for Grid applications
 Called "logistical" because of the analogy it bears
with the systems of warehouses, depots, and
distribution channels.
Resource

From Chapter 1 of Distributed Systems


Concepts and Design,4th Edition,

By G. Coulouris, J. Dollimore and T. Kindberg

35

You might also like