Unit 1 Cloud
Unit 1 Cloud
https://drait.edu.in 1
Cloud computing is the on-demand availability of computer systems,
especially data storage (cloud storage) and computing power, without
direct active management by the user.
5
1.1.1 The Age of Internet Computing
1. Linpack Benchmark for high-performance computing (HPC) applications is
obselute for measuring system performance.
2.The emergence of computing clouds instead demands high-throughput
computing (HTC) systems built with parallel and distributed computing
technologies.
3.Need to upgrade data centers using fast servers, storage systems, and high-
bandwidth networks.
4.From 1970 to 1990, widespread with VLSI microprocessors.
5.From 1980 to 2000, massive numbers of portable both wired and wireless
applications.
6.Since 1990, HPC and HTC clusters, grids, or Internet clouds
6
Figure 1.1 evolution of HPC and HTC systems.
HPC -Supercomputers (massively parallel processors) by clusters of cooperative computers to share computing
resources.
The cluster is often a collection of homogeneous compute nodes that are physically connected in close range to one
another.
HTC -peer-to-peer (P2P) distributed file sharing and content delivery applications.
Service-Oriented Architecture (SOA) : Applications make use of services available in the network.
High-Performance Computing-HPC
Centralized computing:
Parallel computing:
Distributed computing
Cloud computing
Ubiquitous computing:
The Internet of Things (IoT)
Internet computing .
Design objectives
1. Efficiency:
resources utilization for massive parallelism
efficiency is job throughput, data access, storage, and power efficiency.
2. Dependability:
The reliability
self-management with Quality of Service (QoS) assurance, even under failure
conditions.
3. Adaptation: support billions of job requests massive data sets and virtualized
cloud resources under various workload and service models.
4.Flexibility: Application deployment both HPC (science and engineering) and HTC (business)
applications
1.1.2 Scalable Computing Trends and New Paradigms
resource distribution and concurrency or high degree of parallelism
(DoP).
Bit-level parallelism (BLP) converts bit-serial processing to word-level processing
gradually.
Instruction-level parallelism (ILP), in which the processor executes multiple
instructions simultaneously rather than only one instruction at a time.
ILP requires branch prediction, dynamic scheduling, speculation, and compiler
support to work efficiently.
Data-level parallelism (DLP) through SIMD (single instruction, multiple data).
DLP --hardware support and compiler assistance to work properly.
Innovative Applications
Trend toward Utility Computing
1.New and emerging computing and information technology may go through a hype cycle
The Internet of Things and Cyber-Physical Systems
Interconnection, tools, devices, or computers.
RFID or a related sensor or electronic technology -
GPS.
IPv6 protocol, 2128 IP addresses.
surrounded by 1,000 to 5,000 objects.
Designed to track 100 trillion static or moving
objects simultaneously.
Demands universal addressability
All objects and devices are instrumented, interconnected,
and interacted with each other intelligently.
H2H (human-to-human)
H2T (human- to thing)
T2T (thing-to-thing). things are PCs and mobile phones.
with low cost.
Cyber-Physical Systems
Cyber-Physical Systems
CPS interaction between computational processes and
the physical world.
“cyber” (heterogeneous, asynchronous) with “physical”
(concurrent and information-dense) objects.
A CPS merges the “3C” computation, communication,
and control.
The IoT ----- various networking connections among
physical objects
CPS --- virtual reality (VR) applications in the physical
world.
https://www.youtube.com/watch?v=eXsNX_2AzM8
Virtual Machines and Virtualization Middleware
Virtualization relies on software to simulate hardware
functionality and create a virtual computer system. scale and
greater efficiency.
A virtual machine (VM)
Virtualization/emulation of computer system.
Virtual machines are based on computer architectures and
provide functionality of a physical computer.
specialized hardware, software, or a combination.
Virtual Machines and Virtualization Middleware
3.Convergence of Technologies
Hardware virtualization and multi-core chips - enable the
existence of dynamic configurations in the cloud
utility and grid computing
SOA, Web 2.0, and data center automation
System Models for Distributed and Cloud Computing
SOA applies to building grids, clouds, grids of clouds, clouds of grids, clouds of
clouds.
A large number of sensors provide data-collection services as SS (sensor
service).
A sensor can be a ZigBee device, a Bluetooth device, a WiFi access point, a
personal computer, a GPA(generalized power allocation), or a wireless phone,
among other things.
Raw data is collected by sensor services.
All the SS devices interact with large or small computers, many forms of grids,
databases, the compute cloud, the storage cloud, the filter cloud, the discovery
cloud, and so on.
The Evolution of SOA
5.NUMA machines are often made out of SMP nodes with distributed, shared
memory.
6 A NUMA machine can run with multiple operating systems, and can scale to
a few thousand processors communicating with the MPI library
Performance Metrics and Scalability Analysis
4. Amdahl’s Law
Consider the execution of a given program on a uniprocessor
workstation with a total execution time of T minutes. Now, let’s say
the program has been parallelized or partitioned for parallel
execution on a cluster of many processing nodes. Assume that a
fraction α of the code must be executed sequentially, called the
sequential bottleneck. Therefore, (1 − α) of the code can be
compiled for parallel execution by n processors. The total execution
time of the program is calculated by α T + (1− α)T/n, where the first
term is the sequential execution time on a single processor and the
second term is the parallel execution time on n processing nodes.
Amdahl’s Law states that the speedup factor of using the n-processor
system over the use of a single processor is expressed by:
In Amdahl’s law, we have assumed the same amount of workload for both
sequential and parallel execution of the program with a fixed problem size or
dataset. To execute a fixed workload on n processors, parallel processing may
lead to a system efficiency defined as follows:
E =S/n =1/[αn +1 −α ]
System efficiency is low, when the cluster size is very large.
A cluster with n = 256 nodes, extremely low efficiency
E = 1/[0.25 × 256 + 0.75] = 1.5% is observed
Performance Metrics and Scalability Analysis
6. Gustafson’s Law
To achieve higher efficiency when using a large cluster, consider
scaling the problem size to match the cluster capability-speedup law
proposed by John Gustafson as scaled-workload speedup.
Let W be the workload in a given program.
When using an n-processor system, the user scales the workload to
W′ = αW + (1 − α)nW.
Scaled workload W′ is essentially the sequential execution time on a
single processor.
Performance Metrics and Scalability Analysis
6. Gustafson’s Law
The parallel execution time of a scaled workload W′ on n processors is
defined by a scaled-workload speedup as follows:
S′ =W′/W =[αW + (1− α)nW]/W = α+ (1 −α)n
This speedup is known as Gustafson’s law. By fixing the parallel
execution time at level W,
E′ =S′/n= α/n+(1− α)
256-node cluster to E′ = 0.25/256 + 0.75 = 0.751.
Fixed workload, users should apply Amdahl’s law. To solve scaled problems, users should apply Gustafson’s
Fault Tolerance and System Availability
1.System Availability
HA (high availability) is desired in all clusters, grids, P2P networks, and cloud
systems.
A system is highly available if it has a long mean time to failure (MTTF) and
a short mean time to repair (MTTR).
System availability is formally defined as follows:
System Availability =MTTF/(MTTF +MTTR)
All hardware, software, and network components may fail. Any failure that
will pull down the operation of the entire system is called a single point of
failure.
Fault Tolerance and System Availability
1. System Availability
1.System Availability
2.Security Responsibilities
confidentiality, integrity, and availability
SaaS - cloud provider to perform all security functions.
IaaS - users to assume almost all security functions, but to leave
availability in the hands of the providers.
PaaS - provider to maintain data integrity and availability, but burdens
the user with confidentiality and privacy.
3. Copyright Protection
To run a server farm (data center) a company has to spend a huge
amount of money for hardware, software, operational support, and
energy every year.
Therefore, companies should thoroughly identify whether their
installed server farm , the volume of provisioned resources in terms
of utilization.
Energy Efficiency in Distributed Computing
1 Energy Consumption of Unused Servers
Energy Efficiency in Distributed Computing