0% found this document useful (0 votes)
22 views

Abstraction and Virtualization

The document discusses the concepts of abstraction and virtualization in cloud computing, highlighting the importance of resource pooling for efficient utilization and cost-effectiveness. It covers various types of virtualization, load balancing techniques, and technologies such as hypervisors and machine imaging, as well as specific cloud services like Amazon EC2 and Eucalyptus. Additionally, it addresses challenges in application portability and the role of virtualization in optimizing CPU, memory, and I/O management.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Abstraction and Virtualization

The document discusses the concepts of abstraction and virtualization in cloud computing, highlighting the importance of resource pooling for efficient utilization and cost-effectiveness. It covers various types of virtualization, load balancing techniques, and technologies such as hypervisors and machine imaging, as well as specific cloud services like Amazon EC2 and Eucalyptus. Additionally, it addresses challenges in application portability and the role of virtualization in optimizing CPU, memory, and I/O management.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Abstraction and Virtualization

Introduction
• A collection of objects that are grouped together.

• Without resource pooling, it is impossible to attain


efficient utilization, provide reasonable costs to
users, and proactively react to demand.
Virtualization
• When you use cloud computing, you are accessing
pooled resources using a technique called
virtualization.

• Virtualization assigns a logical name for a physical


resource and then provides a pointer to that physical
resource when a request is made.
Types of Virtualization
• These are among the different types of virtualization that are
characteristic of cloud computing:

• Access: A client can request access to a cloud service from


any location.

• Application: A cloud has multiple application instances and


directs requests to an instance based on conditions.

• CPU: Computers can be partitioned into a set of virtual


machines with each machine being assigned a workload.

• Storage: Data is stored across storage devices and often


replicated for redundancy.
Services of Virtualization
• Service-based: A service-based architecture is
where clients are abstracted from service providers
through service interfaces.
• Scalable and elastic: Services can be altered to
affect capacity and performance on demand.
• Shared services: Resources are pooled in order to
create greater efficiencies.
• Metered usage: Services are billed on a usage basis.
• Internet delivery: The services provided by cloud
computing are based on Internet protocols and
formats.
Virtualization Technologies
• P2V
– Migration of an OS, application programs, and data from
computer’s hard disk to virtual machine. It is also known
as hardware virtualization.

• V2V
– Migration of an application on a virtual machine.

• V2P
– Migration of an application from virtual machine to
computer physical resoruces.
Load Balancing and Virtualization
• One characteristic of cloud computing is virtualized network access to a
service.

• No matter where you access the service, you are directed to the available
resources.

• The technology used to distribute service requests to resources is referred to


as load balancing.

• Load balancing can be implemented in hardware, as is the case with F5’s


BigIP servers, or in software, such as the Apache mod_proxy_balancer
extension, the Pound load balancer and reverse proxy software, and the
Squid proxy and cache daemon.

• Load balancing is an optimization technique; it can be used to increase


utilization and throughput, lower latency, reduce response time, and avoid
system overload.
Load balancing types
• Centralized Approach
– Single node is responsible for managing the distribution
of whole system.

• Decentralized Approach
– Each node takes part in load balancing and distributing
activities on different nodes.

• Mixed Approach
– Combination of the above two approaches.
Resources that can be balanced
• The following network resources can be load
balanced:
– Network interfaces and services such as DNS, FTP, and
HTTP
– Connections through intelligent switches
– Processing through computer system assignment
– Storage resources
– Access to application instances
Load Balancing
• Without load balancing, cloud computing would
very difficult to manage.

• It also provides fault tolerance when coupled with a


failover mechanism.

• Load balancing is nearly always a feature of server


farms and computer clusters and for high availability
applications.
Contd..
• In the simplest load-balancing mechanisms, the load
balancer listens to a network port for service requests.

• When a request from a client or service requester


arrives, the load balancer uses a scheduling algorithm to
assign where the request is sent.

• Typical scheduling algorithms in use today are round


robin and weighted round robin, fastest response time,
least connections and weighted least connections, and
custom assignments based on other factors.
Advanced load balancing
• The more sophisticated load balancers are workload managers.

• They determine the current utilization of the resources in their pool, the
response time, the work queue length, connection latency and capacity, and
other factors in order to assign tasks to each resource.

• An Application Delivery Controller (ADC) is a combination load balancer


and application server that is a server placed between a firewall or router and
a server farm providing Web services

• ADC include data compression, content caching, server health monitoring,


security, SSL offload and advanced routing based on current conditions.

• An ADC is considered to be an application accelerator, and the current


products in this area are usually focused on two areas of technology:
network optimization, and an application or framework optimization.
Google Cloud
• According to the Web site tracking firm Alexa, Google is the single
most heavily visited site on the Internet; that is, Google gets the
most hits.

• The investment Google has made in infrastructure is enormous,


and the Google cloud is one of the largest in use today.

• It is estimated that Google runs over a million servers worldwide,


processes a billion search requests, and generates twenty petabytes
of data per day.

• Google is understandably reticent to disclose much about its


network, because it believes that its infrastructure, system
response, and low latency are key to the company’s success.
Google never gives datacenter tours to journalists, doesn’t disclose
where its datacenters are located
Contd..
• Google has many datacenters around the world. As of
March 2008, Rich Miller of DataCenterKnowledge.com
wrote that Google had at least 12 major installations in
the United States and many more around the world.

• Based on current locations and the company’s


statements, Google’s datacenters are sited based on the
following factors (roughly in order of importance):
• Availability of cheap and, if possible, renewable energy
• The relative locations of other Google datacenters such that the site
provides the lowest latency response between sites
• Location of nearby Internet hubs and peering sites
• A source of cooling water
• Tax concessions from municipalities that lower Google’s overhead
Contd..
• When you initiate a Google search, your query is sent to a DNS server,
which then queries Google’s DNS servers.

• The DNS assignment acts as a first level of IP virtualization, a pool of


network addresses have been load balanced based on geography.

• When the query request arrives at its destination, a Google cluster is sent to a
load balancer which forward the request to Google squid server. This is the
second level of IP distribution, based on a measure of the current system
loading on proxy servers in the cluster.

• The Squid server checks its cache, and if it finds a match to the query, that
match is returned and the query has been satisfied. If there is no match in the
Squid cache, the query is sent to an individual Google Web Server based on
current Web server utilizations, which is the third level of network load
balancing, again based on utilization rates.
Understanding Hypervisors
• Given a computer system with a certain set of resources, you
can set aside portions of those resources to create a virtual
machine.

• A system virtual machine (or a hardware virtual machine) has


its own address space in memory, its own processor resource
allocation, and its own device I/O using its own virtual device
drivers

• From the standpoint of cloud computing, these features enable


VMMs to manage application provisioning, provide for
machine instance cloning and replication, allow for graceful
system failover, and provide several other desirable features.
Virtual machine types
• A low-level program is required to provide system resource access
to virtual machines, and this program is referred to as the
hypervisor or Virtual Machine Monitor (VMM).

• A hypervisor running on bare metal is a Type 1 VM or native VM.

• Examples of Type 1 Virtual Machine Monitors are LynxSecure,


RTS Hypervisor, Oracle VM, Sun xVM Server, etc.

• The operating system loaded into a virtual machine is referred to as


the guest operating system, and there is no constraint on running
the same guest on multiple VMs on a physical system.

• Type 1 VMs have no host operating system because they are


installed on a bare system.
Contd..
• Some hypervisors are installed over an operating system and are
referred to as Type 2 or hosted VM.

• Examples of Type 2 Virtual Machine Monitors are Containers,


KVM, Microsoft Hyper V, Parallels Desktop for Mac, Wind River
Simics, VMWare Fusion etc.

• Type 2 virtual machines are installed over a host operating system;


for Microsoft Hyper-V, that operating system would be Windows
Server.

• This abstraction is meant to place many I/O operations outside the


virtual environment, which makes it both programmatically easier
and more efficient to execute device I/O than it would be inside a
virtual environment. This type of virtualization is sometimes
referred to as paravirtualization
Contd..
Contd..
Understanding Machine Imaging
• A system image makes a copy or a clone of the entire computer
system inside a single container such as a file.

• The system imaging program is used to make this image and can
be used later to restore a system image.

• A prominent example of a system image and how it can be used in


cloud computing architectures is the Amazon Machine Image
(AMI) used by Amazon Web Services to store copies of a virtual
machine.

• An AMI is a file system image that contains an operating system,


all appropriate device drivers, and any applications and state
information that the working virtual machine would have.
Porting Applications
• If you build an application on a platform such as
Microsoft Azure, porting that application to Amazon
Web Services or GoogleApps may be difficult.

• In an effort to create an interoperability standard,


Zen Technologies has started an open source
initiative to create a common application program
interface that will allow applications to be portable.

• The initiative is called the Simple API for Cloud


Application Services
Contd..
• Simple Cloud API has as its goal a set of common
interfaces for:
• File Storage Services: Currently Amazon S3,
Windows Azure Blob Storage, Nirvanix, and Local
storage is supported by the Storage API.
• Document Storage Services: Amazon SimpleDB
and Windows Azure Table Storage are currently
supported. Local document storage is planned.
• Simple Queue Services: Amazon SQS, Windows
Azure Queue Storage, and Local queue services are
supported.
AppZero Virtual Application Appliance
• Applications that run in datacenters are captive to the
operating systems and hardware platforms that they run on.

• So moving an application from one platform to another isn’t


nearly simple.

• One company working on this problem is AppZero and its


solution is called the Virtual Application Appliance (VAA).

• The AppZero solution creates a virtual application appliance


as an architectural layer between the Windows or the UNIX
operating system and applications.

• The virtualization layer serves as the mediator for file I/O,


memory I/O, and application calls and response to DLLs,
which has the effect of sandboxing the application.
PROVISIONING IN THE CLOUD CONTEXT
• Amazon EC2 is a widely known example

• Eucalyptus and Open-Nebula are two


complementary and enabling technologies for open
source cloud tools.

• Eucalyptus is a system for implementing on-premise


private and hybrid clouds using the hardware and
software’s infrastructure
Amazon Elastic Compute Cloud
• The Amazon EC2 (Elastic Compute Cloud) is a Web service
that allows users to provision new machines into Amazon’s
virtualized infrastructure in a matter of minutes; using a
publicly available API

• Users get full root access and can install almost any OS or
application in their AMIs

• EC2 instance is typically a virtual machine with a certain


amount of RAM, CPU, and storage capacity

• Setting up an EC2 instance is quite easy. Once you create your


AWS (Amazon Web service) account, you can use the on-line
AWS console, or simply download the offline command line’s
tools to start provisioning your instances.
Contd..
• Amazon EC2 provides its customers with three
flexible purchasing models to make it easy for the
cost optimization:
– On-Demand instances, which allow you to pay a fixed
rate by the hour with no commitment.
– Reserved instances, which allow you to pay a low, one-
time fee and in turn receive a significant discount on the
hourly usage charge for that instance.
– Spot instances, which enable you to bid whatever price
you want for instance capacity, providing for even greater
savings, if your applications have flexible start and end
times.
Eucalyptus
• Eucalyptus is an open-source infrastructure for the
implementation of cloud computing on computer
clusters.
• Here are some of the Eucalyptus features:
– Interface compatibility with EC2, and S3 (both Web service
and Query/REST interfaces).
– Simple installation and deployment.
– Support for most Linux distributions (source and binary
packages).
– Support for running VMs that run atop the Xen hypervisor or
KVM.
– Cloud administrator’s tool for system’s management and
user’s accounting.
Architecture of Eucalyptus
Contd..
• Node controller (NC) controls the execution, inspection, and
termination of VM instances on the host where it runs.
• Cluster controller (CC) gathers information about and
schedules VM execution on specific node controllers, as well
as manages virtual instance network.
• Storage controller (SC) is a put/get storage service that
implements Amazon’s S3 interface and provides a way for
storing and accessing VM images and user data.
• Cloud controller (CLC) is the entry point into the cloud for
users and administrators. It queries node managers for
information about resources, makes high-level scheduling
decisions, and implements them by making requests to cluster
controllers.
• Walrus (W) is the controller component that manages access
to the storage services within Eucalyptus..
OpenNebula
• OpenNebula is an open and flexible tool that fits into
existing data center’s environments to build any type of
cloud deployment.

• OpenNebula supports a hybrid cloud to combine local


infrastructure with public cloud-based infrastructure,
enabling highly scalable hosting environments.

• Haizea is an open-source virtual machine-based lease


management architecture which can be used as a
scheduling backend for OpenNebula.
Contd..
Virtualization of CPU
• CPU Virtualization is one of the cloud-computing
technology that requires a single CPU to work,
which acts as multiple machines working together.

• CPU Virtualization emphasizes running programs


and instructions through a virtual machine, giving
the feeling of working on a physical workstation.

• All the operations are handled by an emulator that


controls software to run according to it.
Types
• Software-Based CPU Virtualization
– With software-based CPU virtualization, the guest
application code runs directly on the processor, while the
guest privileged code is translated and the translated code
runs on the processor.

• Hardware-Assisted CPU Virtualization


– Certain processors provide hardware assistance for CPU
virtualization.
Memory Virtualization
• Virtual memory virtualization is similar to the
virtual memory support provided by modern
operating systems.
• All modern x86 CPUs include a memory
management unit (MMU) and a translation look
aside buffer (TLB) to optimize virtual memory
performance.
• However, in a virtual execution environment, virtual
memory virtualization involves sharing the physical
system memory in RAM and dynamically allocating
it to the physical memory of the VMs.
Contd..
I/O Virtualization
• I/O virtualization involves managing the routing of
I/O requests between virtual devices and the shared
physical hardware.

• At the time of this writing, there are three ways to


implement I/O virtualization:
• full device emulation
• para-virtualization
• direct I/O
Full device emulation
• All the functions of a device or bus infrastructure,
such as device enumeration, identification,
interrupts, and DMA, are replicated in software.
• This software is located in the VMM and acts as a
virtual device.
Para Virtualization
• The para-virtualization method of I/O virtualization
is typically used in Xen.
• It is also known as the split driver model consisting
of a frontend driver and a backend driver.
• They interact with each other via a block of shared
memory.
• The frontend driver manages the I/O requests of the
guest OS and the backend driver is responsible for
managing the real I/O devices and multiplexing the
I/O data of different VMs.
Direct I/O
• Direct I/O virtualization lets the VM access devices
directly
Virtual Clusters
• Virtual clusters are built with VMs installed at
distributed servers from one or more physical
clusters.
• The VMs in a virtual cluster are interconnected
logically by a virtual network across several
physical networks.
Virtual Clusters
Virtual Cluster Management
• There are four ways to manage a virtual cluster.
• First, you can use a guest-based manager, by
which the cluster manager resides on a guest system
• The host-based manager supervises the guest
systems and can restart the guest system on another
physical machine.
• A third way to manage a virtual cluster is to use
an independent cluster manager on both the host
and guest systems.
• Finally, you can use an integrated cluster on the
guest and host systems.
VIRTUALIZATION FOR DATA-CENTER AUTOMATION

• Data centers have grown rapidly in recent years, and


all major IT companies are pouring their resources
into building new data centers.

• Data-center automation means that huge volumes of


hardware, software, and database resources in these
data centers can be allocated dynamically to millions
of Internet users simultaneously, with guaranteed
QoS and cost-effectiveness.
Server Consolidation in Data Centers
• In data centers, a large number of heterogeneous
workloads can run on servers at various times.
• These heterogeneous workloads can be roughly divided
into two categories: chatty workloads and non
interactive workloads.
• Chatty workloads may burst at some point and return to
a silent state at some other point. A web video service is
an example of this, whereby a lot of people use it at
night and few people use it during the day.
• Non interactive workloads do not require people’s
efforts to make progress after they are submitted.
Contd..
• However, to guarantee that a workload will always be able to cope
with all demand levels, the workload is statically allocated enough
resources so that peak demand is satisfied.

• Therefore, it is common that most servers in data centers are


underutilized.

• A large amount of hardware, space, power, and management cost


of these servers is wasted.

• Server consolidation is an approach to improve the low utility ratio


of hardware resources by reducing the number of physical servers.

• Among several server consolidation techniques such as centralized


and physical consolidation, virtualization-based server
consolidation is the most powerful.
Trust Management in Virtualized Data Centers
• Once a hacker successfully enters the VMM or
management VM, the whole system is in danger.

• An intrusion detection is used to recognize the


unauthorized access.

• An intrusion detection system (IDS) is built on


operating systems, and is based on the characteristics of
intrusion actions.

• A typical IDS can be classified as a host-based IDS


(HIDS) or a network-based IDS (NIDS), depending on
the data source.
Thank You

You might also like