Cloud Computing Automating The Virtualized DC I
Cloud Computing Automating The Virtualized DC I
Upon completing this chapter, you will be able to understand the following:
This chapter provides virtualization and cloud computing concepts. Virtualization and
cloud computing are dovetailed, and vendors and solution providers are increasingly using
virtualization to build clouds. This chapter will discuss various types of virtualization and
cloud computing, and the benefits of on-site computing to cloud computing. This chapter
will also provide information on types of services that can be provided on top of clouds,
such as Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a
Service (IaaS). Also, cloud adoption and barriers, ROI for cloud computing, and cloud
benefits are covered in this chapter.
Virtualization
Virtualization has become a technical necessity these days, and the trend is continuing for
a good reason because when implemented, it provides many benefits such as the following:
The sum of these savings can be huge, depending on the size of the enterprise.
It’s hard to define virtualization because there are many flavors of it. There’s usually a
one-to-many or many-to-one aspect to it. In a one-to-many approach, virtualization
enables you to create many virtualized resources from one physical resource. This form
of virtualization allows data centers to maximize resource utilization. Virtual resources
hosting individual applications are mapped to physical resources to provide more effi-
cient server utilization.
Virtualization Types
Virtualization can mean many things to many people. This chapter covers the following
virtualization types:
■ Server virtualization
■ Storage virtualization
■ Network virtualization
■ Service virtualization
Chapter 1: Cloud Computing Concepts 3
Figure 1-1 shows server virtualization, network virtualization, storage virtualization, and
service virtualization that can exist in a data center and be managed using virtualization
management. There can be other types of virtualization, but this is a start for virtualiza-
tion technology in the data centers.
Virtualization Management
Server Virtualization
Server virtualization (also referred as hardware virtualization) is the best known applica-
tion for hardware virtualization today. Today’s powerful x86 computer hardware was
designed to run a single operating system and a single application. This leaves most
machines vastly underutilized. Virtualization lets you run multiple virtual machines on a
single physical machine, sharing the resources of that single computer across multiple
environments. Different virtual machines can run different operating systems and multiple
applications on the same physical computer. Figure 1-2 shows how a virtualized server
looks against a physical server without virtualization.
The hypervisor software enables the creation of a virtual machine (VM) that emulates a
physical computer by creating a separate OS environment that is logically isolated from
the host server. A hypervisor, also called a virtual machine manager (VMM), is a program
that allows multiple operating systems to share a single hardware host. A single physical
machine can be used to create several VMs that can run several operating systems inde-
pendently and simultaneously. VMs are stored as files, so restoring a failed system can be
as simple as copying its file onto a new machine.
4 Cloud Computing
Application Application
■ Partitioning
■ Run multiple operating systems on one physical machine.
■ Management
■ Flexibility
Server virtualization is a key driving force in reducing the number of physical servers and
hence the physical space, cooling, cabling, and capital expenses in any data center con-
solidation projects.
Storage Virtualization
Storage virtualization refers to providing a logical, abstracted view of physical storage
devices. It provides a way for many users or applications to access storage without being
concerned with where or how that storage is physically located or managed. It enables
physical storage in an environment to be shared across multiple application servers, and
physical devices behind the virtualization layer to be viewed and managed as if they were
one large storage pool with no physical boundaries. The storage virtualization hides the
fact there are separate storage devices in an organization by making all the devices appear
as one device. Virtualization hides the complex process of where the data needs to be
stored and bringing it back and presenting it to the user when it is required.
Typically, storage virtualization applies to larger storage-area network (SAN) arrays, but it
is just as accurately applied to the logical partitioning of a local desktop hard drive and
Redundant Array of Independent Disks (RAID). Large enterprises have long benefited
from SAN technologies, in which storage is uncoupled from servers and attached directly
to the network. By sharing storage on the network, SANs enable scalable and flexible
storage resource allocation, efficient backup solutions, and higher storage utilization.
■ Resource optimization: Traditionally, the storage device is physically tied and dedi-
cated to servers and applications. If more capacity is required, more disks are pur-
chased and added to the server and dedicated to the applications. This method of op-
eration results in a lot of storage not being used or wasted. Storage virtualization
enables you to obtain the storage space on an as-needed basis without any wastage,
and it allows organizations to use existing storage assets more efficiently without the
need to purchase additional assets.
■ Cost of operation: Adding independent storage resources and configuring for each
server and application is time-consuming and requires a lot of skilled personnel that
are hard to find, and this affects the total cost of operation (TCO). Storage virtualiza-
tion enables adding storage resources without regard to the application, and storage
resources can be easily added to the pool by a drag-and-drop method using a man-
agement console by the operations people. A secure management console with a
GUI would enhance the security and allows operations people to add the storage
resources easily.
■ Improved performance: Many systems working on a single task can overwhelm a sin-
gle storage system. If the workload is distributed over several storage devices through
virtualization, the performance can be improved. In addition, security monitoring can
be implemented in the storage such that only authorized applications or servers are al-
lowed to access the storage assets.
Network Virtualization
Network virtualization might be the most ambiguous virtualization of all virtualization
types. Several types of network virtualization exist, as briefly described here:
■ Virtual device contexts (VDC), a data center virtualization concept, can be used to
virtualize the device itself, presenting the physical switch as multiple logical devices.
Within that VDC, it can contain its own unique and independent set of VLANs and
VRFs. Each VDC can have physical ports assigned to it, thus allowing the hardware
data plane to be virtualized as well. Within each VDC, a separate management
domain can manage the VDC itself, thus allowing the management plane itself to
also be virtualized. Each VDC appears as a unique device to the connected users.
event is signaled to the data center network and SAN, and the appropriate network
profile and storage services move with the VM.
Figure 1-3 illustrates how virtualized network, compute, and storage interact with each
other in the infrastructure.
Clients Clients
Internet
Network
Firewalls
and
Load Balancers
Network
Vmotion
Storage
In a broad sense, network virtualization, when properly designed, is similar to server vir-
tualization or hypervisor, in that a common physical network infrastructure is securely
shared among groups of users, applications, and devices.
8 Cloud Computing
Service Virtualization
Service virtualization in data centers refers to the services such as firewall services for
additional security or load-balancing services for additional performance and reliability.
The virtual interface—often referred to as a virtual IP (VIP)—is exposed to the outside
world, representing itself as the actual web server, and it manages the connections to and
from the web server as needed. This enables the load balancer to manage multiple web
servers or applications as a single instance, providing a more secure and robust topology
than one allowing users direct access to individual web servers. This is a one-to-many vir-
tualization representation. One server is presented to the world, hiding the availability of
multiple servers behind a reverse proxy appliance.
Virtualization Management
Virtualization management refers to coordinated provisioning and orchestration of virtualized
resources, as well as the runtime coordination of resource pools and virtual instances. This
feature includes the static and dynamic mapping of virtual resources to physical resources,
and also overall management capabilities such as capacity, analytics, billing, and SLAs.
Figure 1-4 illustrates how network, compute, and storage interact with the
management/orchestration layer, so the services can be provisioned in near real time.
Typically, the services are abstracted to a customer portal layer where the customer
selects the service, and the service is automatically provisioned using various domain and
middleware management systems along with Configuration Management Database
(CMDB), service catalog, accounting, and chargeback systems; SLA management; service
management; and service portal.
Client
Middleware
Tools
Service Orchestration
VM NW Storage
Domain
Server Network Storage Tools
Configuration Configuration Configuration
Management Management Management
Cloud Computing
Cloud is the most hyped word in the world, and everyone in the industry has his own
definition. In our opinion, the National Institute of Technology and Standards (NIST)
provides the simplest definition for a cloud:
Cloud computing is a model for enabling convenient, on-demand network access to a
shared pool of configurable computing resources (e.g., networks, servers, storage,
applications, and services) that can be rapidly provisioned and released with minimal
management effort or service provider interaction.1
A style of computing where massively scalable IT-related capabilities are provided ‘as
a service’ using Internet technologies to multiple external customers.2
10 Cloud Computing
So, what is cloud computing? From a “utility” perspective, cloud could be considered as
the fourth utility (after water, electricity, and telephony), which, as we and many others
believe, is the ultimate goal of cloud computing. Consider electricity and telephony (utili-
ty) services. When we come home or go to the office, we plug into the electric outlet and
get electricity as much and as long as we want without knowing how it is generated or
who the supplier is (we only know that we have to pay the bill at the end of each month
for the consumption). Similarly for telephony, we plug in, dial, and talk as long as we want
without knowing what kind of networks or service providers the conversation is traversing
through. With cloud as the fourth utility, we could plug in a monitor and get unlimited
computing and storage resources as long and as much as we want. In the next phase of the
Internet called cloud computing, where we will assign computing tasks to a “cloud”—a
combination of compute, storage, and application resources accessed over a network—we
will no longer care where our data is physically stored or where servers are physically
located, as we will only use them (and pay for them) just when we need them. Cloud
providers deliver applications through the Internet that are accessed from a web browser,
while the business software and data are stored on servers at a remote location. Most
cloud computing infrastructures consist of services delivered through shared data centers.
The cloud appears as a single point of access for consumers’ computing needs, and many
cloud service providers provide service offerings on the cloud with specified SLAs.
The cloud will offer all of us amazing flexibility as we can specify the exact amount of
computing power, data, or applications we need for each task we are working on. It will
be inexpensive because we won’t need to invest in our own capital and, with a network
of proven data centers and a solid infrastructure, it will be reliable. We will be able to lit-
erally “plug into” the cloud instead of installing software to run on our own hardware.
Table 1-1 highlights some of the key cloud characteristics/features.
Characteristic Explanation
Scalability and elasticity Rapidly scale the computing capabilities up or down, always elas-
tically to maintain cost efficiencies.
Pay per use Capabilities are charged using a metered, fee-for-service or advertis-
ing-based billing model to promote optimization of resource use.
Ubiquitous access Capabilities are available over the network and accessed through
standard mechanisms that promote use by heterogeneous thick,
thin, or mobile client platforms. Security must be everywhere in
the cloud, and the access to the cloud through Internet devices
must be secured to ensure data integrity and authenticity.
Chapter 1: Cloud Computing Concepts 11
Characteristic Explanation
Table 1-2 outlines the various cloud deployment models, their characteristics, and a brief
description of each.
Public cloud Cloud infrastructure Public cloud or external cloud describes cloud com-
made available to the puting in the traditional mainstream sense. Public
general public clouds are open to the general public or a large indus-
try group and are owned and managed by a cloud
service provider.
Private cloud Cloud infrastructure Private cloud and internal cloud have been described
operated solely for as offerings that emulate cloud computing on private
an organization networks. Private clouds are operated solely for one
organization. They can be managed by the organiza-
tion itself or by a third party, and they can exist on
premises or off premises. They have been criticized on
the basis that users “still have to buy, build, and man-
age them” and as such, do not benefit from lower up-
front capital costs and less hands-on management.
Hybrid cloud Cloud infrastructure Combines two or more private and public clouds by
comprised of two or technology that enables data and application portabili-
more public and pri- ty. The goal of a hybrid cloud is that as you run out of
vate clouds that inter- capacity in a private cloud, you can quickly reach out
operate through tech- to the public cloud for additional resources to contin-
nology ue to operate your business.
Service Models
Figure 1-5 shows service models and delivery models. All the services can be delivered on
any of the cloud delivery models.
Software as a Service (SaaS) The customer accesses the Sales force.com, Google Apps
provider’s application running
on the provider’s servers.
Platform as a Service (PaaS) The customer runs its applica- Google’s App Engine,
tions on the provider’s servers Force.com, MS Azure
using the provider’s operating
systems and tools.
Infrastructure as a The customer uses, adminis- Amazon AWS, Savvis
Service (IaaS) ters, and controls its operating Symphony, Terremarks Vcloud
system and applications run- Express, and Enterprise Cloud
ning on providers’ servers. It
can also include operating
systems and virtualization
technology to manage the
resources.
Figure 1-6 shows the service models and IT foundation, along with the major players.
Additional descriptions of the services are given in the list that follows.
Chapter 1: Cloud Computing Concepts 13
The following list provides a description of the SaaS, PaaS, and IaaS services shown in
Figure 1-6:
platform stacks to users and facilitates the deployment of applications without the
cost and complexity of buying and managing the underlying hardware and software.
The PaaS offerings typically attempt to support the use of the application by many
concurrent users by providing concurrency management, scalability, failover, and
security. The consumer does not manage or control the underlying cloud infrastruc-
ture, including network, servers, operating systems, or storage, but has control over
the deployed applications and possibly application hosting environment configura-
tions. Some of the major players of PaaS include Cisco (WebEx connect), Amazon
Web Services, Google, and Windows Azure.
The IT foundational hardware and software resources include items such as networks
comprised of switches, routers, firewalls, load balancers, and so on; server and storage
farms; and the software. Typically, the IT foundation is comprised of multivendor devices
and software. Some of the major players that supply IT foundational hardware and soft-
ware include Cisco, HP, IBM, Dell, VMware, Red Hat, Microsoft, and others.
The data from various surveys shows that key factors in the minds of IT personnel for
cloud adoption are security and integration. Although security and integration issues are
clearly users’ biggest fears about cloud computing, these concerns have not stopped com-
panies from implementing cloud-based applications within their organizations. Seventy
percent of IT decision makers using cloud computing are planning to move additional
solutions to the cloud within the next 12 months, recognizing the benefits of cloud, ease
of implementation, and security features and cost savings of cloud computing.3
Based on many discussions with customers and surveys, the following security and inte-
gration issues seem to be on many customers’ minds:
■ How to orchestrate among many new cloud tools and existing legacy tools
Although most of the surveys show that most customers are concerned about security
and integration, most of the successful organizations are taking calculated risks and
implementing the cloud with appropriate security measures. As many of you know, noth-
ing can be 100 percent secure, but by knowing the current state, one can apply appropri-
ate security measures to mitigate the risk and grow the business. More details on security
and integration are discussed in later chapters.
Figure 1-7 shows the capacity-versus-usage curve as an example in a typical data center
and a cloud IT IaaS on demand versus the resource usage. There is excess capacity
because of unnecessary capital expenditure early in the life cycle, and there is a shortage
of resources later in the life cycle. Without cloud IT IaaS, the planned resources are either
being wasted because the actual usage is less than the planned resources or there are not
enough resources available to meet the customer demand, resulting in customer dissatis-
faction and lost customers.
Figure 1-7 is a clear indication of why cloud IaaS is beneficial in preventing either over-
provisioning or underprovisioning to improve cost, revenue, and margins and provide the
required resources to match the dynamic demands of the customer. With cloud IaaS, the
provisioning of resources follows the demand curve (see the curves illustrated in Figure
1-7), and there is no wastage or shortage of resources.
Based on the capacity-versus-usage curve and the cloud IaaS technological merits, some
of the economic benefits of cloud IaaS are outlined as follows:
■ Pay-per-usage of the resources. The end user investment cost is only for the duration
of the connection and has no up-front cost.
■ The abstraction of infrastructure devices is typically done by the cloud provider, and
the end user is not locked into any physical devices. The end user gets the infrastruc-
ture required at the time of usage, through on-demand provisioning.
16 Cloud Computing
■ The end user gets service on demand and will be able to scale up or down, with no
planning cost or physical equipment cost. The cloud vendor that will be providing
the infrastructure will also have the benefit of using the spare capacity from the
devices anywhere under its control.
■ The end user access to applications, compute, and storage is unlimited and can be
from anywhere.
■ The end user capacity is unlimited, and the performance remains the same and is only
dictated by the agreed-upon SLAs.
You can find additional detailed information on ROI analysis from the white paper
“Building Return on Investment from Cloud Computing,” by the Open Group.5
Shortage of
Resources
Un-neccessary
Capital Expenditure
es
c
ur
so
Re
Actual Demand
ss
ce
Ex
Time
Summary
Virtualization is already taking place in most of the enterprises and service provider envi-
ronments, and cloud computing in the form of IaaS is taking place to a limited extent in
large enterprises and some service provider environments. Virtualization allows creating
virtual (logical) resources from multiple physical resources. Virtualization can be done in
compute (server) networks, router and switching networks, storage networks, and firewall
and load-balancing services, and management of virtualized resources can be done using
management tools such as provisioning, orchestration, and middleware tools. Cloud com-
puting and virtualization are used interchangeably, but that is incorrect. For example,
server virtualization provides flexibility to enable cloud computing, but that does not
make virtualization the same as cloud computing. There are many technologies that
enable cloud computing, and virtualization is one of them.
Cloud computing is the abstraction of underlying applications, information, content, and
resources, which allows resources to be provided and consumed in a more elastic and on
demand manner. This abstraction also makes the underlying resources easier to manage
and provides the basis for more effective management of the applications themselves.
Clouds can provide an almost immediate access to hardware resources without incurring
any up-front capital costs. This alone will provide incentive for many enterprises and
service providers to move to clouds, because it provides a quick return on investment.
References
1 The NIST Definition of Cloud Computing; refer tohttp://csrc.nist.gov/groups/SNS/
cloud-computing.
2 The Gartner Definition of Cloud Computing; refer tohttp://www.gartner.com/it/
page.jsp?id=707508.
3 www.mimecast.com/News-and-views/Press-releases/Dates/2010/2/70-Percent-of-
Companies-Using-Cloud-Based-Services-Plan-to-Move-Additional-Applications-to-the-
Cloud-in-the-Next-12-Months.
4 Amazon Web Services, AWS Economic Center, athttp://aws.amazon.com/economics.
5 Building Return on Investment from Cloud Computing by the Open Group, at
www.opengroup.org/cloud/whitepapers/ccroi/index.htm.
This page intentionally left blank
Chapter 2
Upon completing this chapter, you will be able to understand the following:
■ How IaaS can be used by SaaS and PaaS services to provide greater agility and
management consistency
■ How to describe how IaaS forms a foundation for other cloud services models
This chapter provides an overview of the components that make up a cloud deployment,
with particular emphasis on the Infrastructure as a Service (IaaS) service model.
Think about an example from the perspective of the consumer. Consumer A wants to
deploy a service to host content, whatever that might be. He can buy it as part of a new
hosting contract where he is simply given a space to upload files and manage the content
and charged on a monthly basis for that space. Changes to that content can happen in
real time; however, if for example, the consumer needed to add server-side processing or
a backend database to persist data, he might have to wait several days for his provider to
revise the contract to add this to his current service. In this example, you as a service
consumer will be charged regardless of the activity of the web server, and you will typi-
cally enter into a contract that requires you to pay for six months to a year.
Alternatively, you could purchase a virtual instance from Amazon (EC2) that provides a
basic machine with set CPU, RAM, and disk size; deploy a standard web server image;
and modify content as required. When you need additional functionality, you could
deploy additional software or instances. If this isn’t what you need, you can simply
destroy the instance and stop paying.
The major difference between the traditional hosting model and the cloud, therefore, is
the fact that the consumer will be charged only for what he uses rather than a fixed fee.
If the web-hosting service is not busy, the consumer will potentially pay less. So, cloud is
not really offering a new service in this case. The service is still hosting, but the offering
and consumption models are different, potentially allowing the consumer to do more
with his money. A more topical example is the content provider Netflix, which to scale
with the expected demand, uses Amazon to host its platform. This means Netflix can
increase its capacity pretty much in real time and with minimal disruption. Had it needed
to add more physical capacity or renegotiate its existing hosting contract, it might not
have been able to meet the demand in the time frames it needed to. So, cloud itself does
not in itself increase the range of what is possible to do, but it does change what is prac-
tical to do.
So, what is an IT service? Many definitions exist, but for the purposes of this chapter, an
IT service is defined as a set of IT resources organized in a specific topology providing or
supporting a specific business function. The decision to move a particular service into
the cloud should first be assessed to ensure that the cloud can support the topology and
the business can risk the function of being hosted in the cloud. Furthermore, you need to
assess what cloud deployment model best suits the use case for cloud adoption that is
being proposed.
Design Patterns
Figure 2-1 illustrates the typical design patterns found in the large enterprise today
across its application base.
Chapter 2: Cloud Design Patterns and Use Cases 21
Application
Tier WS* load balancing Search Query WS* load balancing WS* load balancing
1
1
LB LB LB Cache TS Load
1 2 1 1
App App App App
App App App App
Data
Tier DB Query DB Query DB Query
1
LB LB LB Cache
1 2 1 1
DB DB DB DB
DB DB
■ Load balancer: Where many instances/workers do the same job and a request is dis-
tributed among them using a load balancer that allocates requests and responds to the
requestor. This design pattern is common across all three tiers and is often used to im-
plement websites and or business applications.
■ Scatter and gather: Where a request can be broken down into a number of separate
requests, distributed among a number of workers, and then the separate worker
responses consolidated and returned to the requestor. Search engines often use this
pattern, and it is common across application and database tiers.
■ Caching: Where prior to a request being allocated using either load balancer or scat-
ter and gather patterns, a cache is consulted that has stored all previous queries. If no
match is found, a request is issued to the workers. This design pattern is common
across all three tiers.
22 Cloud Computing
■ Others: Where as technology advances, design patterns such as map reduce, black-
boarding, and so on might become more prevalent. It is not within the scope of this
book to try and predict what will succeed in a cloud. What is clear is that where hori-
zontal scaling based on load is required, IaaS is a good platform to host this type of
design pattern.
A good example of these other design patterns is the use of “infrastructure containers.”
For example, VMware describes a virtual data center as follows: “A vCloud virtual data-
center (vDC) is an allocation mechanism for resources such as networks, storage, CPU,
and memory. In a vDC, computing resources are fully virtualized and can be allocated
based on demand, service-level requirements, or a combination of the two. There are two
kinds of vDCs:
■ Provider vDCs: These vDCs contain all the resources available from the vCloud serv-
ice provider. Provider vDCs are created and managed by vCloud system administrators.
■ Organization vDCs: These vDCs provide an environment where virtual systems can
be stored, deployed, and operated. They also provide storage for virtual media, such
as floppy disks and CD-ROMs.
An organization administrator specifies how resources from a provider vDC are distrib-
uted to the vDCs in an organization.”
VMware vDCs offer a good way to abstract out the complexities of a tenant-specific
topology, such as Layer 2 separation and firewalls, as well as provide a way to manage
resources. Another variation on this theme are Cisco Network Containers that have been
submitted to OpenStack as the basis for Network as a Service, a further abstraction of the
IaaS service model that will allow complex network topologies to be hidden for the end
user and consumed as part of a larger design pattern. In Figure 2-2, you can see that the
three load balancer design patterns have been instantiated. However, the load-balancing
capability, Layer 2 isolation and security, and Layer 3 isolation have all been instantiated
in a network container that runs on a set of physical network devices. This allows the
application developer to concentrate on the application capabilities and not worry about
the network topology implementation specifics. The application developed simply con-
nects his virtual machines to a specific zone in a specific network container, and now load
balancing, firewalling, addressing, and so on will all be handled by the network container.
Chapter 2: Cloud Design Patterns and Use Cases 23
Network Container
Cisco ACE Cisco ASA Cisco Nexus 7000
Server VLAN2
Zone2
App App
Transit
VLAN2
Server VLAN3
Zone3
DB DB
Understanding what design patterns are being used in the enterprise will allow you to
understand the application topology and the basic building blocks required to support
those topologies in the cloud. A broader question, however, is do I want to risk moving
my application and its data into the cloud?
Gartner defines three classifications of applications:
■ Systems of record that record data about the business and business interactions.
■ Systems of differentiation that differentiate how a business is perceived and operates.
■ Systems of innovation can offer real value to businesses. Systems of innovation can
benefit from the elasticity and rapid provisioning capabilities provided by cloud, but
the dynamic and unfiltered nature of the data can make it unwise to host these appli-
cations outside the corporate boundary.
24 Cloud Computing
Business Value
Development and Test High capital outlay, cyclical usage IaaS, PaaS
The setup and maintenance of corporate development and test environments is both
labor- and cost-intensive, and given the cyclical nature of application development and
hardware refreshes, it means that often these environments were underused for a large
portion of time and out of date when they were required. Virtualization has helped in
reducing the hardware refresh requirements as virtual machines can have memory and
CPU added on the fly and ported relatively seamlessly among hardware platforms; how-
ever, hardware is still required and will be underutilized. In addition, how does the enter-
prise cope with a pizza- and Coke-fueled developer who has suddenly had a eureka
moment and understands how he can make a company application run three times faster
and wants to start working on it at 3 a.m. but needs a new database server? Cloud (IaaS
and PaaS) offers the enterprise the ability to meet these requirements of flexibility, on-
demand self-service, and utilization. (The question about which deployment model to
use—private or public—will be discussed in the next section.)
Business continuity and disaster recovery are two key areas that are critical to any
enterprise. Business continuity can be seen as the processes and tools that allow critical
business functions to be accessed by customers, employees, and staff at any time and
encompass, among other things, service desk, change management, backup, and disaster
recovery. In the same way the development and test environments can be underutilized,
the systems that support the business continuity processes can also be underutilized
when a business is operating normally; however, in the case of a failure, for example, the
help desk might experience much greater demand. In the case of a major outage or “disas-
ter,” moving or switching these applications to a secondary site to ensure that users can
still access applications, raise cases, or access backup data is critical. Cloud can clearly
assist with enhanced utilization through its use of virtualization technology or reduce
cost by using a SaaS “pay as you use” model for help desk or change management soft-
ware. The horizontal scaling of applications by creating new instances of a particular
application to cope with demand is something that IaaS will support, along with the
failover to a backup or secondary application hosted on a second public or private cloud,
or even splitting loads across both in a hybrid cloud model.
26 Cloud Computing
Desktop management is a well-known issue within large enterprises with a large amount
of operational resources being deployed to troubleshoot, patch, modify, and secure a
variety of desktop configurations. The introduction and benefits of Virtual Desktop
Infrastructure (VDI), which allow a user to connect to a centrally managed desktop, is
well known and will not be discussed here, but cloud (IaaS) can assist with the self-man-
agement of that desktop, the cloning of new desktops, and the management of the master
image. In addition, charging and billing of the use of VDI resources can also be provided
by utilizing a cloud-based VDI solution.
Pure storage-based services, such as file backup, image backup, and .ISO storage, are
typically required in any large enterprise and consume a vast amount of terabytes. When
funded and allocated at the line of business (LOB) or business unit (BU) level, this again
can lead to a large capital outlay and underutilization. A cloud (subset of IaaS) solution
can help drive higher utilization but still provide the same level of flexibility and self-
service that would be available if the resources were allocated to an individual LOB or
BU. In addition, the storage cloud can utilize other cloud use cases such as disaster recov-
ery and chargeback to provide a more comprehensive service.
Compute-on-demand services are the foundation for any IaaS cloud, regardless of the
use case. The consumer simply wants to increase business agility by being able to either
supplement existing services to cope with demand or to support new ventures quickly for
a period of time. All the IaaS use cases depend on the ability of the cloud provider to
support compute on demand; however, the solution to this should not simply be seen as
deploying server virtualization technologies, such as VMware ESX or Microsoft HyperV.
Many applications cannot or will not be virtualized; therefore, the need to include physi-
cal compute-on-demand services in any cloud offering is often overlooked. Although the
provision of physical servers will not be as common as virtual ones, it is a necessary
capability and requires the supporting capabilities to provision physical storage and net-
working that is no longer being encapsulated at the hypervisor level as well as the physi-
cal server and its operating systems and applications.
Deployment Models
So far, this chapter focused on use cases and design patterns that could exist within a pri-
vate cloud hosted with an enterprise data center or on a public cloud accessed publicly
through the Internet or privately through a corporate IP VPN service provided by a
telecommunications (telco) company. Chapter 1 discussed these models; Table 2-2
describes the most prevalent models.
Chapter 2: Cloud Design Patterns and Use Cases 27
Public cloud, or virtual private cloud as it’s sometimes offered by service providers, sup-
ports a set of standard design patterns and use cases. For example, Amazon Web Services
(AWS) EC2 supports a load balancing design pattern within, arguably, a single tier. With
its use of availability zones and autoscaling, it would support compute-on-demand, usage
monitoring, and business continuity use cases, but the specific requirements for a given
use case would need to be evaluated. A key principle of a public cloud is that it uses a
shared infrastructure to host multiple tenants with the provider’s own data center and uses
a common security model to provide isolation among these tenants. All management of
the cloud is performed by the provider, and the consumer is simply billed on usage.
A private cloud is built and operated by IT operations, so it will support any use cases
that the internal consumer wants to describe as part of the overall cloud reference model.
Although multiple LOBs or BUs might use the same infrastructure, the security model
might not need to be as complex as a public cloud as the cloud infrastructure is hosted
on-premises and data is typically stored within the enterprise. Building a private cloud
does require the enterprise to build or already have significant technical capabilities in
virtualization, networking, storage, and management (to name a few), but it will mean that
that enterprise can fully leverage the cloud and potentially develop new revenue opportu-
nities or business models. In addition, the enterprise must invest in the initial infrastruc-
ture to support the cloud services and sustain the investment as capacity is exhausted and
new infrastructure is required.
28 Cloud Computing
The third option is to allow a service provider to build and run a private cloud hosted
either in its data center or on-premises. A hosted cloud is an ideal solution to those enter-
prises that want to leverage cloud services with low investment and meet any security or
compliance concerns. This is unlike a public cloud, where the infrastructure in a hosted
cloud is dedicated to a specific tenant during normal operational hours. There is an
option to lease back resources during off-peak hours to the service provider, which might
allow the application of discounts to the overall bill; however, some tenants might feel
uncomfortable with different workloads running in their data center. Hosted cloud does
mean that the enterprise doesn’t have to build or invest in a cloud competency center, and
if the provider does offer public cloud services, it might be possible to migrate workloads
between a low-cost public cloud and a higher-cost private cloud as required and vice
versa. Figure 2-4 illustrates this concept.
APP APP
OS OS
Off Premise
On Premise
Current State Private Cloud
Migration
Mobility
APP APP
OS OS
IaaS as a Foundation
So far, we’ve looked at the components that make up a cloud service from the consumer
viewpoint—the use case, design patterns, and deployment models they can consider
when using or deploying cloud. Chapter 1 also looked at the different cloud service mod-
els: Infrastructure, Platform, and Software as a Service. This section describes the why,
for the service provider, IaaS1 is a foundation for the other two service models and
describes the components that are required in IaaS1 to support the typical use cases and
design patterns described. Review the SaaS1 and PaaS1 definitions from Chapter 1:
■ SaaS: The capability provided to the consumer to use the provider’s applications
running on a cloud infrastructure
Chapter 2: Cloud Design Patterns and Use Cases 29
■ PaaS: The capability provided to the consumer to deploy onto the cloud infrastruc-
ture consumer-created or -acquired applications created using programming lan-
guages and tools supported by the provider
Both SaaS and PaaS increase the provider’s responsibility for a service over and above
that of IaaS. Given that the provider still needs to provide the same type of characteristics
of self-service and elasticity to fulfill basic cloud requirements, it makes sense to deploy
SaaS and PaaS on top of an IaaS solution to leverage the capabilities, systems, and
processes that are fundamental to IaaS. This doesn’t mean that it is obligatory for a SaaS
or PaaS provider to deploy IaaS first; however, it will allow a SaaS and PaaS solution to
scale more easily. The rest of this section takes a high-level look at what components
make up the different service models; Figure 2-5 describes the components from both a
consumer and provider perspective.
Presentation
Fulfillment Assurance Billing
Content
Portal and Catalogue
Data
Production API
Basic Resources
Operating System
Infrastructure
Facilities
IaaS
PaaS
SaaS
Provider’s
Responsibility
One of the major points to note about Figure 2-5 is that the application stack is included
in all service models. From a standard definition perspective, IaaS is often seen as pure
infrastructure, so no operating system or applications are provided on the servers, but in
practice, most cloud providers will include applications as a service option. However, the
nature of the applications within each service model is different as discussed in the list
that follows:
■ SaaS: The consumer is only really interested in using the finished application, so con-
tent and data are paramount. Presentation might also be a significant contributing
factor if multiple devices (smartphones, laptops, and so on) are used to access the ap-
plication. Metadata might also be relevant at this level if different tenants are given
different application “skins.”
Both SaaS and PaaS require servers, operating systems, and so on to support the applica-
tions that are consumed by the service users. As the service provider’s level of responsi-
bility for those applications increases, so does the need to manage them in an effective
manner. Looking at SaaS, specifically where the application might not be inherently mul-
titenant (for example, where a managed service provider wants to offer some form of
monitoring application on top of an IP telephony solution), you can see that a new
instance of an application must be created per tenant. Doing this manually will be time-
consuming and costly for the service provider, so using an IaaS solution to quickly
deploy new instances makes a lot of business sense. It also means that there is a process
available to support other new single-tenant applications, adding new instances to cope
with load (horizontal scaling) or changing the vRAM or vCPU of an existing application
to cope with load (vertical scaling) in an automated and consistent manner. IaaS/PaaS
solution users will often need access to OS libraries, patches, and code samples as well as
have the ability to back up or snapshot the machine, which are key capabilities of any
IaaS solution. Rather than building this functionality, specifically SaaS or PaaS functional-
ity, the idea of building a foundational IaaS layer, even if IaaS services are not being
offered directly, means that the architecture is more flexible and agile. Building the infra-
structure in a consistent manner also means that it is possible to manage, bill, or charge in
a consistent manner, regardless of whether the infrastructure is being used for IaaS, PaaS,
SaaS, or a combination of all of them.
Chapter 2: Cloud Design Patterns and Use Cases 31
A cloud consumer will typically not care how a service is fulfilled or managed and will
look for the provider to maintain responsibility for the service levels, governance, change,
and configuration management throughout the lifetime of his service instance, regardless
of whether it’s just for the infrastructure, the platform, or the software. Therefore, the
service provider needs to manage the fulfillment, assurance, and optionally billing/charge-
back of that service consistently. In addition, if the provider offers more than one service
model (for example, SaaS + PaaS), these functions ideally will be offered by an integrated
management stack rather than different silos.
Regardless of the application design pattern, use case, or deployment model required by
the cloud consumer, the consumer operating model chosen for the cloud is critical and
forms the basis for the cloud reference architecture discussed in the next chapter.
Organization
Service Portfolio
Governance
Technology
SLA Architecture
Management
Central to all operating models is an understanding of the use cases and design patterns
that will be hosted in the cloud solution. After these two things are understood and
agreed upon, the following areas must be addressed:
■ Organization: How the consumer structures and organizes his IT capabilities to sup-
port a more on-demand, utility model when resources can be hosted off-site or by a
third party.
■ Service portfolio: What services will be offered internally and to whom, and how
new services will be created.
■ Processes: What processes are changed by the move to utility computing and what
those changes are.
■ SLA management: What service-level agreement (SLA) will be provided to the end
user and therefore needs to be provided between the organization and the cloud
provider (which can be the IT department in the case of private cloud).
■ Supplier management: What cloud provider or tool vendor will be selected, the
selection criteria, the type of licensing, and contractual models that will be used.
■ Governance: How the consumer makes, prioritizes, and manages decisions about
cloud and mitigates risk as a utility computing model is introduced.
Those familiar with The Open Group Architecture Framework (TOGAF) Architecture
Development Method (ADM) will see some similarities between Figure 2-6 and the
ADM. This is intentional because cloud is a business transformation more than a technol-
ogy transformation, so the operating model and management reference architecture
described in the subsequent chapters are critical components of a cloud strategy.
Chapter 2: Cloud Design Patterns and Use Cases 33
Summary
The adoption of cloud is a business transformation first and a technology one second.
When looking at adopting a cloud consumption model, the following aspects should be
well understood:
■ What design patterns and use cases need to be supported in the cloud?
■ What deployment model provides the best business advantage while still conforming
to any regulatory or security requirements?
The way to answer these questions is through the development of a cloud operating
model that encompasses the organizational, process, and technology changes required to
adapt to utility computing.
References
1 The NIST Definition of Cloud Computing, at http://csrc.nist.gov/groups/SNS/cloud-
computing.
2 The Open Group TOGAF ADM, at www.opengroup.org/architecture/togaf8-
doc/arch/chap03.html.
3 VMware vCloud API, at www.vmware.com/pdf/vcd_10_api_guide.pdf.
4 Cisco Network Containers @ Openstack, at http://wiki.openstack.org/
NetworkContainers.
This page intentionally left blank
Chapter 3
■ Describe the cloud Infrastructure as a Service (IaaS) solution and its functional
components
Architecture
Architecture is a borrowed term that is often overused in technology forums. The Oxford
English Dictionary defines architecture as “the art or practice of designing and construct-
ing buildings” and further, “the conceptual structure and logical organization of a com-
puter or computer-based system.”
In general, outside the world of civil engineering, the term architecture is a poorly under-
stood concept. Although we can understand the concrete concept of a building and the
36 Cloud Computing
Even though architecture involves some well-defined activities, our first attempt at a defi-
nition uses the words art along with science. Unfortunately, for practical purposes, this
definition is much too vague. But, one thing the definition does indirectly tell us is that
architecture is simply part of the process of building things. For example, when building
a new services platform, it is being built for a purpose and, when complete, is expected to
have certain required principles.
Chapter 3: Data Center Architecture and Technologies 37
Architecture:
A finite list of components
Architectures A4 and protocols that are
common to all Designs and
Implementations
Designs D1 D2 D3
I1 I2
Implementations
With regard to cloud services, architecture must extend beyond on-premises (private
cloud) deployments to support hybrid cloud models (hosted cloud, public cloud, commu-
nity cloud, virtual private cloud, and so on). Architecture must also take into considera-
tion Web 2.0 technologies (consumer social media services) and data access ubiquity
(mobility).
38 Cloud Computing
Architectural principles that are required for a services platform today would most likely
include but not be limited to efficiency, scalability, reliability, interoperability, flexibility,
robustness, and modularity. How these principles are designed and implemented into a
solution changes all the time as technology evolves.
■ Make data and resources available in real time to provide flexibility and alignment
with current and future business agility needs.
■ Reduce power and cooling consumption to cut operational costs and align with
“green” business practices.
■ Maintain information assurance through consistent and robust security posture and
processes.
From this set of challenges, you can derive a set of architectural principles that a new
services platform would need to exhibit (as outlined in Table 3-1) to address the afore-
mentioned challenges. Those architectural principles can in turn be matched to a set of
underpinning technological requirements.
Scalability Platform scalability can be achieved through explicit protocol choice (for
example, TRILL) and hardware selection and also through implicit system
design and implementation.
Reliability Disaster recovery (BCP) planning, testing, and operational tools (for
example, VMware’s Site Recovery Manager, SNAP, or Clone backup
capabilities).
Interoperability Web-based (XML) APIs, for example, WSDL (W3C) using SOAP or the
conceptually simpler RESTful protocol with standards compliance seman-
tics, for example, RFC 4741 NETCONF or TMForum’s Multi-Technology
Operations Systems Interface (MTOSI) with message binding to “con-
crete” endpoint protocols.
Modularity Commonality of the underlying building blocks that can support scale-out
and scale-up heterogeneous workload requirements with common integra-
tion points (web-based APIs). That is, integrated compute stacks or infra-
structure packages (for example, a Vblock or a FlexPod). Programmatic
workflows versus script-based workflows (discussed later in this chapter)
along with the aforementioned software abstraction help deliver modulari-
ty of software tools.
Over the last few years, there have been iterative developments to the virtual infrastruc-
ture. Basic hypervisor technology with relatively simple virtual switches embedded in the
hypervisor/VMM kernel have given way to far more sophisticated third-party distributed
virtual switches (DVS) (for example, the Cisco Nexus 1000V) that bring together the
operational domains of virtual server and the network, delivering consistent and integrat-
ed policy deployments. Other use cases, such as live migration of a VM, require orches-
tration of (physical and virtual) server, network, storage, and other dependencies to
enable uninterrupted service continuity. Placement of capability and function needs to be
carefully considered. Not every capability and function will have an optimal substantia-
tion as a virtual entity; some might require physical substantiation because of perform-
ance or compliance reasons. So going forward, we see a hybrid model taking shape, with
each capability and function being assessed for optimal placement with the architecture
and design.
Chapter 3: Data Center Architecture and Technologies 41
Although data center performance requirements are growing, IT managers are seeking
ways to limit physical expansion by increasing the utilization of current resources. Server
consolidation by means of server virtualization has become an appealing option. The use
of multiple virtual machines takes full advantage of a physical server’s computing poten-
tial and enables a rapid response to shifting data center demands. This rapid increase in
computing power, coupled with the increased use of VM environments, is increasing the
demand for higher bandwidth and at the same time creating additional challenges for the
supporting networks.
Power consumption and efficiency continue to be some of the top concerns facing data
center operators and designers. Data center facilities are designed with a specific power
budget, in kilowatts per rack (or watts per square foot). Per-rack power consumption and
cooling capacity have steadily increased over the past several years. Growth in the num-
ber of servers and advancement in electronic components continue to consume power at
an exponentially increasing rate. Per-rack power requirements constrain the number of
racks a data center can support, resulting in data centers that are out of capacity even
though there is plenty of unused space.
Several metrics exist today that can help determine how efficient a data center operation
is. These metrics apply differently to different types of systems, for example, facilities,
network, server, and storage systems. For example, Cisco IT uses a measure of power per
work unit performed instead of a measure of power per port because the latter approach
does not account for certain use cases—the availability, power capacity, and density pro-
file of mail, file, and print services will be very different from those of mission-critical
web and security services. Furthermore, Cisco IT recognizes that just a measure of the
network is not indicative of the entire data center operation. This is one of several rea-
sons why Cisco has joined The Green Grid (www.thegreengrid.org), which focuses on
developing data center–wide metrics for power efficiency. The power usage effectiveness
(PUE) and data center efficiency (DCE) metrics detailed in the document “The Green
Grid Metrics: Describing Data Center Power Efficiency” are ways to start addressing this
challenge. Typically, the largest consumer of power and the most inefficient system in the
data center is the Computer Room Air Conditioning (CRAC). At the time of this writing,
state-of-the-art data centers have PUE values in the region of 1.2/1.1, whereas typical val-
ues would be in the range of 1.8–2.5. (For further reading on data center facilities, check
out the book Build the Best Data Center Facility for Your Business, by Douglas Alger
from Cisco Press.)
Cabling also represents a significant portion of a typical data center budget. Cable sprawl
can limit data center deployments by obstructing airflows and requiring complex cooling
system solutions. IT departments around the world are looking for innovative solutions
that will enable them to keep up with this rapid growth with increased efficiency and low
cost. We will discuss Unified Fabric (enabled by virtualization of network I/O) later in
this chapter.
42 Cloud Computing
Trust in the cloud, Cisco believes, centers on five core concepts. These challenges keep
business leaders and IT professionals alike up at night, and Cisco is working to address
them with our partners:
■ Security: Are there sufficient information assurance (IA) processes and tools to en-
force confidentiality, integrity, and availability of the corporate data assets? Fears
around multitenancy, the ability to monitor and record effectively, and the trans-
parency of security events are foremost in customers’ minds.
Chapter 3: Data Center Architecture and Technologies 43
■ Control: Can IT maintain direct control to decide how and where data and soft-
ware are deployed, used, and destroyed in a multitenant and virtual, morphing
infrastructure?
■ Service-level management: Is it reliable? That is, can the appropriate Resource Usage
Records (RUR) be obtained and measured appropriately for accurate billing? What if
there’s an outage? Can each application get the necessary resources and priority needed
to run predictably in the cloud (capacity planning and business continuance planning)?
■ Compliance: Will my cloud environment conform with mandated regulatory, legal, and
general industry requirements (for example, PCI DSS, HIPAA, and Sarbanes-Oxley)?
■ Interoperability: Will there be a vendor lock-in given the proprietary nature of to-
day’s public clouds? The Internet today has proven popular to enterprise businesses
in part because of the ability to reduce risk through “multihoming” network connec-
tivity to multiple Internet service providers that have diverse and distinct physical
infrastructures.
For cloud solutions to be truly secure and trusted, Cisco believes they need an underly-
ing network that can be relied upon to support cloud workloads.
To solve some of these fundamental challenges in the data center, many organizations are
undertaking a journey. Figure 3-4 represents the general direction in which the IT indus-
try is heading. The figure maps the operational phases (Consolidation, Virtualization,
Automation, and so on) to enabling technology phases (Unified Fabric, Unified
Computing, and so on).
Operational
Phase
Governance
IT Element
Technology
and
Evolution
Organizations that are moving toward the adoption and utilization of cloud services tend
to follow these technological phases:
1. Adoption of a broad IP WAN that is highly available (either through an ISP or self-
built over dark fiber) enables centralization and consolidation of IT services.
Application-aware services are layered on top of the WAN to intelligently manage ap-
plication performance.
4. Utility computing model includes the ability meter, chargeback, and bill customer on
a pay-as-you-use (PAYU) basis. Showback is also a popular service: the ability to
show current, real-time service and quota usage/consumption including future trend-
ing. This allows customers to understand and control their IT consumption.
Showback is a fundamental requirement of service transparency.
■ Performance monitoring: Both in the network (transactions) and in the data center
(application processing).
Chapter 3: Data Center Architecture and Technologies 45
■ Application visibility and control: Application control gives service providers dy-
namic and adaptive tools to monitor and assure application performance.
However, there is a price to pay for all this virtualization: management complexity. As vir-
tual resources become abstracted from physical resources, existing management tools
and methodologies start to break down in regard to their control effectiveness, particular-
ly when one starts adding scale into the equation. New management capabilities, both
implicit within infrastructure components or explicitly in external management tools, are
required to provide the visibility and control service operations teams required to manage
the risk to the business.
Unified Fabric based on IEEE Data Center Bridging (DCB) standards (more later) is a
form of abstraction, this time by virtualizing Ethernet. However, this technology unifies
the way that servers and storage resources are connected, how application delivery and
core data center services are provisioned, how servers and data center resources are inter-
connected to scale, and how server and network virtualization is orchestrated.
To complement the usage of VMs, virtual applications (vApp) have also been brought
into the data center architecture to provide policy enforcement within the new virtual
infrastructure, again to help manage risk. Virtual machine-aware network services such as
VMware’s vShield and Virtual Network Services from Cisco allow administrators to pro-
vide services that are aware of tenant ownership of VMs and enforce service domain iso-
lation (that is, the DMZ). The Cisco Virtual Network Services solution is also aware of
the location of VMs. Ultimately, this technology allows the administrator to tie together
service policy to location and ownership of an application residing with a VM container.
46 Cloud Computing
The Cisco Nexus 1000V vPath technology allows policy-based traffic steering to
“invoke” vApp services (also known as policy enforcement points [PEP]), even if they
reside on a separate physical ESX host. This is the start of Intelligent Service Fabrics
(ISF), where the traditional IP or MAC-based forwarding behavior is “policy hijacked” to
substantiate service chain–based forwarding behavior.
Server and network virtualization have been driven primarily by the economic benefits of
consolidation and higher utilization of physical server and network assets. vApps and ISF
change the economics through efficiency gains of providing network-residing services
that can be invoked on demand and dimensioned to need rather than to the design con-
straints of the traditional traffic steering methods.
Virtualization, or rather the act of abstraction from the underlying physical infrastruc-
ture, provides the basis of new types of IT services that potentially can be more dynamic
in nature, as illustrated in Figure 3-5.
Virtualized
Applications
DEMAND
o nP Applications
ution Virtual Automatic/Adaptive Network
STATIC
ON
vol Servers
ic eE
er v
Dedicated
DEMAND
S Virtualized Application
Servers
Virtualized Application Layer
STATIC
ON
Storage
Dedicated
DEMAND
Virtualized Compute
Storage
VPN
Compute Layer
STATIC
ON
WAN/LAN
Private
DEMAND
Virtualized Storage
Circuits
Storage Layer
STATIC
ON
Virtualized Network
Network Layer
Facilities Layer
assets. In other words, if an architect wants or needs to change an IT asset (for example, a
server type/supplier) or change the workflow or process execution logic within a work-
flow step/node in response to a business need, a lot of new scripting is required. It’s like
building a LEGO brick wall with all the bricks glued together. More often than not, a new
wall is cheaper and easier to develop than trying to replace or change individual blocks.
Two main developments have now made service automation a more economically viable
option:
■ Standards-based web APIs and protocols (for example, SOAP and RESTful) have
helped reduce integration complexity and costs through the ability to reuse.
In many IT environments today, dedicated physical servers and their associated applica-
tions, as well as maintenance and licensing costs, can be mapped to the department using
them, making the billing relatively straightforward for such resources. In a shared virtual
environment, however, the task of calculating the IT operational cost for each consumer
in real time is a challenging problem to solve.
Pay for use, where the end customers are charged based on their usage and consumption
of a service, has long been used by such businesses as utilities and wireless phone
providers. Increasingly, pay-per-use has gained acceptance in enterprise computing as IT
works in parallel to lower costs across infrastructures, applications, and services.
One of the top concerns of IT leadership teams implementing a utility platform is this: If
the promise of pay-per-use is driving service adoption in a cloud, how do the providers
of the service track the service usage and bill for it accordingly?
IT providers have typically struggled with billing solution metrics that do not adequately
represent all the resources consumed as part of a given service. The primary goal of any
chargeback solution requires consistent visibility into the infrastructure to meter resource
usage per customer and the cost to serve for a given service. Today, this often requires
cobbling together multiple solutions or even developing custom solutions for metering.
This creates not only up-front costs, but longer-term inefficiencies. IT providers quickly
become overwhelmed building new functionality into the metering system every time
they add a service or infrastructure component.
The dynamic nature of a virtual converged infrastructure and its associated layers of
abstraction being a benefit to the IT operation conversely increase the metering complex-
ity. An optimal chargeback solution provides businesses with the true allocation break-
down of costs and services delivered in a converged infrastructure.
The business goals for metering and chargeback typically include the following:
Step 2. Chargeback mediation (correlating and aggregating data collected from the var-
ious system components into a billing record of the service owner customer)
Step 3. Billing and reporting (applying the pricing model to collected data) and gener-
ating a periodic billing report
Chapter 3: Data Center Architecture and Technologies 49
Phase 5: Market
In mainstream economics, the concept of a market is any structure that allows buyers and
sellers to exchange any type of goods, services, and information. The exchange of goods
or services for money (an agreed-upon medium of exchange) is a transaction.
■ Business: The business aspect is required for marketplace and a technical aspect for
exchange and delivery. The business part needs product definition, relationships (on-
tology), collateral, pricing, and so on.
■ Technical: The technical aspect needs fulfillment, assurance, and governance aspects.
In the marketplace, there will be various players/participants who take on a variety and/or
combination of roles. There would be exchange providers (also known as service aggre-
gators or cloud service brokers), service developers, product manufacturers, service
providers, service resellers, service integrators, and finally consumers (or even prosumers).
First, we will look at Layer 2 physical and logical topology evolution. Figure 3-6 shows
the design evolution of an OSI Layer 2 topology in the data center. Moving from left to
right, you can see the physical topology changing in the number of active interfaces
between the functional layers of the data center. This evolution is necessary to support
the current and future service use cases.
Virtualization technologies such as VMware ESX Server and clustering solutions such as
Microsoft Cluster Service currently require Layer 2 Ethernet connectivity to function
properly. With the increased use of these types of technologies in data centers and now
even across data center locations, organizations are shifting from a highly scalable Layer
3 network model to a highly scalable Layer 2 model. This shift is causing changes in the
technologies used to manage large Layer 2 network environments, including migration
away from Spanning Tree Protocol (STP) as a primary loop management technology
toward new technologies, such as vPC and IETF TRILL (Transparent Interconnection of
Lots of Links).
50 Cloud Computing
Aggregation
Layer
1 2 1 2 1 … … 16
vPC L2MP
STP
VSS /TRILL
Access Layer
In early Layer 2 Ethernet network environments, it was necessary to develop protocol and
control mechanisms that limited the disastrous effects of a topology loop in the network.
STP was the primary solution to this problem, providing a loop detection and loop man-
agement capability for Layer 2 Ethernet networks. This protocol has gone through a num-
ber of enhancements and extensions, and although it scales to very large network envi-
ronments, it still has one suboptimal principle: To break loops in a network, only one
active path is allowed from one device to another, regardless of how many actual connec-
tions might exist in the network. Although STP is a robust and scalable solution to redun-
dancy in a Layer 2 network, the single logical link creates two problems:
■ Half (or more) of the available system bandwidth is off limits to data traffic.
■ A failure of the active link tends to cause multiple seconds of system-wide data loss
while the network reevaluates the new “best” solution for network forwarding in the
Layer 2 network.
Although enhancements to STP reduce the overhead of the rediscovery process and allow
a Layer 2 network to reconverge much faster, the delay can still be too great for some net-
works. In addition, no efficient dynamic mechanism exists for using all the available
bandwidth in a robust network with STP loop management.
An early enhancement to Layer 2 Ethernet networks was PortChannel technology (now
standardized as IEEE 802.3ad PortChannel technology), in which multiple links between
Chapter 3: Data Center Architecture and Technologies 51
two participating devices can use all the links between the devices to forward traffic by
using a load-balancing algorithm that equally balances traffic across the available Inter-
Switch Links (ISL) while also managing the loop problem by bundling the links as one
logical link. This logical construct keeps the remote device from forwarding broadcast
and unicast frames back to the logical link, thereby breaking the loop that actually exists
in the network. PortChannel technology has one other primary benefit: It can potentially
deal with a link loss in the bundle in less than a second, with little loss of traffic and no
effect on the active STP topology.
Although vPC is not the only technology that provides this solution, other solutions tend
to have a number of deficiencies that limit their practical implementation, especially
when deployed at the core or distribution layer of a dense high-speed network. All multi-
chassis PortChannel technologies still need a direct link between the two devices acting
as the PortChannel endpoints. This link is often much smaller than the aggregate band-
width of the vPCs connected to the endpoint pair. Cisco technologies such as vPC are
specifically designed to limit the use of this ISL specifically to switch management traffic
and the occasional traffic flow from a failed network port. Technologies from other ven-
dors are not designed with this goal in mind, and in fact, are dramatically limited in scale
especially because they require the use of the ISL for control traffic and approximately
half the data throughput of the peer devices. For a small environment, this approach
might be adequate, but it will not suffice for an environment in which many terabits of
data traffic might be present.
before the ratification of the IETF TRILL standard. (For the Nexus 7000 switch, the
migration from Cisco FabricPath to IETF TRILL protocol, a simple software upgrade
migration path is planned. In other words, no hardware upgrades are required.)
Generically, we will refer to TRILL and FabricPath as “Layer 2 Multi-Pathing (L2MP).”
■ Enables Layer 2 multipathing in the Layer 2 DC network (up to 16 links). This pro-
vides much greater cross-sectional bandwidth for both client-to-server (North-to-
South) and server-to-server (West-to-East) traffic.
■ Provides built-in loop prevention and mitigation with no need to use the STP. This
significantly reduces the operational risk associated with the day-to-day manage-
ment and troubleshooting of a nontopology-based protocol, like STP.
■ Provides a single control plane for unknown unicast, unicast, broadcast, and multi-
cast traffic.
■ Enhances mobility and virtualization in the FabricPath network with a larger OSI
Layer 2 domain. It also helps with simplifying service automation workflow by sim-
ply having less service dependencies to configure and manage.
What follows is an amusing poem by Ray Perlner that can be found in the IETF TRILL
draft that captures the benefits of building a topology free of STP:
I hope that we shall one day see,
Figure 3-7 illustrates the two evolution trends happening in the data center.
Services Services
Classical IEEE
Ethernet FC FCoE Data Center
(with vPC) B
Bridging (DCB)
Access Layer
Virtual Virtual
Virtual Services Virtual Services
Access Access
Non-Unified I/O
“Intellegent Services Fabric”
Host
From the demand side, multicore CPUs spawning the development of virtualized com-
pute infrastructures have placed increased demand of I/O bandwidth at the access layer
of the data center. In addition to bandwidth, virtual machine mobility also requires the
flexibility of service dependencies such as storage. Unified I/O infrastructure fabric
enables the abstraction of the overlay service (for example, file [IP] or block-based [FC]
storage) that supports the architectural principle of flexibility: “Wire once, any protocol,
any time.”
54 Cloud Computing
Abstraction between the virtual network infrastructure and the physical networking
causes its own challenge in regard to maintaining end-to-end control of service traffic
from a policy enforcement perspective. Virtual Network Link (VN-Link) is a set of stan-
dards-based solutions from Cisco that enables policy-based network abstraction to
recouple the virtual and physical network policy domains.
Cisco and other major industry vendors have made standardization proposals in the IEEE
to address networking challenges in virtualized environments. The resulting standards
tracks are IEEE 802.1Qbg Edge Virtual Bridging and IEEE 802.1Qbh Bridge Port
Extension.
The Data Center Bridging (DCB) architecture is based on a collection of open-standard
Ethernet extensions developed through the IEEE 802.1 working group to improve and
expand Ethernet networking and management capabilities in the data center. It helps
ensure delivery over lossless fabrics and I/O convergence onto a unified fabric. Each ele-
ment of this architecture enhances the DCB implementation and creates a robust Ethernet
infrastructure to meet data center requirements now and in the future. Table 3-2 lists the
main features and benefits of the DCB architecture.
Feature Benefit
IEEE DCB builds on classical Ethernet’s strengths, adds several crucial extensions to pro-
vide the next-generation infrastructure for data center networks, and delivers unified fab-
ric. We will now describe how each of the main features of the DCB architecture con-
tributes to a robust Ethernet network capable of meeting today’s growing application
requirements and responding to future data center network needs.
Priority-based Flow Control (PFC) enables link sharing that is critical to I/O consolida-
tion. For link sharing to succeed, large bursts from one traffic type must not affect other
traffic types, large queues of traffic from one traffic type must not starve other traffic
types’ resources, and optimization for one traffic type must not create high latency for
small messages of other traffic types. The Ethernet pause mechanism can be used to
control the effects of one traffic type on another. PFC is an enhancement to the pause
Chapter 3: Data Center Architecture and Technologies 55
mechanism. PFC enables pause based on user priorities or classes of service. A physical
link divided into eight virtual links, PFC provides the capability to use pause frame on a
single virtual link without affecting traffic on the other virtual links (the classical
Ethernet pause option stops all traffic on a link). Enabling pause based on user priority
allows administrators to create lossless links for traffic requiring no-drop service, such as
Fibre Channel over Ethernet (FCoE), while retaining packet-drop congestion management
for IP traffic.
Traffic within the same PFC class can be grouped together and yet treated differently
within each group. ETS provides prioritized processing based on bandwidth allocation,
low latency, or best effort, resulting in per-group traffic class allocation. Extending the
virtual link concept, the network interface controller (NIC) provides virtual interface
queues, one for each traffic class. Each virtual interface queue is accountable for manag-
ing its allotted bandwidth for its traffic group, but has flexibility within the group to
dynamically manage the traffic. For example, virtual link 3 (of 8) for the IP class of traffic
might have a high-priority designation and a best effort within that same class, with the
virtual link 3 class sharing a defined percentage of the overall link with other traffic class-
es. ETS allows differentiation among traffic of the same priority class, thus creating prior-
ity groups.
In addition to IEEE DCB standards, Cisco Nexus data center switches include enhance-
ments such as FCoE multihop capabilities and lossless fabric to enable construction of a
Unified Fabric.
At this point to avoid any confusion, note that the term Converged Enhanced Ethernet
(CEE) was defined by “CEE Authors,” an ad hoc group that consisted of over 50 develop-
ers from a broad range of networking companies that made prestandard proposals to the
IEEE prior to the IEEE 802.1 Working Group completing DCB standards.
FCoE is the next evolution of the Fibre Channel networking and Small Computer System
Interface (SCSI) block storage connectivity model. FCoE maps Fibre Channel onto Layer
2 Ethernet, allowing the combination of LAN and SAN traffic onto a link and enabling
SAN users to take advantage of the economy of scale, robust vendor community, and
road map of Ethernet. The combination of LAN and SAN traffic on a link is called uni-
fied fabric. Unified fabric eliminates adapters, cables, and devices, resulting in savings
that can extend the life of the data center. FCoE enhances server virtualization initiatives
with the availability of standard server I/O, which supports the LAN and all forms of
Ethernet-based storage networking, eliminating specialized networks from the data cen-
ter. FCoE is an industry standard developed by the same standards body that creates and
maintains all Fibre Channel standards. FCoE is specified under INCITS as FC-BB-5.
FCoE is evolutionary in that it is compatible with the installed base of Fibre Channel as
well as being the next step in capability. FCoE can be implemented in stages nondisrup-
tively on installed SANs. FCoE simply tunnels a full Fibre Channel frame onto Ethernet.
With the strategy of frame encapsulation and deencapsulation, frames are moved, with-
out overhead, between FCoE and Fibre Channel ports to allow connection to installed
Fibre Channel.
56 Cloud Computing
For a comprehensive and detailed review of DCB, TRILL, FCoE and other emerging pro-
tocols, refer to the book I/O Consolidation in the Data Center, by Silvano Gai and
Claudio DeSanti from Cisco Press.
Deploying Layer 4 through 7 services in virtual data centers has, however, been extreme-
ly challenging. Traditional service deployments are completely at odds with highly scala-
ble virtual data center designs, with mobile workloads, dynamic networks, and strict
SLAs. Security, as aforementioned, is just one required service that is frequently cited as
the biggest challenge to enterprises adopting cost-saving virtualization and cloud-com-
puting architectures.
As illustrated in Figure 3-8, Cisco Nexus 7000 Series switches can be segmented into vir-
tual devices based on business need. These segmented virtual switches are referred to as
virtual device contexts (VDC). Each configured VDC presents itself as a unique device
to connected users within the framework of that physical switch. VDCs therefore deliver
true segmentation of network traffic, context-level fault isolation, and management
through the creation of independent hardware and software partitions. The VDC runs as
a separate logical entity within the switch, maintaining its own unique set of running
software processes, having its own configuration, and being managed by a separate
administrator.
The possible use cases for VDCs include the following:
■ Offer a secure network partition for the traffic of multiple departments, enabling de-
partments to administer and maintain their own configurations independently
■ Facilitate the collapsing of multiple tiers within a data center for total cost reduction
in both capital and operational expenses, with greater asset utilization
■ Test new configuration or connectivity options on isolated VDCs on the production
network, which can dramatically improve the time to deploy services
Chapter 3: Data Center Architecture and Technologies 57
PC
PC
WAN
DC Core layer can be
implemented in VDC
for One or Few Legacy Storage
aggregation
Nexus 7000
Core (VDC)
Aggregation (VDC)
Virtualized L3
L2MP
FCoE
FCoE
Nexus 5500
Leaf
DCB DCB DCB DCB DCB DCB DCB DCB
L2MP
VM VM VM
VM VM VM
Access VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM
VM VM VM
Figure 3-8 Collapsing of the Vertical Hierarchy with Nexus 7000 Virtual Device Contexts
(VDC)
Tenant Red
Tenant Green
Secured Access
MPLS L2/L3 VPNs Tenant Blue
SSL and IPSec VPNs
Layer 3
IP/MPLS/IPSec
capabilities L3 VPN Access
Core IP Network
Cisco Nexus 7000 Each tenant contained
switches for inter in a unique Layer 3
“Integrated Compute VRF (IP Subnet)
Stack” (ICS)
communications.
Scale with
L2MP/TRILL Tenant “VRF-lite maps
to unique VLANs
L3
Aggregation
Integrated Compute L2
UCS system includes “Fabric
Stack
Interconnects” as unified
service access switches.
VLANs terminate of vEth and
Access vFC interfaces. VN-tags apply
vApps supported by “tenant separacy” policies within
“tenant aware” vShield UCS system.
& Nexus 1000V with
vPath - “Intelligent
Services Fabric”
Tenant VLANs
contains virtual
servers.
Application Database Web
VM VM VM VLAN to VSAN
mapping for FC.
VM VM Isolated VLANs for
NFS
Virtual Access
In addition to multitenancy, architects need to think about how to provide multitier appli-
cations and their associated network and service design, including from a security pos-
ture perspective a multizone overlay capability. In other words, to build a functional and
secure service, one needs to take into account multiple functional demands, as illustrated
in Figure 3-10.
Chapter 3: Data Center Architecture and Technologies 59
Pod
Pod Pod
Zone Zone
The challenge is being able to “stitch” together the required service components (each
with their own operational-level agreement (OLAs underpin an SLA) to form a service
chain that delivers the end-to-end service attributes (legally formalized by a service-level
agreement [SLA]) that the end customer desires. This has to be done within the context of
the application tier design and security zoning requirements.
Real-time capacity and capability posture reporting of a given infrastructure are only just
beginning to be delivered to the market. Traditional ITIL Configuration Management
Systems (CMS) have not been designed to run in real-time environments. The conse-
quence is that to deploy a service chain with known quantitative and qualitative attrib-
utes, one must take a structured approach to service deployment/service activation. This
structured approach requires a predefined infrastructure modeling of the capacity and
capability of service elements and their proximity and adjacency to each other. A prede-
fined service chain, known more colloquially as a network container, can therefore be
activated on the infrastructure as a known unit of consumption. A service chain is a
group of technical topology building blocks, as illustrated in Figure 3-11.
60 Cloud Computing
VPN offload
No Network SLB and SSL Firewall
Services offload
VM VM VM VM VM VM VM VM VM
Service Assurance
As illustrated in Figure 3-12, SLAs have evolved through necessity from those based only
on general network performance in Layers 1 through 3 (measuring metrics such as jitter
and availability), to SLAs increasingly focused on network performance for specific appli-
cations (as managed by technologies such as a WAN optimization controller), to SLAs
based on specific application metrics and business process SLAs based on key perform-
ance indicators (KPI) such as cycle time or productivity rate. Examples of KPIs are the
number of airline passengers who check in per hour or the number of new customer
accounts provisioned.
Chapter 3: Data Center Architecture and Technologies 61
SLA Boundary
WAN DC DC DC APP
APP
APP
OS APP
OS APP
APP
OS APP
APP
OS
Managed Service Edge Core Aggregation Access APP
APP
OS APP
OS APP
OS APP
OS APP
OS APP
OS APPOS
OS
OS OS OS OS
CPE APP
OS APPOS APP
OS APPOS
Nexus 5k OS OS OS OS
Nexus 7k
CPE Nexus 1k, UCS
CPE Storage
Nexus 7k
Cat6k
Services MDS
Customers expect that their critical business processes (such as payroll and order fulfill-
ment) will always be available and that sufficient resources are provided by the service
provider to ensure application performance, even in the event that a server fails or if a
data center becomes unavailable. This requires cloud providers to be able to scale up data
center resources, ensure the mobility of virtual machines within the data center and
across data centers, and provide supplemental computer resources in another data center,
if needed.
With their combined data center and Cisco IP NGN assets, service providers can attract
relationships with independent software vendors with SaaS offerings, where end cus-
tomers purchase services from the SaaS provider while the service provider delivers an
assured end-to-end application experience.
In addition to SLAs for performance over the WAN and SLAs for application availability,
customers expect that their hosted applications will have security protection in an exter-
nal hosting environment. In many cases, they want the cloud service provider to improve
the performance of applications in the data center and over the WAN, minimizing appli-
cation response times and mitigating the effects of latency and congestion.
With their private IP/MPLS networks, cloud service providers can enhance application
performance and availability in the cloud and deliver the visibility, monitoring, and
reporting that customers require for assurance. As cloud service providers engineer their
solutions, they should consider how they can continue to improve on their service offer-
ings to support not only network and application SLAs but also SLAs for application
transactions and business processes.
62 Cloud Computing
Service assurance solutions today need to cope with rapidly changing infrastructure con-
figurations as well as understand the status of a service with the backdrop of ever-chang-
ing customer ownership of a service. The solution also needs to understand the context
of a service that can span traditionally separate IT domains, such as the IP WAN and the
Data Center Network (DCN).
Ideally, such a solution should ideally be based on a single platform and code base design
that eliminates some of the complexities of understanding a service in a dynamic environ-
ment. This makes it easier to understand and support the cloud services platform and also
eliminates costly and time-consuming product integration work. However, the single-plat-
form design should not detract from scalability and performance that would be required
in a large virtual public cloud environment and obviously with an HA deployment model
supported.
Northbound and southbound integration to third-party tools, with well-defined and doc-
umented message format and workflow that allow direct message interaction and web
integration APIs, is an absolute basic requirement to build a functional system.
An IaaS assurance deployment requires a real-time and extensible data model that can
support the following:
The ability to define logical relationships among service elements to represent the techni-
cal definition of a service is a critical step in providing a service-oriented impact analysis.
In addition, to understand the service components, the service element relationships are
both fixed and dynamic and need to be tracked. Fixed relationships identify definitions,
such as the fact that this web application belongs to this service. Dynamic relationships are
managed by the model, such as identifying as an example which Cisco UCS chassis is host-
ing an ESX server where a virtual machine supporting this service is currently running.
Service policies evaluate the state of and relationships among elements and provide impact
roll-up so that the services affected by a low-level device failure are known. They assist in
root cause identification so that from the service a multilevel deep failure in the infra-
structure can be seen to provide up, down, and degraded service states. (For example, if a
single web server in a load-balanced group is down, the service might be degraded.)
Finally, service policies provide event storm filtering, roll-up, and windowing functions.
All this information, service elements, relationships, and service policies provide service
visualization that allows operations to quickly determine the current state of a service,
service elements, and current dynamic network and infrastructure resources, and in addi-
tion allow service definition and tuning. A good example of a service assurance tool that
supports these attributes and capabilities can be found at www.zenoss.com.
Critical Apps
By Function
By Department
By Application Type
Demand Driven
Cost/ROI Web Farms
DR/BC
Hybrid Cloud
Portal
Overflo/ Infrastructure Services Security Compliance
Burst Cap Acceptable SLAs
DR/BC
Service Culture
Business Drivers
IT agility
App Testing
IT competitiveness Metering/Billing
Pre-production
IT as a Service
BU charge Back
Production Service Catalog
3rd-Party Integration
Scale Dev/Test • ITSM Workflow
Availability R and D
• CMDB
Predictability
Service Assurance
Transition Stages
The phasing closely maps to the IT industry evolution we discussed earlier in this chapter:
Migration of existing applications onto the new services platform requires extensive
research and planning in regard not only to the technical feasibility but also the feasibility
in regard to current operational and governance constraints which, with this authors’
experience to date, prove to be the most challenging aspects to get right. It is essential
that technical and business stakeholders work together to ensure success.
Building a virtualization management strategy at tool set is key to success for the first
phase. The benefits gained through virtualization can be lost without an effective virtual-
ization management strategy. Virtual server management requires changes in policies sur-
rounding operations, naming conventions, chargeback, and security. Although many
Chapter 3: Data Center Architecture and Technologies 65
server and desktop virtualization technologies come with their own sets of management
capabilities, businesses should also evaluate third-party tools to plug any gaps in manage-
ment. These tools should answer questions such as, “How much infrastructure capacity
and capability do I have?” or “What are the service dependencies?” in real time.
Technology to support the fourth phase of this journey is only just starting to appear in
the marketplace at the time of this writing. The ability to migrate workloads and service
chains over large distances between (cloud) service providers requires an entire range of
technological and service-related constraints that are being addressed. Chapter 5, “The
Cisco Cloud Strategy,” will discuss some of these constraints in detail.
66 Cloud Computing
Today
Command Graphical User Programmatic Analytics Various?
Line Interface Modeling
Config
>ip address 192.16.1.2
are needed to see this picture.
Config
> Interface
ge2.1
Config
>switchport access
Time
Summary
Cisco believes that the network platform is a foundational component of a utility service
platform as it is critical to providing intelligent connectivity within and beyond the data
center. With the right built-in and external tools, the network is ideally placed to provide
a secure, trusted, and robust services platform.
The network is the natural home for management and enforcement of policies relating to
risk, performance, and cost. Only the network sees all data, connected resources, and
user interactions within and between clouds. The network is thus uniquely positioned to
monitor and meter usage and performance of distributed services and infrastructure. An
analogy for the network in this context would be the human body’s autonomic nervous
system (ANS) that acts as a system (functioning largely below the level of consciousness)
that controls visceral (inner organ) functions. ANS is usually divided into sensory (affer-
ent) and motor (efferent) subsystems that is analogous to visibility and control capabili-
ties we need from a services platform to derive a desired outcome. Indeed, at the time of
this writing, there is a lot of academic research into managing complex network systems,
might they be biological, social, or traditional IT networking. Management tools for the
data center and wider networks have moved from a user-centric focus (for example, GUI
design) to today’s process-centric programmatic capabilities. In the future, the focus will
most likely shift toward behavioral- and then cognitive-based capabilities.
Chapter 3: Data Center Architecture and Technologies 67
The network also has a pivotal role to play in promoting resilience and reliability. For
example, the network, with its unique end-to-end visibility, helps support dynamic
orchestration and redirection of workloads through embedded policy-based control
capabilities. The network is inherently aware of the physical location of resources and
users. Context-aware services can anticipate the needs of users and deploy resources
appropriately, balancing end-user experience, risk management, and the cost of service.
This page intentionally left blank
Chapter 4
IT Services
Upon completing this chapter, you should be able to understand the following:
This chapter describes the classification of Information Technology (IT) services from
both business-centric and technology-centric perspectives. This chapter looks at the
underpinning economics of Infrastructure as a Service (IaaS) and the contextual aspects
of making a “workload” placement in the cloud, that is, risk versus cost.
We will first discuss the concept of risk (defined in this context as the exposure to the
chance of financial loss) and how can this be classified and quantified (qualitative and
quantitative), and then we will discuss the cost within the context of cloud economic
fundamentals.
70 Cloud Computing
Put another way, the purpose of classification is to ostensibly protect information and
information services from being able to degrade or endanger the function of the enter-
prise through leakage or failure.
As a first step, an organization needs a reference model to identify the different types of
data that might exist. (Data is the lowest level of abstraction; by itself, it carries no mean-
ing, whereas information does convey meaning. Knowledge is the highest form of
abstraction, carrying meaning as well as conveying application value.) Customer data is an
Chapter 4: IT Services 71
example that is heavily protected by law in many countries. Customer information and
knowledge—when, where, and what he purchased, for example—would be of great value
to a corporation for analytics and marketing purposes.
■ BIL6, Top Secret (TS): The highest level of classification of material on a national
level. Such material would cause “exceptionally grave damage” to national security if
made publicly available.
■ BIL5, Secret: Such material would cause “grave damage” to national security if it
were publicly available.
■ BIL3, Restricted: Such material would cause “undesirable effects” if made publicly
available. Some countries do not have such a classification.
■ BIL2, Protect: Information or material that if compromised, would likely cause sub-
stantial distress to individuals or, for example, breach statutory restrictions on the
disclosure of information.
■ BIL1, Protect: Technically not a classification level, but is used for government docu-
ments that do not have a classification listed previously. Such documents can some-
times be viewed by those without security clearance.
■ BIL0, Unclassified: Information that is available to the general public and would not
cause any harm or infringe any law were it to be intentionally or accidentally dis-
closed to the general public.
This document provides useful contextual guidance through subcategories (use cases for
public sector bodies). We will discuss BILs in more detail later in this chapter.
Similarly, for the U.S. government, information classification levels are set out in
Executive Order 13526 (amended in 2009). In Section 1.2, Classification Levels informa-
tion can be classified at one of three levels:
Governance
Corporate governance consists of the set of internal processes, policies, and external laws
and institutions that influence the way people administer a corporation. Corporate gover-
nance also includes the relationships among the many actors involved (the stakeholders)
and the corporate goals. The main actors include the shareholders, management, and the
board of directors, as well as external actors such as regulatory bodies.
IT governance is to assure the investment in IT generate real business value and mitigate
the risks that are associated with IT projects.
The sections that follow examine some of the governance frameworks in more detail.
As a side note, for all frameworks and methodologies, there are champions and detractors
of each. What you and your corporation will utilize or how it is implemented is down to
individuals and the corporate culture that exists.
The ITGI has also developed and built into the COBIT framework a set of management
guidelines for COBIT that consist of maturity models, critical success factors (CSF), key
goal indicators (KGI), and key performance indicators (KPI).
Thus, COBIT addresses the full spectrum of IT governance processes, but only from a
high-level business perspective, emphasizing audit and control. Other frameworks
address a subset of processes in more detail, including ITIL for IT service management
and delivery and ISO 17799 for IT security.
6. Security Management
7. ICT Infrastructure Management
8. Application Management
TOGAF (v9 is the latest) has at its core the Architecture Development Method (ADM).
The purpose of the preliminary phase of the ADM is to define how architecture is devel-
oped within an enterprise, that is, the actual framework to work within and architectural
principles. It is in this phase that IT governance is incorporated and adhered to. In other
words, develop an IT architectural vision that aligns with the business architecture, driv-
ers, and goals.
74 Cloud Computing
Risk
In an enterprise, risk management requires a thorough understanding of business require-
ments, potential threats, and vulnerabilities that might be exploited, along with an evalua-
tion of the likelihood and impact of a risk being realized. Some corporations might issue
a risk appetite statement that allows it to communicate the overall level of risk that they
are prepared to tolerate to achieve their business aims.
A Business Impact Analysis (BIA) report will identify the corporation’s most crucial sys-
tems and processes and the effect an outage would have on the organization and classify
them within Business Impact Levels (BIL). The greater the potential impact, the more
money a company should spend to restore a system or process quickly. For example, a
large hedge fund might decide to pay for completely redundant IT systems that would
allow it to immediately start processing trades at another location if the primary location
fails. A BIA will help companies set a restoration sequence to determine which parts of
the business should be restored first.
Enterprises in larger organizations have a dedicated Risks Assessment Team (RAT) to
undertake the risk assessment process on a regular basis. During this process, three types
of qualitative and quantitative tests can typically be employed:
■ Penetration testing: Penetration testing tests for the depth of vulnerabilities in spe-
cific applications, hosts, or systems. This type of testing poses the greatest risk to the
resource and should be used sparingly.
ISO 27001
According to www.27001-online.com, “the basic objective of the [ISO 27001] standard is
to help establish and maintain an effective information management system, using a con-
tinual improvement approach [‘Plan-Do-Check-Act’]. It implements OECD (Organization
for Economic Cooperation and Development) principles, governing security of informa-
tion and network systems.” It includes an information classification taxonomy that is sim-
ilar to the national government information classification models we discussed at the
beginning of this chapter. However, this taxonomy is more focused for use in the corpo-
rate enterprise and is shown in Table 4-1.
Chapter 4: IT Services 75
Company Confidential Information collected and used Salaries and other personnel
Data by the company in the conduct of data
its business to employ people, to Accounting data and internal
log and fulfill client orders, and to financial reports
manage all aspects of corporate Confidential customer business
finance. Access to this information data and confidential contracts
is very restricted within the Nondisclosure agreements with
company. The highest possible clients/vendors
levels of integrity, confidentiality, Company business plans /
and restricted availability are vital. long-range planning
Compliance
From the Society of Corporate Compliance and Ethics (SCCE), compliance is defined
simply as “the process of meeting the expectations of others.” (Source:
www.corporatecompliance.org.)
Furthermore, end customers will potentially be evaluating service offers from different
providers. To enable efficient arbitrage between competing offers within any market-
place, a common ontology is required. Simply put, the industry needs standards to pro-
vide service transparency, not only from the perspective of the offer (ontology describing
the SLA) but also in terms of the operational status of active services (that is, if an infra-
structure component fails causing a breech of the SLA, the incident needs to be reported
and, if applicable, penalties or reparations made).
Service providers have been looking at ways by which they standardize information
regarding the cloud service SLA that incorporates capabilities related to risk management.
An example of an industry body defining standards would be the Cloud Security
Alliance, which has developed the GRC Stack (www.cloudsecurityalliance.org/
grcstack.zip). The GRC Stack provides a tool kit for enterprises, cloud service providers,
security solution providers, IT auditors, and other key stakeholders to instrument and
assess both private and public clouds against industry-established best practices, stan-
dards, and critical compliance requirements.
The Cloud Security Alliance GRC Stack is an integrated suite of three CSA initiatives:
■ CSA CloudAudit: Allows cloud service providers to automate the process of audit,
assertion, assessment, and assurance.
■ CSA Controls Matrix: Provides fundamental security principles to guide cloud ven-
dors and to assist prospective cloud customers in assessing the overall security risk
of a cloud provider—categorized into 11 control areas from compliance to security
architecture.
Mission-Critical Non-Mission-Critical
Core
Hot DR Warm DR
Non-Core
Referring to Figure 4-2 again, on the horizontal axis, mission-critical provides a tempo-
ral contextualization, that is, the criticality of the application or service in relation to the
operational execution of the business. Put simply, how long can the enterprise cope with-
out a particular IT service being operational? What is the maximum permissible
downtime of a service?
This directly relates to both the high-availability (HA) capabilities built into each service
component and its dependencies. Further, this relates to disaster recovery (DR) invest-
ments required to manage the Recovery Point Objective (RPO)—how frequently to back
up data as well as the Recovery Time Objective (RTO) and how long will it take to get the
service back up and running—of a particular IT service or services because of an
unplanned outage. In other words, how often is a system backed up or the state of the
system replicated? How quickly can the IT system be brought back into operation after
failure? As a practical example, Cisco IT uses a criticality index to classify its business
services that are made up of the application(s) and supporting dependencies, for example,
network, network services, database, storage, and so on, as outlined in Table 4-3.
These classifications might also aid in determining whether an enterprise wants to con-
sume a “black box” solution from a third-party vendor. In-house applications that contain
intellectual property of the enterprise, normally viewed as a core application (a system of
differentiation or innovation), would require characterization (base lining, benchmark-
ing) and interoperability testing for a third-party organization to develop an SLA to
80 Cloud Computing
manage the in-house application or service either on-prem or off-prem (private or hosted
cloud—also known as a hybrid cloud model).
*Primary function: A function (for example, a business process) that is directly aligned with the organization’s top-level
objective. For Cisco, the top-level objective is to generate revenue, be profitable, and grow.
When thinking about consuming a cloud service, in this case IaaS, you need to take into
account an application’s or a service’s criticality level to the business, how core or
context it is, and also what information assurance level it requires in addition to IT-cen-
tric metrics like performance and scalability.
If you look at Figure 4-4, you will see a three-dimensional model with the three axes rep-
resenting criticality, role, and information assurance. The sphere represents a service or an
application (let’s call it Service A). So from this model, any workload that is mapped in the
near upper left would be “business imperative” (taking the Cisco IT definition as an exam-
ple) as well as being core to the business (that is, enabling the business to differentiate
itself in the marketplace). Finally, the data the workload contains is company confidential
and so requires a high security posture level (referring to confidentiality and integrity).
Chapter 4: IT Services 81
ORACLE/SAP
PERFORMANCE
E SHAREPOINT
SIZ
CAPACITY
Single Workload Converged Infrastructure
Small DC Capacity
(scale-out)
l’ ’
ve P1
Le ‘S
re
o stu P2
’
y P ‘S
r it
ecu 3’
‘S P
‘System of
‘S
Record’
4’
‘Non-Core’
P
‘S
A workload with such an aforementioned classification cannot have any downtime. The
confidentiality and integrity of the information must at all times be maintained as the
financial impact of an information leak would be critical if not terminal to the corpora-
tion if either of these events did occur.
These service attributes would help to determine the correlation service mapping to serv-
ice offering attributes (in the case of IaaS) that can be defined by the following:
■ Interconnect containers: Defines the type (OSI Layer 2 or 3), connectivity type, and
method protocol (for example, LISP, VPLS, OTV, and so on) to remote data centers.
■ Security zones: Defines the security posture, policy enforcement, and decision
point types, as well as their implementation and operation.
■ Disaster recovery link activators: Defines the RTO, RPO, and priority class and initi-
ates DR service between two network containers. Uses interconnect containers for
WAN connectivity to remote disaster recovery data center sites.
Figure 4-5 illustrates the aforementioned service capabilities that form the services archi-
tecture building blocks.
Interconnect
Container for
inter-Data Center
Network
N etw
workk
connectivity services.
Pods = Physically
assigned Resources
Pod
Pod Pod
The trick is being able to translate and correlate the service offering (service [level]
description) to the desired service requirements for a particular workload. As previously
mentioned, nonstandard definitions and a lack of normative references make this an odi-
ous task today. Recently, the Tele-Management Forum (TMF) has been doing some work
on standardized service definitions. We will discuss the development of the marketplace
in Chapter 5, “The Cisco Cloud Strategy.”
These required service attributes might also differ during the life cycle of a particular
service or application. This process can be described as the service gestation period. As
an example, Cisco IT would typically move an application through multiple proving
stages called alpha, ACE, and production. Each of these stages has certain governance
rules associated with them. More typical naming conventions would be research and
development (R&D), alpha and beta (trial), and then finally general availability (GA) or
production.
All this complexity is down to the fact that we are mapping traditional enterprise-class
Service Orientated Architecture (SOA) onto the new web/cloud horizontal service plat-
forms. This legacy will be with us for some time just as many mainframe applications are
still with us today. The laws of “Cloudonomics” and its associated scale-out architectural
principles such as designed to fail will take some time to realize as a new wave of appli-
cations that are built for the cloud model slowly supersede existing application work-
loads. In the intervening period, we have to operationalize these existing applications into
our new cloud operational paradigm, achieving the bipolar goals of standardization and
flexibility.
Figure 4-6 depicts the first two of the four cornerstones, unit service costs and financing
the variability or workload (peak, trough, frequency).
84 Cloud Computing
Let’s look at the costs associated with Infrastructure as a Service (IaaS). It is assumed that
because of scale and the complexity in building automated workflows that public cloud
providers can offer IT services at a lower cost per unit than can be achieved by an enter-
prise offering in-house service. However, a number of studies have shown this not to be
true in all cases. Economies of scale start to show diminishing returns after an enterprise
has reached a certain consumption volume, thus enabling better buying power for IT
resources.
Counterintuitively, a pure “pay as you use” solution also makes sense, even if unit costs
are higher than can be achieved by in-house IT. This is because you have to take into
account the cost of financing the workload variability, both from a peak-to-average ratio
workload perspective and from a temporal/frequency perspective (that is, how often the
workload peaks are reached).
The reason for this is straightforward. The fixed-capacity dedicated solution must be
built to the peak workload, whereas the cloud service provider needs to build a capacity
to the average of the peaks (this obviously assumes a good level of distribution of the
total workloads placed on the infrastructure to reduce the variability).
However, some cloud services are required to be metered and billed in fractal units of
time for consumers so that they can benefit from the utility nature of the service (near-
real-time service activation and teardown). Typically, utility services would be billed on
Chapter 4: IT Services 85
an hourly basis or even by the minute. This utility service example directly relates to the
our third area of cloud economics, the time value of money.
If an enterprise can leverage extra processing capacity when needed, faster than it can
deliver it itself, the enterprise would be willing to pay a premium for that capability.
Fundamentally, it is about reaching business decisions quicker with greater accuracy. This
concept is depicted in Figure 4-7 along with the concept of innovation.
The fourth and final cornerstone of cloud economics is probably the least realized. The
cloud is essentially predicated on web applications that have undergone a major shift in
thinking regarding scaling. Scale-up (vertical scaling, that is, adding more resources to a
single node in a system) enterprise-type architectures are limited in growth capability ver-
sus scale-out (a.k.a. horizontal scaling, that is, adding more nodes). Horizontal scaling
coupled with the utility delivery model of IT services (the theoretical capability to burst
for additional capacity in real time) somewhat change the landscape of resource availabil-
ity to developers.
For example, if for the same cost, rather than having ten CPUs available for 10,000 hours,
a developer could procure 10,000 CPUs for ten hours. This paradigm shift in IT resource
consumption benefits a subset of use cases such as large scale-out “big data” application
research and development projects that require lots of computational power if only for a
short period of time (for example, Monte Carlo trading algorithms, geological analysis
programs, and so on). The real-time consumption model removes the economic barrier to
such application development requirements.
86 Cloud Computing
Summary
In the end, it boils down to the simple yet well-known trade-off: cost versus risk.
Understanding what risks are associated with the consumption of a particular service is
not an easy task, as we have discussed throughout this chapter. To avoid too much cost
associated with classification of data and the search for the right service offer, the indus-
try needs clear, easy-to-implement models and service descriptions within an agreed-
upon service framework and ontology. Service providers will continue to develop risk
profiles for customer workloads as the industry matures.
Chapter 5
Upon completing this chapter, you should be able to understand the following:
■ How Cisco is helping partners and customers deliver on the promises of the cloud
This chapter describes the Cisco Systems strategy regarding technological, system, and
service developments related to the cloud. The chapter also briefly covers the technology
evolution toward the cloud to understand how we got to where we are today.
We are reminded of an alleged quote (or more likely a misquote) by Thomas J. Watson of
IBM fame in 1943: “I think there is a world market for maybe five computers.” This
apparent misquote might be closer to the truth than anyone realized until now!
Nicolas Carr published a book in 2008 titled The Big Switch: Rewiring the World from
Edison to Google (published by W.W. Norton & Co.). In the book, Nicholas Carr
88 Cloud Computing
compares the starting transition in the IT industry with the evolution of the electrical
power generation industry toward a utility model. The idea is simple—pay only for what
you use, just as we do with electricity. The efficiency and cost savings of such a model
are clear to understand, as we discussed in Chapter 4, “IT Services,” in the section, “Four
Cornerstones of Cloud Economics.” However, efficiency and cost savings are not the
only reasons why cloud computing is proving popular.
Figure 5-1 depicts an analogy of two boxers representing seemingly opposing approaches
to the delivery and provision of information services to end users. The boxer on the left
represents the disruption of the cloud model on more traditional enterprise information
services delivery.
Web Enterprise
Model Model
Massive Scale Modest Scale
(‘Scale-Out’) (‘Scale-Up’)
Open Source Commercial Software
Information Centric Application Centric
QuickTime™ and a
decompressor
are needed to see this picture.
Distruption to
the Market
Figure 5-1 Enterprise Versus the Web—The Web Wins Today’s Battle
If you compare and contrast the typical enterprise IT model of the last decade versus the
recent emergence of the Web 2.0 model, you can see a number of bipolar approaches.
The enterprise model was based on specific applications or service-oriented infrastruc-
tures featuring a vertical scale model (also known as scale-up) with per-application-based
high availability (N+1) failover capabilities. This architectural approach achieved only
modest scalability.
Contrast this with the web model’s architectural principles and proven design capabilities:
horizontal scaling (also known as scale-out), multilayer caching, eventual consistency,
information-centric, shared infrastructure, and open APIs.
Chapter 5: The Cisco Cloud Strategy 89
The web model is no accident. It has evolved to solve technological problems of the day;
a good example is “big data.” Large organizations increasingly face the need to maintain
large amounts of structured and unstructured data to comply with government regula-
tions. Recent court cases have legally mandated corporations in certain industry verticals
(for example, the healthcare industry) to retain large masses of documents, email mes-
sages, and other forms of electronic communication that might be required in the event
of possible litigation.
“Big data” is a (marketing) term applied to data sets whose size is beyond the capability
of commonly used enterprise software tools to capture, manage, and process within a
tolerable elapsed time. Big data sizes are a constantly moving target currently ranging
from a few dozen terabytes to many petabytes of data in a single data set. Recently, new
technologies being applied to big data include massively parallel processing (MPP) data-
bases, compute grids, map-and-reduce file systems (inspiring commercialized applications
like Apache Hadoop), along with cloud computing platforms coupled with the power of
the Internet to help us to manage the “big data” challenge.
In short, enterprises are now adopting the web model, first pioneered mainly in the con-
sumer social-media market and now adopted to solve or enhance the enterprise’s capabili-
ties, for example, to foster collaboration between employees as well as speed data pro-
cessing and result accuracy. Essentially, enterprises need a model that provides capabili-
ties that deliver IT value to the business in today’s global marketplace.
Note that the new cloud paradigm does not negate the need for enterprise requirements
such as regulatory adherence. As an example, corporations that are subject to the
Sarbanes-Oxley (SOX) regulation are required to keep data archives (active or inactive)
for up to seven years. Other than the commoditization of storage media prices, there is
little room to maneuver to reduce costs.
In some cases, the nature of where the data is held in the cloud might require the cus-
tomer to implement extra regulatory compliance.
90 Cloud Computing
The World Economic Forum publishes a “Global Risks Network” report on a yearly basis
(www.weforum.org/issues/global-risks). As the world becomes more interconnected, the
greater the impact of a megashock will be.
By 2013, it is estimated that the flow of information globally is going to increase dramati-
cally again to roughly 13 billion DVDs crossing the Internet on a monthly basis.
If you look at this picture as a whole, it’s not a story about slow, predictable growth. We
are talking about truly exponential growth in the amount of information that’s flowing
around the world. This presents both a challenge and an opportunity. In the midst of this
deluge of data, the advantage goes to the organizations that can help individuals and
businesses find the most relevant information. Speed and accuracy are king.
The makeup of Internet traffic itself is also changing. Historically, we mostly had FTP,
HTTP, and peer-to-peer traffic. Today, video is already dominating the mix, and by 2013, it
is estimated that video will represent more than 90 percent of all consumer Internet traffic.
This shift has broad implications. In the past, the IT department’s job was to build a data
and voice network that carried some video. But from now on, the IT department’s job is
to build a video network that might carry some data and some voice traffic. Safe to say,
video not only changes the shape and behavior of traffic on networks, but video is forc-
ing us to change the way we think, design, and operate networks.
92 Cloud Computing
To serve and manage the number of users searching through enormous amounts of data,
we are now building vast data centers—industrial-scale machines. In fact, you could argue
that cloud computing started with the needs of the largest Internet companies having to
manage tens of thousands of servers. At this scale, you can begin to view large data cen-
ters or even multiple data centers as a single operational system, with many parts, running
a highly distributed scale-out application. Because of the scale and complexity, the
process of IT industrialization could only be accomplished with almost fully automated
management systems. Given the numbers involved, failure of individual components
becomes an everyday occurrence, and applications need to be designed in such a way as
to be able to survive despite such failures. System management is still being improved and
honed today. Having said this, aspects of cloud computing have actually been around for
some time; it has just been called different things. We have been talking about the con-
cept of utility computing for at least 20 years, and virtualization goes back to the main-
frame era.
Added to the large number of Internet users and information growth, in 2010, approxi-
mately 35 billion devices are connected to the Internet (Source: Cisco IBSG). It is impor-
tant to note that the aforementioned 35 billion devices go far beyond devices that we
hold in our hands. It includes device-to-device and machine-to-machine communications,
what is often referred to as the “Internet of Things.”
challenges this presents information service management and delivery. In this section, we
discuss what the Cisco Systems strategy is in response to those challenges, helping part-
ners and customers build solutions (see Figure 5-2).
1. Essential infrastructure for building clouds: Cisco develops technology, both soft-
ware and hardware, to create products and solutions to address customer needs that,
for example, assist cloud builders deliver innovative and flexible service platforms.
This technology applies to an enterprise building a private cloud or a service provider
building a virtual private cloud or a public cloud infrastructure. Cisco is investing
heavily to advance the market for the cloud by driving technology innovation that is
based on open standards (addressing vendor and service provider lock-in concerns)
alongside ecosystem development.
2. Solutions for deploying cloud services: Cisco is delivering systems and service plat-
forms to organizations so that they can deliver their own secure cloud services capa-
ble of supporting enterprise-class service-level agreements (SLA), of which security
is a core component. Cisco is enabling service providers and enterprises to deliver
secure cloud solutions with end-to-end, top-to-bottom validated designs coupled
with world-class professional services that are aligned to partner and end customer
business goals. Cisco partners both on an engineering/technology research and
development level as well as on a system build level with “best of breed” technology
vendors to provide partners and customers with a choice in how they consume and
deliver IT services.
3. Innovation to accelerate the use of clouds: Cisco is investing in services and solu-
tions that can be delivered in a multitenant and utility-based cloud form, offered by
94 Cloud Computing
Cisco Service Provider partners, that combine collaboration and mobility capabilities
with security. It is important to note that a central tenant of Cisco Systems’ cloud
strategy is its support for an indirect “go-to-market” model. In other words, Cisco
Systems does not intend to offer Infrastructure as a Service (IaaS) directly from its
own data centers to end customers. Rather, Cisco Systems would prefer that its serv-
ice provider partners provide this service (and others) bundled with their own value-
added services (mainly network [WAN] derived).
The following sections look at some examples of the technologies, products, systems,
and solutions that Cisco can deliver today that help deliver the IT utility model, ITaaS.
Cisco Systems, unlike its main competitors, supports an indirect, partnership-based go-
to-market (GTM) model for IaaS. Cisco Systems fundamentally believes that service
providers can leverage their network assets as the underlying platform for cloud-based
services, whereas systems integration partners have the more traditional IT skill sets that
can be augmented through an addition of an intelligent network.
To reinforce this approach to the market, Cisco Systems has developed the Cloud Partner
Program (CPP) that is specifically designed to assist the enablement and acceleration of
cloud services for
■ Cloud builders: Partners that sell infrastructure and professional service capabilities
■ Cloud providers: Partners that build, operate, and run cloud service platforms
■ Cloud resellers: Partners that OEM or resell cloud services of cloud providers (typi-
cally do not own the infrastructure assets)
Each of these defined partner roles can leverage the benefits of certification and go-to-
market programs designed specifically around cloud-based services.
More information on the CPP framework and program kits can be found at
www.cisco.com/web/partners/partner_with_cisco/cloud_partner.html.
Deploying Layer 4 through 7 services in virtual data centers has however been extremely
challenging. Traditional service deployments are completely at odds with highly scalable
virtual data center designs that have mobile workloads, dynamic networks, and strict
SLAs. Security is just one required service frequently cited as the biggest challenge to
enterprises adopting cost-saving virtualization and cloud-computing architectures.
When services could be deployed effectively, they frequently undermined the benefits of
virtualization by adding cost and complexity to the infrastructure build, thus reducing
flexibility. Network-based services also tended to conflict with each other, with poor
integration and completely separate policy management platforms, further increasing
costs and management overhead.
One of the greatest design challenges is how to “steer” application traffic flows through
the physical and virtual infrastructure so that network services can be applied.
Traditionally, engineers have used utilities such as Policy-Based Routing (PBR) and Web
Cache Communications Protocol (WCCP) in conjunction with Network Address
Translation (NAT). Although all these technologies have been valuable in given use cases,
in general, they do not enable the efficient (flexible) invocation of network services to
build, in effect, a dynamic service chain that is policy based in its construction.
In other words, this strategy works well in traditional data centers but not in the cloud
computing world, where (nearly) everything is virtual. To adapt to this change, Cisco
developed the UNS strategy.
Cisco Unified Network Services (UNS) brings together an intelligent solution set that
secures and optimizes the delivery of applications across the distributed enterprise. UNS
addresses the requirements of dynamic, on-demand service delivery with consistently
managed, policy-based provisioning, bringing integrated application delivery and security
services to highly scalable, virtualized data centers and cloud environments.
UNS enables customers to
A new concept of steering the traffic had to be developed for the virtual environment.
Cisco developed a protocol called vPath. The Cisco Nexus 1000V switch’s vPath func-
tionality provides a single architecture supporting multiple Layer 4–7 network services in
virtualized environments with service policy–based traffic steering capabilities. In other
words, it provides an “intelligent services fabric” that allows a service designer to place
virtual service nodes (VSN) adjacent to the application VMs it is serving (that is, on the
same ESX host or physical server) or place the VSNs on a separate and dedicated “serv-
ice” ESX host (physical server). In effect, this provides a greater degree of flexibility in
where and when to place VSNs. VSNs can support a variety of network services, such as
virtual firewalls or WAN acceleration. Figure 5-3 illustrates the Nexus 1000V distributed
virtual switch running across three VMware ESX servers (each a separate physical server),
supporting the VSNs with the vPath policy-based traffic steering capability.
Virtual
Network
Management Center
VM VM VM
VM VM VM VM VM VM VM
VM VM VM VM VM VM VM VM VM
Figure 5-3 Policy-Based Traffic Steering with the Nexus 1000V’s vPath
At press time, the Nexus 1000V virtual distributed virtual switch supports VMware ESX,
KVM, and Microsoft’s Hyper-V hypervisor/virtual machine monitor (VMM) software.
Today, Cisco offers a virtual form factor of a network security service running in a VM.
The Cisco Virtual Security Gateway (VSG) for the Cisco Nexus 1000V virtual switch is
an example of a VSN, enabling service policy creation and enforcement at the VM level
(Cisco Systems also supports virtual wide-area application services [vWAAS] as a virtual
appliance at the time of this writing, with other virtual appliances to be announced short-
ly). Integrated with the Cisco Nexus 1000V in a virtual host, a VSN provides
Chapter 5: The Cisco Cloud Strategy 97
■ VM-level granularity: Apply, manage, and audit policy at the VM context level
■ Support for vMotion: Policy stays intact across manual and automated vMotion events
In addition, Cisco UNS also consists of physical services, that is, traditional network
service appliances and modules. The combination of VSNs and dedicated appliances or
servers provides customers with the flexibility to take full advantage of existing resource
investments and to scale easily as new capacity is required while leveraging a common
operational model and management toolset. Cisco UNS thus provides the capability to
integrate and manage both virtual and physical services into a common, consistent frame-
work. Integration of policy-based, traffic-steering capabilities for both physical and virtu-
al network services is seen as the next logical step forward, but it is not supported today.
The (802.1Q) VLAN has been the traditional mechanism for providing logical network
isolation. Because of the ubiquity of the IEEE 802.1Q standard, numerous switches and
tools provide robust network troubleshooting and monitoring capabilities, enabling mis-
sion-critical applications to depend on the network. Unfortunately, the IEEE 802.1Q stan-
dard specifies a 12-bit VLAN identifier that hinders the scalability of cloud networks
beyond 4K VLANs.
Some vendors in the industry have proposed incorporation of a longer logical network
identifier in a MAC-in-MAC or MAC in Generic Route Encapsulation (MAC-in-GRE) as a
way to scale. Unfortunately, these techniques cannot make use of all the links in a port
channel between switches; that is often found in the data center network or in some cases
does not behave well with Network Address Translation (NAT). In addition, because of
the encapsulation, monitoring capabilities are lost, preventing troubleshooting and moni-
toring. Hence, customers are not confident in deploying Tier 1 applications or applica-
tions requiring regulatory compliance in the cloud.
VXLAN solves these challenges with a MAC in User Datagram Protocol (MAC-in-UDP)
encapsulation technique. VXLAN uses a 24-bit segment identifier to scale. In addition,
the UDP encapsulation enables the logical network to be extended to different subnets
and helps ensure high utilization of port channel links.
Cisco and VMware, along with other networking and server virtualization companies, pro-
posed and submitted a new LAN segmentation technology proposal, “VXLAN,” to the
Internet Engineering Task Force (IETF) in August 2011 for standardization. The IETF draft
submission is available at http://datatracker.ietf.org/doc/draft-mahalingam-dutt-dcops-vxlan.
98 Cloud Computing
Requirement Solution
Layer 2 extensions between data centers have traditionally been driven by high-availabili-
ty service use cases, for example, stateful replication between active, standby firewalls;
session load balancers; or synchronous data storage replication. This means relatively
short geographic distances between paired data centers. With the adoption of virtual
machines, the ability to execute a live migration from one physical server to another situ-
ated in either the same or a geographically dispersed data center for workload balancing,
Chapter 5: The Cisco Cloud Strategy 99
driving up asset utilization or for disaster recovery (DR) purposes is now a viable techni-
cal use case. The latter case is driving up the bandwidth requirements (today, a recom-
mended minimum of 622 Mbps) required for VMware’s VMotion/Site Recovery Manager
(SRM) service. Technologies like Cisco virtual Port Channel (vPC) is ideal for Layer 2
deployments over dark fiber. Layer 2 extension technologies need to guarantee the basic
operational principles of Ethernet, loop-free forwarding, no packet duplication, and
MAC address table stability. In addition, the solution should provide the following:
In this challenging environment, Layer 3 overlay solutions that enable fast, reliable, high-
capacity, and highly scalable DCI are also essential. Such a solution is available with vir-
tual private LAN service (VPLS), a technology that provides Ethernet connectivity over
packet-switched WANs. VPLS supports the connection of multiple sites in a single
bridged domain over a managed IP or IP and MPLS (IP/MPLS) network. VPLS presents
an Ethernet interface, simplifying the LAN and WAN boundary for enterprise customers
and helping enable rapid and flexible service provisioning. Data centers, each having their
own Ethernet LAN, can be united in a VLAN over a WAN by using VPLS.
The Advanced VPLS (A-VPLS) feature introduces the following enhancements to VPLS:
One of the most recent innovations for Layer 2 extension over IP (or MPLS) is Overlay
Transport Virtualization (OTV). OTV provides Layer 2 connectivity between remote net-
work sites by using MAC address–based routing and dynamic IP-encapsulated forward-
ing across a Layer 3 transport network to provide support for applications that require
Layer 2 adjacency, such as clusters and virtualization. OTV is deployed on the edge
devices in each site. OTV requires no other changes to the sites or the transport network.
OTV builds Layer 2 reachability information by communicating between edge devices
with the overlay protocol. The overlay protocol forms adjacencies with all edge devices.
After each edge device is adjacent with all its peers on the overlay, the edge devices share
MAC address reachability information with other edge devices that participate in the
same overlay network.
OTV discovers edge devices through dynamic neighbor discovery that can leverage the
multicast support of the core. This means efficient multisite Layer 2 extensions, which
are ideal for the VM live migration use case. It is important to note that OTV is aimed
at private cloud scenarios, as the protocol does not explicitly support per-tenant seman-
tics. In other words, one can dedicate an overlay to a customer (maximum overlays
100 Cloud Computing
supported today is three on a Nexus 7000), but it cannot provide per-tenant isolation
(for example, VPN). Figure 5-4 illustrates the simplified view of the OTV dynamic
encapsulation of Layer 2 frames in a Layer 3 (pseudo-GRE) header for transport across a
Layer 3 (IP) network.
Encap Decap
OTV OTV
MAC IF MAC IF
MAC1 Eth1 IP A IP B MAC1 IP A
MAC2 IP B OTV MAC2 Eth 1
MAC3 IP B MAC3 Eth 2
Communication between
MAC1 (West) and MAC2 (East)
West East
Site Site
Today, if an administrator wants to spin up VMs in a cloud, he would use an API to man-
age the life cycle of that VM. There are plenty of cloud OS/cloud stacks that have been
built to support these APIs. Examples of cloud APIs (normally using the RESTful proto-
col) include but are not limited to the following:
■ OCCI: www.ogf.org/gf/group_info/view.php?group=occi-wg
■ VMware vCD API: www.vmware.com/pdf/vcd_10_api_guide.pdf
Use of these APIs is suitable for cold VM migration from one (for example, private) cloud
to another (for example, public) cloud managed and operated separately. Because each
cloud has its own administrative ambience and methods, this is one of the many chal-
lenges today that restricts live VM migration between clouds. Each cloud requires unique
machine images (for example, Amazon Machine Image [AMI]). There are companies that
specialize in converting machine images, such as CohesiveFT. (AWS also has its own con-
version service called VM Import.)
The Distributed Management Task Force (DMTF) is working on something called the
Open Virtualization Format (OVF). From www.dmtf.org/standards/ovf:
Note that OVF v1.1.0 supports both standard single VM packages (VirtualSystem ele-
ment) and packages containing complex, multitier services consisting of multiple interde-
pendent VMs (VirtualSystemCollection element). OVF v1.1.0 supports virtual hardware
descriptions based on the Common Information Model (CIM) classes to request the
infrastructure to support the running of the virtual machine(s) or appliance(s). The XML
representation of the CIM model is based on the WS-CIM mapping.
Some cloud service providers offer their own CloudOS as Software as a Service (SaaS) to
manage the private cloud on the end customer’s premises or expose APIs, as previously
mentioned.
From a “live” machine migration point of view, solving this problem at the management
plane is only the first step before moving onto another challenge at the data plane (net-
work and storage). Let’s briefly discuss the challenge of OSI Layer 3 boundaries related to
this goal and how Cisco is innovating with new technology, Locator/Identifier Separation
Protocol (LISP), to address, at least in part, the challenge.
Intranet
Int
et
ra
ran
n et
Int
Enterprise
EnterpriseCompute
Enterprise
Services
Compute
• Providers Host Multiple Tenants securely Services
Storage
Compute
Private Cloud Intranet
• Workloads move securely across organizations Services
Services
Storage
Private Cloud Intranet
Database Services Storage
Private Cloud Intranet
Services Services
LISP = Internet Scale Open Connectivity Database
Standards based Mobility, Segmentation, Security Services
Database
Services
Layer 3 routing has been developed over time to incorporate interadministrative domain
interface points. Exterior gateway protocols like Border Gateway Protocol (BGP) have
been specifically developed for this purpose. For Layer 2 connectivity between adminis-
trative domains, this is more problematic and as such is not considered as a viable option.
So we need to look at Layer 3 options that can support the live VM migration use case.
A basic observation, made during early network research and development work, is that
the use of a single address field for both device identification and routing is problematic.
To effectively identify a device as a network session endpoint, an address should not
change, even if the device moves, such as from a home to a work location, or if the
organization with which the device is associated changes its network connectivity, per-
haps from one cloud service provider to another. However, it is not feasible for the rout-
ing system to track billions of devices with such flexibly assigned addresses, so a device
needs an address that is tightly coupled to its topological location to enable routing to
operate efficiently.
To provide improved routing scalability while also facilitating flexible address assignment
for multihoming, provider independence, and mobility, LISP was created. LISP describes a
change to the Internet architecture in which IP addresses are replaced by routing locators
(RLOC) for routing through the global Internet and by endpoint identifiers (EID) for iden-
tifying network sessions between devices. Essentially, LISP introduces a new hierarchy
Chapter 5: The Cisco Cloud Strategy 103
(also known as “jack up”) to the forwarding plane, allowing the separation of location and
identity.
Note You can find more information about LISP capabilities, use cases, and deployments
at www.cisco.com/go/lisp, http://lisp4.cisco.com, and http://lisp6.cisco.com.
Cisco Systems Nexus 7000 now supports LISP VM-Mobility mode (see www.cisco.com/
en/US/docs/switches/datacenter/sw/5_x/nx-os/lisp/configuration/guide/NX-OS_LISP_
Configuration_Guide_chapter2.html).
What is needed is a way to abstract from the low-level “concrete” configuration tasks to
more policy-based, high-level (“abstract”) system change management tasks.
The Cisco Network Hypervisor product has been designed specifically for highly virtual-
ized environments and cloud delivery models and does for the network infrastructure
what server virtualization has done for the data center—provide efficiency, elasticity,
automation, and control. The virtualization capabilities provided by Network Hypervisor
facilitate the transformation of static, rigid networks into a dynamic infrastructure that
responds automatically to the demands of virtual and cloud environments based on the
rules and business policies defined by administrators.
The network services orchestration capabilities of Network Hypervisor allow physical or
virtualized computing/storage resources to be combined with network access and securi-
ty models into a single service chain—a cloud service—that is fully automated and can
be deployed, on demand, to selected end users. Network Hypervisor business policies
define and capture the discrete elements of a cloud service and translate those elements
into actual device services and configuration syntax that is automatically disseminated to
the appropriate devices across the network to initiate the requested service.
104 Cloud Computing
From the activation of a business policy that defines a new cloud service, Network
Hypervisor automatically initiates the creation of the required VMs. As the VMs are
coming online, Network Hypervisor defines and deploys the network access and security
models across all required infrastructure devices (routers, switches, and firewalls) as need-
ed to deliver the cloud service to the defined end users. The entire process is completed
in seconds and can include the setup and deployment of network routes, Virtual Private
Networks (VPN), VLANs, and access control lists (ACL); the deployment of security cer-
tificates; and the configuring of firewall rules and DNS entries, all of which are defined
through the business policy and deployed automatically without any chance of com-
mand-line mistakes.
Cisco Network Hypervisor virtualizes network services by creating or abstracting a logi-
cal network in concordance with the physical network that it also manages. It controls the
physical network by virtualizing hardware switches and routers to create subnets of net-
work addressing space, typically VPNs, that also enable and orchestrate clouds and VMs.
The logical network is driven by policies that control network access for individuals to
resources. The policies specify high-level resource sharing. They can be created external-
ly or using the Network Hypervisor Command Center as documents that can be import-
ed, exported, and edited within the center. At the XML level, they comprise elements
that model all the specifications (and more) that can be expressed using a grammar of
variable and parameter substitutions that let admins and network configurators easily
specify individual and multiple models that Network Hypervisor can express. For this
reason, they are called metamodel files.
Figure 5-6 depicts the functional subcomponents of the Network Hypervisor and shows
how it logically connects to the northbound ITSM (Information Technology Service
Management) tools shown on the left and to the southbound underlying infrastructure
shown on the right.
Physical and logical topology models drive Interaction between Network Hypervisor components is via a
service provisioning, via ITSM service secure out-of-band control plane (JMS). Devices via
requests. existing mechanisms (SNMP, CLI, SSH, XML).
Policy/Service
Service Routers, Switches, Firewalls
Admin
Directives Storage and Compute
Real-Time Topology Provisioning Infrastructure
Respository Engine Policy/Service
Service Status
REST JSON/XML API
Device Config
Updates
Seman Mcs
External Iterative Device
ITSM Service Real-time Service
Processing
Provider Order/Service Controller Status Updates
Engine
Details (DSC) Active Service
Service Catalog
Alarms Service Enforcement
Details
Service
Config Updates Monitoring
Repository
Inventory Detail Engine Real-Time
Service Status
In other words, Network Hypervisor provides the services and configurations that each
business policy needs.
A key component of the NSVE is a process called the provisioner or provisioning engine.
This determines which sites are affected by the new policy. For each one, it constructs a
set of abstract directives to tell the site’s device service controller (DSC) which policies it
needs to implement. Depending on which services are enabled at the sites, when the serv-
ice controller receives the directives, it converts them into device-specific instructions
and configures the devices accordingly.
■ Bring IaaS to market more quickly. Virtualization-aware intelligence is built into the
key elements of the Unified Service Delivery solution (for example, Unified
Computing and Virtual Network Link).
■ Reduce the cost of deploying IaaS and other services. The Unified Service Delivery
solution incorporates consolidation and scalability features that are unique to Cisco
products.
■ Meet customers’ service-level and security requirements in IaaS offers. QoS, encryp-
tion, and secure partitioning technologies work in concert across the components of
the Unified Service Delivery solution.
This “general purpose” concept, where an organization can utilize the scale-out or even
the more traditional scale-up capabilities of the underlying infrastructure, provides maxi-
mum flexibility in regard to IT infrastructure supporting the agile business. Integrated
Compute Stacks (also known as infrastructure packages) provide the modular building
blocks of the Unified Service Delivery platform. Productized examples of ICSs would be
VCE’s Vblock, a product from the Cisco, EMC, VMware, Intel joint venture or, from the
Cisco, NetApp alliance, the FlexPod building block. Figure 5-8 depicts an abstracted view
of CUSDP in relation to NIST’s taxonomy on cloud service models in addition to some of
the infrastructure building blocks for CUSDP.
BSS
Software
as a Service
Platform (SaaS)
Infrastructure as a as a Service
Service (IaaS) (PaaS)
OSS
Cisco’s Unified Service Delivery Platform (CUSDP)
(DCN + IP NGN)
IP NGN
CRS-1
Flexpod ASR 1000
ASR 9000
APM NMS
Cisco VMDC delivers a validated architecture that delivers a highly available, secure,
flexible, and efficient data center infrastructure. It provides the following benefits:
■ Reduced time to deployment: Provides a fully tested and validated architecture that
accelerates technology adoption and rapid deployment
■ Reduced risk: Enables enterprises and service providers to deploy new architectures
and technologies with confidence
Cisco VMDC 2.0 provides a scalable solution that can address the needs of smaller, as
well as larger, enterprise and service provider data centers. This architectural consistency
enables providers to select the design that best suits their immediate needs, while provid-
ing a solution that can scale to meet future needs without retooling or retraining staff.
Chapter 5: The Cisco Cloud Strategy 109
This scalability with a hierarchical design is based on two modular building blocks: point
of delivery (POD) and Integrated Compute Stack (ICS).
Services Services
PC PC
SAN SAN
EMC V-Max EMC V-Max
Figure 5-9 VMDC 2.0 PoD Designs Showing ICS as a Key Subcomponent
Building Block
110 Cloud Computing
The modular design starts with a basic infrastructure module called a PoD. A PoD allows
providers to add network, compute, storage, and service resources incrementally that pro-
vide all the infrastructure components to suffice the service use cases that are offered
within the service catalogue. As an example, the Cisco VMDC 2.0 architecture specifies
two PoD designs: compact and large. Essentially the difference between the PoDs is
mainly around capacity provided rather than capability.
The PoD concept offers a number of benefits:
■ Fault isolation
■ Consistent and efficient operation
The Cisco VMDC 2.0 architecture includes an open management framework that enables
provisioning of resources through service orchestration. A provider can deploy orchestra-
tion tools that provide a portal-based configuration model where a tenant can select from
a defined number of service options. Service orchestration offers a number of benefits:
■ Significantly reduces the operating expenses associated with administering and moni-
toring virtualized resources
■ The service orchestrator used in the VMDC 2.0 architecture is BMC Atrium
Orchestrator.
CIAC forms the “backbone” of service automation capabilities required for an IaaS plat-
form. Added to this backbone would be back office capabilities (for example, service desk,
change management, and CMDB) as well as service assurance tools (for example, Zenoss).
Cisco Systems plans to qualify CIAC with its VMDC infrastructure best-practice refer-
ence design, with a focus on private cloud installations.
CMDB
Global Orchestration
Cisco Tidal Enterprise Orchestrator IT Service
Management
Tools
Adapter Framework
Billing/
Chargeback
Hardware Virtualization OS/Software
Managers Managers Provisioning Monitoring and
e.g. ,VMware vCenter, Cisco Tidal Server
e.g., UCS Manager Governance
vCloud Director Provisioner
At the OpenStack Design Summit that took place in April 2011 in Santa Clara,
California, Cisco joined the OpenStack community in shaping new technologies for next-
generation cloud computing.
Cisco recognizes that any one company on its own cannot shape the future of cloud-
computing architectures. Innovating in collaboration with others is essential. The Cisco
strength in networking technologies and Unified Service Delivery puts it in a knowledge-
able position to help form how the network is utilized in cloud-computing services.
112 Cloud Computing
Prior to the aforementioned summit, Cisco submitted a blueprint (one of four in total
submitted by various contributors) that focused on network and network services provi-
sioning. This was called Network as a Service, or NaaS. Figure 5-11 illustrates the NaaS
concept and the network container service fulfillment operational methodology.
OpenStack:
Dashboard Tenant APIs
Compute
Cloud
Applications
/Services
Network Infrastructure
Optimized for Cloud Services
Take, for example, the ability to spin up a workload in a geographic location that is com-
pliant in regard to legal, taxation, and regulatory demands. We need mechanisms and
information/metadata that allows automatic geoplacement of workloads. Where better to
derive such information than the network?
How about making it easier to manage infrastructure through programmatic modeling
(abstraction) of the underlying infrastructure resources to deliver the required service chain?
How about linking the WAN with the services running within the data center? That is,
the WAN becomes “service aware” and provides end-to-end visibility and control, that is,
vNIC to vNIC (a vNIC being a virtual machine interface).
114 Cloud Computing
How do we make it easy to signal between different autonomous domains or clouds like
we do today at scale with IP services using well-established protocols like BGP?
This is just a few of the use cases or examples that are being asked by end users, archi-
tects, and engineers alike.
■ The ICP concept focuses on the value of network and network-based services that
adapt to dynamic, virtualized cloud-computing environments.
The idea is that the service provider delivers the NPS service as a generic service to the
Content Delivery Network (CDN) /application overlay. The service provider leverages its
Chapter 5: The Cisco Cloud Strategy 115
routing layer information to deliver intelligence to the application layer in terms of loca-
tion and preference/ranking based on distance.
In its generic form, the NPS service is implemented in a server accessible to applications
and invoked through a request in the form of “Which of several candidate nodes are clos-
est to some point of interest?”
The NPS server leverages different information sources to accurately compute the dis-
tance between endpoints. Sources can be
■ Policy database (to represent network policies as deployed by the service provider)
The NPS service is a ranking service allowing any client to obtain a ranked list of
addresses. The NPS client submits a request to the NPS server with a list of addresses to
be ranked.
The NPS server has precomputed a set of algorithms and maintains a topology database
that allows ranking the list of addresses received in the request. It then generates a reply
with the ranked list and sends it to the requester.
NPS is a “win, win” for both the end user and the service provider, providing better
Quality of Experience (QoE) to the customer and minimizing transportation costs for the
service provider.
In summary, a service product captures the business aspects and a service widget cap-
tures the technical aspects. These capabilities within a common service framework are
the baseline requirements to building a marketplace and ensuring fulfillment of service.
Summary
This chapter discussed many of the drivers (megatrends) that are forcing business to
rethink how they architect and manage their information services portfolio. We have seen
a huge rise in unstructured data sets for various regulatory and commercial reasons in
addition to the mobility of data and information access and the “consumerization” of IT
devices used by employees.
As a result, new models of consuming information services are resulting in the use of
innovative technologies and service platforms that not only allow corporations to reduce
the cost of delivery but also to maintain a level of risk exposure that is acceptable to the
business while tapping into the economic benefits of consumerization on IT endpoints
and the mobility of its workforce.
Cisco Systems is continuing to develop and deliver technology, systems, service plat-
forms, and programs that adhere to the needs of its customers and partners, making it
easier for them to focus on their own core business.