Lecture 8 ICT723
Lecture 8 ICT723
CRICOS 03171A 1
Cloud infrastructure
• Cloud service providers exploit the latest computing, communication, and
software technologies to offer a highly available, easy to use, and efficient
cloud computing infrastructure.
• Cloud infrastructure is built with inexpensive off-the-shelf components to
deliver cheap computing cycles.
• Virtual machines (VMs) and containers are key components of the cloud
infrastructure.
• Typical cloud workloads include
• Coarse-grained, batch applications,
• Fine-grained, long running applications with strict timing constrains.
Virtualization
• Virtualization is a critical element of the cloud infrastructure.
• Virtualization simulates the interface to a physical object by:
• Multiplexing: creates multiple virtual objects from one instance of a
physical object. Example - a processor is multiplexed among a number
of processes or threads.
• Aggregation: creates one virtual object from multiple physical objects.
Example - a number of physical disks are aggregated into a RAID disk.
• Emulation: constructs a virtual object from a different type of a physical
object. Example - a physical disk emulates a Random Access Memory
(RAM).
• Multiplexing and emulation. Examples - virtual memory with paging
multiplexes real memory and disk; a virtual address emulates a real
address.
Virtual machines
• Processor virtualization by multiplexing is beneficial for users & CSPs.
• Users appreciate virtualization because it allows a better isolation of applications from
one another than the traditional process sharing model. An application developer can
chose to develop the application in a familiar environment and under the OS of her
choice.
• CSPs enjoy larger profits due to the low cost for providing cloud services.
• Running multiple VMs on the same server allows applications to better
share the server resources and lead to higher processor utilization.
• Virtualization also provides more freedom for the system resource
management because VMs can be easily migrated.
• VM migration steps: a VM is stopped, its state is saved as a file, the file is
transported to another server, and the VM is restarted.
Containers
• Containers
• Are based on OS-level virtualization rather than hardware virtualization
• Isolate applications running inside a container from applications running in a different
container
• Isolate applications from the physical system where they run
• Resources used by a container can be limited
• Benefits
• Ease creation and deployment of applications.
• Applications container images are created at build time, rather than deployment time.
• Support portability; containers run independently of the environment.
• Benefit from an application-centric management.
• Optimal philosophy for application deployment; applications are broken into smaller, independent pieces
and can be managed dynamically.
• Support higher resource utilization.
• Lead to predictable application performance.
Warehouse Scale Computers
• WSCs form the backbone of the cloud infrastructure.
• A WSC has 50,000 - 100,000 processors.
• A hierarchy of networks connect, servers, racks, and cells/arrays.
• A rack consists of 48 servers interconnected by a 48 port, 10 Gbps
Ethernet (GE) switch. In addition to the 48 ports, the GE switch has two
to eight uplink ports connecting a rack to a cell.
• A cell/array consists of a number of racks. The racks in a cell are
connected by an array switch.
• The cost of a WSC is of the order of $150 million.
• Cost-performance is what makes WSCs appealing.
WSC processors
• Two basic groups of multicore processors:
• browny - single core performance is impressive, but so is the
power dissipation.
• wimpy - less powerful but consume less power.
• When running on wimpy cores a task needs to spawn a larger number of
threads. Major implications:
• It complicates the software development process as it requires an explicit parallelization of
the application thus, increasing the cost of application development.
• Running a larger number of threads increases the response time. Very often all threads have
to finish before the next step of an algorithm, the well known problem posed by barrier-
synchronization.
WSC Storage – latency, bandwidth, and capacity
• The memory hierarchy of a WSC with the latency given in
microseconds, the bandwidth in MB/sec, and the capacity in GB.
WSC performance
• WSCs workload is diverse, there are no ``killer'' applications that would drive
the design decisions and, at the same time, guarantee optimal performance
for such workloads.
• Solution - profile realistic workloads and analyze data collected during
production runs.
• Google-Wide-Profiling (GWP) - a low-overhead monitoring tool used to gather
the data through random sampling.
• Only data for C++ codes was analyzed because C++ codes dominate the
CPU cycle consumption though, the majority of codes are written in Java,
Python, and Go.
• Data was collected from some 20,000 servers built with Intel Ivy Bridge
processors.
Insights from WSC performance analysis
• Cloud workloads display access patterns involving bursts of
computations intermixed with bursts of stall cycles.
• Processors supporting a higher level of simultaneous multithreading
(SMT) are better equipped than 2-wide SMP processors to hide the
latency by overlapping stall cycles.
• Large working sets of the codes are responsible for the high rate of
instruction cache misses.
• MPKI (misses per kilo instructions) are particularly high for L2 caches.
• Larger caches would alleviate this problem, but at the cost of higher
cache latency.
• Separate cache policies which give priority to instructions over data or
separate L2 caches for instructions and data could help.
Virtualization → user benefits versus concerns
• Users operate in environments they are familiar with, rather than forcing
them to idiosyncratic ones.
• Applications can migrate from one platform to another.
• Support performance isolation important for application optimization and
QoS (Quality of Service) assurance.
• Adds overhead and increases the execution time. The hypervisor is
invoked by the OS when applications make systems calls.
Virtualization → user benefits versus concerns
◼ Simplifies the development and management of services offered
by a CSP.
◼ Allows isolation of services running on the same hardware.
• Important for load balancing. The state of a virtual machine (VM) running
under a hypervisor can de saved and migrated to another server to balance
the load.
• Increases the size of software stack.
• Complicates software maintenance. Saved VMs are not updated when OS
and other system software patches are applied.
Hypervisors –CPU and memory virtualization
• A hypervisor:
• Traps the privileged instructions executed by a guest OS and
enforces the correctness and safety of the operation.
• Traps interrupts and dispatches them to the individual guest
operating systems.
• Controls the virtual memory management.
Hypervisors –CPU and memory virtualization
• A hypervisor:
• Maintains a shadow page table for each guest OS and replicates
any modification made by the guest OS in its own shadow page
table. This shadow page table points to the actual page frame
and it is used by the Memory Management Unit (MMU) for
dynamic address translation.
• Monitors the system performance and takes corrective actions to
avoid performance degradation. For example, the VMM may
swap out a Virtual Machine to avoid thrashing.
Cluster resource management
• There are two sides of cluster management:
• One reflects the views of application developers who need simple means to locate
resources for an application and then to control the use of resources;
• The other is the view of service providers concerned with system availability,
reliability, and resource utilization.
• New concepts
• Framework ➔a large consumer of CPU cycles, a widely-used software system such
as Hadoop and MPI (Message Passing Interface), a standardized and portable
message-passing system used by the parallel computing community since 1990s.
• Resource offer ➔ abstraction for a bundle of resources a framework can allocate on a
cluster node to run its tasks.
• Large clusters scheduling is a hard problem due to the system scale
combined with the workload diversity.
Cluster Management with Borg
• Borg is a cluster management software developed at Google.
• Design goals:
• Manage effectively workloads distributed to a large number of machines and be highly
reliable and available.
• Hide the details of resource management and failure handling thus, allow users to focus on
application development. This is important as the machines of a cluster differ in terms of
processor type and performance, number of cores per processor, RAM, secondary storage,
network interface, and other capabilities.
• Support a range of long-running, highly-dependable production jobs and non-production,
batch jobs.
Borg organization
• Borg cluster ➔ tens of thousands of machines co-located and interconnected
by a data center-scale network fabric.
• Borg architecture:
• BorgMaster ➔ a logically centralized controller.
• Borglets ➔ processes running on each machine in the cell.
Borg organization
• The BorgMaster is replicated (5 replicas)
• Each replica maintains an in-memory copy of the state of the cell.
• The state of a cell is also recorded in a Paxos-based store on local disks of each replica.
• An elected master serves as Paxos leader and handles operations that change the state of
a cell, e.g., submit a job or terminate a task.
• Borglets start, stop, and restart failing tasks, manipulate the OS kernel setting
to manage local resources, and report the local state to the BorgMaster.
Borg organization
Configuration file
BorgConfig Command Line Tools Web browser
Borg cell
Scheduler
Scheduler Control
BorgMaster
Persistent Communication manager
store
• A quota system for job scheduling uses a vector including the quantity of
resources such as CPU, RAM, disk for specified periods of time.
• Production jobs are allocated about 70% of CPU resources and 55% of
the total memory.
Borg scheduler
• A Borg job could have multiple tasks and runs in a single cell. The majority
of jobs do not run inside a VM.
• Only 200 lines of Spark code implement the HaLoop model for
MapReduce applications.
Questions: