Xen Hypervisor
Xen Hypervisor
Introduction to the Open Source Xen Hypervisor Todd Deshane and Patrick Wilbur (Clarkson University) Stephen Spector (Citrix) June 22, 2008
Unit 1: Virtualization and Xen Overview Morning Break Unit 2: Installing, Configuring, and Basic Usage Lunch Unit 3: Devices and Advanced Configuration Afternoon Break Unit 4: Security, Admin Tools, and Performance
Unit 1
3
Why Virtualize? Potential Pitfalls Types of Virtualization Xen Background Hardware-Assisted Virtualization Overview Xen Architecture Xen LiveCD Demonstration
Why Virtualize?
Consolidation of servers Support heterogeneous and legacy OSes Rapid deployment and provisioning Advantages
Testing/debugging before going into production Recovery and backup Load balancing
Potential Pitfalls
Lack of IT staff with virtualization skills Security challenges Loss of deterministic hardware performance
Types of Virtualization
No Virtualization
Emulation
Full
Simulate the base hardware architecture Unmodified guest OSes Examples: VMware, VirtualBox, QEMU + kqemu, MS Virtual PC, Parallels, Xen/KVM (hardware-assisted)
7
Types of Virtualization
Para
Abstracted base architecture Modified guest OSes Examples: Xen, UML, Lguest Shared kernel (and architecture), separate user spaces Homogeneous guest OSes Examples: OpenVZ, Linux-VServer, Solaris Containers, FreeBSD Jails
Xen Background
Paravirtualization (PV)
High performance (claim to fame) With hardware support Hardware Virtual Machine (HVM) Open source Standalone hypervisor
Citrix Xen Server Virtual Iron Solaris xVM Oracle VM
Full Virtualization
Advantages
Hardware-Assisted Virtualization
Co-evolution of hardware and software AMD-V, Nested Page Tables (NPT) VT-x, VT-d, Extended Page Tables (EPT) AMD-V, VT-x Input/Output Memory Management Unit (IOMMU) NPT, EPT Smart devices
x86 AMD/Intel
Processor
Memory
I/O
Xen Architecture
Domain0 Management VM
Guest1
Guest2
GuestN
Xen Hypervisor
Physical Hardware
11
Scheduling
12
Domain0 Role
Creates and manages guest VMs Interacts with the Xen hypervisor
13
Xenstore Overview
Database of configuration information Used by Domain0 to access guest state Guest domain drivers can write to xenstore Reference:
http://wiki.xensource.com/xenwiki/XenStoreReference
14
15
Questions?
16
Morning Break
Break Time
17
Unit 2
18
Hands-on Installation Demos Guest Installation Distro-specific Guest Installation Tools Interacting with Guests
19
Mercurial overview
Peer to peer version control hg serve, hg pull (-u), hg update, hg commit, etc. Supports tags, hg fetch extension in .hgrc As a user, hg clone is usually enough Example:
-
Integrates better with distribution Security and bug fixes from distribution maintainers
20
21
22
23
24
OpenSUSE
25
26
27
Debian/Ubuntu
Ubuntu
-
ubuntu-xen-server ubuntu-xen-desktop
Gentoo
OpenSolaris
NetBSD
28
Commercial Distributions
Red Hat Enterprise Linux (RHEL) SUSE Linux Enterprise Server (SLES) Virtual Iron Oracle VM Xen Server
29
Guest Installation
Pre-built Guest Images Converting VMware Images Distribution-specific Guest Installation Tools Interacting with Guests
30
disk
Array of 3-tuples
-
disk = ['phy:sda, sda, w', 'phy:/dev/cdrom, cdrom:hdc, r'] disk = ['tap:aio:hdb1, hdb1, w', 'phy:/dev/LV/disk1, sda1, w'] disk = ['phy:sda, xvda, w', 'phy:/dev/cdrom, cdrom:hdc, r'] disk = ['tap:aio:hdb1, xvdb1, w', 'phy:/dev/LV/disk1, xvda1, w']
memory
memory = 512
31
vif
Array of virtual interface specifications Zero or more name=value entries for each interface Single interface examples
vif=[' '] vif=['mac=00:16:3e:51:c2:b1'] vif=['mac=00:16:3e:36:a1:e9, bridge=eth0'] vif=[' ', ' '] vif= ['mac=00:16:3e:36:a1:e9, bridge=eth1', 'bridge=eth0']
32
kernel
ramdisk
Use Domain0 kernel (external to guest) Use Domain0 kernel (external to guest) pygrub
-
bootloader
root
Use Xen-compatible kernel and initrd (internal to guest) Root file system device Append to kernel command line
-
Security issues
extra
33
kernel
builder
device_model boot
sdl
vnc
Simple DirectMedia Layer Built-in to Xen virtual frame buffer Virtual Network Computing
34
Deprecated parameters
cdrom
-
file:/ in disk=['file:/...']
-
Replaced by disk=['....cdrom:hdX...'] Replaced with disk=['tap:aio:/...'] Better to simply use the vif array
nics
-
35
36
PV Guest Example
Demo
37
HVM guest example kernel="/usr/lib64/xen/boot/hvmloader" builder="hvm" device_model = "/usr/lib64/xen/bin/qemu-dm" disk=['phy:/xen/images/hvm.disk,hda,w', 'phy:/dev/cdrom,hdc:cdrom,r'] sdl=1 boot="dc" memory=512 vif=['type=ioemu,bridge=eth0']
38
Demo
39
40
41
Distribution-specific Tools
CentOS/Fedora
Virtual Machine Manager (virt-manager) and rpmstrap Yast Virtual Machine Management debootstrap and xen-tools quickpkg and domi
OpenSUSE
Debian/Ubuntu
Gentoo
42
Distribution-specific Tools
43
44
45
46
47
48
49
50
Distribution-specific Tools
51
52
53
Start Installation
54
SSH VNC FreeNX Xen console Xen virtual frame buffer (sdl) Remote Desktop (demo)
55
Questions?
56
Lunch
Food Time
57
Unit 3
58
Quick LVM Refresher Network Storage Guest Image Files from Scratch Guest Save, Restore, and Live Migration Xen Device Models Network Configurations
59
Logical Volume Manager (LVM) Provides abstraction above block devices Allows the system administrator to:
Span logical volumes across physical volumes Grow and shrink logical volumes Use various RAID levels Use copy-on-write
Why LVM?
Flexible
Feature-rich
Resize partitions (guest images) Density of logical volumes (guests) Copy-on-write snapshots
-
Key terms
61
LVM commands
# pvcreate /dev/sda3 /dev/sda4 /dev/hda1 # vgcreate xen_vg /dev/sda3 /dev/sda4 /dev/hda1 # lvcreate -L 4G -n guest_partition xen_vg # lvextend -L 5G /dev/xen_vg/guest_partition # lvreduce -L 3G /dev/xen_vg/guest_partition
62
LVM caveats
Resize the underlying file system after the lvextend Resize the underlying file system before the lvreduce
Resize larger or smaller Resize online (while the file system is mounted)
ext2/3
63
Network Storage
Exports block devices over the network Lightweight Ethernet layer protocol No built-in security Exports block devices over the network Scales with network bandwidth Network layer protocol Client and user-level security Exports file system over the network Well-known and widely used Network layer protocol Known performance issues as root file system
64
Network Storage
Exports block device over the network Scales with network bandwidth Network layer protocol Not recommended as root file system Advantages of block devices and file servers More difficult to setup and configure Examples:
Global Network Block Device (GNBD) Distributed Replicated Block Device (DRBD)
65
Smallest guest image file Best for backup and sharing More difficult to setup guest
Single file containing root and swap partitions Largest guest image file More commands needed Separate files for root and swap partitions Easiest to work with (standard commands) Slightly more image files to work with (separate swap)
66
Creation
tar -czpf /linux-root.tgz \ --exclude /proc \ --exclude /linux-root.tgz \ / Setup a partition, local volume, disk or partition image Extract with tar command Customize partition setup Configure network-specific details
/etc/hosts and /etc/hostname IP address setup
Using
67
Creation
Use dd to make a sparse or pre-allocated file Partition the image (fdisk) Make partitions available to the system (kpartx) Format the file systems (mkfs/mkswap) Populate root file system If no customization is needed, use directly If customization is needed
Using
Associate disk image with block device (losetup) Make partitions available to system (kpartx) Mount partitions in /dev/mapper (mount) Do customizations Unmount (umount), release partitions (kpartx -d), un-associate with block device (losetup -d)
68
Creation
Use dd to make a sparse or pre-allocated file Format each file system (mkfs/mkswap) Populate file system (if root partition) If no customization is needed, use directly If customization is needed
Mount with loop option (mount -o loop) Do customizations Unmount (umount)
Using
69
Similar to hibernate and resume of physical PC Built-in Xen functionality and xm commands Cold static relocation Warm static (regular) migration Live migration
70
xm save
xm restore
Pauses and suspends (hibernates) a guest Saves guest state to a file on disk Restores a guest's state from disk Resumes execution of guest
71
Guest Relocation
Image and config files need to be manually copied from source to target Domain0 Hardware maintenance with less downtime Backup of guest images Shared storage not required More manual process Guests should be shut down during copy
Benefits
Limitations
72
Guest Migration
xm migrate
Network connections to and from guest are interrupted and probably will timeout xm migrate --live
Copies a guest's state to a new Domain0 Repeatedly copies dirtied memory until transfer is complete Re-routes network connections
Pauses a guest Transfers guest state across network to a new Domain0 Resumes guest on destination host
Live migration
73
Guest Migration
Benefits
Load balancing Hardware maintenance with little or no downtime Relocation for various reasons Shared storage required Guests on same layer 2 network Sufficient resources needed on target machine CPU architectures need to match Some constraints on hypervisor version
Limitations
74
75
Generic backends
Loaded in DriverDomain (often Domain0) Loaded in guest domain Connects to corresponding backend driver Guests use standard Xen virtual device drivers
Real device-specific drivers are in DriverDomain Devices are multiplexed to the guests
76
77
Provides emulation of devices Provides exclusive access illusion to the guests Used primarily for HVM guests
78
Guests granted full access to specific PCI devices The actual device driver runs in the guest Benefits
Highest performance for a device Useful when virtualization doesn't support a device System stability with buggy driver Not (yet) well-tested DriverDomain as backend can be tricky HVM guest support still limited Security considerations
Limitations
Without an IOMMU, guests can (direct memory access) DMA into main memory
79
PCI Passthrough
Demo
80
Network Configurations
Network Bridge Configuration Network Route Configuration Network NAT Configuration Host-only Networking Handling Multiple Interfaces Virtual Private Network (VPN)
81
A bridge relays traffic based on MAC address Behavior of guests in bridge mode
Appear transparently on the DriverDomain's network Access the network directly (through software bridge) Obtain IP address on the local Ethernet Set network-bridge and vif-bridge in xend config Bridge is default guest configuration
Configuration details
83
Get routed to the Ethernet through the DriverDomain Access the network via DriverDomain Obtain IP address from DriverDomain
Configuration details
Set network-route and vif-route in xend config Set IP, netmask, and gateway in guest config ip_forwarding in DriverDomain Network traces and iptables configuration
84
Troubleshoot at IP layer
85
Get NATed through the DriverDomain Access outgoing traffic transparently through NAT Are not visible from the outside (behind NAT router) Obtain internal IP from software NAT router Set network-nat and vif-nat in xend config Set IP, netmask, and gateway in guest config Uses MASQUERADE chain
Configuration details
87
Host-only Networking
Share data on private network Guests to other guests only Host (DriverDomain) to guests only Set up dummy bridge in DriverDomain Set bridge to dummy bridge in guest config
Configuration details
Secure with ebtables (if needed) Troubleshoot with brctl and iptables
88
89
90
Configure the purpose of NICs in xend config Specify bridge device in guest config Create custom scripts that call network-* scripts
Set virtual device number (vifnum) Set network device (netdev) Set bridge device (bridge)
Add custom network script in xend config Add vif entries and specify bridge in guest config
91
Illusion of VPN client on VPN server's local network VPN implementation based on bridged network Installation methods Not well-tested (yet)
Kernel module Userspace daemon in DriverDomain
92
Questions?
93
Afternoon Break
Break Time
94
Unit 4
95
Security Basics Advanced Security: sHype and XSM Management APIs and Utilities Resources and Performance Future Directions
96
Security Basics
Standard system security practices apply Secure Domain0 and Xen Secure guest domains normally
97
Security Basics
Secure Hypervisor (sHype) Xen Security Modules (XSM) Minimize software packages Minimize running services and open ports Use firewall and intrusion detection systems Similar to Domain0 software, services, and ports IOMMU for direct PCI device pass-through
Secure Domain0
Secure guests
98
Labeled objects Privilege groups given access to specific object labels Enables remote attestation by digitally signing cryptographic hashes of software components Attestation means to affirm that some software or hardware is genuine or correct
Demo
100
APIs
libvirt Xen Application Programming Interface (API) Xen Common Information Model (CIM) virt-manager virsh XenMan (ConVirt) Enomalism Citrix XenServer Virtual Iron IBM Director (IBM Virtualization Manager extension)
Management utilities
101
Demo
102
Gauging performance
Memory management
103
Xen scheduler
Credit scheduler
I/O schedulers
Noop scheduler Deadline scheduler Anticipatory scheduler (as) Complete fair queuing scheduler (cfq)
104
Macro benchmarks
Disk I/O
Network CPU
Future Directions
Smart devices
IOMMUs Trusted Platform Module (TPM) support Xen Domain0 inclusion in mainline Linux PV drivers for HVM guests
Further Information
Useful resources
Xen Community XenWiki Xen Mailing Lists Xen Bugzilla Xen Summit Xen source code Academic papers and conferences The Definitive Guide to the Xen Hypervisor (book) Running Xen: A Hands-On Guide to the Art of Virtualization (book)
107
Questions?
108
Acknowledgments
109
References
http://runningxen.com http://runningxen.com/mailman/listinfo/readers_runningxen.com http://www.usenix.org/publications/login/2007-02/pdfs/hand.pdf http://wiki.xensource.com/xenwiki/XenStoreReference http://portal.acm.org/citation.cfm?id=1281700.1281706&coll=&dl=ACM http://www.usenix.org/publications/login/2007-02/pdfs/griffin.pdf http://passat.crhc.uiuc.edu/dasCMP/papers/dasCMP07/paper01.pdf http://ieeexplore.ieee.org/iel5/4299339/4299340/04299347.pdf? isnumber=4299340&prod=CNF&arnumber=4299347&arSt=2&ared=2&arAuthor =Apparao%2C+Padma%3B+Makineni%2C+Srihari%3B+Newell%2C+Don http://workspace.globus.org/vtdc06/VTDC_files/programdraft.htm http://xen.org/files/xensummit_fall07/28_PadmaApparao.pdf http://opensolaris.org/os/project/libmicro/ http://www.computerworld.com/action/article.do? command=viewArticleBasic&taxonomyName=Storage&articleId=9081798&taxon omyId=19 http://weblog.infoworld.com/daily/archives/2008/04/top_3_gotchas_o.html http://news.zdnet.com/2100-3513_22-6191965.html http://www.cl.cam.ac.uk/research/srg/netos/xen/readmes/hg-cheatsheet.txt http://www.cuddletech.com/blog/pivot/entry.php?id=469
110