0% found this document useful (0 votes)
193 views

Boot From SAN in Windows

Booting from a storage area network (SAN) can enable organizations to maximize consolidation of their IT resources. Boot from SAN technologies are supported on microsoft(r) Windows server(tm) 2003 and windows(r) 2000 Server platforms.

Uploaded by

ashfakis
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
193 views

Boot From SAN in Windows

Booting from a storage area network (SAN) can enable organizations to maximize consolidation of their IT resources. Boot from SAN technologies are supported on microsoft(r) Windows server(tm) 2003 and windows(r) 2000 Server platforms.

Uploaded by

ashfakis
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 26

Boot from SAN in Windows Server 2003 and Windows 2000 Server

Microsoft Corporation December 2003

Abstract
Booting from a storage area network (SAN), rather than from local disks on individual servers, can enable organizations to maximize consolidation of their IT resources, minimize their equipment costs, and realize the considerable management benefits of centralizing the boot process. This white paper describes boot from SAN technology in a Windows environment, the advantages and complexities of the technology, and a number of key SAN boot deployment scenarios. Boot from SAN technologies are supported on Microsoft Windows Server 2003 and Microsoft Windows 2000 Server platforms.

Contents
Introduction.....................................................................................................................................2 The Boot Process: An Overview.....................................................................................................3 Local Boot..........................................................................................................................3 Remote Boot......................................................................................................................3 Boot from SAN...................................................................................................................4 Boot from SAN: Pros and Cons......................................................................................................5 Advantages........................................................................................................................5 Disadvantages...................................................................................................................6 Key Solution Components..............................................................................................................7 Hardware Requirements....................................................................................................7 Boot Specific Requirements...............................................................................................8 SAN Boot Scenarios.....................................................................................................................10 Basic SAN..............................................................................................................................10 Multipath Configurations.........................................................................................................12 Clustered Servers...................................................................................................................13 Directly Attached Paging Disk................................................................................................17 iSCSI boot from SAN..............................................................................................................17 Troubleshooting Boot from SAN...................................................................................................18 General Boot Problems....................................................................................................18 Potential Difficulties with SAN Boot..................................................................................19 Current Limitations to Windows Boot from SAN...............................................................20 Summary......................................................................................................................................21 Additional Resources....................................................................................................................22 Appendix.......................................................................................................................................23 The Boot Process: Details......................................................................................................23 Pre-boot...........................................................................................................................23 Boot Sequence................................................................................................................23 Intel IA-64 Architecture Differences..................................................................................25

Boot from SAN in Windows Server 2003

Introduction
One of the ways in which organizations with large scale server deployments (consisting of hundreds to tens of thousands of servers, such as those found in enterprise data centers) are dramatically cutting costs is to replace large servers with highly compact rack-mountable forms. These smaller form factors dispense with costly individually-attached hardware (such as power supplies and device interfaces) in favor of shared resources among the servers in a rack. The densest of these forms is the blade server, which began shipping in 2002. In addition to reducing the hardware, electrical and square footage costs, blade servers are more manageable than traditional servers, since, in addition to being hot pluggable and simplifying cable configurations (and incorrect configurations can be a major source of downtime), a single server can be used to manage all other servers in the same shared chassis. While some blade servers have internal disks, they tend to have lower performance and capacity than SCSI disks, a fact which is helping to drive adoption of diskless blade servers used in combination with storage networks (NAS and SAN). The development of diskless blade servers introduced a challenge for versions of the Windows operating system prior to the Windows Server 2003 server release, since Windows boot procedures were originally developed with the requirements that the boot disk be directly attached to the server, and that the operating system have access to the boot volume at all times. With the release of Windows Server 2003 (and updates to Windows 2000), the Windows platform supports boot from SAN capabilities, without requiring that the dedicated boot disk be local to the server. These capabilities, and the necessary steps for successful deployment in a variety of environments, including clustering, are explained in the sections that follow. Because boot from SAN configurations are fundamentally dependent on hardware configurations and capabilities, it is important to note that support for boot from SAN in the Windows environment comes from the hardware vendors, not from Microsoft.

Boot from SAN in Windows Server 2003

The Boot Process: An Overview


The boot process, variously known as booting or bootstrapping, is the iterative process of loading the installed operating system code from the storage device into computer memory when the computer is powered on. Since BIOS (Basic Input/Output System) is the most basic code, it is loaded first. It serves to initialize the computer hardware and read in the code (from a storage device or network) necessary to begin the next stage of booting. This code loads the operating system, completes hardware setup and produces a fully functional operating system residing in memory. For a more detailed description of the boot process, see the Appendix of this white paper. The boot process can occur from a direct attached disk, over a local area network, or from a storage area network. In all cases, the critical step to a successful boot is locating the boot disk. The device controlling that process varies, depending on the boot type.

Local Boot
The most common booting approach is to boot from a direct-attached disk. The server BIOS locates the SCSI adapter BIOS, which contains the instructions allowing the server to determine which of the internal disks is the boot disk necessary to load the operating system.

Remote Boot
Remote booting (also called network boot) is the ability of a computer system to boot over a local area network (LAN) from a remote boot server. Critical to remote boot is the network adapter card, which contains the instructions necessary for booting. Remote boot is not a new concept; UNIX systems have had remote boot capabilities for about 30 years. Remote booting confers a number of advantages, including remote administration of client workstations, greater control over distribution of software, and cost reduction by eliminating the need for local hard drives. However, the downside of remote boot is that booting over the LAN is a considerable security risk. While Microsoft Windows NT Server 4.01 enables remote boot, Microsoft has not made this capability widely available in its other operating system products because of the inherent security risks. Windows does, however, support a much more secure remote boot technology, boot from SAN.

Windows NT Server 4.0 uses the Remoteboot Service, which requires a special chip on each client network interface card to enable remote booting of MS-DOS, Windows 3.1, Windows 95 and Windows 98. This boot programmable readonly memory (PROM) chip redirects the standard startup process from the client to the network adapter card and establishes the network connection to the server. The client can then obtain a boot imagethe minimum necessary information for startup and configurationdirectly from the server. The Remoteboot Service requires installation of the NetBEUI and DLC protocols on the server.
1

Boot from SAN in Windows Server 2003

Boot from SAN


Boot from SAN is a remote boot technology; however, in this case, the source of the boot disk is on the storage area network (SAN), not on the LAN. The server communicates with the SAN through the host bus adapter (HBA), and it is the HBA BIOS that contains the instructions that enable the server to find the boot disk on the SAN. Boot from SAN, like LAN-based remote boot, offers the advantages of reduced equipment costs. It also offers a number of other advantages, including reduced server maintenance, improved security, and better performance. These factors are addressed in greater detail in the next section.

Boot from SAN in Windows Server 2003

Boot from SAN: Pros and Cons


Booting from a SAN can offer organizations a number of storage management advantages. However, while boot from SAN is conceptually straightforward (and is the same process whether the boot is local to the server or from the SAN), configuration of the various hardware components to guarantee a successful SAN boot is both difficult and inadequately documented. Given this complexity, any organization interested in boot from SAN capabilities should weigh the increased complexity against the advantages boot from SAN can confer.

Advantages
Boot from SAN technologies help businesses continue the trend toward consolidated and effective management of storage resources, decoupled from the server. Server Consolidation. Boot from SAN alleviates the necessity for each server to boot from its own direct-attached disk, since each server can now boot from an image of the operating system on the SAN. Thin diskless servers take up less facility space, require less power to operate, and, because they have fewer hardware components, are generally less expensive. Internal disk failure is very common in large datacenters. Centralized Management. Since operating system images can be stored to disks on the SAN, all upgrades and fixes can be managed at a centralized location. This eliminates the need to manually install upgrades on each system. Changes made to the disks in the storage array are readily accessible by each server. Simplified Recovery from Server Failures. Recovery from server failures is simplified in a SAN environment. Rather than a lengthy process of re-installing the operating system and a backup copy of the data from tape to a spare server, the spare can simply be booted from the SAN and then access the data stored on the SAN, returning to production with maximum efficiency. Rapid Disaster Recovery. All the boot information and production data stored on a local SAN can be replicated to a SAN at a remote disaster recovery site. If a disaster destroys functionality of the servers at the primary site, the remote site can take over with minimal downtime. Rapid Redeployment for Temporary Server Loads. Businesses that experience temporary but high production workloads can take advantage of SAN technologies to clone the boot image, and distribute the image to multiple servers for rapid deployment. Such servers may only need to be in production for hours or days, and can be readily removed when the production need has been met; the highly efficient deployment of the boot image makes such temporary deployment a cost effective endeavor.

Boot from SAN in Windows Server 2003

Disadvantages
For all its advantages, booting from a SAN is not a technology for the storage administrator who is unfamiliar with the complexity of deploying a SAN. Hardware Deployment is Complex. Solution components, including HBAs and logical unit number (LUN)2 management must all be configured correctly for a server to successfully boot from the SAN. These challenges increase in a multivendor hardware environment. Boot Process is Complex. The details of the operating system boot process, and the dependencies of the process on operating system functionality are conceptually challenging (see Appendix of this white paper), and need to be generally understood to make troubleshooting more effective.

A logical disk. A LUN may map onto a single or multiple physical disks, and may constitute a whole or only part of any given disk or disks.

Boot from SAN in Windows Server 2003

Key Solution Components


Boot from SAN, while conceptually straightforward, can be problematic to deploy correctly. Key to effective deployment, whether the most basic topology or a complex enterprise configuration, is ensuring that both software and hardware components are installed according to specified vendor requirements. This section outlines the key solution components. The actual sequence of steps depends on the boot from SAN scenario deployed.

Hardware Requirements
The sections that follow outline the basic hardware components necessary for correctly deploying a boot from SAN solution. It is recommended that key components (HBAs, switches etc) are duplicated for redundancy in the event of hardware failure. Servers Each new server designated to be connected to the SAN storage array should be installed as per vendor instructions. If the server has already been in production, ensure that all disks are backed up before connecting it to the SAN. Ensure that the operating system supports boot from SAN. Windows Server 2003, Windows Storage Server 2003, Windows 2000 Server, and Windows NT 4.0 are capable of booting from the SAN.

Host Bus Adapters For each server to be connected to the SAN, record each world wide name (WWN) for each HBA prior to installation, or obtain this information from the setup utility resident on the HBA. The WWN is a unique address assigned by the manufacturer, and will be used during the configuration process. It may be necessary to obtain both the world-wide port name and the world-wide node name. Install the HBA according to vendor instructions. Ensure that the HBA supports booting (some do not). Ensure that the HBA BIOS has the correct version of the firmware installed. Obtain the correct HBA driver. It is this driver that allows the server to communicate with the disks in the SAN as if they were local SCSI attached disks. The driver also provides the bootstrap program. In certain configurations, the Microsoft Storport driver is the recommended driver for boot from SAN. Ensure that the HBA settings are configured to match all components of the particular solution deployed, including the server, operating system version, the SAN fabric, and the storage array. Vendors will include instructions on any necessary changes to the default configuration.

Boot from SAN in Windows Server 2003

SAN Fabric The SAN fabric consists of the switches and cabling that connect the servers to the storage array. The HBA for each server is connected to a switch, and from there through to the port on the storage array. Assign the new HBA devices to zones (groups) on the SAN fabric. Communication is restricted to members of the same zone. WWNs or the physical switch ports are used to identify members of the zone.

Storage Array Storage controllers control communication between the disks on the array and the ports to which the servers connect. Storage arrays should have at least two controllers for redundancy. Create the RAID sets and LUNs for each server on the storage array. The logical units are either numbered automatically by the storage array, or they can be assigned by the user. Many storage arrays have the capability of managing disk security so that servers can only access those LUNs that belong to them. Disks and LUNs are assigned to ports; hence a single port connects to multiple disks or LUNs. These storage resources must be shared among the multiple servers; LUN management through masking is critical to prevent multiple hosts from having access to the same LUN at the same time. Microsoft only supports boot from SAN when used with LUN masking.

Boot Specific Requirements


A number of boot specific factors must be considered in order to correctly deploy a boot from SAN solution. Boot Bios Ensure that the correct boot BIOS is on the HBA; without this the HBA may not detect any disks on the SAN. The default setting for the HBA boot BIOS is typically disabled; this must be enabled on only one adapter per server in order to boot from SAN.

HBA Driver Ensure that the HBA driver is appropriate for the boot configuration design. The SCSIport driver, which was designed for parallel SCSI solutions, is not the appropriate driver for high performance SAN configurations; instead use a Storport miniport driver with Windows Server 2003, as it is specifically designed for such solutions. The Storport driver features improved reset handling, which makes it possible to boot from SAN and have clusters running on the same adapters. Storport also allows for queue management, which is critical in a SAN fabric where fabric events such as adding or removing devices are common. Further details on the Storport driver can be found in the white paper, Storport in Windows Server 2003: Improving Manageability and Performance in Hardware RAID and Storage Area Networks.

Boot from SAN in Windows Server 2003

Designate Boot Disk Each server must have access to its own boot drive. Designate as many boot disks (or LUNs) in the storage array as there are servers accessing the storage in that array. For each server to access the correct boot disk, a setting on the HBA boot BIOS must be changed to reflect the address of the disk or LUN on the SAN.

Boot from SAN in Windows Server 2003

SAN Boot Scenarios


SAN deployment configurations can be quite simple, and can grow to enormous complexity. This section guides the reader through the most basic configuration to some of the more complex configurations common to enterprise storage environments.

Basic SAN
The simplest SAN configuration, shown in Figure 1, is to deploy two diskless servers, each with a single HBA, connected to Fibre Channel storage3. (For simplicitys sake, this configuration does not employ redundant components, even though it is recommended that they are deployed.) For Windows boot from SAN solutions to work correctly, each server must have access to its own dedicated boot device.

Figure 1. Basic boot from SAN Configuration

One of the simplest SAN configurations is a Fibre Channel arbitrated loop (FC-AL) configuration in which up to 126 devices are connected. However, this configuration is not supported in Windows boot from SAN, since the addition or removal of devices from a FC-AL configuration may result in all the devices acquiring a new network address. Moreover, the interruptions that occur when loop events occur can cause I/O to fail, which can cause the whole system to stop booting or to crash.

Boot from SAN in Windows Server 2003

10

Follow these steps to set up the system so that the BIOS for each server correctly locates its boot LUN: 1. Configure the storage array with LUNs. (These LUNs are empty until after the operating system is loaded and the file structure is added to allow population with data.) The storage array returns the LUN numbers automatically, or the administrator can set them. (LUN numbers either remain as assigned, or are remapped by the HBA, as discussed later in these steps.) In this case, the intention will be to install boot files on LUN 1 and LUN 2. LUN 3 can be used for data. The array is assumed to support a single LUN 0 instance, which is not a disk device. This logical unit is used to obtain discovery information from the array through the use of the SCSI-3 REPORT LUNS command. Devices that only comply with earlier specifications are not recommended for use with Fibre Channel, clustering, or when booting from SAN. (The array must also return the HiSup bit set in the LUN 0 INQUIRY data unless a Storport miniport is available.) 2. Determine the manufacturer-set world wide node name (WWNN) for each HBA adapter. (This number can be read from the HBA label prior to installation, or it may be displayed using the BIOS setup program.) 3. Determine the port name (WWPN) of the controller on the storage array. 4. Ensure that each server only has access to the LUNs allocated to it. This is done through the process of unmasking4 the appropriate LUN in the storage array to the appropriate server, so that a path is traced from LUN to the controller across the fabric into the HBA. The LUN number, node, and port addresses are all required. In the configuration in the example, LUN 1 will be unmasked to server 1 and LUN 2 to server 2. It is advisable to keep LUN 3 masked at this point so that it does not appear as a choice during installation of the operating system. 5. Begin installation of the operating system on server 1 from a bootable CD. If steps 1-4 have been correctly carried out, installation should proceed without error. 6. When prompted during setup for third party storage drivers, press the F6 key and make sure that the miniport driver is available (typically on a floppy disk or CD) for the HBA. While the inbox Fibre Channel drivers can be used for Server 2003, check for the correct driver required by the storage vendor. For Windows 2000 or earlier, ensure that the appropriate driver is available.

Depending on the vendor, the default state of the LUNs is either masked or unmasked to the server. Thus, whether the administrator unmasks or masks depends on the default state of the LUNs on the storage array.

Boot from SAN in Windows Server 2003

11

7. Setup searches for and lists all available LUNs. Since only LUN 1 has been unmasked to the server, the LUN on which to install will be clear. Installation proceeds in two stages: Text mode: The target LUN is formatted and partitioned, and the operating system files are copied to the boot LUN. Once complete, the system automatically restarts and then begins the second phase of setup. Graphical user interface mode: Setup discovers and enumerates server hardware, installs drivers, and finishes installation of the operating system files. The system restarts, and the user can now log into server A, which is running Windows from the SAN.

8. Repeat step 5 for server 2, using LUN 2. Again, if steps 1-4 have been correctly carried out, installation of the operating system will be successful. Subsequent boots from the SAN should proceed without problem. Adding redundancy to this basic configuration introduces a further layer of complexity. This is discussed in the next section, Multipath Configurations.

Multipath Configurations
Using the same two-server configuration introduced in the previous section, the administrator can add redundant HBAs, cabling, switches, controllers, and ports on each controller. This configuration, which confers high availability and high performance to the deployment, is shown in Figure 2.

Figure 2. A Fully Redundant Boot from SAN Configuration

Boot from SAN in Windows Server 2003

12

In order for multipath solutions to work correctly, path management software that works with Windows is necessary5. Follow steps 1-3 as listed above for the basic SAN configuration, obtaining the WWNN and WWPN for each HBA and controller. 1. Configure storage, creating LUNs 1-3. 2. On each server, unmask LUNs to both HBAs: Server 1 Server 2 HBA B: controller 1 LUN 2, controller 2 LUN 2 HBA B: controller 1 LUN 2, controller 2 LUN 2 HBA A: controller 1 LUN 1, controller 2 LUN 1 HBA A: controller 1 LUN 1, controller 2 LUN 1

3. Make sure that only one HBA has its BIOS enabled. Only one LUN can be the boot LUN for each server. Continue with all installation activities, as outlined in the basic SAN configuration. Note that, since only one boot device can be exposed to the BIOS as the boot LUN, the BIOS requires a manual reset to boot from HBA A if HBA A fails. Crash Dump File Creation In the event of a system or kernel software component failure, a crash dump file is created and used to aid with diagnosis of the failure. To be created, the file must be written to the system drive (C:). The crash dump stack (created at boot and the precursor to creation of the crash dump file) is specific to the HBA path from which the system is booted. This creates a difficulty in multipathing solutions, since the crash dump stack does not have multipath drivers available. Using the example given in Figure 2, if the boot path is through HBA A and that adapter fails, the system is no longer able to write the crash dump file, since HBA A is not recognized by the crashdump driver. However, if the failure is transient, the system administrator might not be aware of the problem.

Clustered Servers
When implementing a Microsoft clustering solution (MSCS) in a SAN boot environment, the MSCS servers must keep the boot LUNs separate from the shared cluster LUNs. Whether or not dedicated HBAs must be used to accomplish this separation depends on whether the loaded Windows driver is SCSIport or Storport.

Vendors can use the Microsoft MPIO driver package to develop effective path management solutions that work with Windows. See the white paper, Highly Available Storage: Multipathing and the Microsoft MPIO Driver Architecture, available at the storage website.

Boot from SAN in Windows Server 2003

13

SCSIport Driver If SCSIport is installed as the HBA drivers for the servers, each server will require two HBAs, one to expose the boot disks to and the other to expose the shared cluster disks to. This is because clustering limits the connection of shared cluster resources to a separate bus, necessary since bus-level resets are used within the cluster software. A reset on the port attached to the boot LUN has the potential of disrupting paging I/O and the resulting timeout can result in a system crash. Functionally, this does not prevent using boot from SAN capabilities in a clustered environment, although it does require careful deployment. Note that this basic cluster deployment, shown in Figure 3, does not provide HBA redundancy. HBA A within server 1 accesses one of the boot LUNs and HBA B accesses the shared cluster LUNs. Server 2 accesses the other boot LUN through HBA C. HBA D also accesses the shared cluster LUNs. The shared cluster design allows for high application availability and service-level failover in case a hardware component fails. To set up the cluster servers so that each server can correctly locate its boot LUN and the shared cluster LUNs, use either a combination of zoning and masking, or masking alone. Once zoning and/or masking are complete, install or configure the clustering software. Zoning + Masking. This is a two step process. First, apply zoning to the ports. In the case where the LUNs are presented on different storage ports, this step separates the shared cluster LUNs from the boot LUNs. Both HBA A and HBA C are zoned to share controller 1 and access boot LUNs only. HBA B and D are zoned to share controller 2 and access the shared cluster LUNs. The second step is to use masking to ensure that each server only has access to the appropriate boot LUN. Since both servers share the cluster LUNs, those LUNs must be unmasked to both nodes of the cluster. If the storage array contains additional shared clusters, they will require zoning and masking to ensure that only the appropriate servers access the appropriate cluster resources. Masking Only. This method does not employ zoning techniques. While it can be successfully adopted for clustering deployments, correct deployment is difficult unless very high quality masking implementations are used.

Boot from SAN in Windows Server 2003

14

Figure 3. Boot from SAN and Clustering, Using the SCSIport Driver Storport Driver The most significant limitation to deploying a clustering solution in a SAN boot environment using the SCSIport driver is the HBA slot limit. Since separate HBAs must be used to access boot LUNs and shared cluster LUNs, to implement a fully redundant solution, 4 HBAs (or two dual channel HBAs) are necessary in each server. If the server cannot accommodate 4 HBA cards, or if the cost of obtaining those cards is too great, a high availability multipathing solution is not possible. The Storport driver overcomes this limitation. With Storport, given its hierarchical reset capabilities6, bus-level resets are rare events, eliminating the necessity for multiple HBAs to separate boot and cluster LUNs. The basic configuration (without multipathing redundancy) is shown in Figure 4.

See the Storport white paper, Storport in Windows Server 2003, for further details.

Boot from SAN in Windows Server 2003

15

This solution is much less expensive and simpler to configure. Since only a single controller and port is used, all LUNs are visible on the port and zoning cannot be used to completely isolate cluster LUNs from boot LUNs. Masking must be used to ensure that each server has access to the correct boot LUN (and no access to the boot LUN of another server). Both servers will share access to the cluster LUNs. One final step is required to enable this configuration for clustering. (See the Microsoft Knowledge Base article 304415 for details.)

Figure 4. Boot from SAN and Clustering, Using the Storport Driver Storport also allows miniport controllable queue management, which allows HBA vendors to build drivers that can survive SAN transients (such as Fibre events) without crashing the system. This is of considerable importance in cluster configurations.

Boot from SAN in Windows Server 2003

16

Directly Attached Paging Disk


A pagefile is a reserved portion of the hard disk that is used to expand the amount of virtual memory available to applications. Paging is the process of temporarily swapping out the inactive contents of system physical memory to hard disk until those contents are needed again. Since the operating system must have unrestricted access to the pagefile; the pagefile is commonly placed on the same drive as system files. Thus, the C: drive normally includes boot, system and paging files7. While there is negligible contention between the boot reads and paging writes, there can be considerable resource contention between systems on the SAN when they are all trying to do paging I/O, or when many systems attempt to boot simultaneously from the same storage port. One way to lessen this problem is to offload non-data I/O (such as paging, registry updates and other boot-related information) from data I/O (created by such sources as SQL or Exchange). The different ways to store the files are shown in Table 1. Table 1. Various Possible Locations of the Pagefile Case 1 SAN (C: ) boot system pagefile Case 2 SAN (C: ) boot system Local disk (e.g. D: ) pagefile Case 3 SAN (e.g. D: ) boot Local disk (C: ) system pagefile

iSCSI boot from SAN


Thus far, this paper has only discussed booting from SAN in a Fibre Channel interconnect environment. Windows also supports boot from SAN using iSCSI interconnects to the SAN, provided iSCSI HBAs are used to enable the boot process. As in Fibre Channel environments, the HBAs must support INT138 BIOS extensions that enable the boot process. Boot from SAN is not supported using the Microsoft iSCSI software initiator. See the paper, Microsoft Support for iSCSI for further details.

The boot files are the files required to run the Windows operating system. The system files are the files required to boot Windows. These files include boot.ini, Ntldr and Ntdetect. The paging file is typically called pagefile.sys.
8

INT 13 are device service routines (DSRs) that communicate with hard drives (or diskettes) before other system drivers are loaded. The INT13 extensions enable systems to see partitions up to 2 TB, well beyond the original 7.8GB limitation of the original INT13 functionality.

Boot from SAN in Windows Server 2003

17

Troubleshooting Boot from SAN


A number of problems can arise during configuration that can result in a failure to load the operating system. It is important to distinguish between those problems that are shared by all types of boot, and those that are specific to boot from SAN environments. Because correct deployment of boot from SAN depends on the user undertaking the exact vendor steps for HBA and SAN configuration, hardware vendors must be the primary point of contact for issues related to booting.

General Boot Problems


The most common cause of boot problems is a failure to locate the boot partition and boot files. This can happen for a multitude of reasons, ranging from boot sector viruses to hardware failure to configuration problems. While failure to locate the boot device can occur in any boot environment (not simply boot from SAN), this issue is more problematic in complex storage configurations where new devices are added and removed frequently. Variable Device Enumeration Order As new storage targets (such as disks or LUNs within the storage array) become available on the SAN, the HBA controller assigns each a target ID. Although each device already has a unique WWN assigned by the manufacturer, the Windows operating system requires that devices are numbered, using target IDs, according to the SCSI device convention. The target IDs are assigned to storage devices as they appear in the fabric. When a LUN is created within a target, the fabric does not register its presence as an event; instead the HBA is responsible for notifying Plug and Play (PnP) of its presence, or a manual rescan of disks is required (using either diskpart or the Disk Management snap-in). Some Fibre Channel arbitrated loop (FC-AL) configurations do not work well with boot from SAN. A single server on one loop, accessing a disk on a SAN with a single port, works well. This is because, with no other servers or port targets, the controller ID is 0, the desired state. If a second device (such as another disk) is added, it is given the target ID=1, which also works effectively. However, with a power off/on sequence (or with reinitialization of the loop following a fabric event), the devices might not be enumerated in the same order. One unintended consequence of such a change is that the boot device may not be addressed as expected, and the operating system cannot load. Although this problem can be circumvented by using HBA persistent binding (which prevents the SCSI target ID from changing even after a system reboot), this FC-AL solution is not supported by Microsoft. Multiple Adapter Complexity with PnP Although adapter devices can be successfully hot plugged and manually enumerated so that the attached systems behave as expected, when the power cycles off and on, the devices can be reenumerated, possibly causing all the HBA port addresses to change. The fact that different system vendors assign their PCI slots differently can introduce configuration problems.

Boot from SAN in Windows Server 2003

18

Potential Difficulties with SAN Boot


Boot from SAN introduces a number of specific challenges that the administrator must be aware of to ensure that the solution works as intended. Lack of Standardized Assignment of LUN 0 to Controller Some vendors storage adapters automatically assign logical unit numbers (LUNs). Others require that the storage administrator explicitly define the numbers. With parallel SCSI, the boot LUN is LUN 0 by default. Fibre Channel configurations must adhere to SCSI-3 storage standards. In correctly configured arrays, LUN 0 is assigned to the controller (not to a disk device), and is accessible to all servers. This LUN 0 assignment is part of the SCSI-3 standard, since many operating systems do not boot unless the controller is correctly assigned as LUN 0. Correctly assigning LUN 0 to the controller allows it to assume the critical role in discovering and reporting a list of all other LUNs available through that adapter. In Windows, these LUNs are reported back to the kernel in response to the SCSI REPORT LUNS command. Unfortunately, not all vendor storage arrays comply with the standard of assigning LUN 0 to the controller. Failure to comply with that standard means the boot process may not proceed correctly. In some cases, even with LUN 0 correctly assigned, the boot LUN cannot be found, and the operating system fails to load. In the following cases (without HBA LUN remapping), the kernel finds LUN 0, but may not be successful in enumerating the LUNs correctly. Without HBA LUN Remapping 1. The kernel finds LUN 0 (the controller) and sends it a Report LUNs command. 2. The storage array controller: a) Correctly interprets Report LUNs, and returns a LUN list for each HBA (HBA A: LUN 1, LUN 3; HBA B: LUN 2) Each server can boot from its assigned boot LUN b) Does NOT correctly interpret this command (most likely because SCSI-3 standards were not followed). A LUN list is not produced for each HBA. The Windows kernel attempts further LUN discovery using a sequential discovery algorithm, which starts searching beginning with LUN 0 and increments sequentially i.e. the next LUN is LUN 1. LUN 1 is not found (because it is masked) No further discovery attempts are made, since the algorithm essentially returns no more LUNs. LUNs 2 and 3 are NOT found; neither server 1 nor 2 can boot.

These problems can be solved by implementing HBA-based LUN mapping, described in the next topic. HBA mapping must be available on the HBA controller.

Boot from SAN in Windows Server 2003

19

With HBA LUN Mapping HBA LUN mapping can correct the problems that arise when LUN 0 is not assigned to the storage controller. Suppose the boot disk for server A is LUN 1. The HBA can remap this disk, changing its identity from LUN 1 to LUN 0, which is a bootable disk. Hence in the prior example, for HBA A, LUN 1 is remapped to LUN 0, and HBA B LUN 2 is also remapped as LUN 0. (Because each LUN is masked from the other server, the fact that the numbers are the same does not matter.) HBA Configuration is Complex The setup routine of each HBA boot ROM must be individually configured for boot from SAN solutions to work. For each adapter, the correct disks must be allocated, and the boot disk or LUN must be correctly identified. Too Many Systems Booting From the Array The number of servers that can reliably boot from a single fabric connection is limited. If too many servers send I/Os at the same time, the link can become saturated, delaying the boot for the server that is attempting to boot. If this condition persists for too long, the requests will time out and the system can crash. The actual number of systems that can boot from a single fabric connection is vendor specific9.

Current Limitations to Windows Boot from SAN


There are a number of advanced scenarios that are not currently possible in Windows boot from SAN environments. No Shared Boot Images Windows servers cannot currently share a boot image. Each server requires its own dedicated LUN to boot. Mass Deployment of Boot Images Requires ADS Windows does not currently support en masse distribution of boot images. While cloning of boot images could help here, Windows does not have the tools for distribution of these images. In enterprise configurations, however, Windows Automated Deployment System (ADS) can help.

Switch and storage controllers both have limitations.

Boot from SAN in Windows Server 2003

20

Summary
This paper introduces the boot from SAN technology in the Windows environment. Boot from SAN simplifies the adoption of diskless server technologies, and simplifies storage management by facilitating a centralized approach to operating system installation and booting processes. This paper describes a number of boot from SAN scenarios supported in the Windows environment, including multipathing and clustering, and offers critical troubleshooting information to help ensure successful deployment. The paper includes an appendix of the boot process to aid understanding of the SAN boot process.

Boot from SAN in Windows Server 2003

21

Additional Resources
Click the technology links to obtain these white papers, all available through the Microsoft Windows storage portal (http://go.microsoft.com/fwlink/?LinkId=18974) Microsoft Support for iSCSI, August 2003. Highly Available Storage: Multipathing and the Microsoft MPIO Driver Architecture, October 2003. Storport in Windows Server 2003: Improving Manageability and Performance in Hardware RAID and Storage Area Networks, December 2003.

See also the Knowledge Base Articles: Support for Booting from a Storage Area Network (SAN) (http://go.microsoft.com/fwlink/? LinkId=22265) Support for Multiple Clusters Attached to the same SAN Device, (http://go.microsoft.com/fwlink/?LinkId=22266)

Boot from SAN in Windows Server 2003

22

Appendix
The Boot Process: Details
The boot (or bootstrapping) process starts with the execution of the shortest and simplest code necessary to begin the boot process, and successively accesses and executes more complex code. The following sections outline the high level details of the boot process for 32-bit (x86) architecture (differences with the Intel IA-64 architecture are discussed in the section following). The boot process is the same whether or not the boot occurs from a direct-attached disk or from a disk on a SAN.

Pre-boot
The pre-boot process is common to all operating systems. During this stage of the process, the following steps occur: POST. The system BIOS (stored on read-only memory chips) performs a power-on self test to ensure that there are no problems with the hardware, such as voltage irregularities or hard disk failure. If the hardware is working correctly, the CPU can begin operations The BIOS locates and initializes all bootable devices. The BIOS first locates all add-in devices (such as network interface cards and host bus adapters), as well as the local system hard and floppy drives, then it determines which devices are bootable. The BIOS sets the boot device. Although multiple devices are potentially able to supply the boot files (including multiple hard drives if the BIOS provides multi-boot support), the boot device actually used is either the first bootable device found (the default), or is set by the user in the case of multi-boot capable systems. The BIOS gives this device the address drive=80, which is the boot drive. (Note that, in configurations with multiple adapters, the order in which the devices are enumerated becomes critical to this determination because the BIOS assigns, by default, the first bootable drive it finds as the boot device.) Load boot sector. Having assigned the device from which the system will boot, the system will then search that device for the boot sector. All x86 systems require that the first sector of the primary disk contain the Master Boot Record (MBR). The MBR contains the system partition that contains the code and configuration files (Ntldr, boot.ini, and Ntdetect.com) necessary to boot Windows. This partition must be set as active (bootable) in order to proceed. Once the boot sector is loaded into memory, it can execute the next steps of the boot process.

Boot Sequence
The boot sequence described in this section is Windows specific. The file Ntldr controls much of this process. Once control is passed to Ntoskrnl.exe, the boot process is nearly complete. 1. Initial boot loader phase. The boot sector loads the Ntldr file, which begins loading the operating system in a series of phases, the first of which is the initial boot loader phase. During this phase, the Ntldr code enables the system to access all physical memory (protected-mode). Prior to this, only the first 1 MB of system memory was available (realmode). At this point Ntldr also enables paging, which is the normal mode of Windows operation.

Boot from SAN in Windows Server 2003

23

2. Selection of the operating system. In the next phase, Ntldr loads the boot.ini file, which tells Ntldr where the operating system kernel, registry, and device drivers reside. The boot.ini file locates the boot files using either an ARC (Advanced RISC computing) path or disk signatures. ARC Path. The ARC path, used to locate system files, may need to be modified if the system is shut down and more hardware is added. The format of the ARC path is either: multi (n) disk (n) rdisk (n) partition (n) \systemroot or where multi (n): indicates a multifunction adapter (such as IDE), or a SCSI adapter (NIC or HBA) with an onboard BIOS scsi (n): indicates the device is a legacy SCSI device with no onboard BIOS rdisk (n): used with legacy SCSI devices, indicates the target addresses of the disks on the controller (0-7). The boot disk is normally assigned as rdisk (0). For IDE, this value is 0. disk (n): for IDE, indicates which disk at the target address; for SCSI, indicates the logical unit number (LUN). It is used with controllers that support a master/slave disk configuration. partition (n): indicates the partition upon which the boot information resides. By default, the system root in Windows 2000 and later is \Windows. For NT4, it is WINNT. scsi (n) disk (n) rdisk (n) partition (n) \systemroot

3. Disk Signature. Rather than using the ARC path to locate the disk upon which the boot files reside, the disk signature, a unique 32 bit number, can be used to identify each disk. The format for signatures is: signature(abcdefg). 4. Hardware detection. Ntldr loads Ntdetect.com, which uses the system BIOS to query the computer for additional information, such as the machine ID, bus/adapter type, the number and sizes of disk drives, and ports. This information will later be recorded in the registry. 5. Kernel initialization. Ntldr loads the files from the boot partition necessary for kernel initialization. Included among these are the kernel (typically ntoskrnl.exe), the Hardware Abstraction Layer (typically HAL.DLL), file system drivers and any device drivers necessary to boot the system. Control then passes to Ntoskrnl.exe, which must also successfully locate the boot disk to update the registry with driver changes. 6. User mode. Once these phases are complete and the remaining system driver files are loaded, the Session Manager Subsystem (SMSS) is loaded, and in turn, loads files necessary to create the user mode interface. If the boot has been successful, the user can log in.

Boot from SAN in Windows Server 2003

24

Intel IA-64 Architecture Differences


Windows Server 2003, using the EFI (Extensible Firmware Interface) BIOS, supports 64 bit addressing, enabling hardware with 64 bit capabilities to realize full performance improvements. Bootable HBAs must be tested with IA-64 hardware. In certain configurations (see the scenario section), the Storport driver is the recommended HBA driver. In contrast to IA-32 design, The EFI BIOS can boot from any device. The drive address does not use the INT13 mechanisms previously described. Drives are partitioned with the GPT (GUID Partition Tables). The file EFIloader, rather than NTldr, is used to load the information gathered in steps 4-5. Hardware detection is accomplished by the firmware, rather than by the software-based file Ntdetect. The hardware detection is actually easier, since a device path (WWN and real eight bit LUN numbers) and disk signatures are used, ensuring that the solution works correctly.

Windows Server System is the comprehensive, integrated server software that simplifies the development, deployment, and operation of agile business solutions. www.microsoft.com/windowsserversystem

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication. This white paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in, or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. 2001 Microsoft Corporation. All rights reserved. Microsoft, Windows 2000 Server, and Windows Server 2003 are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

Boot from SAN in Windows Server 2003

25

You might also like