Network Balance - 2
Network Balance - 2
to look into building a Web Farm that uses multiple machines on the network acting as a single server. In this article Rick looks at the Windows Load Balancing Service and the new interface it sports in Windows Server 2003, which makes creating a Web Farm quick and easy and gasp even an affordable solution. With the release of Windows Server 2003 Network Load Balancing has become a much more visible tool as a part of the operating system, providing a very usable and relatively easy to configure interface that makes it easy to build a Web Farm. The Network Load Balancing Service has been around in one incarnation or another since Windows NT SP4, but Windows Server 2003 is the first operating system that brings this service into the forefront as a main component of the OS. A new Network Load Balancing Manager application is now directly available from the Adminstrative Tasks menu and its powerful enough to allow to configure the entire cluster from a single console. The service is now available for all products in the Windows Server family including the lower end Web Edition which means that you now have a much more affordable solution to create Web Farms at your disposal. Just add servers please. In this article Ill review the basics of a Load Balancing service and then show you how to set up configure a basic installation using two machines.
Because a Web Farm are made up of essentially identically configured servers, a failure on a single server will not bring down the entire Web site. Other servers in the pool can continue to process requests and pick up the slack. For many companies this feature of load balancing is often important for peace of mind both in the knowledge that a single point of failure on the Web Server is avoided as well as providing an in place mechanism to grow the application should the need arise at a later point.
A network load balancing cluster routes requests to a single virtual IP to available servers in the load balancing cluster. Note that each machine is self-sufficient and runs independent of the others duplicating all of the resources on each server. The database sits on a separate box(es) accessible by all servers. Although a Web Farm is a common scenario for this service keep in mind that any IP based service can be run off this service. For example, you could use a mail server that is under heavy load and uses a central datastore to share multiple machines in a cluster. Network Load Balancing facilitates the process of creating a Web Server Farm. A Web Server farm is a redundant cluster of several Web servers serving a single IP address. The most common scenario is that each of the servers is identically configured running the Web server and whatever local Web applications running on the Web server as shown in Figure 1. Each machine has its own copy of everything it needs to run the Web application which includes the HTML files, any script pages (ASP, ASP.Net), any binary files (such as compiled .Net assemblies, COM objects or DLLs loaded from the Web app) and any support files such as configuration and local data files (if any). In short the application should be fully selfcontained on a single machine, except for the data which is shared in a central location. Data typically resides in a SQL backend of some sort somewhere on the network, but could also be files shared in a directory for files from a file based database engine such as Visual FoxPro or Access. Each server in the cluster is fully self-contained, which means it should be able to function without any other in the cluster with the exception of the database (which is not part of the NLB cluster). This means each server must be configured separately and run the Web server as well as any Web server applications that are running. If you're running a static site, all HTML files and images must be replicated across servers. If youre using ASP or ASP.Net, those ASP pages and all associated binaries and support files must also be replicated. Source control programs like Visual SourceSafe can make this process relatively painless by allowing you to deploy updated files of a project (in Visual Studio.Net or FrontPage for example) to multiple locations simultaneously.
Short of the data everything else is running on all of the machines in the NLB cluster. The key is redundancy in addition to load balancing if any machine in the cluster goes down, NLB will re-balance the incoming requests to the still running servers in the cluster. The servers in the cluster need to be able to communicate with each other to exchange information about their current processor and network load and even more basic checks to see if a server went down. If you have COM components as part of your Web application things get more complicated, since the COM objects must be installed and configured on each of the servers. This isn't as simple as copying the file, but also requires re-registering the components, plus potentially moving any additional support files (DLLs, configuration files if needed, non-sql data files etc.). In addition, if you're using In-Process components you'll have to shut down the Web server to unload the components. You'll likely want to set up some scripts or batch files to perform these tasks in an automated fashion pulling update files from a central deployment server. You can use the Windows Scripting Host (.vbs or .js files) along with the IIS Admin objects to automate much of this process. This is often tricky and can be a major job especially if you have a large number of cluster nodes and updates are frequent strict operational rules are often required to make this process reliable. Luckily if youre building applications with pure ASP.Net you wont have these issues since ASP.Net can update .Net binary files without any shutdowns by detecting changes to the source files and shadow copying binary files to a different directory for execution.
Efficiciency
Network Load Balancing is very efficient and can provide you reasonably close to 1:1 performance improvement for each machine added into the cluster there is some overhead involved, but I didn't notice much in my performance tests with Vs.Net Application
Center Test Tool with each machine adding 90-95% of its standalone performance to the cluster even in my non-optimized network setup that I was using to conduct the tests. You may notice that with this level of redundancy increasing your load balancing capability becomes simply a matter of adding additional machines to the cluster, which gives you practically unlimited application scalability (database allowing) if you need it.
Setting up NLB
In order to utilize the Windows Server Network Load Balancing features you will need two machines running Windows Server 2003. Each machine needs to have at least one network card and at least one fixed IP address. Although running with one adapter works well, for best performance its recommended that you have two adapters in each machine one mapped to the real IP Address (Microsoft calls this the Dedicated IP) and one mapped to the virtual IP Address (Microsoft calls this the Cluster IP). Be aware that NLB uses some advanced networking features of network adapters, so its possible that some low end adapters (especially those for non-server machines) may not support the required NDIS protocols. In addition you will also need one more machine for testing (3 machines total). The test machine should be external as you cant use a machine from the pool to test it will only fire request on the local machine since the IP requests are not traveling over the network when you hit the virtual IP address it goes to the local machine. I'm going to use two servers here to demonstrate how to set up and run NLB. Assume the IP addresses for these machines are 111.111.111.1 and 111.111.111.2. To create a virtual IP address (Cluster IP) you need to pick an available IP Address on the same Class C network segment. In my example here Ill use 111.111.111.10. Unlike previous versions of NLB the new version has a central manager application that you can use to create a cluster from a single machine. Gone are the hassles of having to manually configure each machine manually you can do it all from a single machine over the network which is a welcome change. To start setting up this cluster bring up the Network Load Balancing Manager from the Administrative Tools menu. Figure 1 shows what the cluster manager looks like.
Figure 1 To set up a new NLB cluster bring up the Network Load Balancing Manager and right click to createa a new cluster. Right-click on the root node to add a new cluster. Next configure the basic cluster configuration, which will consist of assigning the Cluster or virtual IP address. Figure 2 shows what this dialog looks like filled out for our test network.
Figure 2 Configuring the Cluster IP. This is the virtual IP address that will service all servers in the cluster. Note that you should set the operation mode to Multicast if you are using a single adapter. The IP Address is the virtual IP address for the cluster that will be used to address this cluster. NLB will actually create a new IP address on each machine in the cluster and bind it to the specified network adapter (in the next step). Choose a subnet mask make sure you use the same one for all servers in the cluster. The Full Internet name is only for reference and is used here primarily for displaying the name of the server. But if you have a domain configured for the server you should use that domain name. Cluster operation mode is very important. Unicast mode means that NLB takes over the network card it is bound to and doesnt allow any additional network traffic through it. This is the reason why two adapters are a good idea one that NLB can take over and one that can still handle all other network traffic directed at the dedicated IP address of the server. If youre using a single adapter you should probably select Multicast which allows both the NLB traffic and the native IP traffic to move through the same network adapter. Multicast is slower than Unicast as both kinds of traffic need to be handled by the network adapter but its the only way to remotely configure all machines centrally. You can run a single adapter in Unicast mode, but the cluster manager will not be able to communicate with the server after its configured. As a general rule use Unicast for two adapters, Multicast for a single adapter. With my network cards I had to use IGMP mode in order to get the cards to converge properly you may have to experiment with both modes to see what works best for you.
Leave the Allow Remote Control option unchecked. This allows you to reconfigure the nodes and port rules remotely, although I found little need to do so. Any changes made to the cluster are automatically propagated down to the nodes anyway, so theres little need to do this with the exception of changing the processing priority. If you do want this functionality I suggest you enable it after you have the cluster up and running. The next dialog called Cluster IP Addresses allows you to add additional virtual IP addresses. This might be useful if you have a Web server that is hosting multiple Web sites each of which is tied to a specific IP address. For our example here, we dont need any and can just click next as shown in Figure 3.
Figure 3 If you need to add additional IP addresses to be load balanced you can add them here. This is needed only if you host multiple sites on separate IP addresses and you need separate IPs for these. Next we need to configure port rules. Port rules determine which TCP/IP port is handled and how. Figure 3 shows the Port Rules dialog with two port rules defined for Port 80 (http) and 443 (SSL). The default port configuration set up by NLB handles all ports, but in this case that rule is too broad. Port rules cant overlap so if you create specific rules you either have to create them for each port specifically or create ranges that fit your specific ports.
Figure 4 The Port Rules dialog shows all of the port rules defined for cluster. By default a rule for all ports 0 65365 is defined. Here Ive Created to specific port rules for port 80 and 443. To add a new port rule click on the Add button which brings up the dialog shown in Figure 5. Here you can configure how the specific port is handled. The key property is the Filtering Mode which determines the affinity of requests. Affinity refers to how requests are routed to a specific server. None means any server can service the incoming request. Single means that a specific server has to handle every request from a given IP address. Generally None is the preferred mode as it scales better in stateless applications. Theres less overhead in NLB as it doesnt have to route requests in many cases. Single mode is useful for server connections that do require state, such as SSL connections for HTTPS. Secure Server Certificates performs much better with a persistant connection rather than having to create new connections on each of the servers in the pool for requests. Figure 1 shows the configuration for the standard Web Server port - port 80.
Figure 5 Setting port rules lets you configure how the cluster responds to client requests. Affinity in particular determines whether the same server must handle all requests from a specific IP address (single) or Class C IP address range (Class C). To set up the second rule for the SSL Port I added another rule and then changed the port to 443 and changed the affinity to single. Although you cant do it from here, another important setting is the priority for each machine for each port rule. You can set up Machine 1 to take 80% of the traffic and the second 20% for example. Each rule can be individually configured. Well see a little later why this is important for our SSL scenario. The rules set in this dialog are propagated to all the cluster servers, which is significant, because the cluster port rules must be configured identically on each of the cluster node servers. The configuration tool manages this by remotely pushing the settings to each of the cluster nodes Network Connections IP configuration settings. This is a big improvement over previous versions where you manually had to make sure each machines port rules matched and stayed matching. Up to this point we have configured the cluster and the common parameters for each node. Now we need to add individual nodes to the cluster. Figure 6 shows the dialog that handles this step for the first node as part of the configuration process.
Figure 6 Adding a node by selecting the IP address and picking a specifc network adapter. When you click Next you get to another dialog that lets you configure the cluster node. The main feature to configure on this dialog is the Priority which is a unique ID that identifies each node in the cluster. Each node must have a unique ID and the lower the number the higher the priority. Node 1 is the master which means that it typically receives requests and acts as the routing manager although when load is high other machines will take over.
Figure 7 Setting the node parameters involves setting a priority for the machine, which is a unique ID you select. The lower the number the higher the priority this machine acts as the master host. Click finish and now we have one node in our cluster. Actually, not quite so fast. Be patient, this process isnt instant. When you click finish the NLB manager actually goes out and configures your network adapter for you. It creates a new IP address in your network connections, enables the Network Load Balancing service on your network adapter(s) you chose during setup and configures the setting we assigned on the NLB property sheet. Youll see your network connection flash on and off a few times during this configuration process on the machine you are configuring to be a host. This is normal, but be patient until you see your network connection back up and running. If all goes well you should see your network connection back up and running and see a new node in the NLB Manager sitting below the cluster (see Figure 8 which shows both nodes). If everything is OK the Status should say Converged. If it does node 1 is ready. But were not quite done yet we still need to add the second node. To do so right-click on the cluster, after which you go through the steps shown in Figure 7 and 8 one more time. Again be patient, this process is not super fast it takes about 20 seconds or so to get a response back from a remote machine. Once you click finish the process of Converging can take a minute or more.
Figure 8 The final cluster with both nodes converged and ready to process requests.
Troubleshooting Tips
Ive had a few problems getting convergence to happen for the first time. It helps to follow the steps here closely from start to finish and if for whatever reason you end up removing nodes make sure you double check your network settings first before re-adding nodes. You can check what NLB did in the Network Connections for your machine (Figure 9). Click on the Load Balancing section to see the settings made there. Remember that the settings should match between machines with the exception of IP Addresses assigned for each machine. You should also see the new IP address added in the Internet Protocol settings Advanced page.
Figure 9 All of the setting that NLB makes are made to the network adapter that the virtual IP is bound to. You can click on the Network Load Balancing item to configure the node settings as described earlier. The Virtual IP also has been added in the Internet Protocol | Advanced dialog. If things look Ok, make sure that the machines can ping each other with their dedicated IPs. Figure 10 shows what you should see for one of the machines and you should run this test on both of them:
Figure 10 Checking whether the machines can see each other. Use IPCONFIG to see adapter information and you should see both your physical adapter and the virtual IP configured. Make sure that you dont get any errors that say that theres a network IP address conflict. If you do it means that the virtual IP is not virtual ie. Its entered but its not bound to the NLB service. In that case remove the IP and then configure the NLB first, then re-add the IP address. Alternately remove everything then try adding it one more time through the NLB manager. Ive also found that it helps to configure remote machines first, then configure the machine running the NLB Manager (if you are using it in the cluster) last. This avoids network issues on the manager machine plain network access gets a little weird once you have NLB configured on a machine. Again this is a great reason to use two adapters rather than one.
Figure 11 Using Application Center Test to stress test a simple page. The result here is from combined machines which running around 275 rps. Machine 1 and 2 individually were running 136 and 158 rps respectively. The script hits only the ASPX page no images or other static content was hit. I tested each of the machines individually changing the IP Addresses to their dedicated IPs in the ACT script first and then together by changing the script to use the virtual IP. The results for this short 5 minute test are as follows: Web Store Single Read Page Test Test Mode Requests per second Office Server 111.111.111.2 162 Laptop 111.111.111.1 141 Both of them Load Balanced 111.111.111.10 276
This is a ratio of 91% for the load balanced vs. the machines individually which is excellent given that we are running with a single adapter here. The second test is a bit more realistic in that it runs through the entire Web Store application site and uses a shared SQL Server on a third machine. Web Store Full Order Test Test Mode Requests per second Office Server 111.111.111.2 91 Laptop 111.111.111.1 85 Both of them Load Balanced 111.111.111.10 135 Here the ratio is a bit worse: 77%, but the reason for this drop off has little to do with the Load Balancing, but the fact that there are some limits being hit on the SQL Server. Looking at the lock count with performance monitor reveals that the site is hitting the SQL box pretty heavily and the locking thresholds are causing requests to start slowing down significantly. This application is not heavily SQL optimized and performance could be improved to make these numbers higher both for individual and combined tests. However, this test shows that load balancing can help performance of an app, but that there may still be other limits that can slow down the application as a whole. In short, beware of load issues beyond the Web front ends that can bite you in terms of performance. Still even in this test where an external limit was being approached we still got a significant gain from using Load Balancing.
Figure 12 When editing the Port Rules in Network Connections you can configure the load weight for each server in percentages. This effectively drives all SSL traffic to the machine that has the certificate installed.
Finally, load balancing can allow you to scale applications with multiple machines relatively easily. To add more load handling capabilities just add more machines. But remember that when you build applications this way that your weakest link can bring down the entire load balancing scheme. If your SQL backend which all of your cluster nodes are accessing is maxed out, no amount of additional machines in the load balancing cluster will improve performance. The SQL backend is your weakest link and the only way to wring better performance out of it is to upgrade hardware or start splitting databases into separate servers.