Link Level Load Balancing and Fault Tolerance in NetWare 6
Articles and Tips: article
Senior Software Engineer
Novell Bangalore
ssudarshan@novell.com
Piyush Rai
Senior Software Engineer
Novell Bangalore
rpiyush@novell.com
01 Mar 2002
Load Balancing and Fault Tolerance is provided in NetWare 6 by extending the stack's multihoming capabilities. This AppNote discusses how the NetWare 6 platform provides infrastructure and support to host a variety of net services. We also discuss how the multihoming, load balancing and fault tolerance features available in NetWare 6 can be optimized in different customer scenarios.
IntroductionHigh Availability at Low Cost
Novell's Solution
Usage Scenarios
Conclusion
Topics |
NetWare 6, Load Balancing, Fault Tolerance |
Products |
NetWare 6 |
Audience |
network administrators, consultants, integrators |
Level |
intermediate |
Prerequisite Skills |
familiarity with high end servers |
Operating System |
Windows 95/98/NT/2000, NetWare |
Tools |
none |
Sample Code |
no |
Introduction
The role of a network adapter can become critical when a server is connected to several clients through a single adapter. In such a scenario, the network adapter can also become a bottleneck and the cause for a single point of failure. As the processing power of the server is increased, this bottleneck becomes more visible. The earlier NetWare TCP/IP stack already provides a fault tolerance solution for default gateways. With NetWare 6 the TCP/IP stack provides link level solutions- load balancing and fault tolerance- which would be available to users desiring high availability and uninterrupted flow of traffic.
This solution is especially important to ISPs and others using high end servers.
This AppNote describes market solutions to customer problems in this area, Novell's solution, the implementation details of Novell's solution, and possible scenarios in which this solution could be implemented.
High Availability at Low Cost
Currently there are a number of solutions available but none of them match NetWare's advantage to the users. The advantage, quite simply, is in the two key areas that concern the users: availability of services and costs. NetWare 6 provides the most reliable solution.
Before getting into the details of the NetWare solution let's take a quick look at the solutions currently available in the market. These solutions are based on four strategies. Each of them is briefly discussed here for their feasibility to address the customer concerns mentioned above.
Go for higher bandwidth. While this is a logical solution, it is expensive and does not address the primary problem- a single point of failure. Higher bandwidth does not bring in a distribution of load across multiple adapters. If anything, the stress on the adapter increases because of the higher bandwidth.
Go for clustering with higher bandwidth. This partly addresses the problem of the first strategy but again is extremely expensive. Also clustering introduces greater complexity and manageability is difficult.
Load sharing across multiple servers. This solution involves redirecting the traffic to different servers in a round robin fashion. DNS load balancing server is one good example. In this solution the load shared by different servers could be different and failure detection and recovery takes a considerable amount of time.
Deploy an array of network adapters and aggregate all the links to represent a single logical link to the network hosts. This strategy does address the problem directly. However, the problem with this strategy is that most of the link aggregation technologies are NIC vendor dependant where NIC drivers and switches would communicate using proprietary protocol. This poses a lot of interoperability issues with other devices.
The Institute of Electrical and Electronics Engineers (IEEE) has approved the 802.3ad Port Aggregation standard for LAN, but it requires all NIC vendors and switch vendors to adopt the 802.3ad standard in order to ensure multi vendor interoperability of link aggregation technology.
The Novell Solution
In NetWare 6, Load Balancing is provided across multiple NICs. This is a single server solution where multiple NICs share the server load and prove to be drastically cost effective. The NetWare solution also includes Fault Tolerance that enables the failed NICs to switch back to normal function once they are ready. Also, the NetWare solution does not require any extra software, it comes with the operating system. To a user this means:
- Increased throughput
- High availability
- Hardware independent
- Cost effective
- Simplicity of configuration
- Multiple IP address support
Novell's Solution
Load Balancing and Fault Tolerance is provided in NetWare 6 by extending the stack's multihoming capabilities.
Multihoming
Multihoming is the feature that enables a system to have more than one network interface and also ensures that the interface assumes multiple IP addresses on the same network.
It is typically used for all IP networks bound to a router, irrespective of whether the networks are bound to the same interface or to different interfaces. Novell's stack supports different kinds of multihoming combinations; between Single/Multiple NIC and also between Single/Multiple IP Address.
Grouping
Multiple NICs can be grouped together by assigning the same IP address on all the NICs. This group will work as a single logical interface for all the applications running on the server as well as the hosts connected to the server using this IP address. Load balancing and fault tolerance can be enabled for this group to balance the server load on all the NICs and provide uninterrupted service in case of NIC failure.
One of the interfaces in the group is selected as primary and the configuration of the group is controlled using this primary interface. The primary interface handles the broadcast load on the server. The NetWare configuration utility allows the administrator to change the primary interface of the group.
Even though all the interfaces are grouped to form a logical group, their MAC addresses are exposed to the network elements. This would enable the server to control both incoming and out-going traffic.
If the administrator wants to associate more than one address to a group, it is possible to use a secondary IP address configuration. In this case the set of IP addresses represent a logical group. This is useful if it is required to use a different IP address for each service hosted on the server.
Load Sharing
NetWare 6 configuration provides an option to enable or disable the load balancing feature for a group. When the load balancing option is not enabled for a group, the traffic will be shared across the NICs based on the remote host's IP address. Every packet that is going out of the system to a given destination address will always be mapped to one of the NICs in the group. The traffic will be fairly distributed across the NICs when multiple clients are connected to the server. Similarly incoming traffic for a given remote IP address will be mapped to the same NIC.
Load Balancing
In this configuration both the system tries to balance the load across multiple NICs in the group. It is possible to have NICs with different speeds grouped together for load balancing and the load balancing algorithm takes individual NIC capacity into consideration while distributing the load.
Fault Tolerance
With fault tolerance you can monitor the health of the grouped interfaces and detect instances of faults such as link failure, NIC failure, and switch failure. Once such a fault is detected the load on that interface is diverted to another healthy interface. Fault tolerance works along with load balancing to ensure uninterrupted connection between hosts and the server.
If load balancing is enabled in a system and fault tolerance detects a fault in any interface it diverts the traffic to the less loaded interface in the group. If load balancing is not enabled in a system and a fault tolerance detects a fault in the system it randomly diverts the load on any of the available healthy interfaces in the group. Once the failed interface recovers it is put back into the healthy set and again the load is redistributed across them. The distribution of load, fail over, and redistribution of load (once the failed interface has recovered) takes place in such a way that the flow of data is smooth and the TCP/IP connections stay intact throughout. The connected hosts will re-map their IP address to MAC address mapping by picking up the broadcast messages sent by the server in case of a NIC failure and continue to work with out any problems.
Usage Scenarios
NetWare 6 is a full-fledged platform that provides infrastructure and support to host a variety of net services. In this section we discuss how the multihoming, load balancing and fault tolerance features available in NetWare 6 can be optimized in different customer scenarios.
Case 1: Load Sharing using Physical Segments
Figure 1 shows a scenario of a multihomed server with each NIC connected to a different physical segment. In this case, each NIC should be configured with different IP addresses of the same or different subnet; and clients can be distributed among these NICs by putting them in different segments.
Load sharing using physical segments.
This is a typical multihomed network without fault tolerance support. This can be used where different policies have to be defined for different departments and when there is a need for physical separation.
Case 2: Manual Load Sharing using Multiple IP Addresses
Figure 2 shows a scenario of a multihomed server where all the clients have equal access through the network to any of the NICs in the server. In this case, configure each NIC in the server with a unique IP address and distribute the clients among those IP addresses.
Manual load sharing using multiple IP addresses.
Since IP traffic from the clients will be sent to the NIC which is associated with that particular IP address, the load balancing achieved here is by virtue of physically distributing the clients across multiple IP addresses.
When one of the NICs fails, the other NIC takes over that address and clients continue to work without any problems.
Case 3: Auto Load Sharing using Multiple IP Addresses
The setup is similar to case 2; however, in this scenario clients can be distributed among different IP addresses using a DNS load balancer. The DNS load balancer should be aware of multiple IP addresses assigned to the server and each time it receives a client request it gives one of the IP address (see Figure 3).
Auto load sharing using multiple IP addresses.
When one of the NICs fails, the other NIC will take over that address and clients will continue to work without any problems.
Case 4: Transparent Load Sharing using Single IP Address
Here is a scenario of a multihomed server where all the clients have equal access through the network to any of the NICs in the server. In this case, configure all the NICs in the server with the same IP address to form a single group.
Now all the clients will be able to communicate with the server using the same IP address and the load balancing feature in the server takes care of distributing the load across all the NICs.
Transparent load sharing using single IP address.
When one of the NICs fails, the load gets shifted to other NICs in the group.
Case 5: Application Load Balancing using Virtual IP Address
In this scenario consider a cluster of proxy servers with multiple NICs connected to the same network. In this case, configure all the systems with a virtual IP address. Clients can use this virtual IP address to get the proxy service running on these servers. Also configure a L4 switch to intercept all the requests from the clients to redirect the request to one of the proxy server using the load balancing feature.
The proxy server can process the request and return the data directly to the client instead of passing it to the L4 switch. Usually, the out going load on the server will be high compared to incoming load. When that is the case the server can use multiple NICs to pump the data to the clients.When one of the NICs in the proxy server fails, it automatically switches the load to the other NICs (see Figure 5).
Application load balancing using virtual IP address.
When the server itself fails, the L4 switch detects the failure and can switch the load to other servers. Thus, the feature provides two levels of fault tolerance one at the server level and another at the NIC level.
Conclusion
Even though there are several solutions available in the market Novell gives cost effective and vendor independent solution to the problem. The new Multihoming, Load Balancing, and Fault Tolerance features are available in NetWare 6 and NetWare 5.1 Consolidated Support Pack 7.
* Originally published in Novell AppNotes
Disclaimer
The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.