An Introduction to Novell Cluster Services
Articles and Tips: article
01 May 1999
This AppNote is adapted from "Overview and Installation" and other documents found on the Novell Cluster Services Web Site
Get a sneak preview of Novell's latest high availability product offering, which allows up to 32 servers to be joined together in a cluster where resources can be dynamically switched or moved to another server in the event of a server failure.
Novell Cluster Services is a server clustering system that ensures high availability and manageability of critical network resources including data (volumes), applications, server licenses, and services. It is a multinode, directory-enabled clustering product for NetWare 5 that supports failover, failback, and migration (load balancing) of individually managed cluster resources.
Novell Cluster Services includes several important features to help you ensure and manage the availability of your network resources. These include:
Support for shared disk configurations or local disk configurations
Multinode all-active cluster (up to 32 nodes)—any NetWare server in the cluster can restart resources (applications, services, IP addresses, and volumes) from a failed server in the cluster.
Single point of administration through a Java-based ConsoleOne cluster configuration and monitoring GUI
The ability to tailor a cluster to the specific applications and hardware infrastructure that fit your organization
Dynamic assignment and reassignment of server storage on an as-needed basis
Novell Cluster Services allows you to configure up to 32 NetWare 5 servers into a high-availability cluster, where resources can be dynamically switched or moved to any server in the cluster. Resources can be configured to automatically switch or be moved in the event of a server failure, or can be moved manually to troubleshoot hardware or enable load balancing.
This AppNote provides an overview of Novell Cluster Services benefits, installation, and management. For more information on this and other high availability solutions from Novell, visit the Web site at:
Benefits of Novell Cluster Services
Novell Cluster Services provides high availability from commodity components. Lower costs are obtained through the consolidation of applications and operations onto a cluster. The ability to manage a cluster from a single point of control and to adjust resources to meet changing workload requirements (thus, manually "load balance" the cluster) are also important benefits of Novell Cluster Services.
An equally important benefit of implementing Novell Cluster Services is that you can reduce unplanned service outages and reduce planned outages for software and hardware maintenance and upgrades.
Novell Cluster Services directly leverages Novell Directory Services (NDS) by storing all cluster configuration information within the Directory. Instead of using scripts to determine the response to a server failure, cluster services uses NDS to store and distribute failover scenario information, which is available through the Directory to all servers in the cluster.
Reasons you would want to implement Novell Cluster Services include:
Low cost of operation
Shared disk fault tolerance can be obtained by implementing RAID Level 5 on the shared disk subsystem.
Example Cluster Services Scenario
The benefits of Novell Cluster Services can be better understood through the following scenario. Suppose you have a three server cluster configured, and you have Web server software installed on each of the three servers in the cluster. Each of the servers in the cluster hosts two Web sites. All the data, graphics, and e-mail messages for each Web site are stored on a shared disk subsystem connected to each of the servers in the cluster. Figure 1 depicts how this setup might look.
Figure 1: A sample three-server cluster.
During normal cluster operation, each server is in constant communication with the other servers in the cluster. Novell Cluster Services software performs periodic polling of all registered resources to detect failure.
Suppose Web Server 1 experiences hardware or software problems and the users depending on Web Server 1 for Internet access, e-mail, and information lose their connections. Figure 2 shows how resources are moved when Web Server 1 fails.
Figure 2: When Web Server 1 fails it's services are moved to the other servers in the cluster.
As shown in this figure, Web site A moves to Web Server 2 and Web site B moves to Web Server 3. IP addresses and applicable licenses also move to Web Server 2 and Web Server 3.
When you configured the cluster, you decided where the Web sites hosted on each Web server would go should a failure occur. In the previous example, you configured Web site A to move to Web Server 2 and Web site B to move to Web Server 3. This way, the workload once handled by Web Server 1 is evenly distributed.
When Web Server 1 failed, Novell Cluster Services software took the following actions:
Detected a failure
Started any Web site-specific applications (that were running on Web Server 1) on Web Server 2 and Web Server 3 as specified
Transferred IP addresses to Web Server 2 and Web Server 3 as specified
Remounted the data volumes on the shared disk system (that were formerly mounted on Web Werver 1) on Web Server 2 and Web Server 3 as specified
In this example, the failover process happened quickly and users regained access to the Internet, Web site information, and e-mail within seconds, and in most cases, without having to log in again.
Now suppose the problems with Web Server 1 are resolved, and Web Server 1 is returned to a normal operating state. Web site A and Web site B will automatically failback, or be moved back to Web Server 1, and Web Server operation will return back to the way it was before Web Server 1 failed.
Novell Cluster Services also provides resource migration capabilities. You can move applications, Web sites, and so on to other servers in your cluster without waiting for a server to fail.
For example, you could have manually moved Web Site A or Web Site B from Web Server 1 to either of the other servers in the cluster. You might want to do this to upgrade or perform scheduled maintenance on Web Server 1, or just to increase performance or accessibility of the Web sites.
Typical cluster configurations normally include a shared disk subsystem connected to all servers in the cluster via high-speed fiber channel cards, cables, and switches. If a server fails, another designated server in the cluster automatically mounts the shared subsystem volumes previously mounted on the failed server. This gives network users continuous access to the volumes on the shared disk subsystem.
Typical resources might include data (volumes), applications, server licenses, and services. Figure 3 shows how a typical cluster configuration might look.
Figure 3: A typical cluster configuration.
The following components comprise a Novell Cluster Services cluster:
From 2 to 32 NetWare 5 servers configured to use IP, each containing at least one disk device (used for a local SYS volume)
Novell Cluster Services software running on each NetWare 5 server in the cluster
A shared disk subsystem connected to all servers in the cluster (optional, but recommended for most configurations)
High-speed fiber channel cards, cables, and switch used to connect the servers to the shared disk subsystem
This section provides an overview of how Novell Cluster Services is installed.
Hardware and Software Requirements
You must have the NetWare 5 Support Pack 2 or later installed and running on each server you intend to add to your cluster. Support Pack 2 for NetWare 5 is currently in beta; you can download it from http://support.novell.com/beta/public.
Before installing, ensure that the following requirements are met:
You have a minimum of two NetWare 5 servers. (A maximum of 6 servers per cluster is recommended for this beta release.)
All servers in the cluster are configured with the IP protocol and are on the same IP subnet.
At least 64 MB of memory is in all servers in the cluster (128 MB is recommended for server application failover).
All servers in each cluster are in the same NDS tree.
At least one local disk device (not shared) is available for a SYS volume on each server.
Novell Client version 184.108.40.206 for Windows 95 or version 4.50.819 for Windows NT is installed on the workstations used to manage and access your cluster.
Console One is installed (from the Novell Cluster Services product CD) on the workstation used to manage your cluster.
The shared disk system is properly set up and functional according to the manufacturer's instructions.
All shared disk system volumes are configured to use Novell Storage Services.
The disks contained in the shared storage subsystem are configured in a mirroring or RAID 5 configuration to add fault tolerance to the shared disk subsystem.
If the disks in the shared disk system are not configured to use mirroring or RAID 5, a single device error can cause a system failure. Novell Cluster Services software will not protect against such faults.
It is recommended the client machine used to manage the cluster be at least 300 MHz or above and have at least 90 MB of memory. Although you can use slower machines with less memory, faster processor machines will greatly increase the performance of Console One, so choose the highest performance workstation available.
A shared disk system is stongly recommended for each cluster.
Installing Novell Cluster Services Software
The Novell Cluster Services Installation creates a new cluster object and installs Novell Cluster Services software on the servers you specify to be part of your cluster. After installation, you will need to run the Novell Cluster Services Installation again each time you want to add servers to your cluster.
From the initial splash screen, launch the Novell Cluster Services installation.
Continue through the installation screens until you get to the screen that prompts you to create a new cluster or edit an existing cluster.
Select "Create a New Cluster" or "Edit an Existing Cluster". Then click Next.
Do one of the following:
(If creating) Enter the name for the new cluster object you are creating and specify the directory tree and context where you want it created. Then click Next.
(If editing) Specify the directory tree, context, and name of the cluster you will add servers to. If you don't know a cluster name, browse and select one from the list. Then click Next.
Enter the name of the server you want to add to the cluster, or browse and select one from the list, and then click Add to Cluster. Repeat this step for every server you want to add to the cluster.
You can also remove servers you just added to the cluster by selecting them from the NetWare Servers in Cluster list and clicking Remove. When you add a server to a cluster, Novell Cluster Services automatically detects the server's IP address. If the server you are adding has more than one IP address, you will be prompted to select the IP address you want Novell Cluster Services to use.
(Conditional) If you are creating a new cluster, specify whether your cluster has a shared disk system and if so, select the drive where you want the small cluster partition created.
Novell Cluster Services requires a small cluster partition on the shared disk system. You are also given the option of mirroring the partition for greater fault tolerance.
You must have a small amount of free space on one of the shared disk drives to create the cluster partition. If no free space is available, the shared disk drives will be unusable by Novell Cluster Services.
Continue through the final installation screen. The Novell Cluster Services installation program will then create a new cluster for you and add the servers you specified in the install to the cluster. If you are editing an existing cluster, the install will just add the specified servers to the cluster.
Any servers in your cluster that don't have NDS replicas must be given all object rights to the cluster object. If you created a new cluster, you now need to create cluster resources and configure those resources. You also should consider adding NetWare volumes to the cluster and creating cluster resource templates.
Adding Volumes to the Cluster
If you have a shared disk system that is part of your cluster and you want the volumes on the shared disk system to be highly available to NetWare clients, you will need to add those volumes to your cluster. Adding the volumes on the shared disk system to your cluster enables them to be moved or mounted on different servers in the cluster during failures, or when migration is necessary. Some server applications don't require NetWare client access to volumes, so adding volumes to your cluster might not be necessary.
In ConsoleOne, browse and select the cluster object you want to add cluster volumes to.
Select File | New | Cluster Volume.
Browse and select a volume on the shared disk system to add to the cluster.
Enter an IP address for the volume. Each volume you add to your cluster requires its own separate IP address.
To complete the process for adding a volume to the cluster, you now need to set failover and failback modes (see "Setting Failover and Failback Modes" later in this AppNote.) If necessary, you can also change the node assignments for the volume (see "Assigning Nodes to a Resource" later in this AppNote).
Creating Cluster Resource Templates
The creation and use of templates simplifies the process of creating similar or identical cluster resources. You can create templates for any server application or resource you want to add to your cluster. IP SERVICE is the only template currently provided with Novell Cluster Services software, and it can be used when configuring server applications to run on your cluster. You can edit and customize the IP SERVICE template for specific server applications.
In ConsoleOne, browse and select the cluster object where you want to create a cluster resource template.
From the menu bar, select File | New | Cluster Resource.
Enter a name for the new cluster resource template.
Check the Create Cluster Resource Template check box. This option lets you create a cluster resource template instead of a cluster resource.
Check the Define Additional Properties check box.
To complete the process for creating a cluster resource template, you now need to configure load and unload scripts (see "Configuring Load Scripts" later in this AppNote). You also need to set failover and failback modes and, if necessary, change the node assignments for the resource template.
Creating Cluster Resources
Cluster resources must be created for every resource or application you run on servers in your cluster. Cluster resources can include web sites, e-mail servers, databases, and any other server-based applications or services you want to make available to users at all times.
In ConsoleOne, browse and select the cluster object which you want to create resources for.
Select File | New | Cluster Resource.
Enter a name for the new cluster resource.
If a template exists for the resource you are creating, enter the template name or browse and select it from the list. If a template does not exist, check the Define Additional Properties check box.
(Conditional) If you are using a template for this resource, additional resource configuration is performed automatically by the template.
If you are not using a template, you must now complete the process for creating the cluster resource by configuring load and unload scripts, setting failover and failback modes, and, if necessary, changing the node assignments for the resource.
Configuring Load Scripts
A load script is required by Novell Cluster Services for each resource, service, or volume in your cluster. The load script specifies the commands to start the resource or service on a server, or to mount the volume on a server. You can use any commands in the load script that would be used in a .NCF file run from the server console. If you don't know which commands to add to your load script, consult the documentation for the application or resource.
Select the Load Script property page tab on the property page book.
Edit or add the necessary commands to the script to load the intended resource on the server. As you edit or add to the script, you will notice a CRMACK command already in the script. The CRMACK command is placed in the script automatically by Novell Cluster Services, and is used by the system to determine when the end of the script has been reached. The CRMACK command must always be the last command in the script
Specify a timeout value. The default is 600 seconds, or 10 minutes. The timeout value determines how much time the script is given to complete. If the script does not complete within the specified time, the resource becomes "comatose."
Configuring Unload Scripts
Depending on your cluster application or resource, you can add an unload script to specify how the application or resource should terminate. An unload script may not be required by all resources or applications, but it can ensure that during a failback or manual migration, a resource unloads before it loads on another node. Consult your application vendor or documentation to determine if you should add commands to unload the resource.
Select the Unload Script property page tab on the property page book.
Edit or add the necessary commands to the script to unload the intended resource on the server
You can use any commands used in a .NCF file run from the server console. If you don't know which commands to add, consult the documentation for the application or resource you want to unload. The CRMACK command should be the last command in the script
Specify a timeout value. The default is 600 seconds, or 10 minutes. The timeout value determines how much time the script is given to complete. If the script does not complete within the specified time, the resource becomes "comatose."
Setting Failover and Failback Modes
Failover and failback of cluster resources can be configured to happen manually or automatically. If you want applications or resources to automatically move to specified nodes in the event of hardware and software failures, set the failover mode to automatic.
Setting the failback mode to automatic ensures applications or resources automatically move back to their preferred node when hardware or software problems are resolved, and the server assigned as the preferred node for the resource is brought back online. The preferred node is the first server in the nodes list on the Nodes property page. The resource will not automatically failback to any other servers in the cluster except the preferred node.
Set the failover mode to manual if you want to intervene after a failure occurs and before the resource is moved to another node. Setting the failover mode to manual allows you time to bring up failed nodes or migrate resources on other nodes before allowing the resource to move.
Manual failback works much the same as manual failover. Manual failback is useful for preventing a resource from moving back to its preferred node after the preferred node is brought back online.
Select the Policies property page tab on the property page book.
Select the Ignore Quorum check box if you don't want the cluster-wide timeout period and node number limit enforced.
The quorum default values were set when you installed Novell Cluster Services. You can change the quorum default values by accessing the properties page for the cluster object.
Choose the failover and failback modes for this resource. The default for both failover and failback is Auto.
Assigning Nodes to a Resource
When you create a resource on a cluster or add a volume to a cluster, the nodes in the cluster are automatically assigned to the resource or volume. The order of assignment is the order the nodes appear in the resource list. You can assign or unassign nodes to the resource or volume, or change the failover order.
Select the Nodes property page tab on the property page book.
From the list of unassigned nodes, select the server you want the resource assigned to and click the right arrow button to move the selected server to the Assigned nodes list.
Repeat this step for as many servers as you want assigned to the resource. You can also use the left arrow button to unassign servers from the resource.
Click the up and down arrow buttons to change the failover order of the servers assigned to the resource or volume.
Cluster Services Management
This section provides an overview of various management tasks you can perform with Novell Cluster Services.
You can move resources to different servers in your cluster without waiting for a failure to occur. You might want to migrate resources to lessen the load on a specific server, to free up a server so it can be brought down for scheduled maintenance, or to increase the performance of the resource or application by putting it on a faster machine.
Moving resources allows you to balance the load and evenly distribute applications among the servers in your cluster. In ConsoleOne, browse and select the cluster object on which the resource you want to migrate exists.
Ensure the right half of ConsoleOne displays the Cluster View State by selecting View | Cluster State from the menu at the top of the screen. In the Cluster Resource List, select the resource you want to migrate. The Cluster Resource Manager screen appears, displaying the server the selected resource is currently running on, and a list of possible servers to which you can migrate resources. Select a server from the list and click Migrate to move the resource to the selected server.
Note: If you select a resource and click OFFLINE, the selected resource will be unloaded from the server. It will not load on any other servers in the cluster and will remain unloaded until you load it again. This option is useful for editing resources since resources can't be edited while loaded or running on a server.
Identifying Cluster States
The Cluster State view in ConsoleOne gives you information about the status of servers and resources in your cluster. Cluster servers and resources display in different colors, depending on their operating state. When servers and resources are green, they are in a normal operating condition.
Red signifies that a server that has been part of the cluster has failed and is in need of administrator intervention. When a resource is red, it has stopped running and is waiting to be failed over or back to a designated server.
Grey indicates that a server is not part of the cluster, or its state is unknown. When a resource is grey, it is not assigned and not running on any server.
The yellow ball designates the master server in the cluster. The master server is initially the first server in the cluster, but another server can become master should the first server fail.
The Epoch number indicates the number of times the cluster state has changed. The cluster state will change every time a server joins or leaves the cluster.
Clicking on the top bar of the Cluster State view screen will launch a more detailed report of the state of your cluster. You can view this report, or save it to an HTML file for printing or viewing with a browser.
Novell Cluster Services Console Commands
Two NCF script files that can be run from the server console are created by Novell Cluster Services. These commands can be useful for updating your cluster software or troubleshooting cluster problems.
ULDNCS.NCF unloads Novell Cluster Services software from the server.
LDNCS.NCF reloads Novell Cluster Services software on the server.
Novell Cluster Services provides other server console commands to help you perform certain cluster related tasks. Type HELP CLUSTER at the console prompt to get information on the commands and their functions.
Novell Cluster Services dramatically increases the availability of Web and Internet application servers and transparently migrates Internet applications and users from a failed server to other servers in the cluster. By providing a solution that ensures Internet applications will always be available, Novell continues to provide solutions that form the intelligent infrastructure of e-business on the Web.
* Originally published in Novell AppNotes
The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.