Over and Over and Over and Over
Articles and Tips: article
01 Mar 2005
You've probably read a lot about server clustering solutions. But writers often don't mention the specific type of clustering solution they are discussing, leaving the discussion unclear. Four different types of clustering solutions exist:
High-performance computational clustering
Clustered parallel file systems
Each of the four types of clustering serves a unique purpose and requires different software to accomplish its purpose. For an explanation of the four types of clustering solutions, see The Four Faces of Clustering.
Of the different types of clustering solutions, high-availability clustering is in the most demand. In fact, 70 percent of the 1,371 IT directors who responded to a recent IDC survey, cited high availability as their primary reason for clustering. (Source: IDC, U.S. Server Clustering 2004--2008, Forecast: Scale-Out Computing from the End User's Perspective, 2004.)
Many organizations need 24x7 access to data and resources. High-availability clustering solutions meet this need. Open Enterprise Server conveniently includes a high-availability clustering solution for both NetWare and Linux: Novell Cluster Services.
Novell Cluster Services Feature Overview
Novell Cluster Services 1.7, which is the same version available in NetWare 6.5, is included in Open Enterprise Server. In addition, Open Enterprise Server includes Novell Cluster Services 1.7 for SUSE LINUX Enterprise Server 9. By including Novell Cluster Services for both the NetWare and Linux platforms, Open Enterprise Server enables you to create clusters of NetWare servers, clusters of Linux servers, or clusters that include both NetWare and Linux servers. Creating any of these clusters requires the Open Enterprise Server versions of NetWare and Linux.
When you purchase Open Enterprise Server, you will be able to create a two-node cluster (each server in a cluster is called a node). You can easily expand the cluster by purchasing additional licensing. In fact, Novell Cluster Services is capable of scaling up to 32 nodes in a cluster--and each node can have as many as 32 processors.
Novell Cluster Services has a rich feature set--too rich to fully describe here--but the core functions include the following:
One of the main purposes of any high-availability clustering software is to provide failover. It's easy to describe how failover works by using a simple scenario: A company (let's call it Shakespeare, Inc.) sets up a two-node cluster with a shared-disk subsystem where the servers are named Petruchio and Kate. (See Figure 1.) (A shared-disk subsystem, also called a Storage Area Network (SAN), is a group of shared disks connected to all the nodes in a cluster. See Connecting the SAN.) The Petruchio and Kate servers both control access to a Novell Storage Services storage pool: PET_POOL and KATE_POOL, respectively. Uninterrupted access to these pools is vital because employees require the documents in these pools to do their jobs.
The actual disks for the two servers are physically located on the shared-disk subsystem. Petruchio is configured to mount PET_POOL and Kate is configured to mount KATE_POOL. Petruchio and Kate communicate their status to one another by means of a heartbeat. Every second (the default setting), Petruchio and Kate exchange heartbeat packets. If Petruchio is taken down for maintenance or fails, it no longer sends or acknowledges heartbeat packets and Kate detects that Petruchio is down. Kate then automatically mounts PET_POOL and continues to serve the users who normally access their documents through Petruchio. Once Petruchio is repaired, Kate and Petruchio communicate again, and PET_POOL automatically fails back to Petruchio.
The benefit of failover is obvious: The employees at Shakespeare, Inc. have automatic, non-stop access to the documents they need--even during routine maintenance or server failures.
Novell Cluster Services also provides a more customizable version of failover called fan-out failover. Let's use a three-node scenario to describe it using some names from Greek mythology. (See Figure 2.) Let's use a company called Moerae which operates six Web sites--a few sites for customers, one for employees and a blog site. Web sites A and B are on server Atropos; Web sites C and D are on server Clotho; and Web sites E and F are on server Lachesis. Moerae has configured the three servers in a cluster with a shared-disk subsystem.
With fan-out failover clustering, if Atropos fails, both of its Web sites do not have to be moved to the same server, but can be split between the remaining servers. IT personnel at Moerae preconfigured Web site A to move to Clotho if Atropos fails and Web site B to move to Lachesis. Moerae IT personnel preconfigured the fan-out failover so that failovers occur quickly and automatically.
Fan-out failover enables you to balance your server resources. For example, if Clotho were a more powerful server than Lachesis and Web site A were much larger than Web site B, then the configuration that Moerae chose for their failover makes very good sense. In fact, if Web site B were not considered mission-critical (if it were the owner's blog site, for example), it could be configured not to fail over at all. In this situation, Web site A would move to Clotho and continue with little or no interruption, while Web site B would go down until Atropos was repaired.
In addition to providing automatic failover and allowing for fan-out failovers, Novell Cluster Services enables you to migrate resources manually. Resource migration provides a number of benefits. For example, if the IT personnel at our fictional Moerae company need to perform maintenance on Atropos, they simply migrate Web sites A and B to Clotho and Lachesis, take down Atropos and perform the required maintenance. Afterward, they reboot Atropos and migrate Web sites A and B back to it. The users continue to get uninterrupted service, and IT personnel do not have to work strange hours. The financial department is also pleased, because Moerae does not have to pay the IT personnel overtime.
One real-world example of the benefit Novell Cluster Services resource migration provides can be seen at St. Vincent Indianapolis Hospital, a Novell customer. St. Vincent Indianapolis Hospital is part of St. Vincent Health, the largest health care provider in Indiana. Prior to installing Novell Cluster Services, IT staff at St. Vincent had to plan server maintenance for two months in advance, then perform that maintenance on Sunday mornings between 2 a.m. and 4 a.m. Because St. Vincent is a hospital, it operates around the clock, so even on early Sunday mornings, the departments affected by the server maintenance had to prepare and make alternative plans for those two hours. Now that St. Vincent has Novell Cluster Services, they are able to perform routine maintenance on their systems during normal work hours without disrupting services to any of their 8,000 staff and physicians and guaranteeing 24x7 patient care.
Another important benefit of resource migration is scalability. For example, if Moerae's Web site F (on server Lachesis) suddenly became extremely popular, the hits might become more than Lachesis could handle. Without Novell Cluster Services, this problem would mean taking Web site F down while a new, higher-end server is installed to run it. Fortunately, Moerae uses Novell Cluster Services, so they simply buy a new server, attach it to the cluster, and migrate Web site F to it. Most of their end users don't even notice.
The Advent of Mixed-Node Clusters
Considering the core advantages of high-availability clustering with Novell Cluster Services, it is easy to see why Novell customers who have already implemented Cluster Services for NetWare never want to go without it; they're addicted to having all of their services available all of the time and to never having to take services down to upgrade or maintain their NetWare servers.
Suggest Open Enterprise Server to an all-NetWare clustered shop, and you imagine that the customers running these shops might be concerned about adding any Linux servers to their fail-proof environment. In fact, Richard D. Jones, product manager of storage and clustering at Novell, explained that one large Novell customer indicated they would not move from their current NetWare version to any other version or platform unless they could do a rolling upgrade--that is, upgrade one node in a cluster at a time. During a rolling upgrade, the cluster remains operational at all times.
Those customers with all-NetWare clustered shops needn't be concerned: Open Enterprise Server enables you to run both NetWare and Linux servers in mixed-node clusters--that is, clusters that include servers with different operating systems. Mixed-node clusters enable rolling upgrades to be accomplished over a longer period of time. Many customers have taken months to complete a rolling upgrade. You don't have to feel rushed. For loyal Novell customers who want to begin introducing Linux servers into their NetWare environment, Novell Cluster Services provides a smooth migration path by allowing them to cluster NetWare and Linux servers together. Those customers who have grown accustomed to 24x7 availability of their services won't lose that availability while migrating to Linux.
Be aware of a couple of caveats concerning mixed-node clusters: While a cluster is in mixed-node state, the storage in the volumes cannot be expanded. You must wait until you have completed a rolling upgrade on an entire cluster. The cluster also must once again have homogeneous server operating systems before you can expand the file system. Also, due to a restriction in the cluster install utility for Linux, an all-Linux cluster cannot have a NetWare node installed into it. You may add Linux nodes to either Linux clusters or NetWare clusters. But to add a NetWare node to a cluster, there must already be at least one other NetWare node in the cluster.
Of course, another obvious advantage of mixed-node clusters is the flexibility they offer. You can run a particular service on the server platform you prefer for that service, while still having a backup on the other server platform.
NetWare and Linux--Moving in Together
So you can put NetWare and Linux in the same cluster and you can see why you might want to, but you probably still have lingering concerns: What about NetWare and Linux file systems? What will automatically fail over? And what can you do with the services that will not automatically fail over?
In answer to the first question, Novell Cluster Services doesn't care which file system a server in a cluster uses, as long as the file system is a fast-mount, journaled file system. Novell Storage Services is the only fast-mount, journaled file system supported by NetWare. Thus, NetWare servers must use Novell Storage Services. For Linux servers, you have a few more choices. Examples of fast-mount, journaled file systems for the Linux platform include Reiser and EXT3. You can use these Linux file systems with Novell Cluster Services in Linux-only clusters. However, all servers in an individual cluster must use the same file system. For this reason, servers in mixed-mode clusters need to use Novell Storage Services. (For more information about Novell Storage Services for Linux, see In Good Company.)
What will automatically fail over? Theoretically, any service that is identical across platforms will fail over from NetWare to Linux or from Linux to NetWare. Services that are considered identical across platforms have identical configuration files and block data formats on both platforms.
Apache 2.0.50 is an example of one such service. Apache data and configuration files are identical between NetWare and Linux platforms. In fact, Novell has already demonstrated that Apache can seamlessly fail over from NetWare to Linux and back again. Another example of an application that should fail over automatically is MySQL 4.0.21.
As to the question regarding how to handle services that will not automatically fail over, you can manually migrate these services between server platforms. However, doing so is not as easy as resource migration between identical platforms. Novell Cluster Services enables you to migrate the storage volume to the new platform, but configuration files for the service must be translated and created manually.
For example, moving from Common Internet File System (CIFS) on the NetWare platform to SAMBA on the Linux platform can be an arduous task, even within a cluster. CIFS on NetWare and SAMBA on Linux both emulate Windows Server Message Block (SMB) and CIFS protocols, enabling Windows clients to communicate with NetWare or Linux servers using their native protocols.
The Windows SMB/CIFS protocols define share points, which control access to shared folders and resources. However, the way NetWare CIFS and Linux SAMBA configure these share points is completely different. Thus, to migrate between them, you must manually copy what the share points are on one platform and create the same share points on the other platform. Once the manual configuration is done, you can migrate the disk volume using Novell Cluster Services.
While NetWare and Linux cannot provide fault tolerance for one another when it comes to services that will not fail over, you can nevertheless move these services between platforms to perform server maintenance or to scale up your cluster. Although the process can be difficult, the service can continue to run on its original platform while you prepare to migrate the disk volume so that you maintain high availability for the service.
Connecting the SAN
Novell Cluster Services requires a SAN in order to provide high availability. Novell Cluster Services supports three options for attaching the SAN to the cluster: shared Small Computer Systems Interface (SCSI), Fibre Channel, and Internet SCSI (iSCSI).
Novell Cluster Services supports shared SCSI-attached disk systems. However, shared SCSI is limited in a number of ways: First, shared SCSI is slow. Even SCSI-3 maxes out at 640 Mbps. Second, shared SCSI supports distances only up to 12 meters. Finally, the maximum number of disk drives you can connect with SCSI-3 is 16, and the maximum number of server nodes is two. Although SCSI may be an economical solution for a two-node cluster, it lacks the performance and scalability offered by the more powerful Fibre Channel and iSCSI options.
So which is better, Fibre Channel or iSCSI? While the computer industry tolerates endless debates on this question, the bottom line is that both Fibre Channel and iSCSI are powerful solutions with their own merits. Novell Cluster Services fortunately supports both so that you can choose the solution that works best for your business.
Fibre Channel has been around in the SAN market long enough to have a fairly large installed base. It runs at full-speed (1.06 Gbps), 2x (2.12 Gbps), or 4x (4.26 Gbps). Fibre Channel can achieve distances of up to 20 kilometers, but it requires single-mode fiber (SMF) connections. The other components you will need to connect a Fibre Channel SAN include a Fibre Channel switch and a Fibre Channel card for each server. (See Figure 3.) Fibre Channel offers a number of powerful features, including strong Quality of Service (QoS) features. But you can't have your cake and eat it, too; Fibre Channel is expensive.
iSCSI is a Gigabit Ethernet solution. It can run at 1 Gbps or 10 Gbps. The 10 Gbps variant can run short distances (in the 100s of meters) over inexpensive, uncooled fiber or can travel distances that exceed 50 kilometers over SMF. Because iSCSI requires a switch and a card in each server, it looks much like its Fibre Channel counterpart--at first glance, anyway. (See Figure 4). These iSCSI components are commonly-available Ethernet cards and switches and are generally less expensive than their Fibre Channel counterparts. iSCSI also arguably saves businesses a buck or two in training. As a Gigabit Ethernet solution, iSCSI is all Ethernet- and IP-based and therefore requires little specialized training.
The iManager Advantage
You can manage your NetWare-only, Linux-only, or mixed-mode cluster and its resources from any location using the browser-based iManager console that is packaged with Open Enterprise Server. When you install Novell Cluster Services, a Cluster Management link appears in iManager. Using this link, you can manage resources, which include Novell Storage Services pools or other file system volumes and any server-based applications or services that you choose to designate as resources, such as Web sites, e-mail servers or databases.
iManager provides many tools to simplify complex cluster-management tasks. This article provides only a sampling of the many powerful iManager features, including configuring failover and migrating resources.
You can use iManager to configure a variety of options for failover, including the following:
Resources in a cluster are automatically assigned to nodes when they are created. The default assigned node order is the order in which the nodes appear on the list of resources in iManager. Resources fail over to other nodes according to the assigned node list. You can use the iManager graphical user interface to remove nodes from the list, add nodes to the list or change the order of the assigned nodes. (See Figure 5.)
In addition, failover mode for each resource can be set to Auto or Manual. When failover mode is set to Auto, the resource automatically starts on the next server in the Assigned Nodes list in the event of a hardware or software failure. If you set the failover mode to Manual, the resource will enter an alert state in the event of a failure, and you can then choose whether or not the resource will fail over.
Another useful configuration is Resource Priority, which allows you to control the order in which resources start during a failover. For example, if a node fails and two resources fail over to the same node, the resource priority determines which resource loads first, thus ensuring that the most critical resources load first.
Migrating resources using iManager is a very simple process. Doing so requires only that you locate in the iManager console the resource that you want to migrate and check the box next to this resource. Then just click Migrate. (See Figure 6.) A page then appears, displaying a list of possible servers from which you may select the one to which you want to migrate the resource. You can also choose to take a service offline or put it back online by clicking the corresponding links.
With many businesses providing access to data and resources through the Internet, they demand that services be available to customers and staff around the clock. Novell Cluster Services for Open Enterprise Server is the high-availability clustering tool required to make those services available 24x7.
In addition, Novell Cluster Services for Open Enterprise Server is the first clustering software to enable mixed-mode clusters for even higher availability and a smooth migration path to Linux. With Novell Cluster Services, Open Enterprise Server gives you the power to create two-node clusters right out of the box, ensuring your end users will never be without access to services.
The Four Faces of Clustering
High-availability clustering enables non-stop access to data and resources. It enables services to be moved from one server to another in the cluster, either manually so that maintenance can be performed on a server without taking services down, or automatically in the event of a failure. Novell Cluster Services is a high-availability clustering solution.
High-Performance Computational Clustering
High-performance computational clustering software divides a processing job into manageable chunks, allows these chunks to be processed on separate servers, and then reassembles the processed job and returns the results. For applications that require intense processing power, this type of clustering negates the need for a high-end server and enables companies to buy a number of lower-end servers, significantly decreasing their investment.
If your organization has a need for high-performance computational clustering, Novell has a solution: a high-performance service is built into SUSE LINUX Enterprise Server 9 (although it is included in Open Enterprise Server, high-performance computational clustering is priced separately).
Workload-balance clustering is most often used to cluster Web servers. This type of clustering relies on the stateless connection between browsers and Web servers. In this stateless connection, when you request information from a Web site, your browser contacts the Web server and downloads the information, after which the server breaks the connection until you request information again. Thus, a server cluster serving a Web site can enable users to access any available server in the cluster for each individual connection. This both speeds performance and enables a number of lower-end servers to act together to replace a single higher-end server. It also provides fault tolerance for the Web site. Novell provides a workload-balance server clustering solution called Volera Excelerator.
Clustered Parallel File Systems
In a clustered parallel file system, a database application provides simultaneous, shared access to the database to the servers in the cluster. Clustered parallel file systems enable the application to combine the processing power of all the servers in the cluster. They also provide failover security. A popular example of a clustered parallel file system is Oracle Real Application Clusters (RAC). Novell and its partners provide clustered parallel file systems for SUSE LINUX Enterprise Server 9.
* Originally published in Novell Connection Magazine
The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.