A Quick Guide to Web Server Acceleration

Articles and Tips: article

RON LEE
Senior Research Engineer
Advanced Development Group

01 Oct 1997

Tired of waiting for interminably slow web servers to crank out your user requests? Get up to speed with BorderManager Fast Cache, a web server accelerator that can supercharge even the slowest of web servers.

Introduction
The Power of Caching
The Accelerated Web Server Model
Eight Ways to Implement a Web Server Accelerator
Conclusion

Introduction

Given the increasing popularity of the Internet and in-house intranets, there seems to be no end in sight for the burgeoning demand for web access. As the number of web users grows, so does their impatience with slow web server response times. To make matters worse, traditional solutions for the web's fast-paced growth and slow response times involve expensive hardware upgrades.

Novell's answer to this dilemma is surprisingly simple: take the proven advantages of caching and apply them in the web service environment. Web server acceleration, as provided by Novell's BorderManager and BorderManager Fast Cache, promises to deliver web access at cached speeds, while at the same time relieving your web server of up to 99 percent of its workload. An accelerated web server thus overcomes the performance mismatch between today's web server technologies and the growing user population's desire for instant access.

This AppNote provides a guide to web server acceleration for network designers and administrators who want to get up to speed quickly on this promising new technology. It describes how web server acceleration works and illustrates the kind of performance improvements you can expect from an accelerated web server. To help you in planning your own supercharged web server environment, it outlines eight ways to implement Novell's web server accelerator to meet a variety of design needs, ranging from simplicity of design to ultra high-end performance.

For more information on BorderManager and other AppNotes regarding these technologies, visit Novell's BorderManager web site at http://www.novell.com/products/bordermanager.

The Power of Caching

Without the aid of caching, web server performance is inadequate for most web server applications. Web servers quickly become the system bottleneck, with slow response times as the most common ailment. When slow performance is combined with the growing demand for web services, most web servers quickly run out of steam.

As a workaround for relieving heavy web server resource utilization, many organizations typically consider one of two options: (1) upgrade or replace the current system with larger, more expensive systems; or (2) split the site's content across multiple servers up to 40 or 50 servers in some extreme cases. But with the web server acceleration available in Novell's BorderManager Fast Cache product, there is a third, much more cost effective option. Web server acceleration uses the power of caching to breathe new life into overburdened web servers.

The fundamental problem underlying web service performance is that web services were originally designed without consideration for the powerful advantages of caching. Figure 1 compares the performance of Netscape's Enterprise Server running on a dual-processor Sun Ultra Enterprise 3000 to the performance of an accelerated version of the same site using BorderManager's web server acceleration on a uniprocessor Compaq ProLiant 6000.

Figure 1: Web server performance, before and after caching is applied.

Not only is the accelerated site much faster and up to 10 times more scalable, but the majority of the web server's workload is moved to the web server accelerator's cache, significantly extending the usable lifespan of the existing web server hardware.

How Caching Has Helped Novell

Novell's public web site (www.novell.com) has grown steadily in popularity since its inception several years ago. In fact, earlier this year Novell's web servers were due for an expensive replacement because they didn't have sufficient capacity for the site's growing demand and were producing slow response times. But rather than sink more money into expensive hardware upgrades, Novell opted for the accelerated approach. On July 28, 1997, Novell installed BorderManager web server accelerators at its Internet border to take advantage of the increased speed and scalability that caching had to offer.

Figure 2 shows the positive workload trend on the www.novell.com web servers after the BorderManager web server accelerators were installed.

Figure 2: Once BorderManager was implemented, Novell's web server workload fell dramatically while its web site performance skyrocketed.

In Novell's case, all worries about web server performance were eliminated when the average web server workload dropped by 90%. Moreover, three days after implementing BorderManager, Novell introduced an entirely new set of web site content that was much deeper than anything published on the site before. Many web pages expanded from three graphical elements to thirty or more. Most web sites could not have handled such an increase due to lack of web server horsepower. In this case, Novell's new web server accelerators allowed Novell to make content changes that would have otherwise cost hundreds of thousands of dollars in upgraded web server hardware to support.

For more information concerningNovell's own BorderManager installation, see "Web Server Acceleration Using BorderManager: A Case Study of www.novell.com" in the August 1997 issue of AppNotes. For more information on Novell's Internet and intranet caching technologies, see "Three Ways to Deliver Cached Performance to Your Intranet and Internet Users" in the September 1997 issue of AppNotes.

The overwhelming cost-effectiveness of Novell's cache technologies, combined with inexpensive Intel-architecture servers, will soon make non-cached web server systems obsolete.

The Accelerated Web Server Model

When you pair an accelerator with a web server, both the web server and the accelerator take on new roles:

The accelerator assumes the IP address of the web server. By assuming the web server's address, the web server accelerator receives all requests addressed to the web server and handles those requests as if it were the web server.
The web server takes a new IP address known only to the web server accelerator. The accelerator becomes the web server's only user, filling its cache with content requested by the user community in the intranet or Internet.

Two Types of Requests

To better understand the accelerated web server model, you need to be familiar with two terms that are used widely in the Internet industry to refer to different types of web requests: static and dynamic.

Static requests are requests for static content. These are HyperText Transfer Protocol (HTTP) requests which reference files, including HyperText Markup Language (HTML) files, graphics files, sound, or video. All static content is cacheable by an accelerator.

Dynamic requests are requests for dynamically-generated content. These are HTTP requests which reference programs, including CGI scripts, NSAPI scripts, ISAPI scripts, or PERL scripts. These scripts are often used to query databases. An accelerator must pass dynamic requests through to the web server; they cannot be serviced from cache. Although many web sites have databases and are referred to as "dynamic" sites, the majority of their content still consists of static elements that can be cached by an accelerator.

Servicing Requests for Static Content

When an accelerator receives requests for static content, such as HTML pages or graphics files, the accelerator first tries to respond from its own cache. If the requested object is cached, the accelerator responds immediately with the cached object. If the requested object is not cached, the accelerator fills its cache from the web server and then responds to the request from cache. Approximately 90-100% of all requested objects are static and are therefore cacheable.

Servicing Requests for Dynamically-Generated Content

When an accelerator receives requests for dynamically-generated content, such as CGI or PERL scripts, the accelerator passes those requests directly to the web server for processing. Once the web server completes the response, the accelerator immediately passes the response through to the requester. Approximately 10% or less of all web server requests are for dynamically-generated content and are therefore non-cacheable.

However, many of these dynamic requests eventually take advantage of the accelerator's caching. Responses to many of these requests contain references to multiple static elements that must be requested by the browser before the transaction is complete. Once the browser receives the response to the dynamic request, the browser begins requesting the rest of the page's static (cacheable) elements. This allows transactions that begin with a dynamic request to take full advantage of the accelerator's shared cache.

Accelerator Transparency

All of the accelerator's functionality is completely transparent to the web server and to the web server's user community. No changes are necessary to the web server, the browsers, or the browser clients. This transparency allows BorderManager to accelerate any web server, including Netscape Enterprise and FastTrack Servers, Microsoft Internet Information Servers, and Apache Web Servers.

Ensuring Cache Freshness

The cacheability of web content--whether a particular element can be cached and how long it may be cached before it is out-dated--is determined by the webmaster. BorderManager uses several methods to guarantee that users get the same information from the accelerator that they would get from the web server:

Some web content is flagged non-cacheable. BorderManager respects these settings and does not cache these documents. Requests for this material is always passed through to the web server.
Some web content has explicit expiration dates and times. BorderManager respects these settings and expires the content accordingly.
Web content that doesn't have an explicit time-to-live (TTL) is handled based on settable parameters. (See the NWAdmin help screen for these Advanced Web Proxy Cache features.)

Novell's webmaster keeps the accelerators at Novell's border synchronized by tickling the accelerators each time a document is updated on the master web server. When a document on the accelerated web server is changed, the webmaster sends a request for that document to each of the web server accelerators and includes a "Pragma: no-cache" header in the request. This has the effect of forcing an immediate update of that document in all of the BorderManager accelerators, resulting in near-perfect synchronization across all caches.

Although this process has been automated at Novell, you can perform the same function manually by using the Reload (in Netscape browsers) or Refresh (in Microsoft browsers) feature in your browser. Point your browser at the accelerator's IP address, provide the Uniform Resource Identifier (URI) for the element you want to refresh, and select Reload. This browser feature places a "Pragma: no-cache" header in the request which instructs the accelerator (or any cache) to service the request from the origin web server.

A Solution for "Cache-Busters"

Many commercial web sites generate revenue based on the number of hits tallied on their site's advertisements. Webmasters for these sites are leery of caching and web server acceleration because existing caching technology provides no way for a cache or accelerator to report the number of hits on a given element back to the site of origin. To overcome this problem, many webmasters use "cache-busting" techniques to ensure that none of their content is cached. Their site retains all of its content, allowing them to keep an accurate tally of all the hits on that content.

In the near future, new caching protocols will be developed to overcome the need for cache-busting and allow caches to work in harmony with hit collection and reporting services. Until that day arrives, webmasters can still take advantage of BorderManager's web server acceleration.

The solution is to flag all the site's elements cacheable except those that require accurate hit counts. This configuration allows the majority of the site's content to be delivered by the web server accelerator while keeping full control of the site's hit counts.

For example, consider a static HTML page that contains an advertisement and several other static elements. If all of the elements except the advertisement are cacheable, they can all be pushed out to a web server accelerator. Requests for that page then generate an initial request for the HTML document that is serviced by the accelerator. This preliminary request is then followed by multiple requests for the remaining static elements. When the request for the advertisement is received by the web server accelerator, it passes the request through to the web server for servicing. This way the web server takes advantage of web server acceleration while keeping an accurate count of hits on specific elements on the page or site.

Eight Ways to Implement a Web Server Accelerator

Web server acceleration is a flexible network service that can be designed to compliment the architecture of any intranet or Internet web site. To demonstrate this flexibility, the following examples show how this powerful service can be combined to meet several different needs, including simplicity of design, scalability and high-end performance, reliability, and smart mirroring. The eight examples are:

A single server configuration in which a web server and web server accelerator are combined in a single IntranetWare server.
A dedicated web server accelerator.
A web server accelerator for multiple mirrored web servers.
A web server accelerator for multiple unique web servers.
A web server with multiple web server accelerators.
Novell's mission-critical configuration: an infrastructure that combines multiple mirrored web servers and multiple web server accelerators for optimal redundancy.
A web server accelerator configured for optimal scalability.
A web server accelerator configured to accelerate a remote web site.

All of the web server accelerators in these examples are running IntranetWare 4.11 and BorderManager 1.0 or BorderManager Fast Cache, and BorderManager Update 1. The web servers could be any web server running on any hardware platform (except in the first example, in which the web server must be IntranetWare-compatible).

Each of the diagrams specifies the BorderManager settings that are required for each specific accelerator configuration. IP addresses are given to show the relationship between each example's IP addresses and the BorderManager settings for each configuration.

The following two figures are the Windows 95 or Windows NT NWAdmin screens you'll use to set up the web server accelerator. Figure 3 is the main BorderManager Setup screen accessed by selecting the BorderManager Setup tab in NWAdmin for a specific BorderManager server. Figure 4 is the Web Proxy Cache screen accessed by selecting the Web Proxy Cache screen after completing the BorderManager setup in Figure 3.

Using this familiar administrative utility, it is a relatively simple matter to set up a web server accelerator with Novell's BorderManager (see the sidebar "Set Up a Web Server Accelerator in Ten Easy Steps" for a quick step-by-step overview of the process).

Set Up a Web Server Accelerator in Ten Easy Steps

To set up a web server accelerator, install BorderManager or BorderManager Fast Cache on an IntranetWare 4.11 server, install the BorderManager NWAdmin plug-ins, and complete the following steps:

Enter the web server's IP address in the BorderManager server's HOSTS file. To edit the file, type LOAD EDIT ETC/HOSTS at the server console.
Using NWAdmin, select your BorderManager server and proceed to the BorderManager Setup screen (see Figure 3).
Select and enable Proxy Cache Services.
Enter the web server accelerator's IP addresses and designate them as private, public, or both (see the example diagrams for the appropriate designations).
Exit and save your BorderManager Setup.
Using NWAdmin, select your BorderManager server and proceed to the Web Proxy Cache screen (see Figure 4).
Enable the HTTP Accelerator.
Create an accelerated web server configuration by entering the web server's host name and HTTP port, along with the IP addresses of the public interfaces that will accept requests for the web server.
Exit and save your Web Proxy Cache settings.
Test your configuration from the public side of the accelerator by opening the IP address of the accelerator in a browser. If everything is functioning properly, you should receive the web server's default home page.

Figure 3: The NWAdmin BorderManager Setup Screen.

Figure 4: The NWAdmin Web Proxy Cache Screen.

Single Server Configuration

Figure 5 shows how an IntranetWare server can combine an IntranetWare-compatible web server with the BorderManager web server accelerator. In this configuration, the web server operates on a secondary IP address within the server, while the web server accelerator receives all requests for the web server on the server's primary IP address.

Figure 5: A web server and web server accelerator combined in a single server.

This single-server combination is an ideal configuration for sites where server hardware is limited. In this case, the web server accelerator handles all requests for static content (usually 95-99 percent of the web server's workload), and responds at cached speeds.

Note: In this case, the web server would have to be IntranetWare 4.11-compatible because BorderManager requires IntranetWare.

BorderManager's acceleration code paths are so efficient that, while handling over a million hits per day, CPU utilization for the service hovers in the 5-15 percent range with occasional peaks in the 20-25 percent range (running on a 200MHz Intel Pentium Pro system). This ultra-efficiency allows you to host other services, including web services, on the same system without having to worry about running out of resources.

If you prefer not to allocate the additional IP address for the web server in this configuration, you can use an address in the internal IP address space (127.0.0.0). The address 127.0.0.1 is reserved as the internal loopback address, leaving 127.0.0.2 (or 127.0.0.3, and so on) to be used as an internal secondary IP address for the web server.

Dedicated Web Server Accelerator

Figure 6 shows how a web server accelerator can be used as the front end for any web server, including IntranetWare-, Unix-, or NT-based web servers. In this configuration, the web server is moved to a private IP address known only to the web server accelerator. The accelerator is known by the advertised IP address for the web service. For example, the web server in Figure 6 may have originally had the IP address 137.64.1.1. But in this accelerated model, that public address is given to the web server accelerator, while the web server is provided a private address known only to the web server accelerator.

Figure 6: A web server with a dedicated web server accelerator.

In this case, BorderManager's web server accelerator offloads the majority of the web server's workload, allowing the web server to focus its resources on non-cacheable requests.

Mirrored Web Servers

Figure 7 shows how to build redundancy into your accelerated web servers to allow adequate fail-safe systems in mission-critical sites. One web server is the "master-out" server, containing the masters of all of your web content. The additional web servers are mirrors of the master-exact replicas of both web server and content.

In this configuration, all three web servers are known to the web server accelerator. The accelerator actively load balances its requests to the three web servers based on the number of outstanding requests on each web server. Thus, all three web servers are actively servicing requests and are ready to assume more workload in the event one of the web servers goes down.

Figure 7: Multiple mirrored web servers and a web server accelerator.

Novell's webmaster uses several applications to handle web server mirroring. The first is a component of Novell's web authoring and publishing system. New and updated elements are automatically pushed out to each of the web server mirrors at the same time they are placed on the master. This process also sends a request for each updated document to each of the accelerators and includes a "Pragma: no-cache" header in the request. This has the effect of forcing an immediate update of that document in each of the accelerators, resulting in near-perfect synchronization across all caches.

For fault-tolerance, another application runs every hour during off-peak periods. Its purpose is to review the entire site and make certain that each of the web server mirrors is completely synchronized with the master.

Multiple Unique Web Servers

Figure 8 shows how a single web server accelerator can accelerate multiple unique web servers. In this configuration, you may have a Sun system running Netscape's Enterprise Server and an Intel-architecture server running Microsoft's Internet Information Server. Any number of any type of web server can be accelerated by a single BorderManager web server accelerator.

Figure 8: Multiple unique web servers and a web server accelerator.

In this case, the web server accelerator caches the web content from all three web servers and responds to requests for all three servers from cache.

Multiple Web Server Accelerators

Figure 9 shows how to build redundancy into your web server acceleration infrastructure by using more than one accelerator.

Figure 9: A web server with multiple web server accelerators.

In this configuration, all three web server accelerators fill their caches from the single web server. Thus, all three accelerators are servicing requests for the web site and are ready to assume more workload in the event one of the accelerators goes down.

Novell's Mission-Critical Configuration

Figure 10 shows an example of a commercial web site configuration, with multiple web server mirrors and multiple accelerators. The example is Novell's own web site at www.novell.com, a mission-critical site that requires complete redundancy and fail-over protection to provide 24x7 service to a global user community. At over four million hits per day, Novell's site requires only one web server accelerator to handle the load. The two additional accelerators provide needed fault tolerance and fail-safe capabilities that a single accelerator cannot provide.

Figure 10: Novell's border implementation using multiple mirrored web servers and multiple web server accelerators for mission- critical redundancy.

Without the services of the BorderManager, Novell's web servers would have been due for costly replacement. Now the site is fully redundant and performance is dramatically improved.

Each of the three accelerators handles over 1 million hits per day and caches approximately 43,000 of the web servers' content elements. Caching on these systems consumes 6.5MB of RAM and 1.4GB of disk storage. These are light storage requirements for a web site that contains a total of 201,377 elements and requires 10GB of disk storage. This 10:1 disk storage ratio suggests that 10 percent of Novell's site is receiving the majority of the site's traffic-an ideal scenario for caching. In fact, these results suggest that a site's accelerator typically requires only about one-tenth the storage capacity of the web server.

For more information on the LRU sitting time statistic, see "Tuning Cache with the NetWare 4 LRU Sitting Time Statistic" in the March 1995 issue of Novell AppNotes.

BorderManager also stores recently used content in IntranetWare's file cache. For experimentation purposes, the three accelerators have 128MB, 256MB, and 512MB of RAM. Our tuning process on these systems, using MONITOR's Cache Statistics and the LRU sitting time, suggests that the 128MB system is perfectly suited to its 1 million hit-per-day workload. During peak utilization, the LRU sitting time on this system hovers above 13 minutes. The LRU sitting times on the other two systems are one hour and ten hours respectively, suggesting that the additional memory is wasted on these systems.

The Smart Mirror: Accelerating a Remote Web Server

Traditional web server mirrors--exact replicas of a web server and its contents--are used by webmasters to solve several problems. One use of the web server mirror is to add redundancy to a site to protect against a single point of failure, as described in the Mirrored Web Servers and Novell's Mission-Critical Configuration examples above.

Web server mirrors can also be used to distribute access to remote sites. For example, Novell's Support web site (support.novell.com) is mirrored in Germany to support Novell's customers in Europe. However, this type of mirror is difficult to maintain because it is a high-maintenance web server with copies of live content in a remote location. Moreover, it requires additional technical resources and expensive hot-spare equipment.

In many cases, a web server accelerator is the best solution for this second scenario. An accelerator is easily configured to accelerate the remote master site (see Figure 11).

Figure 11: The "smart mirror" isn't a mirror at all, but a shared cache of the accelerated site.

In this case, the web server accelerator is installed at the remote location and left to run indefinitely. Maintenance on the system is low because the web server cannot run out of memory or disk space--it uses both, based on least-recently-used (LRU) algorithms that discard least-used content to make room for more recently used content. In this type of setup, a web server accelerator could potentially run unattended for several years.

Configuring for Maximum Scalability

For high-end configurations, BorderManager uses subnetting to squeeze the full capacity out of an Intel-architecture server. This capability is the true measure of scalability and an accurate measure of a network service's efficiency.

In a Fast Ethernet configuration, for example, BorderManager uses the full bandwidth of four adapters to transmit over 40 megabytes of payload per second (MBps).

Figure 12: With five Intel EtherExpress PRO/100 Server Adapters operating in polled-mode (without interrupts), high-end Intel- architecture servers can scale to 40 MBps throughput.

There is only one adapter that provides this kind of scalability in BorderManager servers: an intelligent, Event Control Block (ECB)-aware adapter. Novell and Intel have jointly developed Intel's EtherExpress PRO/100 Server Adapter for this solution. This adapter uses an Intel i960, a 32-bit RISC microprocessor, to offload the majority of LAN channel processing from the server CPU onto the adapter. (For more information, see "The Benefits of Using Intelligent LAN Adapters in NetWare Servers" in the May 1995 issue of Novell AppNotes.)

By using this adapter in polled-mode (the default driver configuration in IntranetWare uniprocessor systems), you eliminate the overhead of interrupt-processing in the LAN channel. This can have a significant effect on the server when you consider that servers with interrupt-based LAN channels typically spend 40 percent of their CPU cycles in LAN adapter interrupt service routines (ISRs).

If you're aiming to build the fastest, most scalable BorderManager server, Novell recommends the Intel EtherExpress PRO/100 Server Adapter in a multiple LAN channel configuration. Even if you're only going to install a single network adapter, the EtherExpress PRO/100 provides the most efficient use of the server CPU.

For benchmarking purposes, we use systems that contain dual peer PCI busses, and in each configuration we balance the LAN adapters across both busses. System memory is also an important factor. SDRAM provides the highest throughput.

Conclusion

If you are looking for a cost-effective way to provide more efficient web services for your web servers, look no further than BorderManager from Novell. When configured as a web server accelerator, BorderManager's Fast Cache provides up to 10 times the scalability of traditional web servers and can offload up to 99 percent of a web server's workload. Moreover, BorderManager can provide these benefits on inexpensive Intel-architecture servers. Once you experience the performance boost that comes from caching in a web access environment, you'll never want to go back to those non-cached systems that gave rise to the nickname "World Wide Wait."

* Originally published in Novell AppNotes

Disclaimer

The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.