Web Server Acceleration with Novell's BorderManager: A Case Study of WWW.NOVELL.COM
Articles and Tips: article
Senior Research Engineer
Advanced Development Group
01 Aug 1997
Got a bad case of slow web servers? This AppNote presents a case study of how Novell's own web team is using BorderManager's web server acceleration capabilities to turbocharge the corporate web site and streamline administration.
- Introduction
- The Evolution of WWW.NOVELL.COM
- Content Caching: The Web's Equivalent of a Free Lunch
- Novell's Current Web Site Solution
- Summary
Introduction
Information businesses around the globe are rushing to hang their shingle on the World Wide Web (WWW). Most are counting on web presence to boost their name recognition and enhance their image as worldwide enterprises, with the hopes of forming new customer relationships and ultimately generating more sales. Many are telling phenomenal and surprising success stories. With over 1,000,000 hits per day, Novell's corporate web site, www.novell.com, is one of those success stories.
Novell's corporate web site is strategic to the company, providing visitors a view of the company and presenting up-to-the-minute information about our products, solutions, programs, and channel partners. It's also an interactive site where communication and feedback can take place between the visitor and Novell. Like many successful sites, www.novell.com handles a lot of traffic and is constantly growing. It is also susceptible to all of the traditional problems related to hosting a corporate web site, including Internet security, 24x7 reliability, and performance. In fact, there's quite an investment of time and money required just to keep a large and dynamic web site up and running.
As an early implementer, Novell quickly saw the need to simplify their web publishing operations and infrastructure. In the process, Novell's IT management and webmasters found a remarkably simple solution to the sometimes nightmarish problems of managing a corporate web site, such as trying to centrally manage a distributed content set while trying to increase performance for customers worldwide.
This AppNote narrates Novell's quest for a solution and describes their eventual use of Novell's BorderManager to redesign their web configuration and increase security and reliability, while boosting performance by an order of magnitude. This case study is targeted to IT executives, corporate webmasters, and network engineers who are interested in applying web server acceleration to their web infrastructures. Whether you're still waiting to enter the fray or you're already holding the Internet tiger by the tail, this case study will give you a blueprint to implement BorderManager so you can achieve similar results.
The Evolution of WWW.NOVELL.COM
Novell's story begins long before the Web was widely recognized as the next great frontier of corporate marketing. By the mid-1980s, Novell had developed a very large presence on CompuServe and delivered marketing, sales channel support, and customer support through CompuServe forums and FTP download sites.
Early Outsourcing
When the World Wide Web began to emerge as a viable alternative for delivering information, Novell outsourced all of its original web site development and hosting (with the exception of support.novell.com) to a third-party web development agency. All of the organizations within Novell who had used CompuServe facilities for everything from marketing to support jumped into the web publishing business. As a result, the initial www.novell.com was composed of a number of sites all served by separate web servers: a corp.novell.com, a netware.novell.com, an education.novell.com, and various other servers, each with its own content set, authoring system, and the normal headaches associated with having multiple systems when one would have been sufficient (see Figure 1).
Figure 1: Novell's outsourced WWW infrastructure.
Moving WWW.NOVELL.COM In-House
When Novell hired a corporate webmaster, he re-evaluated the design of the web site. The main objective of the new design was to provide Novell's company information and all the traditional content and functionality that Internet-savvy customers had come to expect from a corporate web site. Additionally, the design required 24x7 reliability for 100% up time and adequate capacity for the rapidly expanding site.
To meet these overall success factors for the site, the new web team decided on two primary goals:
First, they had to centralize the content set. This would allow them to manage what would otherwise be duplicate copies of the same content set distributed across many servers located all over the world. Centralizing the web server resources would also ease their efforts to provide redundant systems for fail-over.
The second goal was to distribute access to the site internationally so that countries with poor communications infrastructures could have appreciably better access than they had with Novell's existing single point of presence.
The web team and IT management knew they had a pretty big challenge right from the beginning. Based on their experience with too many web servers, Novell came to believe that, for the purposes of managing a web site, the fewer web servers you have the better. So the web team formulated a plan to try to condense the contents and collate them under one server. After factoring in the need for 24x7 reliability, the design was expanded to include two additional servers (see Figure 2).
Figure 2: Novell's first in-house www.novell.com configuration.
The additional servers would be identical mirrors of the first and include a fail-over process in the event one of the servers crashed. Shortly thereafter, IS installed the three web servers, mirrored the contents, and placed them on the web. By registering all of the systems' IP numbers as www.novell.com, the DNS resolution of www.novell.com randomly provides one of the three addresses to clients all over the world thereby allowing the three servers to handle requests on a psuedo-round-robin basis-each handling approximately one-third of the load.
At first, the only page that went up on the new servers was a home page. The remainder of Novell's corporate content was still located on the web servers operated by the third-party agency. Because Novell didn't know what kind of traffic to expect on the new consolidated web site the new site ran with just this single home page for several months. This was obviously an extreme case of overkill to have a single HTML document being served by a set of three Compaq 4500 systems, each with four 100MHz Pentiums, 256MB of RAM, and 16GB RAID subsystems. But with all of the content on the outsourced web servers, this was the perfect opportunity to see what kind of traffic a single Novell web server would attract.
The communications infrastructure for www.novell.com started out with a single T1 line. Within several months Novell needed two T1 lines, and several months later needed four, and soon after six. It was clear that demand would stay ahead of any new infrastructure Novell could put in place.
At the same time, the Novell web team was dealing with the time consuming effort of consolidating the content from multiple outsourced servers. Sadly, the existing content couldn't simply be copied en masse onto the new server because of considerable name space overlap in the top level directories. There was no way to resolve the mismatches without completely reorganizing the content, a tremendously painful, time consuming, and error-prone process.
Performance Problems and a Short-Term Solution
During the web server consolidation, Novell's content became evenly split between the new in-house master server and the outsourced servers. The popularity of the site was also growing and overrunning the communication lines. It was at this point that Novell began to see system utilization on corp.novell.com that was unacceptable. There were two possible solutions:
Buy a larger server
Balance the load by breaking up the consolidated content set and moving the content onto multiple servers
In an attempt to overcome burgeoning web server workloads and bandwidth problems in Provo, Novell's support.novell.com web team (separate from the corporate web team) began to distribute mirrors of the support web servers to other parts of the world. Web servers were introduced in Germany and the United Kingdom that initially were independent of the Provo server system (see Figure 3). Their contents were manually generated replicas of the most popular portions of the corporate server.
Figure 3: An interim solution with several drawbacks.
Although these systems improved access for customers in those regions, the mirrors quickly became labor intensive because the content wasn't automatically synchronized. A better long-term solution had to be found.
Diverging Paths
At this point of the redesign process, the solutions to Novell's two original goals began to take diverging paths. By consolidating the content set onto one mirrored set of central web servers, Novell had simplified the process of making the whole site redundant. They had also improved the ability to manage the site because the content sets were all now adjacent rather than distributed throughout the network. But the performance of both the web servers and the Internet connection was unable to keep up with demand. And now the short-term solution used by the support.novell.com team-placing additional mirrored sites around the globe-was adding enormous complexity and resource burdens to an already overloaded web team.
Content Caching: The Web's Equivalent of a Free Lunch
While searching for ways to speed up these systems, Novell's corporate webmaster came across a web site that described a new type of WWW content caching software called Harvest and another called Squid. The Harvest cache was designed by the Computer Science Departments at the University of Southern California and the University of Colorado - Boulder (http://excalibur.usc.edu/cache-html/cache.html), and the Squid cache was developed at the National Laboratory for Applied Network Research (NLANR - http://squid.nlanr.net). Both provided something called web server acceleration to reduce the web's burden on popular web servers. Several organizations with large Internet sites were successfully using this technology to reduce costs and keep pace with the popularity of their sites. Beyond easing the load on web servers, caching could also save bandwidth, increase the speed of service requests, and protect the web servers from clients that generate repeated requests.
By placing a Harvest or Squid cache in front of a web server, the cache pretends to be the web server and forwards requests that aren't cached to the real web server. Requests for cacheable objects, such as HTML pages and graphics, are served by the cache, providing web server acceleration, while requests for non-cacheable objects, such as CGI-BIN programs, are served by the true web server. Because the majority of web sites are made up of over 95% cacheable objects, web server acceleration can dramatically reduce a web server's workload and increase the performance of a web server by an order of magnitude-all without degrading the site's performance.
To Novell's webmaster, this ability to accelerate a web site sounded like a free lunch. He downloaded the source code for the Squid cache, which was available under the GNU Public License, compiled it, and placed the cache software in front of the web server process on one of the outsourced web servers. The result: the load on the web server process dropped dramatically.
Achieving Webmaster Nirvana
Novell's webmaster wondered if he would gain the same benefit by running the Squid cache on a different system at some other location. So the webmaster installed a Squid cache in his office in San Jose and configured it to cache the www.novell.com server located in Provo. He hoped to gain cached access to the corporate web site which resided on a loaded server, behind a loaded T1 connection and a loaded firewall. The result was instantaneous cached access to the Provo content and an end to his long response times.
Novell's webmaster couldn't believe what he was seeing. He had cached his working set of www.novell.com-Novell's entire corporate content, with the exception of the one home page-on a 486/66 PC with 16MB of RAM allocated to the cache and 100MB of disk space. This small system made an impressive difference in the load on a very expensive web server. If that was all it cost to cut the load on his server in half, then he had found webmaster nirvana.
It was at that point, about a year into the lifetime of the site, that the web team realized the two goals of central management and distributed performance weren't really at odds with each other. They could cache their centralized content set and distribute those caches in areas of high traffic and achieve the second goal of globally distributing the content set without pushing high cost, high maintenance web server mirrors to the four corners of the globe.
Novell had achieved their first goal of central management by condensing the contents into the set of the three mirrored servers. And now, they achieved their second goal by distributing caches of the central site around the world. These successful experiences helped Novell's web team realize that caching was the best way to achieve these goals.
A Better Cache
About the same time that Novell's corporate webmaster was discovering the power of web caches, Novell's Advanced Development Group began to develop an Internet object cache of their own. Based on years of caching research and development from the development of NetWare and IntranetWare, Novell designed an Internet object cache that provides both proxy caching and web server acceleration with performance and scalability unmatched by the Harvest and Squid solutions. In fact, the Novell cache can service close to 4000 connections per second and deliver up to 32MB of payload per second, which is up to ten times the capacity of existing Unix and NT solutions. And these results were produced on a uniprocessor Pentium Pro system. So the Squid caches originally used by Novell's corporate web team have now been replaced by Novell's Internet object cache which is included in Novell's BorderManager suite of Internet technologies.
So that's what led Novell's web team to the point where they are right now. Novell has substantial content along with an authoring system that enables any authorized Novell employee anywhere in the world to submit their content to the centralized server. And their centralized web servers are now the target of BorderManager caches placed around the world to speed worldwide access.
Novell's Current Web Site Solution
The results of Novell's early experimentation with caching and eventual installation of Novell's own BorderManager technologies are still hard to believe. It's not very often that elegant solutions are as simple and inexpensive as BorderManager's web server acceleration capabilities.
The addition of BorderManager has allowed Novell to make a dramatic change to their Internet infrastructure. Internet web servers have traditionally been placed between an organization's inner and outer firewalls which leave the servers exposed to the world, untrusted by the organization, and difficult for web publishers to manage through the firewall. Using BorderManager, this traditional configuration can be redrawn to your advantage.
Today, the only www.novell.com systems located within the DMZ-the "no man's land" in front of Novell's inner firewall-are two BorderManager servers which are configured as web server accelerators. These two servers handle all requests for cacheable objects, including all HTML pages and graphics files. The result is that 90% of the total web traffic aimed at www.novell.com has been offloaded from the web servers.
High Availability
From a performance perspective, one BorderManager web server accelerator is sufficient to handle the equivalent of five T3 lines (32MB of payload per second), but Novell's high-availability web site justifies additional equipment to provide sufficient fail-over mechanisms in the case of failure. So Novell's web server accelerators operate on two servers whose IP numbers are registered as www.novell.com. Both actively service requests and are available if one of the servers fails. A third BorderManager server is configured and ready to run as a hot standby system. These precautions eliminate the single point of failure in the system and also protect against conditions that might lead to a single point of failure.
Improved Firewall Security and Web Server Access
The web servers, which were originally located in the DMZ, have been moved inside Novell's firewall with significant benefits which both tighten the security of the firewall and improve the web authors' access to the web servers (see Figure 4).
The web servers are also configured with two servers actively servicing requests from the web server accelerators and one hot standby system. One of BorderManager's features allows it to access multiple web servers using a form of load balancing-a round-robin process based on the number of outstanding requests on each web server connection. So the two web servers are actively servicing requests, making each a perfect fail-over system.
Since the web servers are inside the firewall, Novell's web authors now have no reason to make connections through the firewall to get to the web servers. This greatly simplifies Novell's firewall construction because the firewall need only allow the two web server accelerators through to the web servers on private connections. Content development and management processes are also simplified because the web servers are located inside the corporate firewall. For instance, CGI applications were forced to go through a CGI-proxy to service requests on a corporate database behind the firewall. Now, with the web servers behind the firewall, the need for that CGI-proxy goes away because the web servers have a direct connection to the database server.
Figure 4: Novell's new Internet infrastructure using BorderManager and web server acceleration.
Hardware and Software Configurations
Novell's two BorderManager web server accelerators live inside Novell's DMZ on a Bay Networks 28000-series 10/100Mbps Ethernet switch. Two Cisco 7000-series routers confine the DMZ. The BorderManager hardware platform is an Intel 200MHz Pentium Pro with 256MB of RAM, a 16GB disk subsystem, and an Intel EtherExpress PRO/100 Server Adapter. Software includes IntranetWare 4.11 and the public release of BorderManager 1.0.
The web servers are rack mounted Compaq Proliant 4500R systems. Each includes four 100MHz Pentium processors, 256MB RAM, and 16GB of disk storage. All three systems are running UnixWare v2.1.2 and Apache v1.2x web server software. Novell has been experimenting with several commercial web server products, but plans to switch over to Novonyx's port of NetScape Enterprise Server as soon as that product is ready for release.
New System Benefits
This technology has given Novell considerable flexibility in terms of bandwidth, delivery, maintenance, and cost. The following are some of the benefits Novell can demonstrate as a direct result of using BorderManager and web server acceleration.
Enhanced Company Image. It's one thing to say you're a global company; it's another to actually be a global company. Although every web site on the Internet is accessible around the globe, the reality is that international bandwidth and performance can prevent your target market from making use of your site without significant pain. Using the proxy cache component of BorderManager, Novell is able to distribute caches of its site into areas where bandwidth is a problem. Since this is done with low-cost PC systems and very little administrative burden, Novell's new "globally friendly" appearance is a real win.
Simplified Management and Control. Novell is happy with the web server accelerators because they're simple. All of the accelerator caches are essentially 100% hands-off autonomous systems. Nobody ever looks at them for any reason . . . they just run. It's an ideal system from an administrative point of view because Novell can distribute the BorderManager servers without the additional requirement of putting technical expertise at every location. It's set up and configured one time, and from then on it simply delivers. Contrast this with Novell's main servers that require constant attention.
With these simple and low-cost accelerators in place, Novell is free to maintain its strategy of centralizing the content set even while experiencing overwhelming growth and pressure from international offices to host their own web sites. In fact, the effectiveness of centralizing the content management and distributing that content via BorderManager servers has led early detractors of the plan to change their minds. With these object caches in place, the international offices couldn't tell they were browsing content in the United States.
This new found success in reigning in international content developers led the web team to create an authoring system that directed the entire flow of new content from around the world through a single processing system in Provo. Thus the BorderManager caches allowed Novell to manage the content creation centrally while distributing access in a way that complimented their worldwide organization. This also made Novell more competitive because they're not stuck with a single, high cost solution for increasing access to their site. They have a more flexible, lower cost solution.
In Novell's experience, cache configuration and administration was an order of magnitude less burdensome than what they would have put in place if they were to replicate the web servers with mirrored web servers. Novell also believes they could not replace any of the BorderManager caches with standalone mirrored systems at this point because the maintenance component would be excessive compared to what the caches require.
Stellar Performance and Scalability. BorderManager's performance characteristics are providing Novell the flexibility they need to keep their main web site centralized and still meet the needs of content authors around the world. In their internal marketing pitch to authors, they can guarantee them the benefit of globally cached distribution of their content without requiring a significant change in the way they construct their content. The content providers are ecstatic.
Novell is also taking advantage of BorderManager's scalability. With the addition of up to four Intel EtherExpress PRO/100 Server adapters running in polled mode, Novell's existing BorderManager hardware and software combination is capable of handling close to 4000 connections per second with a sustained throughput of up to 32 MB per second. That's not bad, especially when you consider www.novell.com's three web servers were handling much less than 100 connections per second to make the 1,000,000 hits-per-day mark.
Lower Costs. By placing a BorderManager web server accelerator in front of Novell's web servers, Novell is relieving their expensive web server systems of more than 90% of their workload. Novell has saved a small fortune by avoiding an otherwise natural upgrade process on those systems. In fact, all three were soon to be replaced by three Sun Entrerprise 3000 servers. Now the three web servers will remain on Compaq equipment.
Novell has also reduced costs by placing BorderManager at sites around the world. These caches are accelerating 90% of www.novell.com's requests for customers in those areas. This means those requests aren't traveling to the central web site in Provo and don't require bandwidth on Novell's web servers or their connection to the Internet.
New Business Opportunities. In terms of creating new business opportunities, Novell's new web architecture allows its strategic partners to cache Novell's entire web site for their internal benefit and the benefit of their customers. For example, a Novell reseller might cache Novell's web site at their location using BorderManager web server accelerators. After downloading sales materials and software once, they would be able to distribute those cached products in a rapid fashion to their customers. Another example is the ability of an organization that relies heavily on Novell products and support to cache Novell's web site at their location. They would have much faster access to the kind of information they need to effectively run their organization.
The tremendous benefit in these situations is that BorderManager caches at the end-points are inexpensive and easy to set up and maintain. Novell's infrastructure doesn't require any special configuration and complements existing Unix and Windows NT systems.
Perhaps best of all, Novell's authoring system is capable of expiring BorderManager cache content. If these caches located at Novell's strategic partners' sites are known to Novell, Novell can include their caches in the authoring system and guarantee the cache owners that their contents are always up to date. The authoring system takes on the burden of expiring the contents that are being modified on the main web servers.
Summary
As Novell's web site was being distributed via BorderManager caches around the world, the web team began referring to the process as "virtualizing" the web site. By virtualizing the entire web site, Novell makes access to the site more convenient for a wider range of people and believes that will result in increased sales and an enhanced perception of the Novell name. Through these experiences, Novell's web team now believes it is beginning to realize the true potential of the World Wide Web.
* Originally published in Novell AppNotes
Disclaimer
The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.