An Introduction to Workload Characterization
Articles and Tips: article
Senior Consultant
Systems Engineering Division
01 May 1991
This AppNote introduces the science of workload characterization as a crucial component of any performance analysis process. It is the first in a series that will detail the tools and uses of LAN workload characterization, especially in relation to network design, optimization, and benchmarking.
- The Blessing and Curse of Network Transparency
- Workload Characterization Defined
- An Example Client Workload
- An Example Server Workload
- Conclusions
- Your Role in Future Workload Characterization Research
The Blessing and Curse of Network Transparency
The transparency of network traffic, or workload, can be a blessing and a curse. Your ability, as the engineer, to hide the complexity of media access protocols and data communication apparatus from the user is the blessing. Your inability to see inside the network and ascertain the amount and types of workload being generated by network users, however, can be a curse. Misconceptions abound.
For instance, word processor users, most of whose computing is performed at the local workstation, often believe their typing speed has a significant impact on the aggregate network workload. Nothing could be farther from the truth. On the other hand, some seemingly innocent network configurations or events can create substantial fluctuations in workload and, therefore, response times.
The fact that local area networks tend to be organic in nature only adds complexity to these scenarios. Users often purchase and install their own workstations and network gear. Small networks sprout here and there without any perceptible impact - initially. Finally, the ease of interconnection often produces internet permutations that occur gradually, even imperceptibly, until major problems with response times and manageability surface.
Amidst these kinds of misconceptions and complexity, you're asked to perform all sorts of magic such as making accurate decisions about the impact of moves and changes, predicting network growth, and optimizing servers - each a close cousin to guesswork without a quantitative understanding of the workload in question.
Workload characterization provides a sound understand of how your clientele is using their network. The resulting information can be used to simplify troubleshooting tasks and help you make decisions regarding network design and optimization. In addition, characterization data, placed in the right format, produces excellent persuasion tools for management, eve non-technical management. And the data reflect the workload of the users rather than your standalone personal opinion. The combination of the characterization data and your opinion equals a professional opinion from which a sound recommendation can be formed. And that makes you a better source of information. That's my goal.
This AppNote is the first in a series that will detail the tools and uses of LAN workload characterization, especially in relation to network design, optimization, and benchmarking.
Workload Characterization Defined
The dictionary defines workload as "the amount of work assigned to, or done by, a worker or unit of workers in a given time period" (The American Heritage Dictionary, 2nd Edition).
Within the confines of a network, workload is the amount of work assigned to, or done by, a client, workgroup, server, or internetwork in a given time period. Therefore, workload characterization is the science that observes, identifies and explains the phenomena of work in a manner that simplifies your understanding of how the network is being used.
With the use of graphs and descriptive metrics, you can begin to collect useful historical information concerning your networks. This historical information, describing the volume, intensity, and patterns of workload created by your clientele, is the only accurate foundation for performance evaluations of any kind.
Depth Perception
One of the most valuable benefits of workload characterization is the immediate perspective gained from a simple graph. I'll never forget the first time I picked up a book about the discovery and exploration of the lost ship Titanic. The scale drawing in Figure 1 was a startling example to me of how one picture can describe something I had no conceptual understanding of - the true depth of the ocean. Now after seeing the drawing, I have a notion of ocean depth as it relates to the size of a large oceanliner.
Workload characterization provides the equivalent view of a network - scale drawings detailing the total bandwidth of the network, the significance of users, and the bandwidth requests of a particular workload. All this, displayed for you graphically, makes the transparent workload just as visible and meaningful as a scale drawing of the Titanic sitting at the bottom of the ocean.
Levels of Characterization
In order to make sense of the billions of bits and bytes being transmitted inside a network cable plant, you must look at the workload from at least three different perspectives, or levels. Not everyone is interested in all of the information you can gather, and sometimes the large numbers of traffic at the lowest level of workload can be misleading.
Figure 1: A Scale Drawing of the Sunken Titanic.
Dr. Domenico Farrari, a professor in the Computer Science Division of the University of California at Berkeley, is one of the foremost researchers in the area of performance analysis and workload characterization. He uses a three-level diagram similar to the one in Figure 2 to classify workload characteristics.
Figure 2: Levels of Workload Characterization.
Physical Level Characterization
The physical level is the computer's view of the workload. At this layer, you can look at bytes, the quantity of bytes, and their frequency. You can also look at packets, the size of packets , and their quantity and frequency.
Questions you might ask at this level include:
Is the network cable plant under- or over-utilized?
Is there a fair level of access given to all clients?
Is the network free of significant error conditions?
What levels of housekeeping traffic are required to keep the protocol and network services operational?
Although several media access protocols, such as Ethernet, provide a promiscuous mode for network analysis with home-grown applications, a protocol analyzer is the most accurate source of information at this level.
Logical Level Characterization
The logical level is the programmers' view of workload. Programmers aren't so concerned at this level with bits and bytes as they are with the individual response times and sequencing of remote procedure calls made by the application. Here, they may look at the packet's contents and the response times of specific request/response sequences.
Questions most often asked by programmers at this layer might include:
Does the packet contain a service request or a response?
What kind of request or response is it?
How quickly does the requested service respond to a request?
What sequence of network events occur when high-level API calls are made by the application?
Can I fine-tune these sequences?
Fine tuning is possible to a degree that is often discernable by the user. So an analysis at this level can be rewarding for the programmer whose application will be using the network frequently.
Functional Level Characterization
The functional layer is the users' view of workload. This layer is made up of a series of transactions such as reports, queries, sorts, and searches. Changes in transaction response times are immediately apparent to the user at this level and are usually elevated to a high priority by management when the users' expectations aren't being met.
Questions you might ask at this level include:
Why is my word processor creating a temporary filed before it writes the file to the print queue?
Is my network's housekeeping traffic consuming enough bandwidth to be the cause of my performance problems?
Will a 386-based workstation significantly improve my network response time versus the use of my current 286-based workstation?
This level is more application-dependent and less system-dependent than the other two levels. Performance at this level varies widely based upon your selection of software, the physical and logical architecture of the network, and configurable software settings.
An Example Client Workload
The primary contributor to network workload is the user, or network client. The term client may be better than user in this case because a client can represent application that are human-driven, as well as applications that run unabated, such as batch-driven processes, reports, sorts, and software compile and link routines.
A network client generates workload by making requests in a client/server relationship with a server, or by communicating with other clients and services in a peer-to-peer relationship. While both types of traffic are found in the NetWare environment, the majority of traffic is made up of file and print requests of the client/server type.
The example workload in Figure 3 represents the workload produced by a NetWare client. In this case, the client is a Legal Secretary named Laura Maxwell. Laura works in a small legal office with four attorneys. She uses WordPerfect v5.1 running in the Microsoft Windows environment. Within the various classifications of word processor users--personal, professional, legal, and desktop publishing--Laura, with her legal responsibilities is on the heavier end of the workload scale.
Figure 3: Workload Characterization for Laura Maxwell, Legal Secretary.
The x-axis, in this case, represents four hours of work time, from 8 a.m. to 12 p.m. on a Tuesday morning. The y-axis represents, at its peak, 100 percent utilization of the media.
Looking at the graph of overall throughput of Laura's workstation, you could say Laura's use of the network was light. This doesn't mean Laura wasn't working. It means that most of Laura's work was performed at her workstation. This is due to the inherent client/server architecture of WordPerfect - a standalone network application. Only file import, export, and file search activity made significant use of the network's resources.
This is a much different picture than you might have drawn if you were to watch Laura work or were given only Laura's throughput statistics. The graph represents 30,705 requests and 30,705 responses from the server for a total of 61,410 packets, weighing in at just over 16MB of raw data. Surprised? You should be. Most word processing users fall below a legal secretary's utilization figures, making these figures all the more interesting if any of your clients are WP users.
The on graph of Laura's workload and the workload numbers listed above represent just the tip of the iceberg. But you get the idea of how useful this information can be when it is simplified with an appropriate context and scale.
One obvious conclusion is that users like Laura do not use a significant portion of network bandwidth. So what does the combined workload of over 100 users look like? For the answer, let's take a look at a characterization of a NetWare server in action.
An Example Server Workload
The workload created by an individual user , or client, on a network is just one component of the total workload found on a NetWare network. Not only are other clients vying for the same network services, but a continual stream of network management functions run in the background. These housekeeping functions, although minimal, play an important role in keeping the network running. A few general categories of housekeeping functions include server advertising, routing information, logins and logouts, connection management, and resource accounting.
From a NetWare server's perspective, workload includes both the requests generated by all of the server's logged clients, the server's responses to those clients, and all of the housekeeping functions mentioned above. In addition, multiple LAN connections utilizing the server's routing services can add additional workload made up of packets from internet connections passing through the server's router. All of these kinds of workload encountered by a server are addressed to that specific server. Requests include the server's address as the destination. Replies include the server's address as the source. Packets using the server as a router also include the server's address as their destination.
The only other type of traffic not mentioned above is broadcast traffic with a destination address of FFFFFF FFFFFF. All of these types of traffic (broadcasts included) are monitored and graphed in the workload characterization below.
Figure 4 represents the workload received and transmitted by a mission-critical server at Novell's Provo development center, called PRV-SRS. The server is dedicated to an application written in Dataflex that manages Novell's customer service and technical support database. At the time of this characterization, the database residing on PRV-SRS totaled 4GB and the number of active users connected to the server remained steady at 132.
The x-axis, in this case, represents twelve hours of work time, from 6 a.m. to 6 p.m. on a Wednesday. The y-axis represents, at its peak, 100 percent utilization of the media.
Looking at the graph, you immediately recognize a work pattern characteristic of time and motion studies in any industry. Generally, people accomplish the most in the morning hours and never quite hit the same levels in the afternoon. These patterns are not only interesting but point out peak and valley periods of usage that can be helpful while troubleshooting or when looking for the most appropriate time to make untimely network repairs during work hours.
Figure 4: Workload Characterization of Server PRV-SRS.
Also, you can tell that PRV-SRS users are not utilizing the full bandwidth of the cable plant. This is not to say that they should, but that there is room within the full bandwidth of the media for additional PRV-SRS users or additional servers and their users.
As with Laura's workload above, you might have come to another conclusion about PRV-SRS's workload if you only had the statistics to go on. For instance, PRV-SRS crunched 4,780,529,270 bytes of data and served 9,234,525 requests during that time period and transmitted the same number of replies for a total of 18, 467,182 packets, weighing in at just 4559MB of workload.
The benefits of the characterization graphs are many. Not only does the characterization allow you to easily compare the workload of one server to another and observe general trends, the graph allows you to communicate accurate information about the server's workload characteristics to others.
Conclusions
Admittedly, the characterizations in this AppNote barley touch the surface of detail and usefulness of accurate, simplified workload metrics. Yet, without this quantitative understanding of workload, attempts at network design, optimization, or procurement (benchmarking) are trial and error at best.
Although many experts within the industry have drawn up their own design heuristics based upon experience, even they have misconceptions and a daunting number of unanswered questions. Because of these unknowns in the network equation, workload characterization must be a fundamental and essential element of each of these processes.
Simple byte and packet counts are unusable in most cases die to the inability of most of us to deal with such large numbers. So the statistics, whatever their source, must be clothed in a manner that makes reality obvious and minimizes misconceptions.
Your Role in Future Workload Characterization Research
Workload characterization is part of a better toolkit. A research project at Novell is currently focused on the development and use of workload characterization tools, especially as they relate to design and optimization processes. As these tools are completed, they will be published as AppNotes. A call for characterization will accompany many of these tools, AppNote readers, Novell will compile a database of workload characterizations from hundreds of installations. A set of network design and optimization strategies based upon that database of workloads will follow.
Although protocol analyzers are necessary for accurate workload characterization today, the goal of this ongoing performance analysis project is the creation of software tools and design and optimization guides that can be used by anyone that owns NetWare products - without the need for expensive protocol analyzers.
Editor's Note: The author accepts written feedback at FAX (801) 429-5511.
* Originally published in Novell AppNotes
Disclaimer
The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.