Novell is now a part of Micro Focus

How NetWare Loadable Modules Operate in the NetWare Environment

Articles and Tips: article

MATT HAGEN
Technical Consultant
Systems Engineering Division

01 Sep 1991


This AppNote explores the subject of NetWare Loadable Modules (NLMs) and how they operate within the multithreaded, nonpreemptive environment of the NetWare v3.x operating system. It presents a simple short-order cook analogy to help both programmers and non-programmers alike better understand the intricate relationship between NLMs and the fascinating world of NetWare.

Introduction

One day in 1667, as a young man looked on, an apple fell from a tree in Woolsthorpe, England. The event was not unusual; apples had been falling in this garden for many years and people had been watching them fall for just as long. But the young man, a 25-year-old professor of mathematics at Cambridge, had been brooding about his studies as he gazed off in the direction of the tree that day, and the sight of the falling apple prompted a sudden leap of understanding in his mind. He realized that every particle in the universe attracts every other particle. Later, he formulated the law of universal gravitation, the foundation of modern astronomy.

Like Isaac Newton, we too can occasionally see in some simple event the essence of a tangled concept. The simple event presented in this AppNote is the job of a short-order cook. The tangled concept that will hopefully be illuminated by examining this everyday event is the manner in which the NetWare v3.x operating system deals with NetWare Loadable Modules (NLMs). The first few pages of this AppNote give a detailed description of a typical short-order cook's job. While you may not immediately see any possible connection with NLMs, read the description carefully. The rest of the AppNote draws on this analogy to untangle some important NetWare/NLM concepts.

The Cavernous Craw

Every day, from the grills of the golden arches to the floors of Jerry's Diner, the short-order cook - that mainstay of the modern meal - delights hungry crowds with tantalizing odors of sizzling beef and skillful shows of spatulate artistry. Surely during mealtimes past you have watched this dexterous man or woman prepare a dozen lunches in tandem: flipping a chicken breast, salting a burger, submerging more fries, taking an order, toasting some bread - all the while whistling a tune or carrying on some jocular exchange with a customer.

Take Phillip Phlapjack, for instance. As the short-order chef for the Cavernous Craw, Phil is the kingpin of the kitchen. He is the doer of the work. Without Phil there might be ideas for lunch, ingredients for lunch, grills for lunch, and demands for lunch - but there would be no lunch.

Phil possesses a surprising amount of knowledge. He knows how to do many things. He knows dozens of recipes; he knows how to stock the grill area; he knows how to clean the grill, the toaster, the fry vat, and the counter; and he knows how to order produce, meats, and cheeses from local delis.

And talk about busy! Phil has plenty of chores to do. In fact, Phil has categorized his chores into three groups. The first group includes permanent chores. For example, every week Phil has to survey the restaurant's depleted refrigerator and order new foodstuffs from wholesalers in the area. Every morning he must stock the grill area with lunch meat, cheese, tomatoes, hamburger, and other sandwich ingredients. Every evening he has to put everything away. And, at idle times during the day, he has to clean the grill and wipe down the counters.

The second group includes temporary chores. Phil considers each meal that he cooks for a customer a temporary chore. Depending on the time of day and the kind of meals he is cooking, Phil can handle between five and ten meals simultaneously. This was not always so. Years ago, Phil would conscientiously finish each order before moving on to the next. Then it occurred to him that, while waiting for a chicken breast to cook, he could fill two orders for pita bread specials and even sink some fries for another order before the chicken was ready. Now Phil finds it faster to complete a small portion of one chore and then move on to the next chore in a continual round-robin of work, rather than completing each chore before moving on. Frequently, Phil mixes portions of permanent and temporary chores during this round-robin of activity.

Phil's years of experience have also taught him how much to do of one chore before moving on to the next. He makes this switch between chores thousands of times per day.

Phil calls the third group of chores immediate chores. When a customer orders a meal, or a spill occurs in the grill area, or the grease trough catches fire, Phil must momentarily stop what he is doing and respond in some way. The response is the immediate chore. When a customer orders lunch, Phil jots down the order and clips the paper onto the order queue. When a spill occurs, Phil either mops it up immediately or sweeps it under the counter, depending on the size and consistency of the spill. For a grease fire, Phil immediately covers the trough with a metal lid to suffocate the fire, grabs an extinguisher just in case, and scans the customer line for signs of an OSHA representative.

Phil follows two rules regarding immediate chores. First, when performing an immediate chore like listening to a customer and jotting down an order, he always starts and completes it without allowing himself to be further interrupted by another customer. Second, whatever the interruption, Phil tries to minimize the immediate chore and save the rest of the work as a temporary chore. That's why when a customer orders a meal, Phil simply makes a note of it (classifying the actual cooking of the meal as a temporary chore which he will handle in his own good time) rather than cooking the entire meal immediately.

One last thing: Recently the manager asked Phil to take on another job. He asked Phil to ensure daily that all perishable ingredients are rotated properly. Phil is happy to add this job to his list of permanent chores.

Short-Order NetWare

The kitchen at the Cavernous Craw is a good paradigm for a NetWare file server. Like the kitchen, the file server has a doer, the knowledge of how to do things, and categories of chores to do. Let's examine each of these three elements.

The Doer

The file server's central processing unit (CPU) is the doer. It executes all instructions. It is equivalent to Phil in the foregoing analogy. Without the CPU, there might be an operating system and several tasks to accomplish, but there would be no agent to do the computing.

The Know-How

The NetWare operating system data and code contain the knowledge of how to do things, how to unpackage and interpret an incoming packet, how to flush a cache buffer to disk, how to organize and allocate memory, and how to write a message to the console. The OS is equivalent to Phil's recipes and other kitchen know-how. Without the operating system, there might be a willing CPU and a load of tasks to do, but there would be no instructions explaining how to do the tasks.

The Chores

NetWare processes and interrupts define the chores that the CPU must perform. Processes are equivalent to Phil's permanent and temporary chores. Interrupts are equivalent to the Phil's immediate chores. Without processes and interrupts, there might be a willing CPU and lots of operating system code, but there would be nothing for them to do.

Permanent Chores. The permanent chores in the foregoing analogy represent background processes which execute periodically at regular intervals. For example, the NetWare operating system has a Cache Update Process that periodically checks RAM for aged (or dirty) cache buffers that need to be written to disk. It also has a Poll Process that checks a list of received packets that need to be processed.

Temporary Chores. These represent file server processes which reside inactively on a list until assigned (by a background process) to perform a temporary job, such as processing a received packet.

Immediate Chores. Phil's immediate chores represent interrupt service routines (ISRs), the server's initial reaction to packet reception, the completion of a disk write, the depression of a key on the keyboard, and other events that demand immediate attention. Like Phil, the server typically does not allow an ISR to be further interrupted. And like Phil, the server tries to minimize the length of all ISRs. Remember that Phil's initial reaction to a customer's order is to define and queue the order rather than to cook the entire meal immediately. In the same way, the server's reaction to packet reception is to merely queue the packet during the ISR. Later, a background process assigns a server process to handle the temporary chore of interpreting and answering the packet.

Other Aspects of the Analogy

The Cavernous Craw restaurant analogy also brings to light a number of other salient points about NetWare:

Multitasking. Phil handles many chores at the same time, accomplishing small portions of each chore in a continual round-robin of activity. The CPU of a file server (guided by the NetWare operating system) uses this same technique to alternately execute small sequences of code for different processes, giving the impression of simultaneous execution. The technique is called multitasking.

Context Switching. Phil stops working on one chore and starts on another thousands of times each day. NetWare does the same thing. This refocusing of attention is called context switching.

Nonpreemption. Phil cannot indiscriminately decide when he wants to switch from one chore to another. To an extent the nature of each chore dictates when it can and cannot be left alone for awhile in favor of another chore. For example, once Phil scoops a half-grilled hamburger from the grill, he must flip it back onto the grill before going on to the next chore. This is even more true in a NetWare file server. In NetWare, each process decides when it is time to relinquish control of the CPU and allow some other process to direct the CPU's activities. Nothing can preempt the process that currently controls the CPU (except, of course, an interrupt). The NetWare OS is thus described as a nonpreemptive environment.

Priorities. Some chores have higher priorities than others. For example, Phil scrapes the grill and cleans the counters only when he has nothing else to do. A NetWare process can have a priority ranging from 2 to 250. The higher the priority number, the more likely the process is to regain the CPU quickly. Most processes (including processes that CLIB creates) run at priority 50.

Re-entrancy. Sometimes Phil makes two hamburger specials at the same time. He performs two chores simultaneously using the same recipe. A NetWare file server does this all the time. For example, two file server processes might be assigned to handle two request packets at the same time. The one process deals with the first request. The other process handles the second request. This situation includes two processes (chores) executing one piece of code (recipe) at the same time. Code that can handle this situation is called re-entrant code.

Process and Interrupt Times. When Phil performs permanent or temporary chores, he is working in a mode that allows him to switch between chores and listen for interruptions. When he is performing an immediate chore, he can neither switch to another chore nor can he pay attention to other interruptions. Similarly, a NetWare file server runs at either process time or interrupt time. At process time a (usually) reschedulable, interruptible process is controlling the CPU. At interrupt time a (usually) uninterruptible interrupt service routine is occupying the CPU.

Loading Loadable Modules. At one point, the manager gave Phil the new job of checking expiration dates and rotating food. This new job was in addition to Phil's current work load. The manager actually gave Phil two things: (1) instructions explaining how to do the work, and (2) a request to do the work.

Like the manager, when we load a NetWare loadable module (NLM) into server memory, we typically give the CPU two things: (1) new code explaining how to do a chore, and (2) a process telling the CPU to do the chore. This new process must coordinate with several other processes telling the CPU to do other chores. Later in the AppNote, we will analyze exactly how NetWare enables us to load NLMs. First, though, let's explore the split-second world of NetWare processes.

The World of Processes

The human mind is an elusive entity. Invisible and intangible, it is impossible to touch or examine directly. But we can study it indirectly by noticing its relationship to (and effect on) certain things in the physical world. For example, we can study the anatomy of the human brain. We can also study human behavior.

A NetWare process is also difficult to pin down. But we can understand it better by examining its physical manifestations and its behavior. The physical manifestations of a process include three blocks of memory: a structure called a Process Control Block (PCB), a stack, and some code. Figure 1 illustrates these three tangible elements of a NetWare process.

The PCB structure shown in Figure 1 defines the process. The operating system uses the Link field to link PCBs together in various lists. The Stack field points to a stack of memory reserved exclusively for the process. The Stack Pointer field points to values within the stack. In addition to these, a PCB also includes other fields that are not shown in the figure. These fields stipulate the size of the stack, the name of the process, the scheduling priority, and semaphore information.

The box in the middle of Figure 1 represents a stack. Every NetWare process has its own stack where it stores variables and other information specific to the process. During a context switch, before relinquishing control of the CPU, the process stores a Code Pointer on the stack. The code pointer acts as a bookmark; it tells the CPU where (in the code) to start executing when the process regains control of the CPU.

Figure 1: The three elements of a NetWare process include PCB, stack, and code.

The Code in Figure 1 represents the sequence of instructions that the process tells the CPU to execute. Since processes often do the same chore over and over again, the sequence is usually circular. Somewhere in this eternal loop of instructions, the process must relinquish control, allowing the CPU to execute the instructions of another process. As mentioned earlier, this is called a nonpreemptive or voluntary context switch. Let's look more closely at a NetWare context switch.

Context Switch

Figure 2 shows three NetWare processes: A, B, and C. Process A is the Running Process because the CPU is executing Process A's code and using Process A's stack. Processes B and C are queued on the Run Queue, awaiting their turn to tell the CPU what to do.

Eventually, Process A must relinquish control of the CPU and allow Process B to run. To do so, Process A must make a call to ContextSwitch(), a small but crucial subroutine in the kernel of the NetWare OS. ContextSwitch() performs the following steps:

  1. It copies the CPU'sCode Pointer value into Process A's stack, effectively placing a bookmark in Process A's code.

  2. It copies the CPU's Stack Pointer value into Process A's PCB.

  3. It copies Process B's Stack Pointer into the CPU.

  4. It copies the Code Pointer value from Process B's stack into the CPU, effecting telling theCPU where to start executing again.

Figure 2: Three processes as they exist prior to a context switch.

In this way, the running process passes control of the CPU to the next process on the run queue. In the next section, we'll find out where the running process goes after it relinquishes control.

Process States and State Transitions

Figure 3 shows the three possible states and the four possible state transitions of a NetWare process. The three states include running, waiting, and sleeping, as described below.

Running. When a process is in the Running state, the Running Process variable points to the process' PCB and the process has control of the CPU. Only one process can be in this state at a time.

Waiting. When a process is in the Waiting state, it is linked into the Run Queue, a linked list ordered by PCB priority number.

Figure 3: NetWare process states and transitions.

Sleeping. When a process is in the Sleeping state, it is not the running process, nor is it linked into the Run Queue. When asleep, a process is in one of a variety of places, where another process (or an interrupt) can find it and wake it back up.

The four possible state transitions include the following:

Waiting to Running. The letter A in Figure 3 represents the transition from the Waiting state to the Running state. This transition is called gaining control of the CPU. The running process calls ContextSwitch() to effect this transition.

Running to Waiting. The letter B represents the transition from the Running state to the Waiting state. This state change is usually called rescheduling. The CLIB function ThreadSwitch() causes this transition which schedules the running process at the end of its priority on the Run Queue.

Running to Sleeping. The letter C represents the transition from the Running state to the Sleeping state. Software engineers refer to this transition as going to sleep. Sometimes the process' PCB winds up linked into one of several linked lists (of sleeping processes) watched over by the timer interrupt, the Asynchronous Event Scheduler (AES) Sleep process, or the AES No Sleep process. At other times, the PCB pointer is saved in the field of a screen structure representing a screen on which the process is waiting for keyboard input. The CLIB function delay() puts the running process to sleep on the timer interrupt's linked list of sleeping processes.

Sleeping to Waiting. The letter D represents the transition from the Sleeping state to the Waiting state. Engineers refer to this transition as waking up. A process cannot wake itself up. Instead, another process or an interrupt must gain control of the CPU, find the PCB pointer to the sleeping process (perhaps unlinking it from some list of sleeping processes), and link the PCB into the Run Queue. The CLIB function ResumeThread() allows the running process to wake up the specified process as long as it is sleeping on the timer interrupt's list, the AES Sleep process's list, or the AES No Sleep process's list.

NetWare Loadable Modules

Now that we've examined the world of NetWare processes, let's explore how a third-party developer can add a process (PCB, stack, and code) to the NetWare environment in the form of a server application.

In 1987, Novell released NetWare v2.1 with support for primitive server applications called Value-Added Processes (VAPs). The NetWare v2.1 platform provided a limited VAP interface and an inflexible VAP environment, forcing supervisors to bring down a server and then bring it back up again to load a VAP. But it did allow programmers to load their own code into file server RAM and then create their own processes to execute the code.

But even as Novell was releasing NetWare v2.1, the Novell OS architects were redesigning and preparing NetWare to run on the Intel 386 chip, taking the opportunity to rethink portions of the operating system, including support for server applications. One objective was to modularize the OS into dynamically loadable and unloadable components of functionality, allowing customers to customize their servers. The resulting NetWare 3.x loadable module platform includes a rich, varied interface for server applications (NLMs) and a dynamic environment that allows supervisors to load and unload a variety of NLMs, including LAN drivers, protocol stacks, disk drivers, name spaces, and server-based utilities, without interrupting file server operation.

The following two sections examine both the NLM interface and the NLM architecture required by an environment that allows the loading and unloading of server applications.

NLM Interface

The NLM interface consists of a low-level interface called OSLIB and a higher-level interface consisting of CLIB.NLM and other library NLMs.

OSLIB. The NetWare v3.x operating system includes one low-level NLM library called OSLIB which exports about 750 routines and 190 variables to client NLMs. Figure 4 illustrates OSLIB.

Figure 4: The OSLIB API for NLMs.

The following items shed some light on Figure 4.

  • OSLIB is a composite of interfaces. It includes the Link Support Layer API for LAN driver and protocol stack NLMs, the disk subsystem API for disk driver NLMs, the file system API for name space NLMs,and a host of other functions and variables for all NLMs.

  • Aside from those parts that deal with LAN and disk driver NLMs, OSLIB remains unpublished to the third-party world. The intent is not to rob developers of access to useful OS internals, but rather to prevent third-party programmers from using an evolving (creeping) interface. As explained below, CLIB provides access to OSLIB functionality.

  • Several Novell utility NLMs like INSTALL and MONITOR access OSLIB directly. Why? They do so partly because these utilities need access to sensitive OS internals, but mostly because these utilities were written before CLIB was created.

  • Several Novell library NLMs like STREAMS, CLIB, NUT, TLI, SQL, and BTRIEVE also access OSLIB directly.

CLIB and Company. Novell library NLMs, collectively exporting hundreds of functions and variables, form a second API above OSLIB that most third-party NLMs access. CLIB is the cornerstone of this rich API. Figure 5 illustrates the CLIB NLM API.

Figure 5: The high-level CLIB API for NLMs.

Note the following items about Figure 5:

  • CLIB is the cornerstone of the secondary API. For functions like strlen(), CLIB simply performs the work and returns the answer to the calling NLM. For functions like malloc() and read() which require access to NetWare OS internals, CLIB calls the OSLIB interface. For streams-related functions, CLIB calls the STREAMS.NLM API.

  • LIB.NLM represents a host of other Novell library NLMs.

  • SS.NLM is a small (minuscule) library NLM described in the "NetWare v3.x Operating System Statistics Exposed!" AppNote published in July 1991.

  • UTILITY.NLM is a representative example for all third-party NLMs that access CLIB.

NLM Architecture

By definition, NetWare Loadable Modules are both loadable and unloadable. This feature places an architectural requirement on all NLMs. Like an Apollo spaceship that requires boosters during takeoff, a capsule during space flight, and parachutes during splashdown, all NLMs must include an initialization part, a main body, and a deinitialization part. Figure 6 illustrates these three parts of a generic NLM.

Figure 6: The three parts of a generic NLM.

Initialization

init()

   {

        create process or hook interrupt

   }

Main Body

chore()

   { 

        work to do

   }

Deinitialization

deinit()

   {

        destroy process or unhook interrupt

   }

The following imaginary sequence of events explains how each part of the generic NLM shown in Figure 6 is used during load,run, and unload time.

Events During Load Time. A console operator decides to load GENERIC.NLM into file server memory.

  1. The console operator types the following at the server console:

    load a:generic  <Enter<
  2. The keyboard interrupt wakes up the Console Command Process (CCP), a NetWare background process whose job it is to execute all console commands.

  3. CCP bides its time on the Run Queue until it finally becomes the running process and takes control of the CPU.

  4. CCP checks the console command line and finds the string "load a:generic," recognizing the word "load" as a legitimate console command.

  5. CCP searches the diskette in drive A for GENERIC.NLM and finds it there.

  6. CCP loads the GENERIC.NLM code into file server memory.

  7. CCP executes init(), which must either create a process to execute chore() or hook an interrupt to execute the code. If not, chore() will sit inserver RAM and never be executed. We'll assume init() directs CCP to create a process (named GP) and place the process on the Run Queue.

  8. CCP relinquishes control of the CPU, allowing other processes to run.

Events During Run Time. Eventually, the GP process gains control of the CPU and executes chore(), which (we will assume) includes an eternal loop. Shortly thereafter, GP relinquishes control and either goes to sleep or reschedules itself on the run queue. During the life of the NLM, GP repeats this cycle thousands of times.

Events During Unload Time. The console operator decides to unload the NLM.

  1. The operator types the following at the server console:

    unload generic <Enter<
  2. Again, the keyboard interrupt wakes up CCP, which finds "unload generic" on the command line and recognizes "unload" as a valid console command.

  3. CCP executes deinit(). Since init() created the process GP, deinit() must destroy GP. If init() had hooked an interrupt, deinit() would have to restorethe interrupt.

  4. CCP deallocates the server RAM occupied by GENERIC's code image, effectively unloading the NLM from server memory.

As described in the preceding steps, an NLM's initialization part must create at least one process or hook at least one interrupt. This initialization part does not have to be called init(). It can create many processes and hook many interrupts. It can allocate screens, semaphores, sockets, and scores of other resources.

An NLM's main body contains the recipe(s) for the chore(s) that the NLM performs. The main body does not have to be called chore(). It can also allocate (and deallocate) any number of resources.

An NLM's deinitialization part must destroy at least one process or restore at least one interrupt. It does not have to be called deinit(). It can deallocate other resources as needed.

The CLIB Booster and Parachute

During the development of CLIB.NLM, the software engineers noticed a similarity between most utility NLMs. Specifically, they noticed that most NLMs allocate a process and a screen during init() and deallocate the process and screen during deinit(). So, the CLIB architects decided to write generic init() and deinit() routines that all CLIB client NLMs could use.

By default, the init() routine creates a process (thread) that executes a function called main(). It also creates a screen for the NLM. The deinit() routine deallocates both the screen and the thread. The decision required only that all client NLMs name their chore() functions main(), a well-established C tradition already.

A CLIB client NLM's main() function, in turn, can allocate other resources, including threads, screens, semaphores, sockets, and memory, so long as the NLM returns the resources before exiting.

Summary

This AppNote has explored the multithreaded, nonpreemptive environment of the NetWare operating system, examined the world of NetWare processes, discussed the architectural requirements of NLMs, and pointed out one important peculiarity of CLIB client NLMs. Future NetWare programming AppNotes will build on this conceptual foundation.

* Originally published in Novell AppNotes


Disclaimer

The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.

© Copyright Micro Focus or one of its affiliates