Graceful NLM Demise

Articles and Tips: tip

Kevin Burnett
Senior Research Engineer
Novell AppNotes
kburnett@novell.com

01 Mar 2003

In most environments, the program is finished when it runs as it should. However, an NLM is not finished until it can terminate successfully. Here are some rules to consider when writing an NLM.

Rule One: "Everyone put away your own things when you are finished with them."

If your NLM application allocates a resource, it must free it when no longer needed; but this rule does not stop there. If an NLM thread allocates a resource, it must free it when no longer needed.

Some NLM developers attempt to write an all purpose "clean up" procedure. Such a procedure is inherently very difficult because it attempts to free resources that it did not allocate. Don't do this. Having each thread free its own allocated resources is much more efficient.

Rule Two: Implement a signal (SIGTERM) handler

Users can attempt to unload your NLM using the UNLOAD command. Your NLM must be prepared for this event. The best way to handle this is to use the signal() function to implement a SIGTERM signal handler. Then, do not allow your SIGTERM handler to "return" until your main() thread, and any other NLM threads, have terminated.

When I write an NLM, I instinctively create two global integer variables; NLM_exiting and NLM_threadCnt. Initially, NLM_exiting is set to FALSE (0), and NLM_threadCnt is set to zero (0). The first statement in my NLMs main() is ++NLM_threadCnt. The last statement before my NLM main()'s final "return" statement is --NLM_threadCnt. Further, the NLM_exiting variable is monitored by all loops within my NLM. If NLM_exiting is ever true, all threads free any allocated resources and then self terminate. Given this, my NLMs SIGTERM signal handler looks something like the following:

void NLM_SignalHandler(int sig)
{
    switch(sig)
    {
        case SIGTERM:
        NLM_exiting = TRUE;
        while(NLM_threadCnt != 0)
        ThreadSwitchWithDelay()
        break;
    }
    
    return;
}

Many software engineers are tempted to use the AtUnload() or atexit() functions instead of a SIGTERM signal handler. Their success in doing so is somewhat limited. When NetWare executes the handler for these functions, all of the NLM's threads have already been summarily terminated, whether or not they have had a chance to clean up their resources. To be blunt, I do not use AtUnload() or atexit() in my NLMs.

Note: Your SIGTERM handler will be executed by an OS Thread. (OS threads cannot take advantage of most of the functions offered in the NLM SDK.) In fact, this OS thread is the one and only Console Prompt thread. This thread is the one that accepts and executes commands from the file server's Console screen (or the colon prompt).

Note: When your NLM's SIGTERM handler captures control of this thread, it has literally captured the command prompt. Understand that the console command prompt will not be available again until your SIGTERM signal handler releases it. It is also important to note that your SIGTERM signal handler must not destroy this thread. Doing so would be destroying the console command prompt. Therefore, do not call exit() from your SIGTERM handler.

At times, your SIGTERM handler may need to call NLM SDK functions that require a full CLIB thread context. It is possible to borrow a CLIB context for the SIGTERM handler's OS thread, effectively converting it to CLIB Thread.

To do this, you must first have the thread group ID of another an active thread within your NLM. Your NLM's main() thread is generally suitable for this purpose and you can get its thread group ID by calling GetThreadGroupID() and storing this ID in a global variable (such as NLM_mainThreadGroupID). Then your SIGTERM handler can temporarily assume the same context as your main() thread by calling the SetThreadGroupID() function.

Before calling SetThreadGroupID() in your SIGTERM handler, call GetThreadGroupID() to obtain the handler's original context. You must restore the SIGTERM handler's original context before returning. The code below shows an example main() and SIGTERM handler that demonstrates the above verbiage.

int NLM_mainThreadGroupID;
int NLM_threadCnt = 0;
int NLM_exiting = FALSE;
    
void NLM_SignalHandler(int sig)
{
    int handlerThreadGroupID;
    switch(sig)
    {
        case SIGTERM:
        NLM_exiting = TRUE;
        handlerThreadGroupID =
            GetThreadGroupID();
        SetThreadGroupID(NLM_mainThreadGroupID);
    
        /* NLM SDK functions may be called
            here */
        while(NLM_threadCnt != 0)
            ThreadSwitchWithDelay();
    
        SetThreadGroupID(handlerThreadGroupID);
        break;
    }
    return;
}
    
void main(void)
{
    ++NLM_threadCnt;
    
    NLM_mainThreadGroupID =
        GetThreadGroupID();
    signal(SIGTERM, NLM_SignalHandler);
    
    /* Body of main continues here... */
    
    --NLM_threadCnt;
    return;
}

Rule Three: Be Aware of Code That Might Be Blocked or Suspended when UNLOADed

Assume that the body of main() above includes a statement such as getch(). The getch() function blocks (or suspends) the thread's execution until a character is received from the keyboard. It is highly probable that the console operator will attempt to UNLOAD our NLM while it is waiting for keyboard input.

The SIGTERM handler is waiting for main() to decrement the NLM_threadCnt value to zero before proceeding, and main() will not do that until it receives a character. Therefore, it will appear to the console operator that the System Console screen is "hung" and that your NLM will not unload as requested.

The SIGTERM handler is responsible for waking up any blocked or suspended threads so that they can become aware of the NLM_exiting value. (Obviously then, it is the responsibility of each thread to check the NLM_exiting value as often as appropriate). The SIGTERM handler can help wake up a thread blocked on the getch() function by calling the ungetch() function, stuffing a character into the keyboard buffer. This will be read out of the keyboard buffer by the blocked getch() and execution can proceed.

Other blocking functions you should watch out for include gets(), t_snd(), NWSList(), NWSMenu(), SuspendThread(), delay(), etc.

Rule Four: Don't Forget Child Threads and Call-back Routines

As shown in the sample code, the first thing main() should do is increment the NLM_threadCnt; and the last thing is to decrement the value. If the NLM calls BeginThread(), or similar functions, the spawned thread should also increment and decrement the NLM_threadCnt just as main() does.

If your NLM sets up other call-back routines, each call-back routine must also increment and decrement the NLM_threadCnt. Call-back functions might include functions specified by NWAddFSMonitorHook(), NWRegisterNCPExtension(), RegisterForEvent(), etc.

Rule Five: Allow Your NLM To Terminate Normally If Appropriate.

Just as your SIGTERM handler waits for the NLM_threadCnt to go to zero, your main() thread should never terminate until the NLM_threadCnt is one (i.e.: only main is still running). This allows any thread in your application to shut down the NLM by setting NLM_exiting to true. This also forces main to stay alive until all other NLM threads have terminated. For example:

void main(void)
{
    ++NLM_threadCnt;
    
    NLM_mainThreadGroupID =
            GetThreadGroupID();
    signal(SIGTERM, NLM_SignalHandler);
    
    /* Body of main continues here... */
    
    while(NLM_threadCnt != 1)
            ThreadSwitchWithDelay();
    
    --NLM_threadCnt;
    return;
}

Rule Six: Don't Forget CTRL-C

Your user can break out of your NLM using CTRL-C. To avoid this, you can register a SIGINT signal handler or disable CTRL-C's functionality. The following illustrates how to implement a simple SIGINT handler that causes your NLM to ignore CTRL-C. Notice that the SIGINT signal handler must be re-registered each time a CTRL-C event occurs.

int NLM_mainThreadGroupID;
int NLM_threadCnt = 0;
int NLM_exiting = FALSE;
    
void NLM_SignalHandler(int sig)
{
    int handlerThreadGroupID;
    switch(sig)
    {
        case SIGTERM:                    
        NLM_exiting = TRUE;
    
        handlerThreadGroupID =
            GetThreadGroupID();
        SetThreadGroupID(NLM_mainThreadGroupID);
    
        /* NLM SDK functions may be called
            here */
        while(NLM_threadCnt != 0)
            ThreadSwitchWithDelay();
    
        SetThreadGroupID(handlerThreadGroupID);
        break;
    
        case SIGINT:
            signal(SIGINT, NLM_SignalHandler);
        break;
    }
    return;
}
    
void main(void)
{
    ++NLM_threadCnt;
    
    NLM_mainThreadGroupID =
            GetThreadGroupID();
    signal(SIGTERM, NLM_SignalHandler);
    signal(SIGINT, NLM_SignalHandler);
    
    /* Body of main continues here... */
    
    --NLM_threadCnt;
    return;
}

You may also elect simply to call SetCtrlCharCheckMode() to disable CTRL-C's function.

Following the above rules will help your NLM find its way peacefully into that place where all good software goes after it has been duly executed. Help your NLM put its affairs in order so that it can rest peacefully and avoid an undue ABnormal END.

This information was taken from the DeveloperNet University Course called "Programming NDS with NetWare Loadable Modules (NLM)." You can find this course on the web at http://developer.novell.com/education/tutorials/nlm_nds/index.html

* Originally published in Novell AppNotes

Disclaimer

The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.