The NetWare Internal Debugger: Walking the Stack

Articles and Tips: article

Tom Buckley
Senior Software Engineer
Novell, Inc.

01 Jun 2002

NetWare has a power tool available for debugging problems that occur, and it is called the internal debugger. This tool, while intimidating at first, has several advantages going for it:

It is available on any NetWare server.
It can always show what's going on.
It is powerful enough to find the root cause for almost every problem that occurs.

Additionally, when a coredump is taken from a server, the virtual debugger (VDB) gives a lot of the same functionality.

I want to discuss how to use the NetWare debugger and the technique of "walking the stack" to determine the root cause of some typical Abends that occur.

Abend Terminology

To begin, we must discuss some Abend terminology. There are two basic types of Abnormal-Ends, or Abends: the processor-detected Abnormal Ends and code-detected Abnormal Ends.

A processor-detected Abend is known by the words "Processor Exception," following the type of Abend. This would include "Page Fault," "Machine Check," or any other combination that includes "Processor Exception." These Abends occur when the processor detects some error, such as reading from or writing to non-existent memory. An example of this would be a function referencing [EAX] when EAX = 00000000.

A code-detected Abend typically has a bit more information and does not mention the processor. The messages include "Free Detected a Corrupt Trailing Red Zone" or "CsleepUntilInterrupt Called on a Processor Other Than 0" (have to check that one out). They indicate a problem the software has found through one of its numerous checks that are in place. When these errors occur, the server halts so you can perform further debugging, allowing you to determine the root cause of the corruption.

Assembly Instruction and "Walking the Stack"

The internal and virtual debuggers both display their information in Assembly instructions. Being able to use them effectively will require a good knowledge of the Assembly language in order to navigate through the instructions. As a person spends more time in the debugger, it becomes easier to understand how C code is translated into Assembly, which will help greatly in using the debugger.

One of the most useful tasks a person can do in the debugger is to learn how to "walk the stack," which consists of removing from the stack what has been already been done. You do this "unwinding" of what has already occurred in order to bring the server to the state of when it stopped running.

The stack itself is a lot like a bread trail, with the processor leaving behind a line of bread crumbs on the trail showing where it has been. Walking the stack is the ability to walk back along that trail, picking up the bread crumbs.

When Working with the Stack

There are two things to keep in mind when working with the stack. The first is counting in hexadecimal. Hex counts from 0 to F then 10. When walking through the stack trail, it is important to remember that each long on the stack counts as 4 hex digits. So when you look at a paragraph on the stack, you would count 4, 8, C, 10 (see Figure 1).

The second thing to keep in mind is ESP, which is the register that holds that current stack pointer. This is the starting point to help you determine what has occurred.

Counting by 4 across a line of code (the 1 line is considered a paragraph). There are 4 paragraphs shown in this example

Also, there are six typical instructions that change the stack: PUSH, POP, ADD, SUB, CALL, and RETURN. Their functionality goes like this:

PUSH pushes a value onto the stack
POP removes a value from the stack
ADD when done to ESP removes items from the stack
SUB when done to ESP adds items to the stack
CALL PUSHes a RETURN address onto the stack
RETURN POPs a return address off the stack

What Happens on the Stack When a Call Is Made

When troubleshooting an Abend, the first thing to understand is what happens on the stack when a call is made. At the time of the call, the instruction that follows the call is PUSHed onto the stack. This is known as the RETURN address and is always the instruction immediately following the CALL command.

At the point of the CALL, execution is moved to the new function. Most C functions then preserve some of the registers' values. The registers that hold the information include EBX, ESI, EDI, and EBP. By being a preserve, they will be restored to their value after the function is finished (see Figure 2).

After the register value preserves, any space on the stack for local variables will be reserved by using the SUB ESP, X command where X is any amount of space.

What the beginning (or top) of a typical function looks like in the debugger. Each compiler does its own, but this example shows the typical function

As you continue down through execution of the function, you will see that one function will eventually call another function. This is also preceded by PUSHing the parameters to the function onto the stack, then CALLing the function, which then PUSHes register value preserves and SUBs locals (see Figure 3).

Two separate calls being made. The first is the parameters being PUSHed on the stack, the actual CALL opcode, then the ADD ESP, C call that cleans up the stack

When walking the stack, these items need to be removed.

PARAMETERS
CALL (RETURN ADDRESS)
PRESERVES
LOCALS

By going through this list, it is possible to walk back through the stack far enough to determine which application function sent in bad parameters.This will help you find the culprit at the point of the problem.

Next time, we'll take a walk through the F dump that you can generate through the ABEND.NLM.

* Originally published in Novell AppNotes

Disclaimer

The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.