Walking the Stack with the NetWare Internal Debugger: The Example
Articles and Tips: article
Senior Software Engineer
01 Aug 2002
In my first article explaining the intricatecies of the NetWare internal debugger, we went through the debugger terminology. Now let's walk through an actual server Abend and show what to look for in an abended server's registers with the debugger.
First run ABEND.NLM and choose Abend #1. This will abend the server and you'll see the screen shown in Figure 1.
Running the ABEND module abends the server and displays this message
To help you determine the problem, use the question mark (?) to see what function the server is currently in. In this case, you see in Figure 2 that an address in the LIBC.NLM module is at fault, at the starting point in its code at +00066863h (hexidecimal).
Use the question mark (?) to help you find out which module is at fault
You next use the "r" command to look at the registers, which you can also see in Figure 2. The r command shows that in this instance, the server is abending because register ESI shows 0x00000000h, which is a null pointer. (Null pointers point to memory that doesn't exist and therefore will abend the server.)
You then use the unassemble command (u) along with the function command from the top of the function to show more information about register ESI. In this case, the function is memcpy, which you can see from the question mark command.
The u command shows you that register ESI received its value from a MOVE instruction from the line that says:
MOV ESI, [ESP+10]
From this information, you know it was a passed-in parameter, since ESP (Extended Stack Pointer) is used to access the parameters. From here you need to walk the stack to determine who passed in the offending parameter. We'll start this process by dumping a manageable portion of the stack, as shown in Figure 3.
In Figure 3, the command "dd esp 40" dumps the first four paragraphs, or forty words, of the ESP stack. This should allow you to see what has gone on before.
The dd command tells the debugger to display the values to which the Extended Stack Pointer (ESP) is pointing
The D06E44A4 is the first stack value and is the same as ESP, since ESP points to the beginning of the stack by default. And 0x00000000 is placed on the stack by the PUSH ESI from memcpy. (Remember, the stack is a snapshot of what has already happened, so you need to go backwards to take items off until you find the offending parameters).The next value, 0xD3B05480, is from PUSH EDI in memcpy. This shows the preserved values in this function.
The next address, D3ACD17C, is the return value from the call to memcpy(). From this return value, you can see which application called the memcpy function. You verify this by unassembling it and then by using the ? command to determine what NLM was calling memcpy. This is shown in Figure 4.
The question mark command followed by the return value points to the module that abended the server. The -5 1 portion of the U command backs us up one opcode to make certain we know from where the Call was made
Now you know that ABEND.NLM called memcpy and passed in bad parameters, which led to the abend. At this point, system administrators can take this information back to the manufacturer of the offending software and see if this problem has been reported and if there are any patches or updates.
If you are a developer, you'll want to go a little further. Developers will want to unassemble ABEND.NLM a bit to take a look at things before ABEND calls the memcpy function. You do this by again using the u command, stating the return address, minus 30 bytes, as seen in Figure 5. This will bring you 30 bytes ahead of ABEND.NLM's call to memcpy.
The -30 on the u command brings you 30 bytes ahead of ABEND.NLM's call to the memcpy function
So you know the bad parameter that memcpy took off the stack at ESP+10, which ABEND.NLM PUSHed on that stack. The second parameter to memcpy came from [EBP-14, a local variable containing an invalid memory location and the cause of the abend.
From this information, you know the problem is not the fault of LIBC.NLM, but needs to be fixed by the writer of the ABEND.NLM software.
Feel free to work through the other three coredumps on ABEND.NLM. (You can find the ABEND.NLM utility at http://www.novell.com/coolsolutions/tools/13516.html , under the ABENDEMO.NLM: Server ABEND Testing Utility heading.) They were chosen to get more difficult as they go and should provide several examples of abends that require walking through the stack. However, be sure to do this on a non-production server; otherwise you will wreak havoc on an unsuspecting public.
* Originally published in Novell AppNotes
The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.