Generating NLMs Without Watcom C

Articles and Tips: article

JAN BEULICH
Developer Support Engineer
Developer Support

01 Mar 1997

Considers the use of compilers other than Watcom C, that are equally capable of generating 32-bit FLAT-model-code.

Copyright 1997 by Novell, Inc. All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, for any purpose without the express written permission of Novell.

All product names mentioned are trademarks of their respective companies or distributors.

Introduction
The EXE2NLM File Set
Naming Conventions
CLIB Independent NLMs
OS Dependent Compiler Generated Code
Other Considerations
CS Command Line Options
CS Configuration File Options
Debugging

Introduction

While Watcom C/C++ provides a stable cross-platform development environment that by default includes NetWare Loadable Modules as a development target, there is still no reason to not consider using other compilers that are equally capable of generating 32-bit FLAT-model code. Today's domination of NLM development by Watcom is primarily based on the fact that Watcom was a very early provider of a 32-bit compiler. Today, with the Win32 platforms gaining more and more importance, FLAT-model support is a must for any compiler vendor that does not want to be out of the market within the next few years.

There are of course some basic things that one should be aware of before deciding which compiler to use. First, FLAT-model code running on Win32 runs equally well on NetWare because the only restriction is that the two memory models must be equal, which they are (the only difference being that in NetWare everything runs in an almost unprotected single address space environment whereas Win32 user mode applications run in separated protected address spaces). Following that, the major problem one is going to face is the generation of the actual NLM.

Again, Watcom provides a linker with built-in support for NLM generation. But Novell itself also provides a linker, namely NLMLINKx. The limitation with this linker is that it is bound to a certain .OBJ-module format variant (PharLap EZ-OMF), and that it is not familiar with recent additions to the .OBJ module format. Next there is Base Technologies NLink Pro, which lifts some (but not all) of the restrictions NLMLINKx places on .OBJ files, but this still has the drawback that you must own another product in addition to the compiler. This is where the EXE2NLM utility set comes into place.

The EXE2NLM File Set

The EXE2NLM file set provides several ways of removing the above named restrictions in order to give the NLM developer more flexibility in deciding which development environment to use, and to preferrably stay with a familiar compiler rather than having to switch. Below we are going to discuss the advantages and disadvantages of these different approaches.

In order to easily follow the descriptions, you should be familiar with MAKE files, as well as with the FLAT memory model and some basics about .OBJ and .LIB files.

Method 1 Compile to IBM/MS OMF format, convert .OBJ file to PharLap format, and link with NLMLINKx

I believe this is the easiest, fastest, and most reliable method as long as the below restrictions do not prevent its use. It requires you to insert only a single step into your normal compile process. You have to compile with a compiler that is capable of generating OMF format files, and you have to pass the resulting .OBJ file through the OBJ utility specifying the /EZ command line option in order to convert it to the format NLMLINKx understands. In a makefile this would look like this:

.c.obj: $(CC) $(CFLAGS) $<obj /ez $*

The link process does not require anything special. Compilers suitable for this process are Borland C++ , as well as IBM's CSet/2 and Visual Age (although there may be some OMF constructs that would not be understood by NLMLINKx). Microsoft's Visual C++ does not fit into here because it generates COFF format .OBJ files rather than OMF. Also, do not attempt to use Borland's Turbo Assembler with the /op switch to generate Pharlap OMF .OBJ-files; the format generated here is not the one expected by NLMLINKx. The disadvantages of this method come mainly from limitations that NLMLINKx places on the .OBJ files it understands.

It is suitable for most (possibly even all) C constructs, but not for C++ constructs like exception handling, RTTI, and compiler instantiated inline functions. It is also not suitable for certain assembly language constructs like absolute externals and fixups required for dup()ed data items (constructs which result in LIDATA records requiring fixups).

You can examine makefile.ez for a complete example using Borland C++ with this method.

Method 2 Compile to the compiler's native format, use the compiler's native linker, and pass the resulting .EXE file through EXE2NLM

This is the most flexible variant enumerated here because it allows for all language constructs to be handled by the established interface between compiler and linker. No modification is required to the intermediate .OBJ files. The first important difference in setting up the link process is that the NWPRE.OBJ file from the NetWare SDK needs to be replaced by the correct NWPREC.??? file from the EXE2NLM\OBJ directory (OMF\NWPREC.PE for Borland C++, OMF\NWPREC.IBM for IBM Cset/2 and Visual Age, and COFF\NWPREC.PE for MS Visual C++).

The other major difference is the use of import libraries instead of .IMP files. This is required because the linkers we are talking about here do not understand the .IMP files provided for use with NLMLINKx or WLINK. To generate the libraries, use IMP2LIB to convert the .IMP files to the proper library file format. Note that you will need to do this only once (possibly repeating this only when you get updated versions of the NetWare SDK), and you will be able to use these libraries for all your projects, so this is not an additional step required for every NLM you build.

The only additional step to take here is finally converting the linker output file to NLM format using EXE2NLM.

Although this method is very flexible, there are still some drawbacks. Primarily, the code size of the resulting NLM becomes larger when converted out of PE (Portable Executable) files due to the internal format of these files. Second, you have to ensure that the linker output file (regardless of whether this is PE [Win32 executable] LE [Win386 Linear Executable], or LX [OS/2 Linear Executable]) contains exactly one code and exactly one data section (except for LE and LX files where there may be a completely uninitialized second data section that is marked as being the stack in the EXE-header). Especially for PE files this limitation on code segments can also not be lifted due to the internal format, for the other formats as well as for multiple data sections in PE files lifting this restriction would be a future option.

You can examine makefile.pe for a complete example using Borland C++, makefile.ibm for a IBM CSet/2, or makefile.ms for MicroSoft Visual C++ with this method.

Method 3 Compile to IBM/MS format, convert to true FLAT format if necessary, and link using LINK386

This method falls somewhere between method 1 and 2. It is targeted for Borland C++ only at this time, and removes the image size (and implied speed) impact of using PE files as intermediate format. Due to restrictions in LINK386 this again imposes some limitations on the possible language constructs, but these are not as tight as for NLMLINKx (I have no complete picture of what does work and what does not work, yet).

The compiling technique is as for method 1, with the difference that the OBJ utility must be passed the /FLAT option to convert the .OBJ files to true FLAT format. You can omit the conversion if you are sure the compiler generates correct FLAT-model fixups, but there is no harm in passing a correct .OBJ file through the OBJ utility. Note that Borland's compilers do not generate correct FLAT format .OBJ files, whereas compiling through assembly seems to always generate correct FLAT files. Failure to provide correctly formatted .OBJ files to LINK386 leads to omitted fixups in the intermediate executable and hence also to missing fixup information in the final NLM. The corresponding MAKE rule would look like this:

.c.obj: $(CC) $(CFLAGS) $<obj /flat $*

The linking method resembles the one used in method 2, with the difference that NWPREC.LEX must be used as the initialization file. You can examine makefile.lex for a complete example using Borland C++ with this method.

Global Data that Requires Initialization, Static C++ Objects

If you use static C++ objects that require construction and/or destruction, or if you use Borland's #pragma startup() or #pragma exit() (or the IBM/MS equivalents, which require you to put certain data structures directly into specially named data segments as far as I can tell) you will need to replace NWPREC.??? with NWPRECPP.??? and you will need to link the support library (BCPP.LIB, VCPP.LIB, or ICPP.LIB) in order to get this additional work completed before/after main() runs.

Naming Conventions

The default naming rule for names in NetWare does not prepend an underscore to functions imported from other NLMs (namely the C Runtime Library NLM CLIB). Some compilers support omitting the underscore, while others don't. Generally it is preferable to have the compiler omit the underscores, but if you can't do it that way, the underscores can still be removed with the help of the import library files that you have to use while linking. To do that, you need, while generating the import libraries from the .IMP files, to specify the /U switch to IMP2LIB which causes the linker to resolve the symbols with underscores added to them, but to put them in the intermediate executable without the underscores.

Note that if the compiler does not allow omitting the underscores you will not be able to use method 1 presented above.

CLIB Independent NLMs

In case you are going to build NLMs (such as drivers or other low-level components) that do not interface with CLIB but use OS functions directly you will not need to use NWPRExxx.???, but instead you will have to export an entry procedure (named _Prelude), an unload procedure (named _Stop), and optionally a check unload status procedure (named _Check). Failure to do so will prevent the NLM from being generated with the exception that _Prelude may be replaced by an entry point definition. Note that you may internally use other names for these procedures, but you will then need to use the .DEF file feature for renaming exported functions during the link process.

OS Dependent Compiler Generated Code

Unfortunately the current compilers that target the Win32 or OS/2 platforms use (probably due to speed considerations) certain platform-specific features for C++ and/or structured exception handling. Specifically, they depend on the 80x86 FS segment register to point to the current thread's Thread Information Block. This is a Win32 and OS/2 specific convention that does not apply to NetWare, and accessing FS in the same way as under Win32 or OS/2 will lead to system crashes. Therefore, either do not use exception handling or change the compiler's output prior to translating it into machine code.

To do this you will need to use the CS utility that calls the configured compiler to translate C/C++ code into assembler code, then parses the .ASM file for any OS-specific contents (changing it to library calls), and calls the configured assembler to translate the resulting assembler code into machine code. An example of this process, along with sample configuration files for CS, is given in the SAMPLE.XX directory.

Using exception handling and dynamic casts always requires you to link a support library as well as attach to (become dependent from) NWXCPT.NLM (which provides a very basic implementation of Win32 and OS/2 compatible exception handling OS support). This module is not supposed to work on NetWare versions prior to 4.0, so using exception handling will make your NLM NetWare3-incompatible.

Currently the exception handling library support has been built for Borland C++ only, and hence the required intermediate utility has also been tested with this compiler/assembler pair only.

Warning:If you are going to use TLINK32 for linking, do not use TASM32 for translating the intermediate .ASM files into .OBJ because it omits certain necessary records that prevent TLINK from linking correctly. Use TASM instead.

Other Considerations

Due to implementation differences it is necessary that you avoid calling CLIB functions that return structures unless they have been implemented statically in the support library that corresponds to your compiler (BCPP.LIB, VCPP.LIB, or ICPP.LIB). The CLIB implementation uses Watcom's schema of passing a pointer to the structure; other compilers use other methods which are not compatible and will result in your NLM crashing or misbehaving. Currently only div() and ldiv() have been identified to require this manipulation.

Always turn compiler switches on that support multi-threading unless you don't use callbacks or generate multiple threads.

Always turn exception handling off unless you use the compiler shell described above to modify the OS dependent parts of the compiler output (which is currently possible only with Borland C++, as mentioned above).

The switch from the old (so-called monolithic) to the new (so-called modular) CLIB has taken with it some compatibility issues that had been resolved using new symbol names and mapping the old ones through #defines to them. These new symbols are clearly not available with the old CLIB versions, and hence NLMs referencing them will currently not load. CLIB engineering has proposed supplying a shim module that maps these new symbols to their old counterparts (of course sacrificing the new functionality, but allowing the NLMs to load).

All NWPRE files supplied here reference the new symbols, and until this proposed shim module is made available from the CLIB team this file set includes a stripped-down version (CLIBAUX.NLM) that provides only the most common routines that need to be defined to allow NLMs to load on old platforms.

Exe2nlm Command Line Options

Options are case-insensitive. Options can be preceded by either '/' or '-'.

Option	Description
/b:<filename<	Specifies a bag file (like cross domain call data or SMP marshalling information) to insert
/c=<string<	Specifies the copyright string to insert in the NLM header
/d=<string<	Specifies the description string to insert in the NLM header
/f:<flag<	Specifies one or more (comma separated) flags to set in the NLM header's flags field; named flags are MULTIPLE, OS_DOMAIN, PSEUDOPREMPTION, REENTRANT, and SYNCHRONIZE; other flags can be specified numerically
/h:<filename<	Specifies a help file to insert (use HELPLIB to generate one)
/k=<number<	Specifies the CLIB stack size in bytes, kilobytes (last character must be 'K'), or megabytes (last character must be 'M')
/l	Inserts special module dependency to force new symbols to become accessible on legacy platforms (see above information on CLIBAUX.NLM)
/m:<filename<	Specifies a message file to insert (use MSGLIB to generate one)
/n	Omit CLIB specific information from the NLM header (use only if you do not reference CLIB)
/o:<filename<	Specifies a custom data file to insert
/p:<name<	Specifies additional module dependency that is not automatically taken from the intermediate executable file
/s:<string<	Specifies CLIB initial screen name
/t:<string<	Specifies CLIB initial thread name
/u	Prevents removing uninitialized data from the load image; normally if the size of zeroes at the end of the data section exceeds a certain small number it is replaced by a small piece of code that zeroes this area at runtime
/v:<number<[.<number<[<character<]]	Sets module version number embedded in NLM header
/x	Includes extended header even if it is not required
/y=<number<	Set module type (see NLMLINKx for possible values and their meanings)

Options that take values (strings, filenames) can generally be either followed by the data directly, or the value my be separated using ':' or '=' characters. Because the command line (using all these switches) may become quite large, the options can be (one option per line) placed in a response file, the name of which can be specified as the last (possibly only) parameter to EXE2NLM, and preceded by an '@' character.

CS Command Line Options

Options should be treated as case-sensitive (for future compatibility), although they are currently handled case-insensitive. Options can be preceded by either '/' or '-'.

Option	Description
/I:<filename<	Use alternate .INI file, by default the .INI file is searched for in the current directory and in the load directory with a name equal to that of the program, but with the extension replaced by .INI
/Ka	Keep intermediate assembler file
/Kc	Keep comments in assembler file
/Kd	Keep debug info in assembler file
/Lc=<number<	Number of lines to keep cached in order to analyze them and possibly optimize code (default is 8)
/Ls=<number<	Maximum allowed line length (default is 128 bytes, especially C++ names may require more space)
/P	Only parse .ASM file (does neither call compiler nor assembler)
/V	Verbose operation, primarily displays information on every item encountered that needs replacement
/W	In case of error wait for a key to be pressed before terminating (to prevent automatic window closure when called from IDEs)
/!<string<	Pass <string< as option to compiler

Options can be (one option per line) placed in a response file, the name of which can be specified as the last (possibly only) parameter to CS, and preceded by an '@' character.

CS Configuration File Options

Configuration files are split into sections. Currently only a [compiler] and an [assembler] section are supported. Section contents are cumulative; that is, sections can be repeated. For options that allow only a single value the last value encountered is used and no warning or error is generated; multi-value options are accumulated and passed to the target utility in the sequence they were encountered in the configuration file.

Option names are case-insensitive; option values are never modified and passed as is to the target utility. Options and their values are separated by any number of white space characters followed by an '=' character. Every character up to the end of the line is taken as the option value.

Comments are supported on separate lines only and must have a ';' character as first non-blank.

Option	Description
Program=<filename<	Specifies <filename< as the target utility to execute (the PATH environment variable is searched if filename does not include a path)
ResponseFile=<boolean<	Specifies whether the target utility can be passed (understands) a response file containing the command line; the response file will always be preceded with an '@' character
Redirection=<boolean<	Specifies whether the target utility's standard input can be redirected to a file containing the command line
Options=<string<	Specifies options to pass to the target utility; this can be repeated, and options are passed in the sequence they are encountered
AsmOutput=<string<	Specifies the option(s) required to force the compiler to generate an assembly language output file (only allowed in [compiler] section)
OutputOption=<string<	Specifies option(s) required to have the target utility generate an output file that has a name not corresponding to the input file; if this option string starts with a ',' the option (including the comma) is added after the input file (normally it is added before)

Debugging

An alternate server-only debugger is also included in this file set (NWDBG.NLM). The primary target for it was to overcome the command-line driven interface of the NetWare internal debugger while mostly copying the functionality. The most important addition is symbol-file support (.SYM files in IBM/MS format, not in Novell format, conversion is possible through SYMCONV).

Function and control keys are sufficiently described through the <F1< and Ctrl-<F1< help options, but the syntax for input required at certain places may be not obvious.

Setting Data Breakpoints. <address<[,<1|2|4<[W|I][@<processor<]]

Examples:

to set a byte-wide read memory breakpoint enter	12345678
to set a dword-wide read memory breakpoint enter	12345678,4
to set a byte wide read/write memory break point enter	12345678,1w
to set a word-wide I/O breakpoint enter	12345678,2i
to set a byte-wide read memory breakpoint specific to processor 2 enter	12345678,2@2

Setting Code Breakpoints. In the code pane <F2< can be used to toggle code breakpoints at the current cursor position, or the input window can be used to specify addresses. Non-temporary code breakpoints are currently global for all processors.

Changing Code and Data Breakpoints. In the breakpoint list Windows cursor keys, <SPACE<, <DEL<, and Ctrl-<ENTER< are recognized. <SPACE< toggles the breakpoint on and off, <DEL< removes the breakpoint, and Ctrl-<ENTER< puts the code or data window display location to the highlighted breakpoint. Use <ENTER<or <ESC< to close the breakpoint list windows.

Follow Address in Data Pane. The selection window allows changing the target address type using cursor keys and <SPACE<. The target address type consists of the data type (code or data), the address size(16- or 32-bit), and the distance(near, far, or selector:0). Both code and data pane have a 15-entry history of the last addresses displayed.

Changing Memory. In both the code and data pane memory can be changed by just starting to type. Be aware that in the data pane memory is updated immediately after a display element has been completely overwritten or <ENTER< has been pressed. In the code pane you have to type one complete assembly level instruction (symbols are not yet allowed here). Pseudo instructions DB, DW, and DD can be used.

Changing Registers. Move the highlight cursor to the register you want to change and simply type the new value (symbols are allowed here). Note that the whole register is updated, there is no way to preserve part of the register (e.g., only change AL within EAX). The flags register is somewhat special, only individual flag fields that are displayed as characters below the numeric value can be toggled using <SPACE<.

Locating the Nearest Symbol. Just press '?' optionally followed by an address (the default is the current cursor position) to get the closest preceding and following symbols, if any. All of exported symbols found in NetWare's internal symbol tables, symbols loaded from Novell debug info placed in NLMs, and .SYM files loaded by the debugger are used for this, as well as any other symbol-related operation.

Generally .SYM files are searched for

when loading the debugger: in the debugger load directory and the search path for all modules that are already in memory; this allows loading .SYM files for OS components.
when loading a module to be debugged with the DEBUG console command (which simply replaces the LOAD command for this purpose); the DEBUG command has the side-effect of breaking at the first instruction of the NLM.

Use the cursor keys to navigate through all other information lists.

Restarting the Server. Alt-Ctrl-<DEL< can be used to restart the server from within the debugger. Use this with caution because no updates are made to the disk system in this case. Note that this may sometimes not work (e.g., when there are hardware interrupts pending) and may even sometimes hang the machine because there is no exact way to determine whether reboot is possible at a given time. (You may, however, do this from any processor; the debugger will switch to the boot-strap processor and shut down all others.) There is no way to reboot the computer from within the debugger.

Processor Faults and Abends. Generally, as this is a development tool, all faults and abends should be considered unrecoverable. Simply acknowledge the message with any key except <ESC< and then do any analysis of the problem in the debugger. However, some faults can be forwarded to the OS by pressing <ESC< on the error message screen, and you can also do analysis in the debugger first, then press <F9< which will (if you did not remove the error condition that lead to the fault) redisplay the error message, so you can press <ESC< then. Note that abends are generally unrecoverable and can never be forwarded to the OS, so with the debugger running you cannot use the automatic abend recovery on 4.11 servers.

* Originally published in Novell AppNotes

Disclaimer

The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.