--------------------------------------------------- Interfacing DJGPP with Assembly-Language Procedures Written by Matthew Mastracci Last Revised: 10/11/1997 --------------------------------------------------- Table of Contents =--------------------------------------------------- Section =--------------------------------------------------------------------------- 1.0 Introduction 1.1 A few words 1.2 Where to get NASM 2.0 Assembly language with DJGPP 2.1 Inline assembly (AT&T style) 2.2 When and when not to use NASM 3.0 DJGPP and NASM 3.1 Introduction to NASM 3.2 Using NASM with makefiles 3.3 Using NASM with RHIDE 3.4 Getting used to NASM 3.5 A note on memory references 3.6 Returning 64-bit values 3.7 Name-mangling 3.8 Internal data structures 3.9 Exportable data structures 3.10 A note about labels 3.11 Accessing external symbols 4.0 Advanced NASM topics 4.1 Accessing real-mode interrupts 4.2 Direct memory access (to protected-mode memory) 4.3 Direct memory access (to real-mode memory) 4.4 malloc() and NASM 4.5 Hooking interrupts 4.6 _CRT0_FLAG_LOCK_MEMORY 4.7 Real-mode callback functions 4.8 Doubleword-aligned accesses 5.0 Contacting the author 5.1 Closing comments 5.2 Getting DJGPPASM.DOC =--------------------------------------------------- 1.0 - Introduction =--------------------------------------------------- 1.1 - A few words =--------------------------------------------------- Most programmers have no trouble dealing with assembly language in real mode. Your code is in one segment, your data is in another and you have complete, unhindered access to the entire system. Things like "general protection faults", "segment violations" and "page faults" seemed like things only Windows programmers had to deal with. In protected mode, however, things are different. Instead of segments, we have to use selectors. You aren't allowed to write to any memory location you please and the absolute addresses you took for granted in real mode don't work in quite the same way. The purpose of this tutorial is to ease the real to protected mode transition and help the reader to grasp the important fundamental concepts that are important to well-behaved assembly language. 1.2 - Where to get NASM =--------------------------------------------------- The easiest place to get NASM from is Simtel. There is a large list of Simtel mirrors in the DJGPP FAQ. It's in the "/pub/simtelnet/msdos/asmutl" directory, with the name NASM???.ZIP (where ??? is the latest version number). =--------------------------------------------------- 2.0 - Assembly language with DJGPP =--------------------------------------------------- 2.1 - Inline assembly (AT&T style) =--------------------------------------------------- DJGPP allows you code full-blown inline assembly with one catch: you have to use a semantically different set of opcodes referred to as "AT&T-syntax." On top of that, Gas, DJGPP's assembler, is only used to getting code straight from GCC, which means that it has very limited syntax-checking and may not do exactly what you think you told it to. To begin, let's look at the format of an inline-assembler statement in a function: unsigned short int AddFour(unsigned short int x) { unsigned int y; __asm__ __volatile__( "movw %w1, %%ax\n" "addw $4, %%ax\n" "movw %%ax, %w0\n" : "=r" (y) : "r" (x) ); return y; } This function (as you may have guessed), adds four to a given parameter and returns the value. You may, however, wonder about the strange arrangement of registers and variables for the opcodes. Compared to Intel-syntax asm, they're backwards! In essence, standard x86 opcodes "act" from right to left, ie: mov ax, bx gets the value of ax from bx; add ax, 4 adds 4 to ax. AT&T opcodes, however, acts from left to right. Here are some examples: Intel AT&T mov bx, ax movw %%ax, %%bx add ax, 4 addw $4, %%ax Basically, the format of an inline assembly statement is: __asm__ [__volatile__]( "opcodes" : output-vars : input-vars : modified-regs ); - "__asm__" instructs the compiler to treat the parameters of the statement as pure assembly and to pass them to the assembler (Gas) as written. ; - "__volatile__" is an optional statement which instructs the compiler not to move any of your opcodes around from where you place them or combine them as it pleases. This is probably a good idea, as Gas is designed to take pure GCC output and may take some unwanted liberties with the code; - output-vars is a list of constraints indicating which variables will be modified by the routine: use the format "=r" (x), where r is the type of x and x is the output variable to be modified; - input-vars is a list of constraints indicating which variables will be used by the routine: use the format "r" (x), where r is the type of x and x is the input variable to be referenced in the routine; - modified-regs is a list of hard registers which are "clobbered" by the function. Inside the actual assembly statements, you'll notice that you don't actually say "movw x, %%ax" or something like that. That's because the compiler "renames" them, in the order you "mention" them. That means that the first variable defined as an output variable will be "%w0", and, if it exists, the next output variable will be "%w1", if it doesn't exists, the first input variable will be "%w1" instead. In essence, the numbering starts at output-vars at zero and counts up through the end of output-vars and then through input-vars, incrementing by one each time. Inline assembly is convenient in DJGPP and quite fast, but its complexity in programming and general unreliability makes it difficult to use for long programming tasks. In the next section, we will explore an alternative, Intel-syntax-based method that works incredibly well with DJGPP. 2.2 - When and when not to use NASM =--------------------------------------------------- There are a few points to consider when choosing whether NASM will be the best assembly-language compiler for you to use or even whether a combination of the two methods would be better. 1. Using the __asm__ directive means that you can inline your assembly language function, which you can't do with NASM. On top of that, you can tell the compiler which registers you clobber and it can work around that. 2. GAS was only designed to take input from the compiler, not from the programmer. There are a few things you need to watch out for, as its error-checking is fairly minimal. 3. NASM follows the TASM/MASM format for assembly in most cases. If you're used to these compilers, the adjustment period is much less, where as with AT&T syntax, you have to practice a little more to get used to it. 4. NASM is a solid compiler that won't optimize behind your back. Each instruction you enter is compiled exactly as you enter it. Also, NASM supports MMX, which (as far as I can tell) isn't supported by GAS. My personal preference is this: use NASM for your major assembly routines (ie: the sprite-drawing routines in a graphic library or callback functions) and use the __asm__ directive for minor functions that you want to inline at any point in the future. If you're planning on creating a .S file, you can probably save a lot of headache by just creating it as a .ASM file and compiling it with NASM instead of GAS. =--------------------------------------------------- 3.0 - DJGPP and NASM =--------------------------------------------------- 3.1 - Introduction to NASM =--------------------------------------------------- There is a freeware package named the NASM, Netwide ASseMbler, project. It is a fully-functional assembler with most of the capabilities of commercial products such as MASM or TASM, but (in the spirit of DJGPP and other freeware compilers) costs you nothing. Obtain the latest version of the package, create a %DJDIR%\NASM directory and unZIP the archive inside the directory with the -d option. This will create the proper directories for the program. Finally, either copy the NASM executable to your %DJDIR%\BIN directory or add the directory to your path. 3.2 - Using NASM with makefiles =--------------------------------------------------- If you use GNU's make to handle your projects, setting up NASM as a compiler is fairly easy. For each file, the dependancies/method line should look like this: filename.o : filename.asm ; nasm -f coff filename.asm You could also create a rule for making .o files from .asm files, to save time: %.o : %.asm ; nasm -f coff $< 3.3 - Using NASM with RHIDE =--------------------------------------------------- If you want to include a NASM-compiled file in your project, follow these steps for each .asm file: 1. Open the project window; 2. Select the .asm file you want to compile with NASM; 3. Hit Ctrl-O for the file's local options; 4. Select "User" for compiler; 5. Enter "nasm -f coff $(SOURCE_FILE)" in the "Compiler" text area; and 6. Set the error-checking to "built-in C-compiler". This should work for all versions of RHIDE, starting at version 1.2. In future versions, the author may have implemented a new system for using external compilers. Check the program updates for more information. 3.4 - Getting used to NASM =--------------------------------------------------- To begin, let's create a sample assembly language program: nasmtest.asm: [BITS 32] [GLOBAL _AddFour__FUi] [SECTION .text] ; --------------------------------------------------------------------------- ; Prototype: unsigned int AddFour(unsigned int x); ; Returns: x + 4 ; --------------------------------------------------------------------------- x_AddFour equ 8 _AddFour__FUi: push ebp mov ebp, esp mov eax, [ebp + x_AddFour] add eax, 4 mov esp, ebp pop ebp ret Let's also create a C++ program to test it: nasmtest.cc: #include extern unsigned int AddFour(unsigned int); int main(void) { printf("AddFour(4) = %i\n", AddFour(4)); return 0; } To make things easier, create a makefile like so: nasmtest.mak: nasmtest.exe : nasmtest.cc nasmtest.o ; gcc nasmtest.cc -o nasmtest.exe nasmtest.o -v -Wall nasmtest.o : nasmtest.asm ; nasm -f coff nasmtest.asm Type "make -f nasmtest.mak" to create the executable. When you run it, it should print "AddFour(4) = 8". Let's go through the assembler source: [BITS 32] This line instructs NASM that you want it to prefix 16-bit code and default to 32-bit code, not the other way around. You must have this in any functions designed to run in protected mode. If you don't have it (in a version of NASM prior to 0.91) you'll probably generate a page fault (try it), because all of your memory references will have the top word truncated (ie: ebp = 0x47282567, chopped to 0x2567). [GLOBAL _AddFour__FUi] This line tells NASM that the label is to be exported and will be accessable as an external symbol to other modules. There are other things that can be placed here, but we will learn about those later. [SECTION .text] This defines the .text segment, which is the segment placed first in the assembled file. It contains any executable code (exportable or not) for the module. There are other types of segments which we will learn about later. ; --------------------------------------------------------------------------- ; Prototype: unsigned int AddFour(unsigned int x); ; Returns: x + 4 ; --------------------------------------------------------------------------- These lines give the prototype of the function for your C++ function, for easy reference. They aren't necessary, but help you with remembering what your function takes as parameters and returns. x_AddFour equ 8 As NASM has no direct support for parameter structures (as of yet), we can define our function's parameters with this method. These are used for referencing parameters placed on the stack by the calling function. Each parameter will take a minimum of four bytes (padded with zeros), which helps you keep your code 32-bit. _AddFour__FUi: This label declaration defines a function called "AddFour." There is, however, a suffix which represents the parameters of the function. This suffix is unique to C++ (the label would be "_AddFour" in C) to account for external function overloading. Each suffix begins with "__F" to say that it is, in fact, a function. Following this, there are characters representing the data types (some of them): Character Data Type c char d double f float i int s short v void x long long int In addition, there are prefixes you can add to the characters to change them: Prefix Meaning U unsigned (ie: Uc is 'unsigned char') P pointer (ie: Pi is 'int *') R reference (ie: RUc is 'unsigned char &') push ebp mov ebp, esp These lines are analagous to the "push bp/mov bp, sp" opcodes in real-mode. They are used to preserve the 32-bit values of the caller's stack frame. mov eax, [ebp + x_AddFour] add eax, 4 We load eax with the 32-bit unsigned int from the stack and add four to it. Note that all function return values are returned in eax (extended to edx if necessary). mov esp, ebp pop ebp ret This simply restores the caller's stack frame and returns to the caller. 3.5 - A note on memory references =--------------------------------------------------- In NASM, memory references work a little differently than they do in most other assemblers. To specify the address of a variable (a symbol), you don't use the "offset" function. Instead, you just specify the name of the symbol as so: mov esi, output_variable ; loads ESI with the *address* of output_variable mov output_variable, esi ; illegal! trying to load into immediate value If you want to access the contents of a memory location, put the variable name in square brackets: mov esi, [output_variable] ; loads ESI with the *value* of output_variable mov [output_variable], eax ; loads output_variable with the contents of eax That's all there is. If you're ever confused, just imagine that the compiler is just replacing each of the symbols with its absolute constant value (which it is, as a matter of fact). That way (if output_variable is at offset 2362), this makes no sense: mov 2342, esi But these do make sense: mov esi, 2342 mov [2342], eax 3.6 - Returning 64-bit values =--------------------------------------------------- One of the great things about DJGPP is its built-in support for 64-bit integers (signed and unsigned). How can we use these in NASM? Simple: we just pass the low byte as we would pass a regular int, but we pass the high byte in edx. Consider: [BITS 64] [GLOBAL _BigNum__FUiUi] [SECTION .text] ; --------------------------------------------------------------------------- ; Prototype: unsigned long long int BigNum(unsigned int a, unsigned int b); ; Returns: unsigned 64-bit integer, with a as the high word and b as the low ; word ; --------------------------------------------------------------------------- a_BigNum equ 8 b_BigNum equ 12 _BigNum__FUiUi: push ebp mov ebp, esp mov edx, [ebp + b_BigNum] mov eax, [ebp + a_BigNum] mov esp, ebp pop ebp ret Let's create a quick C++ program to test this: #include extern unsigned long long int BigNum(unsigned int a, unsigned int b); int main(void) { unsigned int a = 0x11111111, b = 0x22222222; printf("The 64-bit integer made from 0x%0x and 0x%0x is 0x%016Lx (%Lu)", a, b, BigNum(a, b), BigNum(a, b)); return 0; } The range of the 64-bit 'unsigned long long int' is from 0 to 18,446,744,073,709,551,615. That's 18 quintillion, or 18x10^18! 3.7 - Name-mangling =--------------------------------------------------- As mentioned before, C++ functions are "mangled" to account for how it allows the programmer to overload the parameters of a function. If you don't want your functions names to have annoying trailers like "__Fv" or "__FUiUicP11__dpmi_regs", there's another way to do it. If you load up one of DJGPP's supplied include files, you'll notice this near the start: #ifdef __cplusplus extern "C" { #endif int foo(int bar); void fx(void); ... #ifdef __cplusplus } #endif The extern "C" declaration tells the compiler that the function was originally compiled with a regular C-compiler, not a fancy C++-compiler that mangles your function names. This way, you can write your assembly language functions will just a preceeding underscore (like _foo or _bar). Before you get mad at C++ for mangling your simple names, take a look at what C++ does to your complicated class names! 3.8 - Internal data structures =--------------------------------------------------- Many of the functions you will write with the assembler will require various variables that won't be seen outside the scope of the module. To do this, let's create a new section below the function AddFour. Put this at the end of nasmtest.asm: [SECTION .data] AddConstant dd 00000004h Now modify AddFour like so: push ebp mov ebp, esp mov eax, [ebp + x_AddFour] add eax, [AddConstant] mov esp, ebp pop ebp ret The code now references data inside the data segment (everything in .data is placed in the data segment referenced by the selector in ds). With NASM, you must place the variable's name in square brackets to indicate a reference to a memory location. If you forget the brackets, the compiler will treat it as the actual offset of the variable, often leading to unwanted side-effects. 3.9 - Exportable data structures =--------------------------------------------------- What if we want a function to make a variable accessable to a module it gets linked with? Let's add a version string to the module for the calling program to read. Add this in the .data section, under AddConstant: _VersionString db "NASMTEST.ASM - Version 0.0a", 00h Now, at the top of the file, under the first [GLOBAL ...] declaration, add: [GLOBAL _VersionString] This label is now exportable as a variable in the calling program. Add the following lines to nasmtest.cc, under the declaration of AddFour: extern char VersionString[]; And under the first printf() statement, add: printf("VersionString = '%s'\n", VersionString); Compile this, and you should see the exported string from the assembly module. Note that the variable can be accessed internally as well. Consider: ; --------------------------------------------------------------------------- ; Prototype: void NewVersion(char c); ; Returns: nothing ; --------------------------------------------------------------------------- c_NewVersion equ 8 _NewVersion__Fc: push ebp mov ebp, esp mov al, [ebp + c_NewVersion] ; mov eax, [ebp + c_NewVersion] ; would be just as valid, as the ; stack is padded with zeros mov byte [_VersionString + 26], al ; 26 is the offset of the 'a' ; in VersionString from position mov esp, ebp ; zero pop ebp ret Add the following two lines as well: NewVersion('g'); printf("VersionString = '%s'\n", VersionString); Don't forget to export the function with [GLOBAL ...] and declare the external function in your C++ source. What if we wanted to offer a complicated structure? Add the following declaration to nasmtest.cc: extern struct { unsigned int x, y; char * VersionStringPointer; } DataValues; Now add the following in the .data section: _DataValues: dd 00000001 ; unsigned int x dd 00000002 ; unsigned int y dd _VersionString ; char * VersionStringPointer ; Loaded with offset of version string, ; effectively a pointer And export it with: [GLOBAL _DataValues] Now you can check that the structure works: printf("x = %i, y = %i\n", DataValues.x, DataValues.y); // This line uses a char * instead of char like before printf("VersionString = '%s'", DataValues.VersionStringPointer); Other variable types may be passed back and forth in this method. Just remember to ensure that the structures on both sides are exactly the same. 3.10 - A note about labels =--------------------------------------------------- In essence, there are three types of labels within NASM: 1) Exportable references: labels which are declared in a [GLOBAL ...] statement at the top of a file, accessable by other modules linked with this one. These labels reference functions, structures and variables, and begin with an underscore, ie: _InitMode13h__Fv: ; Exportable function ; function code _DataTable: ; Exportable structure db 00h db 00h _XCoord dw 0000h ; Exportable variable 2) Internal references: labels referencing functions, structures and variables which are private to the module are written without a prefix character: SaveRegs: ; Internal function ; function code DescriptorTable: ; Internal structure dw 0000h, 0000h RAMPointer: dd 00000000h ; Internal variable 3) Internal jump targets: labels which will be the target of conditional and unconditional jumps are written with a period as a prefix: .LoopStart: loop .LoopStart jmp .DoneProc .DoneProc: Adhering to these guidelines will ensure your code works properly with DJGPP and most debuggers. 3.11 - Accessing external symbols =--------------------------------------------------- Your assembly language module can access public symbols from other modules it gets linked with as well. Create two strings in the .data section like so: ; Note that 0ah is used instead of \n, as \n is not interpreted by ; printf, but processed by the compiler instead. PrintTemplate1 db "VersionString = '%s'", 0ah, 00h PrintTemplate2 db "unsigned int Exported = 0x%08x", 0ah, 00h Now create an assembler function to access some external symbols: ; --------------------------------------------------------------------------- ; Prototype: void PrintStrings(void); ; Returns: nothing ; --------------------------------------------------------------------------- _PrintStrings__Fv: push ebp mov ebp, esp ; Remember that C pushes parameters from right to left, not ; left to right like Pascal! ; Note that we push the pointer for a string... push dword _VersionString push dword PrintTemplate1 call _printf ; ... but push the actual value of most everything else push dword [_Exported] push dword PrintTemplate2 call _printf mov esp, ebp pop ebp ret We'll have to tell NASM that we have a few external symbols, so it doesn't give us any errors. Put these by the [GLOBAL ...] definitions: [GLOBAL _PrintStrings__Fv] [EXTERN _printf] [EXTERN _Exported] Now remove any code previously contain in main() and add the following lines: external void PrintStrings(void); unsigned int Exported; int main(void) { Exported = 0xdeadbeef; PrintStrings(); return 0; } =--------------------------------------------------- 4.0 - Advanced NASM topics =--------------------------------------------------- 4.1 - Accessing real-mode interrupts =--------------------------------------------------- When you write your C++ code, you usually don't worry about having to call a real-mode interrupt. Usually, you just define a register structure and call __dpmi_int or one of the various other wrappers provided by DJGPP. In assembly language, however, you don't have it quite so easy. The easiest way is to call the DPMI server's function 0x03000, which simulates a real-mode interrupt. Create a new file with the following: inttest.asm: [BITS 32] [SECTION .text] SaveRegs: mov [SaveEAX], eax mov [SaveEBX], ebx mov [SaveECX], ecx mov [SaveEDX], edx mov [SaveESI], esi mov [SaveEDI], edi mov [SaveEBP], ebp pushf pop eax mov [SaveFlags], ax ret RestoreRegs: pushf pop eax mov ax, [SaveFlags] push eax popf mov eax, [SaveEAX] mov ebx, [SaveEBX] mov ecx, [SaveECX] mov edx, [SaveEDX] mov esi, [SaveESI] mov edi, [SaveEDI] mov ebp, [SaveEBP] ret [SECTION .data] ; Register set structure for int 31h, function 3000h RegSet SaveEDI dd 00000000 SaveESI dd 00000000 SaveEBP dd 00000000 dd 00000000 ; Note: reserved SaveEBX dd 00000000 SaveEDX dd 00000000 SaveECX dd 00000000 SaveEAX dd 00000000 SaveFlags dw 0000 SaveES dw 0000 SaveDS dw 0000 SaveFS dw 0000 SaveGS dw 0000 SaveIP dw 0000 SaveCS dw 0000 SaveSP dw 0000 SaveSS dw 0000 SaveRegs and RestoreRegs can now be called from your program to fill and examine the values stored in the structure RegSet. Now you can simply wrap an interrupt call like: int 10h like so: ; Pass our registers to the interrupt SaveRegs ; Function 3000h: Simulate real-mode interrupt mov ax, 3000h ; Interrupt to simulate: int 10h mov bx, 0010h ; Copy zero bytes from our stack (you probably won't need otherwise) mov cx, 0000h ; Load es with ds, so es:edi points to RegSet push ds pop es mov edi, RegSet ; Do it int 31h ; Get the return values back RestoreRegs As you can see, there is a lot of overhead involved in calling a real-mode interrupt from protected-mode. The message? If you don't have to, don't. Use these sparingly, in parts of your code that aren't time critical. Let's try this with a real-life example. Add these two export declarations: [GLOBAL _InitMode13h__Fv] [GLOBAL _RestoreTextMode__Fv] Next, add their respective functions: ; --------------------------------------------------------------------------- ; Prototype: void InitMode13h(void); ; Returns: nothing ; --------------------------------------------------------------------------- _InitMode13h__Fv: push ebp mov ebp, esp ; ah = 00h, Sets video mode with int 10h to mode al mov ax, 0013h call SaveRegs ; Simulate real-mode interrupt mov ax, 0300h ; Of int 10h mov bx, 0010h ; Copy nothing from the stack mov cx, 0000h ; Load es with ds, so es:edi points to RegSet push ds pop es mov edi, RegSet ; Do it int 31h ; Load results back into registers call RestoreRegs mov esp, ebp pop ebp ret ; --------------------------------------------------------------------------- ; Prototype: void RestoreTextMode(void); ; Returns: nothing ; --------------------------------------------------------------------------- _RestoreTextMode__Fv: push ebp mov ebp, esp ; ah = 00h, Sets video mode with int 10h to mode al mov ax, 0003h call SaveRegs ; Simulate real-mode interrupt mov ax, 0300h ; Of int 10h mov bx, 0010h ; Copy nothing from the stack mov cx, 0000h ; Load es with ds, so es:edi points to RegSet push ds pop es mov edi, RegSet ; Do it int 31h ; Load results back into registers call RestoreRegs mov esp, ebp pop ebp ret And now create the test program: inttest.cc: #include #include #include extern void InitMode13h(void); extern void RestoreTextMode(void); int main(void) { int x, y; InitMode13h(); _farsetsel(_dos_ds); for (y = 0; y < 100; y++) { for (x = 0; x < 100; x++) { _farnspokeb(0xa0000 + y * 320 + x, ((x * y) >> 2) % 256); } } getch(); RestoreTextMode(); return 0; } This should produce a 100x100 square in the top left of your screen, containing a neat-looking pattern. In a nutshell, that's how you can use interrupts to perform tasks that are complicated, but aren't called enough times to warrent optimization. You may have noticed that the routine isn't quite instant however (if you compiled the program with optimizations off). In the next section, we will explore ways to increase the speed of direct memory accesses of all kinds. 4.2 - Direct memory access (to protected-mode memory) =--------------------------------------------------- As we've seen earlier, data segment pointers can be accessed simply by using square brackets (ie: mov ax, [DataSegmentWord]). In fact, this applies to all standard pointers, including those obtained from malloc(). They are all 32-bit near pointers because, hey, in protected-mode, everything is "near." This simplifies buffer-to-buffer copying a little. You can copy a large amount of memory (up to 256k) between two external malloc()'d pointers by simply writing: cld ; Forward copy, inc esi/edi mov esi, [_PointerFrom] ; ds:esi is the source mov edi, [_PointerTo] ; es:edi is the dest, assumes es = ds mov ecx, Count ; Count is the actual number of bytes rep movsd ; divided by 4 (ie: double-words) Most of the time, you will find es equal to ds while executing your code. Don't rely on this, however, unless you are absolutely sure. It's a good idea to set es to ds at the start of the code, to ensure you don't end up writing to another selector and causing a GPF or worse. If your code can't afford to waste cycles (and is called repeatedly), consider setting es to ds in another function and calling it once before the other calls. Be sure that you aren't calling anything that may change the value of es under your nose, though. 4.3 - Direct memory access (to real-mode memory) =--------------------------------------------------- But what if you want to access something from real-mode linear memory? Here's where it gets a little tricky. You can't simply pull a descriptor for your particular segment from thin air and access it that way. DJGPP provides a convenient way to get a descriptor for something like this, however: int __dpmi_segment_to_descriptor(int _segment); You simply pass it the real-mode segment you want a descriptor for and it returns the descriptor you can use to access it. That's all it takes. Let's add a function to inttest.asm to do this for us: ; --------------------------------------------------------------------------- ; Prototype: int InitGraphics(void); ; Returns: -1 on error, else zero ; --------------------------------------------------------------------------- _InitGraphics__Fv: push ebp mov ebp, esp push dword 0a000h ; Push segment for function call ___dpmi_segment_to_descriptor ; Note the triple underscore cmp eax, -1 ; Check for error jne .SelectorOkay ; No error, selector is valid mov [VideoRAM], 0ffffh ; Error, make selector invalid jmp .Done ; End function, will return -1 .SelectorOkay: mov [VideoRAM], ax ; Load selector from ax xor eax, eax ; Clear eax to return zero .Done: mov esp, ebp pop ebp ret We'll also have to add a variable to the .data section: VideoRAM dw 0000h And add some declarations at the beginning: [GLOBAL _InitGraphics__Fv] [EXTERN ___dpmi_segment_to_descriptor] Now that we have a descriptor that references segment 0xa000, we can create a PutPixel routine on its way to being much faster than before: ; --------------------------------------------------------------------------- ; Prototype: void PutPixel(unsigned int x, unsigned int y, ; unsigned char Color); ; Returns: nothing ; --------------------------------------------------------------------------- x_PutPixel equ 8 y_PutPixel equ 12 Color_PutPixel equ 16 _PutPixel__FUiUiUc: push ebp mov ebp, esp mov eax, [ebp + y_PutPixel] mov ebx, eax shl eax, 6 ; Note: x shl 6 + x shl 8 = 320 * x shl ebx, 8 add ebx, eax add ebx, [ebp + x_PutPixel] mov fs, [VideoRAM] mov al, [ebp + Color_PutPixel] mov [fs:ebx], al mov esp, ebp pop ebp ret Now our main procedure becomes: int main(void) { int x, y; InitGraphics(); InitMode13h(); for (y = 0; y < 100; y++) { for (x = 0; x < 100; x++) { PutPixel(x, y, ((x * y) >> 2) % 256); } } getch(); RestoreTextMode(); return 0; } You can remove the #include<...> statements at the start of the file, with the exception of conio.h. Once you've added the extern declarations, it should compile and run like before. One important aspect of __dpmi_segment_to_descriptor is that the number of selectors available to it is finite. Thus, you should only call the InitGraphics() function once (along with any others that use this function), at the beginning of the program. In any case, calling it more than once is redundant and will put a drain on resources and speed. 4.4 - malloc() and NASM =--------------------------------------------------- One of the most important functions for managing allocated memory is malloc(). It allows you to assign any pointer to a contigious block of memory and access it like a structure, array, variable or even as a function. Let's look at an example: alloctst.cc #include extern void FillString(char * StringToFill); int main(void) { char * TestString; TestString = (char *)malloc(20); FillString(TestString); printf("TestString = '%s'\n", TestString); free(TestString); return 0; } alloctst.asm [BITS 32] [GLOBAL _FillString__FPc] [SECTION .text] ; --------------------------------------------------------------------------- ; Prototype: void FillString(char * StringToFill); ; Returns: nothing ; --------------------------------------------------------------------------- StringToFill_FillString equ 8 _FillString__FPc: push ebp mov ebp, esp mov edi, [ebp + StringToFill_FillString] mov esi, Filler mov ecx, 20 rep movsb mov esp, ebp pop ebp ret [SECTION .data] Filler db "<- Buffer filled ->", 00h As you may have guessed, the program copies the filler (without any sort of bounds checking) to the passed string and then prints it. In this way, malloc() can be used for many different applications in your program. In graphical applications, you may find these buffers useful for storing sprites, sound and tile data. With a little bit of effort, you can create dynamically-loaded drivers that you can load and unload into your program as needed. This could be useful for creating a standard set of procedures for sound output with a number of drivers that can be loaded for each different sound card. 4.5 - Hooking interrupts =--------------------------------------------------- Interrupt-hooking is an integral part of many different types of programs, from timer-synchronization in games to serial port monitoring in communication programs. Let's try a quick example. Create a makefile and enter the following program: hookint.asm: [BITS 32] [GLOBAL _TickHandler__Fv] [GLOBAL _TickHandler__Size] [EXTERN _Timer1] [EXTERN _Timer2] [EXTERN _Flag1] [EXTERN _Flag2] [EXTERN _TickCount] [SECTION .text] _TickHandler__Fv: inc dword [_TickCount] dec dword [_Timer1] dec dword [_Timer2] cmp dword [_Timer1], 0 jz .SetFlag1 jmp .TestFlag2 .SetFlag1: mov dword [_Flag1], 1 .TestFlag2: cmp dword [_Timer2], 0 jz .SetFlag2 jmp .Done .SetFlag2: mov dword [_Flag2], 1 .Done: ret ; Calculate size of function by subtracting offsets _TickHandler__Size dd $-_TickHandler__Fv [SECTION .data] hookint.cc: // Adapted from the DJGPP test program "libc\go32\timer.c" #include #include #include #include #define LOCK_VARIABLE(x) _go32_dpmi_lock_data((void *)&x, (long)sizeof(x)); extern void TickHandler(void); extern int TickHandler__Size; int Timer1 = 1, Timer2 = 1, Flag1, Flag2, TickCount = 0; int main() { _go32_dpmi_seginfo OldHandler, NewHandler; printf("Grabbing timer interrupt...\n"); _go32_dpmi_get_protected_mode_interrupt_vector(8, &OldHandler); _go32_dpmi_lock_code(TickHandler, (long)TickHandler__Size); LOCK_VARIABLE(Timer1); LOCK_VARIABLE(Timer2); LOCK_VARIABLE(Flag1); LOCK_VARIABLE(Flag2); LOCK_VARIABLE(TickCount); NewHandler.pm_offset = (int)TickHandler; NewHandler.pm_selector = _go32_my_cs(); _go32_dpmi_chain_protected_mode_interrupt_vector(8, &NewHandler); while (!kbhit()) { if (Flag1) { printf("Timer 1 expired: %i\n", TickCount); Flag1 = 0; Timer1 = 5; } if (Flag2) { printf("Timer 2 expired: %i\n", TickCount); Flag2 = 0; Timer2 = 7; } } getkey(); printf("Releasing timer interrupt...\n"); _go32_dpmi_set_protected_mode_interrupt_vector(8, &OldHandler); return 0; } What does this program do? Let's examine it: _go32_dpmi_get_protected_mode_interrupt_vector(8, &OldHandler); This line reads the selector and offset of the previous handler for INT 8 (the timer interrupt) into the structure OldHandler for use later. _go32_dpmi_lock_code(TickHandler, (long)TickHandler__Size); LOCK_VARIABLE(Timer1); LOCK_VARIABLE(Timer2); LOCK_VARIABLE(Flag1); LOCK_VARIABLE(Flag2); LOCK_VARIABLE(TickCount); Here, we ensure that the handler doesn't get paged out from under our noses and cause a page fault. As the function call for locking a variable is fairly complicated to type repeatedly, we define a macro to save some time. The size of the locked code is calculated at compile-time by our variable in the source file and passed as an int for us to use. NewHandler.pm_offset = (int)TickHandler; NewHandler.pm_selector = _go32_my_cs(); _go32_dpmi_chain_protected_mode_interrupt_vector(8, &NewHandler); We then get the selector/offset pair for our new handler and add it to the interrupt chain for INT 8. Note that _go32_my_cs() returns the selector for the program's code. At this point, a wrapper has been created for our function so that it will execute every time the interrupt is called. You won't need to terminate your routine with iret, as you normally do, because the function to chain the vector will create a special wrapper to ensure that the routine will run like normal. You cannot, howver, use any special system functions while in your interrupt routine (like printf, fopen, fread, etc.), as most of these are non-reentrant. _go32_dpmi_set_protected_mode_interrupt_vector(8, &OldHandler); Once we have finished with it, we can return the handler to its previous state by passing the old handler's address we saved before. Our protected-mode interrupt handler is quite simple. It increments the variable TickCount for every timer tick and decrements Timer1 and Timer2. If either of the two timers is equal to zero, it sets the flag associated with the timer. The flags are latched, meaning that you must zero them after you have dealt with them. This also means that if you aren't able to catch one or more timer expiries (because the system is busy), you'll only have to service it once. Interrupt handlers can be quite complicated if necessary. The only restriction is that you can't call any functions that aren't re-entrant. This includes most of the system functions and any of yours that fall under the same category. In most cases, you should stay within your function as much as possible to prevent possible conflicts. 4.6 - _CRT0_FLAG_LOCK_MEMORY =--------------------------------------------------- It is possible to lock all of your program's memory using one of the CRT0 startup flags, _CRT_FLAG_LOCK_MEMORY. This flag effectively disables virtual memory (disk swapping). It's a great feature if you want to write a program that shouldn't be swapped to disk at any time (perhaps a game). To use it, you simply add the following lines to your programs: #include int _CRT0_STARTUP_FLAGS = _CRT0_FLAG_LOCK_MEMORY; This will ensure that your code, data and allocated memory will all be locked, without the need to use _go32_dpmi_lock_data(). The amount of available memory will decrease, however. 4.7 - Real-mode callback functions =--------------------------------------------------- Interrupt handlers are great for handling interrupts, but what if you want a real-mode program to call one of your functions when a certain event occurs? If you were in real-mode too, you could just get the segment and offset of your function and pass it on to the other program and be done with it. In protected mode, there's one more step: a wrapper. Just like before, there's a great library function that does all the tough stuff for you: _go32_dpmi_allocate_real_mode_callback_wrapper_retf() How do we use this? It's easy. Let's look at another example: rmcbtest.cc: #include #include #include #include #include #include int _CRT0_STARTUP_FLAGS = _CRT0_FLAG_LOCK_MEMORY; /* To make sure the name doesn't get mangled */ extern "C" void MouseCallback(_go32_dpmi_registers * r); extern volatile int MouseButtons; extern volatile int MouseX; extern volatile int MouseY; _go32_dpmi_registers regs; int main(void) { clrscr(); _go32_dpmi_seginfo info; /* Set up the handler */ info.pm_offset = (int)MouseCallback; _go32_dpmi_allocate_real_mode_callback_retf(&info, ®s); __dpmi_regs r; /* Set the horizontal range valid from 0 to 1000 */ r.x.ax = 0x07; r.x.cx = 0; r.x.dx = 1000; __dpmi_int(0x33, &r); /* Set the vertical range valid from 0 to 1000 */ r.x.ax = 0x08; r.x.cx = 0; r.x.dx = 1000; __dpmi_int(0x33, &r); /* Install the real-mode callback routine */ r.x.ax = 0x0c; r.x.cx = 0x1f; /* 0x1f traps on movements and RMB/LMB presses */ r.x.dx = info.rm_offset; r.x.es = info.rm_segment; __dpmi_int(0x33, &r); while (!MouseButtons) { printf("(%i, %i)\n", MouseX, MouseY); delay(250); } /* Clean up handler */ r.x.ax = 0x0c; r.x.cx = 0; r.x.dx = 0; r.x.es = 0; __dpmi_int(0x33, &r); _go32_dpmi_free_real_mode_callback(&info); printf("Mouse button pressed."); return 0; } rmcbtest.asm: [BITS 32] [GLOBAL _MouseCallback] [GLOBAL _MouseX] [GLOBAL _MouseY] [GLOBAL _MouseButtons] [SECTION .text] ; --------------------------------------------------------------------------- ; Prototype: void MouseCallback(__dpmi_regs * r) ; Returns: nothing ; --------------------------------------------------------------------------- Pointer_MouseCallback equ 8 _MouseCallback: push ebp mov ebp, esp mov esi, [ebp + Pointer_MouseCallback] xor eax, eax mov ax, [esi + 16] ; offset of bx in __dpmi_regs mov [_MouseButtons], eax mov ax, [esi + 24] ; offset of cx in __dpmi_regs mov [_MouseX], eax mov ax, [esi + 20] ; offset of dx in __dpmi_regs mov [_MouseY], eax mov esp, ebp pop ebp ret [SECTION .data] _MouseX dd 00000000h _MouseY dd 00000000h _MouseButtons dd 00000000h The code is fairly straight-forward and similar to the code we created for handling real-mode interrupts. To set up the callback, we set the pm_offset member of the info struct and pass it to the routine allocation function (you know, the one with the excessively long name). The function creates a wrapper and passes the wrapper's entry points back in the rm_segment and rm_offset in the info struct. You might have noticed that it also requires a global variable (this is important: you can't give it any other type of variable). The function then uses this variable and passes a pointer to it to your function. If you write your function correctly, you can save yourself a lot of trouble by passing it to a struct internal to your asm function. Be careful with this, though, it could change at any time and break your programs. To access the register struct, just load esi from [ebp + 8] and then use [esi + n] to access the registers. You'll need to count the offset manually from DPMI.H, but don't worry too much, there aren't very many members. 4.8 - Doubleword-aligned accesses =--------------------------------------------------- In some cases, on 386 machines an up, accessing memory aligned to a word or double-word boundary is faster than an unaligned access. On higher- end machines, double-word boundaries offer the greatest benefit. In NASM, we can tell the assembler that we want to align an entire program segment to a double-word boundary easily: [segment .text ALIGN=4] How can we tell the assembler that we want to align data or functions to a double-word boundary? It's actually quite simple. Using the assembler variables "$$" (the start address of the current segment), "$" (the address of the current opcode) and the "times" directive, we can create a statement like so: times ($$ - $) & 3 nop ; Align the next instruction/data to ; a double-word boundary, assuming ; segment is aligned to double-word It seems fairly lengthy, and indeed, it is. If you have NASM version 0.94 or later, however, the macro facility comes in handy: %define align times ($$ - $) & 3 nop Now if you want to align any of your data/procedures, just use the align keyword as if it were a directive. Make sure that you put it before the label, or else you'll end up jumping into the nop's and slowing down your program: align _PutPixel__FUiUiUi: To see this in action, load up the INTTEST.ASM file and add the %define line listed above to the top of the file and our new "align" keyword before each of the functions. Also add the keyword before the definition of the VideoRAM variable: align VideoRAM dw 0000h Add "ALIGN=4" to each of the segment definitions as well: [segment .text ALIGN=4] [segment .data ALIGN=4] Compile the program and then load it into a debugger (FSDB works well for this). Trace through the program up to the InitGraphics() function call and step into the function. Go back a few bytes and notice how there are a few nop's before the actual function. Also look at the starting address of the function. It'll end in either 0, 4, 8, or C, meaning it's double-word aligned. If you want, remove the "align" macros from the file and recompile. Notice how the function aren't aligned anymore. Neat, huh? =--------------------------------------------------- 5.0 - Contacting the author =--------------------------------------------------- 5.1 - Closing comments =--------------------------------------------------- This document is still in an unfinished state, so there may be some errors (glaring or otherwise), omissions or misinformation. If you happen to stumble across any of these (even typos), feel free to send me email to: mmastrac@acs.ucalgary.ca 5.2 - Getting DJGPPASM.DOC =--------------------------------------------------- The latest public version of this document is always available at: http://www.ucalgary.ca/~mmastrac/djgppasm.doc The examples created in the document are available in a separate zip-file at: http://www.ucalgary.ca/~mmastrac/djnasmex.zip This manual compiled using MC v1.05 by Matthew Mastracci