16-bit assembly programming using NASM.
The Netwide Assembler (NASM) is an an assembler for the x86 CPU architecture portable to nearly every modern platform, and with code generation for many platforms old and new. The good news for DOS nostalgics is that the NASM team continues to support DOS, so the latest version of NASM is available for this operating system. You can download NASM from the developers' website; be sure to pick the (latest) release for DOS.
The intention of this tutorial is primarily to show how to use NASM to build 16-bit real mode and protected mode programs on DOS. As the tutorial includes general information about assembly programming, and some largely commented sample programs, it may also be seen as a starting point for assembly newbies to create their own programs (without really being an introduction to the NASM assembly programming language). The program samples have been build and tested on FreeDOS 1.3 RC5, using NASM 2.16.0.1. The tutorial should also apply to MS-DOS or other DOS operating systems. To note, that the protected mode executables, build here, also run on the first Windows releases and in Command Prompt of the following ones (at least until Windows 2000 included). Use the following link to download the source code of the sample programs.
The NASM download is a ZIP archive (in my case: nasm-2.16.01-dos.zip). I unpacked it on my Windows 10 and created an ISO to get the files onto my FreeDOS VMware virtual machine. On FreeDOS, I created the directory C:\NASM and copied all files from the CD to there. The screenshot shows the content of the NASM directory. The assembler executable is called NASM.EXE.
|   | 
Before starting to try out the assembler, here some important facts to know about assembly programming:
- Assembly language (or assembler language) is any low-level programming language in which there is a very strong correspondence between the instructions in the language and the architecture's machine code instructions. Assembly language usually has one statement per machine instruction. Because assembly depends on the machine code instructions, each assembly language is specific to a particular computer architecture. The family of x86 assembly languages represents decades of advances on the original Intel 8086 architecture. In addition to there being several different dialects based on the assembler used, additional processor instructions, registers and other features have been added over the years while still remaining backwards compatible to the 16-bit assembly used in the 1980s.
- There are a variety of x86 assemblers. Most of them, as for example, Intel assembler, Microsoft Assembler (MASM), Netwide Assembler (NASM), Borland's Turbo Assembler (TASM) use the so-called Intel syntax, rather different from the AT&T syntax, used for example by the AT&T assembler (AS), the GNU assembler (GAS), and also often used with inline assembly. The syntax used by this first group of assemblers is based on the specifications defined by Intel, but there are also simplifications of the syntax rules or syntax extensions specific to one or another of the assemblers. This means that if you have written a program that, assembled with MASM, didn't produce any errors, don't be surprised if using NASM you'll get lots of syntax error messages.
- With the assembly source file as input, the assembler produces an object file, that
        (normally) has to be passed to a linker to create the executable. Object files may have different formats, and
        we'll have to tell the assembler which of them it should produce. Here is an overview:
        - Flat-form binary output (bin): This format does not produce object files, but generates nothing in the output file except the code you wrote. Such "pure binary" files are used by DOS .COM executables and .SYS device drivers. Pure binary output is also useful for operating system and boot loader development. We will use the bin format when creating 16-bit real mode executables.
- Microsoft OMF object files (obj): This format is the one produced by MASM and TASM, which is typically fed to 16-bit DOS linkers to produce .EXE files. It is also the format used by OS/2. NASM has also full support for the 32-bit extensions to this format. We will use the obj format when creating 16-bit protected mode executables.
- Common object file format (coff): This output type produces COFF object files suitable for linking with the DJGPP linker. We would use them if we wanted to create DJGPP based 32-bit programs for DOS.
- Microsoft Win32 object files (win32) and Executable and linkable format object files (elf) are used on newer operating systems.
 
- An assembly program may be subdivided into several sections. In some object file formats, the number and names of sections are fixed; in
        others, the user may make up as many as they wish. A section also represents the part of the output file where the code you write will be assembled into, and the
        code being loaded into memory for execution, also represents a given memory segment. Usually, we consider the following
        three program sections:
        - The text section is used for keeping the actual code (assembly instructions).
- The data section is used for declaring initialized data or constants.
- The bss section is used for declaring variables (uninitialized data).
 - The code segment is represented by the text section. This defines an area in memory that stores the instruction codes.
- The data segment is represented by the data and the bss sections. The data section is used to declare the memory region where data elements are stored for the program. This section cannot be expanded after the data elements are declared, and it remains static throughout the program. The bss section is also a static memory section that contains buffers for data to be declared later in the program. This buffer memory is zero-filled.
- The stack segment is a memory area that will contain data values passed to functions and procedures within the program.
 
- There is a fundamental difference between assembly and higher-order language programming. With the latter ones, the programmer has not to care at all what the CPU does when a given statement is executed, nor (at least in most cases), where in memory the code is executed and where the data used is stored. Using assembly, a basic knowledge of the computer hardware, in particular the CPU is required. This is obvious, because in some sense you directly give the CPU the order to execute a given machine instruction. Machine instructions (normally) require the usage of one of the internal registers of the CPU (for example to load a value from memory into, to store its content to some memory location, to add the value from another register or memory location), so our assembly instructions will mostly be to "do something" with a register or its content. Also, any data used by these instructions is located at a given memory address. The assembly language allows us to use variables, what seems to be the same as in higher-order languages. However, these variables are only a convenience to refer to a given memory area; they do not include any information about what type (character, number...) the data is, nor about the value's length. What the content of given successive memory addresses actually represents is under the full control and responsibility of the programmer!
- Arithmetic operations (mostly) require numbers in hexadecimal representation. If the number is to be considered as signed or unsigned depends on the programmer. Numbers read from the keyboard are ASCII characters, and numbers output to the screen have also to be in ASCII format. So, all number input/output operations requires some kind of conversion.
These long theoretical considerations may be annoying, but I think that you really have to know and understand these basic concepts if you want to create assembly programs...
On a x86 32-bit DOS system, there are three kinds of executables that we can create with NASM:
- 16-bit real mode programs. This is the easiest way to create DOS programs with NASM; we even don't need a linker. Be aware, however, that with programs running in real mode, you can do "all you want"; there is no protection of deleting data or overwriting the code of some TSR or driver loaded or DOS itself. Such programs will be discussed in the next section.
- 16-bit protected mode programs. This requires a linker to transform the object code into an executable. Protected mode programs may not be able to do what you want, if you have have to access the computer hardware. On the other hand, they are safer than real mode programs and you can safely run them in Command Prompt of older Windows releases. I will talk about them in the section following the discussion of the real mode programs.
- 32-bit protected mode programs. Working with 32 bits may largely increase the possibilities. To create such programs on DOS, we need an extender such as DJGPP. If you want to learn to build 32-bit protected mode NASM programs on DOS, please have a look at my tutorial 32-bit assembly programming using NASM and GCC.
Building 16-bit real mode programs.
Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. The mode gets its name from the fact that addresses in real mode always correspond to real locations in memory. Real mode is characterized by a 20-bit segmented memory address space (giving 1 MB of addressable memory, that the real mode program has to share with the programs and drivers already there) and unlimited direct software access to all addressable memory, I/O addresses and peripheral hardware. Real mode provides no support for memory protection, multitasking, or code privilege levels. For details, have a look at the following Wikipedia article.
Here is the code of HELLO1.ASM, a simple real mode "Hello World" program.
      
      org 100h
      section .text
      start:
              ; Display message calling the DOS write-string function
              mov     dx, hello
              mov     ah, 9
              int     0x21
              ; Terminate the program calling the DOS exit function
              mov     ax, 0x4c00
              int     0x21
      section .data
      hello   db     'hello, world', 13, 10, '$'
      section .bss
              ; Put uninitialised data here
      
      
Some general notes concerning this code:
- Like most assemblers, each NASM source line contains (unless it is a macro, a preprocessor directive or an assembler directive) some combination of the four fields: label, instruction, operand(s), comment. NASM allows a totally free-format coding, thus a label has not to start at column 1 and an instruction may start at column 1. For readability, a fixed format with, for example instructions starting at column 9 and operands starting at column 17 is recommendable. Labels may but mustn't be terminated by a colon (:). Comments are preceded by a semicolon (;).
- Real mode programs (more exactly: the bin object format) require the directive ORG to specify the origin address which NASM will assume the program begins at, when it is loaded into memory. A good value for the program origin is 100h. The "h" placed after a number means that this is a hexadecimal value (another possibility to represent hexadecimal numbers is to place "0x" in front of them, as 0x21 and 0x4c00 in the code above.
- The bin object format requires the three sections, we spoke about before. They are defined by the directive section, and their names are respectively .text, .data, and .bss. The text section has to begin with the start: label.
- Assembly programs have access to the BIOS and operating system functions by using interrupts. In DOS programs, we can use the instruction INT 21h to call the DOS functions for keyboard input, screen output or termination of the program. For details, cf. Using interrupts to call DOS functions.
- To declare initialized data (in the .data section), we mostly use the pseudo-instruction db. In the program code above, this instruction initializes a memory area of 15 bytes at the beginning of the .data section; the 15 succeeding memory locations will contain the ASCII characters corresponding to the "hello, world" string, followed by the bytes 0Dh and 0Ah (hexadecimal values for decimal 13 and 10, actually representing a carriage-return + linefeed) and the byte 24h (hexadecimal ASCII code for the dollar sign). Note, that for the assembler, all this memory content is just "some byte of data", that only becomes a number, or a character, or whatever, when the programmer decides so! Also to note that there are similar pseudo-instruction to reserve a word (2 bytes): dw, or a double-word (4 bytes): dd (floating point numbers may be initialized using dd, dq, or dt).
- No example in the code above, but lets mention it here: To declare uninitialized data, we mostly use the pseudo-instruction resb, followed by the number of bytes to be reserved. Similarly as for db, you can also use resw, resd, etc.
- The data in our .data section above is preceded by a label that I called "hello". This variable-label is similar to a variable in higher-order programming languages, but without telling anything about the variable's type (it's the programmer who decides what each byte actually represents), nor its length (number of bytes). In fact the label "hello" refers to the memory address of the first byte of data declared with the db pseudo-instruction. For details, cf. CPU registers and memory addresses.
        NASM.EXE has a whole bunch of command line parameters. Two of them are of particular interest for us. First of all, we have to indicate
        the format of the output produced by the assembler (flat-form binary in our case). This is done by using -f bin. With
        bin, the output actually is a real mode executable rather than an object file and to create a file with .com extension, we'll have to indicate
        the name of the assembler output, using -o <filename>.com. The command to assemble HELLO1.ASM in order to create a 16bit real mode DOS
        executable is as follows:
        
           nasm -f bin hello1.asm -o hello1.com
        
      
The screenshot below shows the execution of this command (no output if there aren't any errors or warnings), a directory listing with the source and executable (note that HELLO1.COM is only 27 bytes!), and the execution of the program (displaying the text "hello, world").
|   | 
Building 16-bit protected mode programs.
Protected mode, also called protected virtual address mode, is an operational mode of x86-compatible CPUs that allows system software to use features such as segmentation, virtual memory, paging and safe multi-tasking designed in order to increase an operating system's control over application software. For details, have a look at the following Wikipedia article.
Here is HELLO2.ASM, the protected mode version of our "Hello World" program from above.
        
        segment code
        ..start:
                ; Initialization
                mov     ax, data
                mov     ds, ax
                mov     ax, stack
                mov     ss, ax
                mov     sp, stacktop
                ; Display message calling the DOS write-string function
                mov     dx, hello
                mov     ah, 9
                int     0x21
                ; Terminate the program calling the DOS exit function
                mov     ax, 0x4c00
                int     0x21
        segment data
        hello   db      'hello, world', 13, 10, '$'
        segment stack   stack
                resb    64
        stacktop:
        
      
Comparing with the code of our program with bin output, we note the following differences:
- As protected mode programs use virtual memory, you haven't and you can't specify an origin, where to load the program, so there is no ORG directive.
- Instead of defining three sections, we'll have to define three segments. These are introduced by the directive segment and respectively called (as seen further up in the text): code, data, and stack (I'm not sure if you must use these names...).
- It is mandatory to define a stack area. In the example above the segment called "stack" is defined as type "stack" (second "stack" of the directive) and we reserve 64 bytes of uninitialized space for it. The label stacktop: points to the top of the stack area (note that a label without ending colon placed in an otherwise empty line produces an assembler warning).
- The code section must begin with special symbol ..start, what means that this memory location will be the entry point into the resulting executable file.
- Before adding your own code, some initialization has to be done. The corresponding code sets up DS to point to the data segment, and initializes SS and SP to point to the top of the provided stack. Don't bother about this code; just copy it to your sources...
        As said above, to create our 16-bit protected mode .exe file, we'll have to tell the assembler to create a Microsoft OMF object file. This is done, by setting
        the command line parameter -f obj. No need to use the -o parameter in this case: the default name of the output file is identical to the
        assembler source file name, with an extension of .obj. So, for the program HELLO2.ASM, the command to assemble it is as follows:
        
           nasm -f obj hello2.asm
        
      
As a difference with before, the assembler output is object code that cannot be executed without being treated by a linker. The NASM download archive doesn't include a linker, so we'll have to find one somewhere else.
There are several 16-bit linkers for DOS available free of charge. The most commonly used are probably ALINK and QLINK. Both require the presence of a DPMI host. An alternative (used by myself when creating the samples in this tutorial), is to use the linker included with MASM. I actually use the one included with MASM 6.11, that you can download from the WinWorld website. For your convenience, I placed a copy of LINK.EXE on my site.
        To link our HELLO2.OBJ file in order to create an executable called HELLO2.EXE, run the command
           link hello2.obj,hello2.exe
        where you can omit the second parameter, because first, it's the default used by the linker and second, you'll be asked for the name (and some other info when you
        link the object).
      
The screenshot below shows the linkage of HELLO2.OBJ (as you can see, Run File is automatically set to HELLO2.EXE; just hit ENTER to accept the default, as you can do for the other items asked), a directory listing with source, object and executable file (note that HELLO2.EXE is lots bigger in size than the real mode HELLO1.COM created before), and finally the execution of HELLO2.EXE.
|   | 
Note: I think that you haven't to worry about the real mode warning issued by LINK.EXE. If you want to prevent it, start your FreeDOS system using HIMEMX.EXE (+ JEMM386.EXE) instead of JEMMEX.EXE as memory driver (add a menu option to your FDCONFIG.SYS, if you don't already have...).
Running NASM using a custom batch file.
The aim of DOS batch files is to make life easier, defining several DOS commands in a file, that you can run as a shell script. My NASM.BAT has for objective the creation of either a real mode, or a protected mode executable, starting from an assembly source file <filename>.asm. The batch file awaits 2 command line parameters: the first one to define what kind of executable you want to create (-r for a real mode program, -p for a protected mode program), the second one being the name of the assembly source file without file extension. The executable produced will be called <filename>.com, or <filename>.exe, depending on the first parameter. Here is the content of my file:
        
        @echo off
        if "%1"=="-r" set _MODE=real
        if "%1"=="-p" set _MODE=protected
        set _PROG=%2
        if "%_MODE%"=="" goto NoMode
        if "%_PROG%"=="" goto NoProg
        if "%_MODE%"=="protected" goto Prot
        :Real
        if exist %_PROG%.com del %_PROG%.com
        c:\nasm\nasm.exe -f bin %_PROG%.asm -o %_PROG%.com
        goto End
        :Prot
        if exist %_PROG%.obj del %_PROG%.obj
        if exist %_PROG%.exe del %_PROG%.exe
        c:\nasm\nasm.exe -f obj %_PROG%.asm
        if not exist %_PROG%.obj goto End
        c:\masm611\bin\link.exe %_PROG%.obj,%_PROG%.exe
        goto End
        :NoMode
        echo Invalid or missing mode
        goto Usage
        :NoProg
        echo Missing file name
        :Usage
        echo Usage: nasm -r|-p asm-file-name
        :End
        set _MODE=
        set _PROG=
      
Note: The paths c:\nasm\nasm.exe and c:\masm611\bin\link.exe correspond to where NASM and MASM were installed on my system. You'll eventually have to adapt them (or add the directories containing the files to your PATH environment variable, or copy the two files to your c:\freedos\bin directory).
Using interrupts to call DOS functions.
Interrupts are signals generated by the computer every time a certain event occurs. Hardware interrupts are triggered by hardware devices such as the network or sound card. Software interrupts may be triggered from within an assembly program using the INT instruction. The interrupts are numbered from 0x00 to 0xFF and each have a special function (most are unused) and their effect is that the normal program execution is suspended, and the control being transferred to a so-called interrupt handler. This a portion of code to be executed when this specific interrupt occurs. As an example, interrupt 09h (IRQ1) is a keyboard interrupt. It is called each time a key on the keyboard is pressed and the corresponding handler (part of the BIOS) reads the value of the key pressed and puts into the keyboard buffer. Beside this, some special keys or key combinations induce some supplementary action, as for example Shift+PrtScr that triggers interrupt 05h (sending the screen content to the printer), or CTRL+ALT+DEL that makes a direct call to address 0FFFFh:00h (reboot of the system).
Here an overview of some of the x86 interrupts:
| Interrupts | Description | 
|---|---|
| 00h - 07h | Processor interrupts | 
| 08h - 0Fh | Peripherals interrupts (IRQ0 to IRQ7) | 
| 10h | Video interface management | 
| 13h | Disk access interface management | 
| 16h | Keyboard interface management | 
| 17h | Printer interface management | 
| 1Bh | CTRL+Break keys pressed on the keyboard | 
| 20h | Termination of a program | 
| 21h | DOS services (API) | 
| 24h | Critical error | 
| 27h | Termination of TSR | 
| 2Ah | Network interface management | 
| 33h | Mouse interface management | 
| 40h | Floppy drive interface management | 
| 42h | Default video management | 
| 5Ch | NetBIOS management | 
| 70h - 77h | IRQ8 to IRQ15 (IRQ AT/286) | 
| 80h | Call to the Linux kernel (API) | 
MS-DOS (as other DOS systems, including FreeDOS) provides many common services through interrupt 21h. Entire books have been written about the variety of functions available. Here just a description of six of them. The general way to proceed (for the functions described) is the following:
- Load the data or its starting address (if there is any data) into the DL resp. DX register.
- Load the function code into the AH or AX register (depending on it's 1 or 2 bytes).
- Execute the INT 21h instruction to call the DOS function.
- Access the return data (if there is any) in the AL register resp. from the memory area that we indicated when calling the function.
        The DOS exit function terminates the program that calls it and returns to the DOS prompt. Here how to call it:
        
           mov  ax, 4c00h
           int  21h
        
        No data to pass to the function, nor any data returned. Just load AX with the function code 4c00h. We have this code in our two "Hello World" programs, and we will
        also have it in the other samples.
      
        The DOS write-character function writes a character to the screen. Here how to call it:
        
           mov  dl, ...
           mov  ax, 02h
           int  21h
        
        The DL register has to contain the character (i.e. its ASCII code) to be displayed. We can load the character as a constant, from another register or from memory
        (cf. further down in the text); there is no data returned.
      
        The DOS write-string function writes a string (sequence of characters) to the screen. Here how to call it:
        
           mov  dx, <address>
           mov  ah, 09h
           int  21h
        
        The DL register has to contain the address of the first character (i.e. its ASCII code) to be displayed. The display on the screen starts
        with the character at this address and continues with the characters at the successive memory locations until the dollar symbol ($) is found
        (this character not being displayed). After the display of the last character, the cursor stays in the same line at its current position, except if the last two
        characters were the ASCII codes 0Dh (13) followed by 0AH (10), which code for a carriage return + linefeed (new line indicator on DOS and
        Windows systems). There is no data return. We have now the knowledge to understand, how the display in the two "Hello World" programs works: mov dx,
        hello loads the DX register with the address referred to by the variable "hello", i.e. the character "h" of the "hello, world" string. This "h" and the following
        characters are displayed on the screen; display ends with the character preceding the dollar symbol. As the last two characters send to the screen were 0Dh followed
        by 0Ah, a carriage-return + linefeed is performed and the cursor is moved to the beginning of the next line.
      
        The DOS read-character function reads a character from the keyboard. Here how to call it:
        
           mov  ah, 01h
           int  21h
        
        This function has no input data; the return data is the ASCII code of the character pressed on the keyboard in the AL register.
        Besides reading the character, the character is also echoed (displayed onto the screen).
      
        The DOS read-character (no echo) function also reads a character from the keyboard. Here how to call it:
        
           mov  ah, 08h
           int  21h
        
        The function works the same way as the one described before, except that the character is not displayed onto the screen.
      
        The DOS read-string function reads a string (sequence of characters terminated by the ENTER key) from the keyboard. Here how to call it:
        
           mov  dx, <address>
           mov  ah, 0ah
           int  21h
        
        This is a little bit more complicated... The DX register has to be loaded with the address of the buffer, where the string (sequence of
        ASCII codes) will be placed. However, first we have to indicate the maximum string length the input may have (1 to 255). This value has
        to be stored by the programmer at the address passed as input to the function (start of the buffer, offset +0) and must equal the maximum number of characters that
        the user may enter plus 1 (the carriage-return code 0Dh will be returned as last character). As we don't know how many characters the user actually enters, the
        actual string length has to be returned by the function. It will be stored at the memory location immediately following the buffer start
        (offset +1). To note that the count returned only equals the string data length (the carriage-return will be stored in the buffer, but it will not be included in
        the count of characters entered). Finally, as the two first locations of the buffer are used for the counters, the string data entered starts
        at offset +2.
      
To illustrate how to use the DOS read-string function, lets write a "Hello User" program. We first ask the user for their name, then send the greeting "Hello <name>!" to the screen. Here is the code of the HELLO3.ASM sample:
        
        segment code
        ..start:
                ; Initialization
                mov     ax, data
                mov     ds, ax
                mov     ax, stack
                mov     ss, ax
                mov     sp, stacktop
                ; Ask for name (calling the DOS write-string function)
                mov     dx, qname
                mov     ah, 9
                int     0x21
                ; Get name from keyboard buffer (calling the DOS read-string function)
                mov     dx, buffer
                mov     ah, 0x0a
                int     0x21
                ; Display "hello" message (calling the DOS write-string function)
                lea     esi, [buffer + 2]
                mov     cl, [buffer + 1]
                lea     edi, [hname]
        copychar:
                mov     bl, [esi]
                mov     [edi], bl
                inc     esi
                inc     edi
                dec     cl
                test    cl, cl
                jnz     copychar
                mov     byte [edi], '!'
                mov     byte [edi + 1], 13
                mov     byte [edi + 2], 10
                mov     byte [edi + 3], '$'
                mov     dx, hello
                mov     ah, 9
                int     0x21
                ; Terminate the program (calling the DOS exit function)
                mov     ax, 0x4c00
                int     0x21
        segment data
        maxlen  equ     20
        buffer  db      maxlen + 1
                resb    maxlen + 2
        qname   db      'What is your name? ', '$'
        hello   db      13, 10, 'Hello '
        hname   resb    maxlen
                resb    4
        segment stack   stack
                resb    64
        stacktop:
        
      
Lets begin by having a look at the data segment. The pseudo-instruction EQU declares a constant. Assembly constants are as in higher-order programming languages: they can't be changed by the program and the assembler simply replaces all occurrences of them by the value they have been assigned. Our constant maxlen = 20 is the maximum length of the user name (an arbitrarily chosen value; if we don't want to set a length limit, we can use maxlen = 255).
"qname" references the memory area where we have stored the string to ask the user for their name. Be sure not to forget the terminal dollar character if you use interrupt 21h to display a string on the screen (just try out what happens if you do...)! Note that the string is not followed by the characters 0Dh 0Ah; there will be no carriage-return + linefeed, the cursor will stay at the position after the last character displayed (and it's here that the name entered by the user will be displayed).
"hello" references the memory area for our greeting message. This area starts with some initialized data (a carriage-return + linefeed and the string "Hello "), followed by 20 bytes reserved for the name entered by the user and 4 further bytes of uninitialized space. Do you have an idea what they are for?
"buffer" references the memory area that will be used for the user input. As we saw when describing the DOS read-string function, the first byte of this area has to contain the maximum number of characters that the user may enter; in our case the length of the name + the carriage-return = maxlen + 1 (21). The remaining space of the buffer will be filled in by the DOS function. Remember that the first byte of the return data is the actual string length, and that the carriage-return is returned as last character. So, the length of the area of uninitialized data that we have to reserve is 1 + (maxlen + 1) = maxlen + 2 (22).
Now, the code segment. After the initialization that has to be done in order to create an OBJ output file, we display the "What is your name? " string on the screen. This is done by loading the address of the first byte of the string ("qname") into DX, loading AH with 0Ah, the code of the DOS read-string function and calling INT 21h. We then want the user to enter their name. As we saw before, in order to use INT 21h, DX has to be loaded with the memory address of the first byte of the buffer area ("buffer") and AH has to be loaded with 0Ah, the code of the DOS write-string function. Program execution is suspended until the user hits the ENTER key (hitting CTRL+C = CTRL+Break will trigger interrupt 1Bh, the program will be aborted and the system will return to the DOS prompt). When the program execution resumes, our buffer will have been filled with the function return data. In order to display the greeting message, we'll have to copy the user name from the buffer to the memory area referred to by the variable "hname". This copy will be done byte by byte. As we will use the DOS write-string function to display the greeting, we'll have to add at least one character (the dollar symbol) to the output string. I will explain the code of all this data movement in the next section. The display itself is done as before (DX having to be loaded with address of the first byte of the greeting message, i.e. "hello"). Finally, we terminate the program by calling the DOS exit function.
CPU registers and memory addresses.
To speed up the processor operations, the processor includes some internal memory storage locations, called registers. The registers store data elements for processing without having to access the memory. A x86 processor has a whole bunch of registers and I will only mention some of them here. If you are serious about assembly programming, you might want to have a look at NASM Assembly - Registers at the tutorialspoint website for details.
There are four 32-bit data registers that are used for arithmetic, logical, and other operations. These 32-bit registers can be used in three ways:
- As complete 32-bit data registers: EAX, EBX, ECX, EDX.
- The lower halves of the 32-bit registers can be used as four 16-bit data registers: AX, BX, CX and DX.
- Lower and higher halves of the four 16-bit registers mentioned above can be used as eight 8-bit data registers: AH, AL, BH, BL, CH, CL, DH, and DL.
For most operations, you can use any of the 4 (resp. 8) registers as you want. However, each of them has also specific functions. AX is the primary accumulator; it is used in input/output and most arithmetic instructions. For example, in multiplication operation (assembly instruction with a single operand), one of the multiplication operands has always to be stored in EAX, AX or AL (according to the size of the operand). BX is called the base register; it is the only general-purpose register which may be used for indirect addressing. For example, the instruction MOV [BX], AX causes the contents of AX to be stored in the memory location whose address is given in BX. CX is known as the count register, as the ECX, CX registers store the loop count in iterative operations (looping, shift and rotate, and string instructions). DX is known as the data register. It is used in input/output operations, and also (together with AX) for multiplication and division operations involving large values.
|   | 
The 32-bit index registers, ESI and EDI, and their 16-bit rightmost portions SI and DI, are mostly used for indexed addressing. E.g.: the instruction MOV BL, [ESI + NUMLEN + 3] causes the contents of the memory location defined by the address given by the value in ESI + the value of the constant "numlen" + 1 to be stored in register BL. In string operations, SI is used as source index, and DI is used as destination index.
The status register, or flags register, is a collection of 1-bit values which reflect the current state of the processor and the results of recent operations. Many instructions change the status of the flags and some other conditional instructions test the value of the flags to take the control flow to another location. Here are some common flag bits:
| Flag | Description | 
|---|---|
| Carry | This flag is set if the last arithmetic operation ended with a leftover carry bit coming off the left end of the result. This signals an overflow on unsigned numbers. | 
| Overflow | This flag is set if the last arithmetic operation caused a signed overflow. For example, after adding 0001h to 7FFFh, resulting in 8000h; read as two's complement numbers, this corresponds to adding 1 to 32767 and ending up with -32768. | 
| Zero | This flag is set if the last computation had a zero result. After a comparison, this indicates that the values compared were equal (since their difference was zero). | 
| Sign | This flag is set if the last computation had a negative result (a 1 in the leftmost bit). | 
Most of the time you will not have to deal with the flags register explicitly; instead, you will execute one of the conditional branch instructions, Jcc, where cc is a mnemonic condition code, as in the following table:
| Codes | Description | 
|---|---|
| C, or NC | Carry resp. no carry | 
| O, or NO | Overflow resp. no overflow | 
| Z, or NZ | Zero resp. not zero | 
| E, or NE | Equal resp. not equal | 
| L, or NL; LE, or NLE | Less resp. not less; less or equal resp. not less or equal | 
| G, or NG; GE, or NGE | Greater resp. not greater; greater or equal resp. not greater or equal | 
| S, or NS | Sign resp. no sign (i.e. negative resp. not negative) | 
Most assembly language instructions require operands to be processed. An operand address provides the location, where the data to be processed is stored. When an instruction requires two operands, the first operand is generally the destination, which contains data in a register or memory location and the second operand is the source. The source contains either the data to be delivered itself (immediate addressing) or the address (in a register or in memory) of the data. Generally, the source data remains unaltered after the operation. There are three basic modes of addressing: register addressing, immediate addressing, and memory addressing.
        In the register addressing mode, the operand is contained in a register. Depending upon the instruction, the register may be the first operand,
        the second operand or both. As processing data between registers does not involve memory, it provides fastest processing of data. Some examples:
        
           mov  dx, qname
           mov  ah, 9
           add  al, bl
           add  bl, al
        
        In the first two examples (from our HELLO3.ASM program), the register is the first operand, i.e. a destination; the address referred to by the variable "qname"
        is moved to the DX register, and the constant value 9 is moved to AH. In the third and fourth examples, both operands are registers; in both cases the values
        in registers AL and BL are added. In the third example, the result is stored in AL, in the fourth example, it is stored in BL.
      
In the immediate addressing mode, the operand is a constant value or an expression. When an instruction with two operands uses immediate addressing, the first operand is normally a register, and the second operand is an immediate constant. If labels are used, labels declared with EQU indicate a value, labels declared otherwise refer to an address. The first two examples above use immediate addressing. In the first example, the immediate operand is the address referred to by "qname", in the second example, the immediate operand is the number 9.
In the memory addressing mode, the operand is a value stored in memory. The memory operand may be the first operand, or the second operand, but not both (the other operand normally being a register). Operands specified in a memory-addressing mode require access to the main memory, usually to the data segment. As a result, they tend to be slower than either of the two previous addressing modes.
To locate a data item in the data segment, we need two components: the data segment start address and an offset value within the segment. The start address of the segment is typically found in the special purpose DS register. The offset value is often called the effective address.
        There are various memory-addressing modes, differing in the way how the offset value of the data is specified. In the direct addressing mode,
        the offset value is specified directly as part of the instruction. In an assembly language program, this value is usually indicated by the variable name of the data
        item. The assembler will translate this name into its associated offset value during the assembly process. To facilitate this translation, the assembler maintains a
        symbol table. The symbol table stores the offset values of all variables in the assembly language program. Examples from our HELLO3.ASM
        program:
        
           lea  edi, [hname]
           mov  cl, [buffer + 1]
           lea  esi, [buffer + 2]
        
        The first example loads the effective address of the variable "hname" into the EDI register. The second example loads the effective address corresponding to the memory
        location referred to by the variable "buffer" +  1 byte, i.e. the memory address located at an offset of +1 with respect to "buffer". If you consider what I said
        concerning the keyboard buffer, the instruction MOV CL, [BUFFER + 1] loads the CL register with the number of characters actually entered by
        the user. The third example loads the ESI register with the effective address immediately following the one in example 2 (memory location with offset +2 with respect to
        "buffer"), that actually is the memory location containing the first character entered by the user. Important: As a difference with what you may read on some Internet
        sites or in some assembly manuals, with NASM, the usage of square brackets with effective addresses is mandatory!
      
        In the indirect addressing mode, the operand stored in memory is not coded (as address or variable) in the assembly instruction, but the
        operand's address is loaded into an index register (ESI, EDI, SI, DI; you can also use EBX and BX) and this register is used in the instruction
        to designate the memory operand. As this operand actually is the value stored at the address contained in the index register and not the value in the index register,
        the index register has to be put between square brackets. Examples from HELLO3.ASM:
        
           mov  bl, [esi]
           mov  [edi], bl
        
        In the first example, the content of the address contained in ESI is moved into the BL register (as BL is an 8-bit register, 1 byte will be moved). In the second
        example, the content of BL is moved into the address contained in EDI. With the two instructions together, we actually move one byte of data from one memory location
        to another. And in our sample program, by incrementing the address in ESI and EDI and repeating the move, we can copy the name entered by the user from the keyboard
        buffer area to the "hname" area in our output string.
      
        I said above that immediate addressing is normally used with a register as first operand, and that in memory addressing mode, one of the operands is normally a register.
        NASM also allows to use a constant value with memory addressing. However, this presents a problem. If a register is involved, we know the
        size of the second operand, actually equal to the size of the register. Thus, moving a constant to DL will move one  byte, moving it to DX will move 2 bytes. But if the
        destination is a memory address? How does the assembler know how many bytes we want to move to this, and possibly the following addresses? It can't know it for sure, and
        that's why we explicitly tell it by using a type specifier. These specifiers are: BYTE (1 byte),
        WORD (2 bytes), DWORD (4 bytes), QWORD (8 bytes), and TBYTE (10
        bytes). Examples from our HELLO3.ASM program:
        
           mov  byte [edi + 1], 13
           mov  byte [edi + 2], 10
           mov  byte [edi + 3], '$'
        
        In all three examples, we move one single byte. In fact, there are three bytes that are moved to 3 successive memory locations. Do you recognize the byte sequence?
        Carriage-return + linefeed, followed by a dollar symbol to terminate the memory area containing a string that will be written to the screen using DOS interrupt 21h.
      
        A final point, that has to be discussed before we return to our HELLO3.ASM program, is the way, how data is stored in memory. Data has
        essentially 4 sources: user input via the keyboard, constants and declared variables in the assembly program, result of some computation (and data read from a file).
        Keyboard input always produces ASCII codes (and screen output always has to be ASCII codes), independently if a character, string or number
        is entered. In an assembly program, a character value is stored as an ASCII character, a string as a sequence of ASCII characters, stored at successive memory addresses.
        All integer numbers, used in an assembly program, are stored in hexadecimal format, independently if they are written as hexadecimal, decimal
        or otherwise (except for a string representation). Integer computations, like arithmetic or logical operations normally await hexadecimal numbers as input, and generate
        as result a hexadecimal number. If the integers have to be treated as signed or not depends on the programmer. Floating point numbers have their own format. Some
        examples from our HELLO3.ASM program):
        
             maxlen  equ  20
             buffer  db   maxlen + 1
             hello   db   13, 10, 'Hello '
        
        The first example doesn't store anything in memory; the constant will be replaced by its value: 20 -> 14h.
        In the second example the value 21 -> 15h will be stored at address "buffer" (more exactly with an offset +"buffer" with respect to the content of the DS register).
        The third example stores a sequence of bytes at address "hello" and following: 13 -> 0Dh, 10 -> 0Ah, Hello + space -> 48h 65h 6Ch 6Ch 6Fh 20h.
      
Things become a little bit more complicated when storing multibyte data. This applies when moving the content of 16- or 32-bit registers to memory, or declaring variables with DW, DD, etc. In fact, there are two ways to store this data, two completely different byte-ordering schemes possible: Either, the processor stores the most significant byte before the least significant one, or it stores it after it. In both cases, the memory area has to be referred to by specifying the lowest memory address. With a word (2 bytes) of data, the MSB is stored at this address and the LSB at the following one, in the case of CPUs that use big-endian byte ordering. In the case of CPUs that use little-endian byte ordering (as do x86 CPUs), it's the LSB that is stored at the lowest address (the one referred to by the variable), and the MSB that is stored at the following one.
        Lets see an example. Consider the following code:
        
             num  resb  2
                  mov  ax, 25159
                  mov  [num], ax
        
        25159 decimal is 6247h hexadecimal, so a word-size number will be stored at memory locations num and num+1. With little-endian byte ordering,
        the LSB is stored at the lowest address (the one referred to by the variable). Thus, our data in memory will be as follows: byte 47h stored at address num and byte 62h
        stored at address num+1.
      
        Here an example with a negative number.:
        
             num  dw  -29255
        
        -29255 decimal is 8DB9h hexadecimal, and will be stored as follows: byte B9h stored at address num and byte 8Dh stored at address num+1.
      
        And an example with a double-word.:
        
             num  dd  542803535
        
        542803535 decimal is the 32-bit hexadecimal 205A864Fh hexadecimal, and will be stored as: 4Fh at address num, 86h at num+1, 5Ah at num+2, and 20h at num+3.
      
        So, we have now all the necessary knowledge to understand how, in our HELLO3.ASM program, the name entered by the user is moved to the corresponding area within the
        greeting output string. We will not use the x86 string instructions, but will do the move byte by byte. First we initialize our copy routine with the code
        
             lea  esi, [buffer + 2]
             lea  edi, [hname]
             mov  cl, [buffer + 1]
        
        We load the index register ESI with the address of the first character of the name entered by the user (remember that the keyboard input returned by the DOS read-string
        function starts within the buffer at the offset +2), and the index register EDI with the address within the output string area where the first character of the name has
        to be placed. As we will copy the name byte by byte, we need a loop counter. Here, we use the register CL, that we initialize with the length of the name (that has been
        stored into the buffer area at offset +1 by the DOS read-string function).
      
Note: Maybe you wonder what's about this LEA instruction used here. LEA means "load effective address" and is nothing else than a special kind of MOV. In fact, both instructions do exactly the same, however, when using MOV, the move is done during assembly time, whereas when using LEA, the move is done during runtime. This allows to use instructions like, for example, LEA EBX,[array + ESI] to load EBX with the address of an element of "array", whose index is in the ESI register. Note, that when using a register's content to compute an address, you cannot use MOV. The instruction MOV EBX,[array + ESI] is not permitted, because at assembly time, the value in ESI might not be known. Why the second operand of the LEA instruction is placed between square brackets, no idea. It seems not logical to me, as it's an address (and not the content at an address) that is loaded. But, it's this way that you find it in NASM manuals and NASM source code examples.
        Here is the code of the loop that copies the name:
        
             copychar:
               mov   bl, [esi]
               mov   [edi], bl
               inc   esi
               inc   edi
               dec   cl
               test  cl, cl
               jnz   copychar
        
        We copy the byte from the source address (stored in ESI) to the destination address (stored in EDI). Then we increment the two index registers, that thus point to the
        next address within the source resp. destination area. For each byte copied, we decrement the counter (BL register). The loop is terminated when the counter is zero
        (the TEST instruction used here does a logic AND of its two operators, without changing the destination operand, but setting the flags; the
        zero flag will only be set here if the value in CL is 0).
      
The following lines of HELLO3.ASM are easy to understand. We add some further characters to the output string (an exclamation mark and a carriage-return + linefeed), then we add, and this is really important as you should know, the dollar symbol to terminate the output string (note the usage of the type specifier BYTE in these moves of an immediate addressing mode operand to a memory location), and finally we load DX with the address of the output string (and AH with function code 09h) to display the greeting message, calling the DOS write-string function.
Introduction to ASCII arithmetic.
As, besides showing how to use NASM on DOS, the purpose of this tutorial also is to give people, who want to learn the assembly programming language, some basic knowledge in order to create their own assembly programs, it's obvious that it has to contain some details concerning numbers and how to process them. I said above that all keyboard input and screen output is in ASCII format. I also said that the arithmetic and logical expressions normally use numbers in hexadecimal format. One way to work with integers would thus be to convert the input operands from ASCII to hexadecimal numbers, to perform the calculations and to convert the results from hexadecimal numbers to ASCII in order to display them onto the screen. This way to proceed is not covered in this tutorial. If you seriously preview to write assembly language programs performing calculations, you should download one of the assembly books available on the Internet; there are chances that you'll find the code of such conversion routines in most of them.
Instead of converting the numbers from ASCII to hexadecimal number and vice-versa, the tutorial shows how you can do calculations on integers as they are when entered from the keyboard, i.e. integers in ASCII representation. The x86 processor includes instructions to perform ASCII arithmetic operations. In short (and without discussing the details how this actually works), it's using the ASCII codes of two integer digits as operands, using the standard addition, subtraction. multiplication, or division operators to calculate the result, that then has to be adjusted using the corresponding ASCII adjust instruction (that will give a result in ASCII representation, ready to be displayed on the screen).
The following table shows the x86 arithmetic instructions and the corresponding ASCII adjust instructions.
| Operation | Arithmetic instruction | ASCII adjust instruction | 
|---|---|---|
| Addition | ADD, ADC | AAA (ASCII adjust after addition) | 
| Subtraction | SUB, SBB | AAS (ASCII adjust after subtraction) | 
| Multiplication | MUL | AAM (ASCII adjust after multiplication) | 
| Division | DIV | AAD (ASCII adjust before division) | 
ASCII addition: As the ASCII adjust instructions require the usage of the AX register, a typical ASCII addition (with 2 1-digit operands) consists in the following:
- Making sure that the AH register is cleared.
- Moving the first operand to AL.
- Adding the second operand (register, immediate, or memory addressing) to AL (result in AL).
- Doing the ASCII adjust after addition (result as unpacked BCD representation in AX).
- Converting the unpacked BCD to ASCII (for example, using OR AX, 3030h).
        Example:
        
           sub  ah, ah      ; clear AH
           mov  al, ’6’     ; AL = 36h
           add  al, ’7’     ; AL = 36h + 37h = 6Dh
           aaa              ; ax = 0103h
           or   ax, 3030h   ; AX = 3133h
        
      
Note: Unpacked BCD (binary coded decimal) is an integer representation, where each digit of the decimal number is represented by the corresponding hexadecimal number. Examples for a 2-byte unpacked BCD: 8 = 0008h, 12 = 0102h.
ASCII subtraction: It works the same way as ASCII addition. If there is a borrow, the carry flag is set and can be used with the SBB (subtract with borrow) instruction in order to perform ASCII subtractions on multibyte operands, using a loop.
        Example:
        
           sub  ah, ah      ; clear AH
           mov  al, ’9’     ; AL = 39h
           add  al, ’3’     ; AL = 39h - 33h = 06h
           aaa              ; AX = 0006h
           or   ax, 3030h   ; AX = 3036h
        
      
If you do these calculations for the operation 3 - 9, the AAS instruction will give the result AX = FF04h and after conversion from BCD to ASCII, AL = 34h, ASCII code for the character '4'. The number is recognized as negative, as the AAS instruction has set the carry flag. The result itself is obviously not what it should be. So, another way has to be used to perform subtractions with a negative result. On the other hand, the result obtained is useful with multibyte operands subtractions with a borrow. In the subtraction 53 - 29, for example, the first loop iteration gives 3 - 9 -> AL = 34h (cf. above) and the second one (we use SBB and the carry has been set because of the borrow) gives 5 - 2 - 1 -> AL = 32h. The result of the operation is thus 3234h. These are the ASCII codes for '2' and '4', and 24 is well the correct result of the subtraction.
ASCII multiplication: The AAM instruction is used to adjust the result of a MUL instruction. Note that MUL has only one operand; for byte multiplication, the other operand must be a value in the AL register, the result is a word that will be stored in AX (in the case of a word multiplication, one operand must be in AX; the result will be a double word with its MSW stored in DX and its LSW stored in AX). Not necessary here to clear the AH register. On the other hand, multiplication should not be performed on ASCII numbers; use unpacked BCD operands instead. This means that for two numbers entered from the keyboard, we'll have to convert them from ASCII to unpacked BCD: 30h -> 00h, 31h -> 01h, ... We'll have to mask off the upper four bits of the number in ASCII representation (what may be done by an AND operation of the register containing the operand and 0Fh).
        Multiplication example:
        
           mov  al, ’3’     ; AL = 33h (ASCII)
           mov  bl, ’9’     ; BL = 39h (ASCII)
           and  al, 0fh     ; AL = 03h (BCD)
           and  bl, 0fh     ; BL = 09h (BCD)
           mul  bl          ; AX = 03h * 09h = 001Bh
           aam              ; AX = 0207h
           or   ax, 3030h   ; AX = 3237h
        
      
So, we have now all the necessary knowledge to understand the CALC.ASM sample. The program allows to add, subtract and multiply two 1-digit positive integers (no calculation if the subtraction result is negative). The user is asked for an operation string of the form <operand1><operator><operand2>, where <operand1> and <operand2> are numbers from 0 to 9 (no validity check done), and <operator> is one of the symbols '+', '-', and '*'. Here is the code:
        
        segment code
        ..start:
                ; Initialization
                mov     ax, data
                mov     ds, ax
                mov     ax, stack
                mov     ss, ax
                mov     sp, stacktop
                ; Display title
                mov     dx, stitle
                mov     ah, 9
                int     21h
                ; Ask for operation
                mov     dx, sop
                mov     ah, 9
                int     21h
                ; Get operation from keyboard buffer
                mov     dx, buffer
                mov     ah, 0ah
                int     21h
                mov     al, [buffer + 2]              ; first operand
                mov     ah, 0
                mov     bl, [buffer + 4]              ; second operand
                mov     cl, [buffer + 3]              ; operator
                ; Branch depending on operator
                cmp     cl, '+'
                je      addition
                cmp     cl, '-'
                je      subtraction
                cmp     cl, '*'
                je      multiplication
                ; Display invalid operator message
                mov     dx, serr
                mov     ah, 9
                int     0x21
                jmp     exit
                ; Do addition
        addition:
                add     al, bl
                aaa
                jmp     result
                ; Do subtraction
        subtraction:
                cmp     al, bl
                jl      negative
                sub     al, bl
                aas
                jmp     result
        negative:
                mov     dx, sneg
                mov     ah, 9
                int     21h
                jmp     exit
                ; Do multiplication
        multiplication:
                and     al, 0fh
                and     bl, 0fh
                mul     bl
                aam
                ; Display result
        result:
                or      ax, 3030h
                mov     [res], ah
                mov     [res + 1], al
                mov     dx, sresult
                mov     ah, 9
                int     21h
                ; Terminate the program
        exit:
                mov     ax, 0x4c00
                int     21h
        segment data
        stitle  db      'Addition, subtraction, and multiplication', 13, 10
                db      'of two 1-digit positive integers', 13, 10, '$'
        sop     db      'Operation ? ', '$'
        serr    db      13, 10, 'Unknown operator!', 13, 10, '$'
        sneg    db      13, 10, 'Result is negative', 13, 10, '$'
        buffer  db      4
                resb    5
        sresult db      13, 10, 'Result = '
        res     resb    2
                db      13, 10, '$'
        segment stack   stack
                resb    64
        stacktop:
        
      
I think that the code of this program shouldn't be to difficult to understand. We read the operation from the keyboard and move the first operand to AL, the second operand to BL and the operator to CL (remember that user input in the buffer starts with offset +2). We then check which operator has to to be used. The instruction CMP compares two values and sets the flags according to the comparison result; here we use the JE (jump if equal) instruction to branch to the label corresponding to the operation that we want to perform. If the operator is unknown, we display an error message. Addition and multiplication are done as explained above. The same for subtraction, but only after having tested if the result is well positive; we compare the two operands using CMP and use a JL (jump if less) to branch to the "negative" label (display of the message "Result is negative" instead of doing the subtraction). All 3 operations lead to the label "result" where we convert the BCD value in AX to ASCII, then move the operation result to the corresponding area within the output string. Note that we don't move AX as a whole, but move AL and AH separately; this is necessary because the little-endian byte ordering used by the x86 CPUs would inverse the two bytes! And finally we display the result string calling the DOS write-string function.
        The sample program CALC2.ASM included in the download archive is identical to CALC.ASM, except that user input, calculation and the display
        of the result is placed within a loop. This allows to do successive calculations without leaving the program. The simplest way to leave the
        loop (and then terminate the program) is to do so if the user hasn't entered any operation (just hit the ENTER key). This can, for example be done by the following
        code ("exit" being the label referring to the address where the program termination code starts)
        
           mov  al, [buffer + 1]
           cmp  al, 0
           je   exit
        
        All we have to do is to check if the number of characters entered by the user (this value has been placed into the keyboard buffer at offset +1 by the DOS read-string
        function) is zero and if so exit the loop and terminate the program.
      
Here is the code of sample CALC3.ASM, which is a further extension of our simple calculation programs: If a subtraction has a negative result, instead of displaying a message, we do the calculation and display the negative result.
        
        segment code
        ..start:
                ; Initialization
                mov     ax, data
                mov     ds, ax
                mov     ax, stack
                mov     ss, ax
                mov     sp, stacktop
                ; Display title
                mov     dx, stitle
                mov     ah, 9
                int     21h
                ; Ask for operation
        loop:
                mov     dx, sop
                mov     ah, 9
                int     21h
                ; Read operation into keyboard buffer
                mov     dx, buffer
                mov     ah, 0ah
                int     21h
                ; Exit loop if no entry
                mov     al, [buffer + 1]
                cmp     al, 0
                je      exit
                ; Get operands and operator from keyboard buffer
                mov     al, [buffer + 2]              ; first operand
                mov     ah, 0
                mov     bl, [buffer + 4]              ; second operand
                mov     cl, [buffer + 3]              ; operator
                ; Branch depending on operator
                cmp     cl, '+'
                je      addition
                cmp     cl, '-'
                je      subtraction
                cmp     cl, '*'
                je      multiplication
                ; Display invalid operator message
                mov     dx, serr
                mov     ah, 9
                int     0x21
                jmp     loop
                ; Do addition
        addition:
                mov     byte [res], ' '
                add     al, bl
                aaa
                jmp     result
                ; Do subtraction
        subtraction:
                mov     byte [res], ' '
                cmp     al, bl
                jge     subtract
                mov     cl, al
                mov     al, bl
                mov     bl, cl
                mov     byte [res], '-'
        subtract:
                sub     al, bl
                aas
                jmp     result
                ; Do multiplication
        multiplication:
                mov     byte [res], ' '
                and     al, 0fh
                and     bl, 0fh
                mul     bl
                aam
                ; Display result
        result:
                or      ax, 3030h
                mov     [res + 1], ah
                mov     [res + 2], al
                mov     dx, sresult
                mov     ah, 9
                int     21h
                jmp     loop
                ; Terminate the program
        exit:
                mov     ax, 0x4c00
                int     21h
        segment data
        stitle  db      'Addition, subtraction, and multiplication', 13, 10
                db      'of two 1-digit positive integers', 13, 10, '$'
        sop     db      'Operation ? ', '$'
        serr    db      13, 10, 'Unknown operator!', 13, 10, '$'
        buffer  db      4
                resb    5
        sresult db      13, 10, 'Result = '
        res     resb    3
                db      13, 10, '$'
        segment stack   stack
                resb    64
        stacktop:
        
      
Lets have a look at the code. First of all, our output number now has a length of 3 characters: the sign plus two number digits; so, we have to reserve 3 bytes for the variable "res" (vs. 2 bytes in CALC.ASM). The first part of the code is identical to the one in CALC.ASM, except that I use a loop for successive calculations, as described for CALC2.ASM. Addition and multiplication are the same as in CALC.ASM, except that we have to consider the sign of the result; always positive, we move a space to "res" offset +0. The code for the subtraction is different, of course. Doing calculations with negative numbers does not necessarily mean that we have to use negative hexadecimals. As in this case, where the subtraction a - b, with a < b, can be calculated as b - a (and moving a minus sign to the first byte of the result area). And that's what's done in this program: I check if the first operand is less than the second one. If no, I just do the subtraction (I assumed the result to be positive and moved a space to the "sign location" before doing the compare); if yes, I simply swap the operands (and move a '-' to the "sign location"), then jump to the standard subtraction code. The display of the result is the same as in CALC.ASM, but the moves have to be adapted. Because of the sign, the MSB has now to be moved to a location with offset +1, the LSB to a location with offset +2 (vs. offset +0 and +1 in CALC.ASM).
Multibyte integer arithmetic.
Being able to do arithmetic operations on 1-digit numbers doesn't take us really far, so some hints concerning arithmetic operations with integers with more than 1 digit should not be missing in the tutorial.
The sample ADD6.ASM asks for two 6-digit positive integers and outputs their sum (as a 7-digit positive integer) onto the screen. As for the programs before, there is no numeric validity check, and the user has to enter the full 6 digits (i.e. has to enter the leading zeros), and the result is always displayed as a 7-digit number (i.e. is displayed with the leading zeros). It should not be to difficult to adapt the program for subtraction; multiplication would probably need a bigger effort. One nice thing in this program: The definition of a constant with the number length (number of digits) and the usage of this constant to calculate offsets (instead of using numeric literals) allows to adapt the code for the addition of two positive integers of any size (less than 256 digits, of course) by changing one single line in the source. Here is the code:
        
        segment code
        ..start:
                ; Initialization
                mov     ax, data
                mov     ds, ax
                mov     ax, stack
                mov     ss, ax
                mov     sp, stacktop
                ; Ask for first number
        loop:
                mov     dx, snum1
                mov     ah, 09h
                int     21h
                ; Read first number
                mov     dx, num1
                mov     ah, 0ah
                int     21h
                mov     al, [num1 + 1]
                cmp     al, 0
                je      exit                        ; exit the loop if no input
                ; Ask for second number
                mov     dx, snum2
                mov     ah, 09h
                int     21h
                ; Read second number
                mov     dx, num2
                mov     ah, 0ah
                int     21h
                ; Do the addition
                lea     esi, [num1 + numlen + 1]
                lea     edi, [res + numlen]
                mov     cl, numlen
                clc
                pushf
        digit:
                mov     al, [esi]                   ; first operand
                mov     ah, 0
                mov     bl, [esi + numlen + 3]      ; second operand
                popf
                adc     al, bl
                aaa
                pushf
                or      ax, 3030h
                mov     [edi], al
                dec     esi
                dec     edi
                dec     cl
                jnz     digit
                mov     [edi], ah
                popf
                ; Display result
                mov     dx, sresult
                mov     ah, 09h
                int     21h
                jmp     loop
                ; Terminate the program
        exit:
                mov     ax, 0x4c00
                int     21h
        segment data
        numlen  equ     6
        stitle  db      13, 10, 'Addition, of two 6-digit positive integers', '$'
        snum1   db      13, 10, 'First number  ? ', '$'
        snum2   db      13, 10, 'Second number ? ', '$'
        sresult db      13, 10, 'Result = '
        res     resb    numlen + 1
                db      13, 10, '$'
        num1    db      numlen + 1
                resb    numlen + 2
        num2    db      numlen + 1
                resb    numlen + 2
        segment stack   stack
                resb    64
        stacktop:
        
      
The program consists of a loop that reads the two operands, does the addition and displays the result. The loop (and the program) is terminated if the user enters no data (just hits ENTER) when asked for the first operand. The 2 operands are read into the two keyboard input buffers "num1" and "num2". The value at offset +1 in the "num1" buffer area is the length of the first operand read; testing if this value is 0 allows to exit the loop if the user didn't enter any data for the first operand.
Doing a multibyte addition is performing a loop, where each iteration does the addition of two 1-digit operands (this is what we did in the program samples before), starting with the LSB and continuing until all digits have been processed. The important thing here is to consider the possible carry: if a 1-digit operands addition produces a carry, it has to be added when adding the following two 1-digit operands. The obvious problem that occurs here is that the operations after the addition can (and will) modify the carry flag, and we loose its value. This means that after a 1-digit operands addition has been done, we have to save the carry flag, in order to be able to add it when performing the following 1-digit operands addition. The method used here is probably the easiest way to save the value of any flag: push the flag register onto the stack and pop it back when you need the flag values from before.
        Lets have a look at the loop initialization code:
        
           lea  esi, [num1 + numlen + 1]
           lea  edi, [res + numlen]
           mov  cl, numlen
           clc
           pushf
        
        We load ESI with the address of the first operand (using offsets, we will also use ESI to point to the second operand), and EDI with the address of the addition
        result. CL is loaded with the loop counter variable, i.e. the number of the operands' digits. To do the multibyte addition, we must proceed from LSB to MSB, so
        starting with the last digit of the operands and writing the corresponding sum digit to the last digit location of the result area. With the first digit located
        at offset +0, the nth digit is located at an offset of +(n-1). As the operands have numlen digits and the first digit in the "num1" buffer has an offset of +2,
        the address of the last digit of operand 1 is given by num1 + 2 + (numlen - 1) = num1 + numlen + 1. The result area is one digit longer than the operands' length
        (numlem + 1), so the address of the last digit of the result is given by res + [(numlen + 1) - 1] = res + numlen.
        The last two lines of the code above concern the carry. As within the loop we pop the flags register before doing the 1-digit operands addition, we have to push
        it onto the stack here; this is done using the instruction PUSHF. To be sure that the carry is zero for the first addition, we perform
        a CLC (clear carry) before pushing the register.
      
The addition of the two 1-digit operands is similar as in the programs before (as we have to consider the carry of the preceding addition, we must use the instruction ADC instead of ADD): We load one operand into AL, the other in BL, add the two (+ the carry) with the result in AL, then perform the AAA and OR AX, 3030h, and finally store AL to its correct position within the "res" variable area. The address of the first operand's digit is contained in ESI, the corresponding result digit's address in EDI. Concerning the address of the second operand's digit, we will compute it by adding a given offset to the address of the first operand's digit (content of ESI). If you look at the data segment, you can see that the two numbers' buffer areas succeed to each other, the buffer of the first operand ending with 1 extra location for the carriage-return, the buffer of the second operand beginning with two extra locations for the maximum resp. actual string length. Beside the extra locations, the offset depends on the operands' size (number of digits), declared by the constant "numlen". The second operand's address is thus given by ESI + numlen + 1 + 2 = ESI + numlen + 3.
The carry, that we need to consider in our ADC instruction is the one set (or not) by AAA, thus it is immediately after this instruction that we have to save it (using PUSHF to push the flags register onto the stack). And to use this carry with the ADC of the next digit, we restore it with a POPF immediately before the addition instruction.
Finally, we have to point the indexes to the next operand's and result's digit; this is done by decrementing the values in ESI and EDI (decrementing, because we add the digits from LSB to MSB). We also decrement the loop counter (number of digits that remain to process). And we continue looping until this counter is zero.
Two things that have to be done when the loop is terminated. First, we haven't yet considered the carry of the MSB digits addition (case where the result is greater that 999999). As we saw when discussing the 1-digit operands ASCII addition, the addition result will be in AX, so AH will contain the correct value (30h or 31h), if there has been a carry or not. Thus, we can use the instruction MOV [EDI], AH to fill in the MSB of the result (EDI having been decremented after the last addition points now to the first location within the result's memory area, i.e. the location where we want to store AH).
A final sample to terminate this tutorial. The program FIBO.ASM doesn't contain any new assembly language elements, but it is interesting, because it solves a problem that is stated in lots of programming language manuals: the calculation of the Fibonacci series. These are a series of numbers defined by the function f(n) = n-2 + n-1, with f(0) = 0 and f(1) = 1. FIBO.ASM calculates and displays the first 20 numbers of the series, as shown on the screenshot below.
|   | 
I think that with the knowledge that you have acquired when working through this tutorial, it should not be to difficult to understand the code of FIBO.ASM, shown here, without further explanations.
        
        segment code
        ..start:
                ; Initialization
                mov     ax, data
                mov     ds, ax
                mov     ax, stack
                mov     ss, ax
                mov     sp, stacktop
                ; Calculate Fibonacci series
                lea     esi, [series + 3]
                mov     ch, count - 2
        loop:
                mov     cl, 4
                clc
                pushf
        fdigit:
                mov     al, [esi]
                mov     ah, 0
                mov     bl, [esi + 4]
                popf
                adc     al, bl
                aaa
                pushf
                or      ax, 3030h
                mov     [esi + 8], al
                dec     esi
                dec     cl
                jnz     fdigit
                popf
                add     esi, 8
                dec     ch
                jnz     loop
                ; Display the series
                lea     esi, [series]
                mov     ch, count
        display:
                mov     cl, 4
                lea     edi, [sfib]
        ddigit:
                mov     bl, [esi]
                mov     [edi], bl
                inc     esi
                inc     edi
                dec     cl
                jnz     ddigit
                mov     dx, sfib
                mov     ah, 9
                int     0x21
                dec     ch
                jnz     display
                ; Terminate the program
        exit:
                mov     ax, 0x4c00
                int     21h
        segment data
        count   equ     20
        series  db      '0000'
                db      '0001'
                times count - 2 resb 4
        sfib    resb    4
                db      13, 10, '$'
        segment stack   stack
                resb    64
        stacktop:
        
      
I hope that following this tutorial, you got everything you need to create 16-bit real mode and 16-bit protected mode assembly programs on DOS using NASM, and that the sample programs shown with my explanations and comments has given you a base for developing your own assembly programs. If you are serious in 16-bit assembly programming, the manual Introduction to Assembly Programming - For Pentium and RISC Processors, Sivarama P. Dandamudi, © 2005, 1998 Springer Science+Business Media, Inc. might be really helpful. It contains a comprehensive introduction to basic computer organization, the Pentium Processor, and the x86 assembly language, including a lots of code and sample programs, that you can use as a base for your own assembly projects. The manual is available as PDF document on the Internet...
If you find this text helpful, please, support me and this website by signing my guestbook.