Computing: DOS, OS/2 & Windows Programming

Introduction to assembly macros (FASM 64-bit).

Some important notes that you should read before starting the tutorial:

The FASM preprocessor.

FASM, as many other assemblers, includes a so-called preprocessor, i.e. a program (or part of a program such as an assembler or compiler), that modifies the source code before it is assembled (resp. compiled). The preprocessor scans the source and replaces some things with others. For example, you can use a name to designate some lines of code that you use several times, and the preprocessor replaces this name with the corresponding code. Another example is the creation of your own, new instructions, that the FASM preprocessor will replace by some code using regular x64 assembly instructions. This means that you, the programmer, has to tell the preprocessor what it should preprocess and how. This is done using preprocessor directives. An important point to remember is that the preprocessor has no knowledge of assembly, all it understands are these directives; any parts of code not meant to the preprocessor will be ignored.

A first thing that the FASM preprocessor does is removing all comments from the code. So, any whatever after a semicolon will never be passed to the assembler.

FASM allows to break instructions into several lines using the line-break character "\" (backslash). The preprocessor removes this character and concatenates the two lines of code into one.

The include directive is used to include assembly code from a file. The preprocessor reads this file and replaces the directive with the code contained in that file.

The preprocessor translates string literals to binary.

The pseudo-instruction equ is in fact a simple preprocessor directive. If your code contains something like
  array_length equ 20
the preprocessor will replace each occurrence of "array_length" by "20".

And finally, the preprocessor replaces all macros by the corresponding macro definition.

Macros without arguments.

Macros, or macroinstructions, are custom instructions, defined by the programmer, and replaced by regular assembly instructions by the preprocessor. Macros are "declared" to the preprocessor by a macro definition, that starts with the preprocessor directive macro. In the simplest case, a macro definition is of the following form:
  macro <macro-name> {
    <macro-body>
  }
and in our assembly program, we'll use <macro-name> to call the macro (what happens in reality is that the preprocessor replaces <macro-name> by the assembly instructions that make up the macro definition, and this code will be executed).

As example, let's write two macros that push resp. pop the 4 data registers RAX, RBX, RCX, and RDX. Here is the code (the macro names are up to you, of course):
  macro pushregs {
    push rax rbx rcx rdx
  }
  macro popregs {
    pop rax rbx rcx rdx
  }

I said before that it is important to note that the preprocessor don't know anything about assembly instructions. This means that if we use the name of a regular x64 assembly instruction as name for a macro, the preprocessor will replace this instruction with the macro definition. This is some kind of "overloading" the assembly instruction with the macro. As an example, consider the instruction pusha, that pushes the registers AX CX DX, SP, BP, SI, DI onto the stack. When using this instruction in 64-bit assembly, we get an "illegal instruction" error (it is not allowed to use 16-bit or 32-bit operands with push and pop instructions in 64-bit assembly). If, now, we define the following macro
  macro pusha {
    push rax rcx rdx rsp rbp rsi rdi
  }
and use pusha in our assembly source, the processor will replace it by the macro definition, and the code works without errors pushing the 8 (64-bit) registers onto the stack. So, using assembly instructions as macro names is possible, but I guess that it's not the best practice to do it!

Consider the following program sample, that displays a "Hello World" message after having cleared the screen by printing the "clear-screen" ANSI escape sequence (cf. my tutorial Text positioning and coloring using ANSI escape sequences on Windows).
  format ELF64
  section '.data' writeable
  ecls      db      1Bh, '[2J', 00h
  hello     db      'Hello World!', 0Dh, 0Ah, 00h
  section '.text' executable
  public main
  extrn printf
  main:
            mov     rbp, rsp
            sub     rsp, 32
            and     rsp, -16
            mov     rcx, ecls
            call    printf
            mov     rcx, hello
            call    printf
            mov     rsp, rbp
            xor     rax, rax
            ret

How can we implement the clear screen function as a macro?

Our macro will have two particularities: 1. it has to call the C function printf; 2. it has to include the declaration of the ANSI escape sequence (a string literal).

The call of printf must not bother us. The preprocessor is only concerned about the directives that are intended to itself, and the assembler will have no problems to deal with the call when assembling the code by which the preprocessor replaces the call to the macro.

In the program above, the escape sequence has been defined in the .data section. That is as it should be and normally is. However, FASM doesn't have any problem with data being declared in the .code section, and I guess that this not unusual in macro definition code. The only thing, to pay attention to, is to make sure that the data can never be interpreted as instructions and is never "executed".

Here is a first version of the cls macro definition (we will see below that this code can result in an error message during assembly):
  format ELF64
  macro cls {
            jmp     continue
    ecls    db      1Bh, '[2J', 00h
    continue:
            mov     rcx, ecls
            call    printf
  }

The code is easy to understand. We declare the ANSI escape sequence within the macro just as we did before in the .data section. This data part (within the code) is skipped by using a jmp instruction.

Here is another way to implement the declaration of some data within an assembly code part (the error during assembly being possible just as before):
  format ELF64
  macro cls {
            call    continue
            db      1Bh, '[2J', 00h
    continue:
            pop     rcx
            call    printf
  }

To avoid the "execution" of the data, we use this time a call instruction (versus the jmp instruction used before). This instruction, before actually calling the subroutine, pushes the return address onto the stack. This return address is the address of the first byte following the call instruction. And that is ... the address of our ANSI escape sequence data! Thus, popping the return address into RCX, we load RCX with the address of the string to display, what is all that is needed to call printf.

That these two implementations of the macro can result in an error during assembly, I said. Do you see why? What do you think does happen if the macro would be called twice? The preprocessor replaces each macro call by the corresponding macro definition, i.e. the assembly instructions coded as macro body. Do you see now? If cls is called twice, the macro body is copied twice. This is in particular true for the two labels "ecls" and "continue". The assembler would find itself with a program where two labels are declared twice; the result would be a symbol already defined error!

To avoid this problem, FASM includes the directive local. If a label is defined as being local, its name will be dynamically changed by the preprocessor. In fact, each time a local label is encountered, a suffix of the form ?x, where x is a hexadecimal number, incremented during each replacement, is added to the label name. In the code, presented to the assembler, the label "continue" would thus be named "continue?1", when cls is called for the first time, "continue?2", when it is called for the second time, etc. No more duplicate label declarations, and the program can be assembled without errors, independently of how many times the macro is called.

Here is the code of the sample program macro1.asm that displays a "Hello World" message after having cleared the screen using the corrected version of the first implementation of the cls macro. The usage of labels, starting with 2 dots (..) is not mandatory, but it seems kind of good practice to use this prefix when naming labels local to a macro.
  format ELF64
  macro cls {
  local ..ecls, ..code
            jmp      ..code
    ..ecls  db       1Bh, '[2J', 00h
    ..code:
            mov      rcx, ..ecls
            call     printf
  }
  section '.data' writeable
  hello     db       'Hello, World!', 0Dh, 0Ah, 00h
  section '.text' executable
  public main
  extrn printf
  main:
            mov      rbp, rsp
            sub      rsp, 32
            and      rsp, -16
            cls
            mov      rcx, hello
            call     printf
            mov      rsp, rbp
            xor      rax, rax
            ret

You can download the source code of all program samples of the tutorial from my website. Note that the program macro2.asm, contained in the download archive, is identical to macro1.asm, but using call instead of jmp in the macro definition.

Macros with (simple) arguments.

Like functions and procedures, macros may have one or more arguments. If there are are several arguments, they are separated with a comma (,). General form of a macro with arguments definition:
  macro <macro-name> <argument1>, <argument2>, ... {
    <macro-body>
  }

Example with one argument: Macro to print an unsigned integer:
  macro print_int number {
  local ..fmt, ..code
            jmp      ..code
    ..fmt   db       '%u', 0Dh, 0Ah, 00h
    ..code:
            mov      rcx, ..fmt
            mov      rdx, number
            call     printf
  }

This macro can be called with either a 64-bit register, or a 64-bit value located in memory; examples:
  print_int rax
  print_int [integer]
  (with, for example: integer dq 20)

Example with 2 arguments: Swapping two 64-bit values (located either in a register or memory):
  macro swap val1, val2 {
            push     val1
            push     val2
            pop      val1
            pop      val2
  }

Calling examples:
  swap rax, rbx
  swap rax, [integer]
  swap [integer1], [integer2]

Note: If a macro is called with less arguments than specified in its definition, the rest of arguments will have empty values. By placing the asterisk (*) symbol after an argument name, you can mark this argument as required; the preprocessor will not allow it to have an empty value. Optional arguments may be assigned a default value, using the equal (=) sign.

Is it possible to pass a string literal to a macro? No problem, just do it the same way as with a register or an address. The following macro displays a string, passed as literal; carriage-return-linefeed and null terminator are added by the macro.
  macro print_literal string {
  local ..str, ..code
            jmp      ..code
    ..str   db       string
            db       0Dh, 0Ah, 00h
    ..code:
            mov      rcx, ..str
            call     printf
  }

Calling example:
  print_literal "Hello, World!"

We can improve our macro, letting the user decide to add the end-of-line characters, or not. One way would be to add a Boolean argument and depending on this argument printing a carriage-return-linefeed, or not. Another way is to add, or not the end-of-line characters 0Dh and 0Ah to the string (the macro argument). Here, we have a problem however. The string that we use is of the form <literal>, 0Dh, 0Ah, and the two commas would be interpreted as argument separators. The FASM preprocessor uses a special format to specify arguments that include commas: you'll have to enclose the argument using <>.

Here is the new version of the macro:
  macro print_literal string {
  local ..str, ..code
            jmp      ..code
    ..str   db       string
            db       00h
    ..code:
            mov      rcx, ..str
            call     printf
  }

And we can call it with, or without the carriage-return-linefeed.
  print_literal "Hello, "
  print_literal <"World!", 0Dh, 0Ah>

The program sample fact.asm calculates the factorial of a number entered by the user. The display is done using the macros print_literal and print_int.

  format ELF64
  macro print_int integer {
  local ..fmt, ..code
            jmp      ..code
    ..fmt   db       '%u', 0Dh, 0Ah, 00h
    ..code:
            mov      rcx, ..fmt
            mov      rdx, integer
            call     printf
  }
  macro print_literal string {
  local ..str, ..code
            jmp      ..code
    ..str   db       string
            db       00h
    ..code:
            mov      rcx, ..str
            call     printf
  }
  section '.data' writeable
  frmat     db       '%u', 00h
  number    dq       ?
  save      dq       ?
  section '.text' executable
  public main
  extrn printf
  extrn scanf
  main:
            mov      rbp, rsp
            sub      rsp, 32
            and      rsp, -16
            print_literal "Please, enter an integer number from 1 to 20? "
            mov      rcx, frmat
            mov      rdx, number
            call     scanf
            mov      rax, [number]
            cmp      rax, 1
            jl       invalid
            cmp      rax, 20
            jg       invalid
            mov      rbx, rax
  next:
            dec      rbx
            cmp      rbx, 0
            je       done
            mul      rbx
            jmp      next
  done:
            mov      [save], rax
            print_literal "The factorial of this number is "
            mov      rax, [save]
            print_int rax
            jmp      exit
  invalid:
            print_literal <"Error: Number out of range!", 0Dh, 0Ah>
  exit:
            mov      rsp, rbp
            xor      rax, rax
            ret

I guess that the simplest way to return a string from a macro is to pass a pointer to the memory location, where to store it as argument. The program sample convert.asm reads a string from the keyboard (using the macro read_string, that also transforms underscores to spaces, a work-around to allow to enter strings containing spaces), and transforms it to uppercase (using the macro uppercase and lowercase (using the macro lowercase. the Original string, and the two converted strings are printed out using the macro print_string. The program also shows that one macro can call another one.

  format ELF64
  ; Read a string of maximum length (defined by input format)
  macro str_read text, string, fmt {
  local ..txt, ..fmt, ..code, ..next, ..continue, ..done
            jmp      ..code
    ..txt   db       text, 00h
    ..fmt   db       fmt, 00h
    ..code:
            mov      rcx, ..txt
            call     printf
            mov      rcx, ..fmt
            mov      rdx, string
            call     scanf
            ; Convert '_' to space
            mov      rdi, string
    ..next:
            mov      al, [rdi]
            cmp      al, 00h
            je       ..done
            cmp      al, '_'
            jne      ..continue
            mov      byte [rdi], ' '
    ..continue:
            inc      rdi
            jmp      ..next
    ..done:
  }
  ; Print a string
  macro str_write string {
            mov      rcx, string
            call     printf
  }
  ; Print a string with CR-LF
  macro str_writeln string {
            str_write string
            newline
  }
  ; Print a new line
  macro newline {
  local ..str, ..code
            jmp      ..code
    ..str   db       0Dh, 0Ah, 00h
    ..code:
            mov      rcx, ..str
            call     printf
  }
  ; Convert a string to uppercase
  macro uppercase string {
  local ..next, ..continue, ..done
            mov      rsi, string
            mov      rdi, rsi
    ..next:
            lodsb
            cmp      al, 00h
            je       ..done
            cmp      al, 'a'
            jl       ..continue
            cmp      al, 'z'
            jg       ..continue
            sub      al, 32
    ..continue:
            stosb
            jmp      ..next
    ..done:
  }
  ; Convert a string to lowercase
  macro lowercase string {
  local ..next, ..continue, ..done
            mov      rsi, string
            mov      rdi, rsi
    ..next:
            lodsb
            cmp      al, 00h
            je       ..done
            cmp      al, 'A'
            jl       ..continue
            cmp      al, 'Z'
            jg       ..continue
            add      al, 32
    ..continue:
            stosb
            jmp      ..next
    ..done:
  }
  ; BSS section
  section '.bss' writeable
  string    db       101 dup (?)
  ; Code section (main program)
  section ".text" executable
  public main
  extrn printf
  extrn scanf
  main:
            mov      rbp, rsp;
            sub      rsp, 32
            and      rsp, -16
            str_read "Please enter a string? ", string, "%100s"
            str_writeln string
            uppercase string
            str_writeln string
            lowercase string
            str_writeln string
            mov      rsp, rbp
            xor      rax, rax
            ret

Macros with group arguments.

Group arguments allow to specify several values for an argument when calling a macro (and make the macro a macro with a variable number of arguments). In the macro definition, they are specified enclosed by square brackets. If the macro definition also includes simple arguments, these must precede the group argument(s). When calling a macro with one group argument, all values specified after the simple arguments will be passed to the group argument. General format of the definition of a macro with one group argument:
  macro <macro-name> <argument1>, <argument2>, ..., [<group-argument>] {
    <macro-body>
  }

Macros with a group argument are special. In fact, the macro body is executed for each value passed to the group argument.

Example:
  macro name_list count, [names] {
    db count
    db names, 00h
  }

The macro call name_list 3, "Aly", "Juno", "Yoko" will be transformed by the preprocessor to the following code:
  macro name_list count,[names] {
    db 3
    db "Aly", 00h
    db 3
    db "Juno", 00h
    db 3
    db "Yoko", 00h
  }

Macros with a group argument may include the special preprocessor directives forward, reverse, and common. These directives allow to subdivide the macro body into blocks, executed one after the other. In a forward block, all block instructions are executed for each value of the group-argument. It's the same for reverse blocks, except that the values are taken in reverse order (starting with the last one). Instructions of a common block, on the other hand, are executed only once.

Let's use blocks to improve our "name_list" macro:
  macro name_list count, [names] {
  common
    db count
  forward
    db names, 00h
  }

The macro call name_list 3, "Aly", "Juno", "Yoko" will now be transformed by the preprocessor as follows:
  macro name_list count,[names] {
    db 3
    db "Aly", 00h
    db "Juno", 00h
    db "Yoko", 00h
  }

The program sample macro3.asm shows how to use a macro with a group argument to calculate and print the sum of 2 or more integers.

  format ELF64
  macro print_sum [numbers] {
  common
  local ..fmt, ..code
            jmp      ..code
    ..fmt   db       'Sum is: %Ld', 0Dh, 0Ah, 00h
    ..code:
            mov      rcx, ..fmt
            xor      rdx, rdx
    forward
            add      rdx, numbers
    common
            call     printf
  }
  section '.text' executable
  public main
  extrn printf
  main:
            mov      rbp, rsp
            sub      rsp, 32
            and      rsp, -16
            print_sum 10, 10
            print_sum 1, -2, 3, -4
            print_sum 1, 2, 3, 4, 5, 6, 7, 8, 9
            mov      rsp, rbp
            xor      rax, rax
            ret

If you pass several values to a group-argument used in a common bloc, then all values are passed to this argument in one time. Example:
  macro define_string [strings] {
  common
    db strings, 00h
  }
When calling the macro as define_string "Hello, World!", 0Dh, 0Ah, the preprocessor will generate the code
    db "Hello, World"', 0Dh, 0Ah, 00h

A macro definition may include two (or more) group-arguments. The names of the different group-arguments have to be placed between square brackets and separated by a comma. When the macro is called, the values are passed alternately to the different group-arguments. In the case of 2 group-arguments: value 1 to first argument, value 2 to second argument, value 3 to first argument, etc.

The program sample macro4.asm shows how to use a macro with two group arguments to display a list of persons with their profession.

  format ELF64
  macro persons_list [names, professions] {
    common
    local ..fmt, ..code
            jmp      ..code
    ..fmt   db       '%s is a %s', 0Dh, 0Ah, 00h
    ..code:
    forward
    local ..name, ..prof, ..continue
            jmp      ..continue
    ..name  db       names, 00h
    ..prof  db       professions, 00h
    ..continue:
            mov      rcx, ..fmt
            mov      rdx, ..name
            mov      r8, ..prof
            call     printf
  }
  section '.text' executable
  public main
  extrn printf
  main:
            mov      rbp, rsp
            sub      rsp, 32
            and      rsp, -16
            persons_list "Aly Baba", "programmer"
            persons_list "Aly", "Free Pascal programmer", "Juno", "movie actrice", "Yoko", "chemical engineer"
            mov      rsp, rbp
            xor      rax, rax
            ret

Preprocessor operators.

The concatenation operator (#) is used to concatenate two symbols into one. Example: Generation of a conditional jump depending on macro argument:
  macro jump_if operand1, condition, operand2, label {
    cmp operand1, operand2
    j#condition label
  }
When calling the macro as jumpif ecx, le, eax, next, the preprocessor will generate the code
    cmp ecx, eax
    jle next

You may also use this operator to concatenate quoted strings into one. We'll see an example further down in the text.

The "string" operator (`) is used to transform a symbol into a quoted string. This is, in particular, useful when passing a quoted string argument from one macro to another.

As code example of the preprocessor operators, let's review the sample program convert.asm. This program uses the macro str_read to read a string from the keyboard, one of its arguments being the input format for the C function scanf(), that actually does the reading. Wouldn't it be better (or, at least nicer) if, instead, we could simply use the maximum string length? That's what is implemented in sample program convert2.asm (that I started with as a copy of convert.asm).

As I wanted to keep the name of the string reading macro in the main program, I renamed the original "str_read" macro to "readstring". It's this macro that will call scanf(), just as before, thus, no code changes here. However, the main program will not call this macro directly, but will call the new "str_read" macro, that transforms the maximum string length argument (passed by the main program) into a scanf() input string, as it is expected by "readstring", that it calls to actually do the reading.

Here is the code of the new "str_read" macro:
  macro str_read text, string, maxlen {
    readstring `text, string, "%"#maxlen#"s"
  }

The macro "readstring" (the original "str_read" macro from program sample convert.asm) is defined with 3 arguments: a quoted string (text to display), a symbol (label for the input buffer), and another quoted string (the input format). I said above that the concatenation operator may also be used with quoted strings. Here is the example: We use it to create the input format using 3 quoted strings (the maximum string length being passed as argument to "str_read"). The macro also shows the usage of the "string" operator. As the text to display has to be passed as a quoted string to the "readstring" macro, we'll have to transform the second argument (interpreted as a symbol) using this operator.

With these changes done, we can read our string, calling the macro as str_read "Please enter a string? ", string, "100". I will not show the code of convert2.asm here. Except for the changes described, it's the same as for convert.asm. The download archive with the tutorial samples includes the entire sample convert2.asm, of course.

Conditional blocks.

There is no preprocessor conditional syntax in FASM. But FASM includes the assembly directive if, that can be used in conjunction with the preprocessor to achieve the same results as with preprocessor conditionals. Besides that this way uses more time and memory, it is important to be aware that the evaluation of the if statement is done by the assembler, i.e. after the code has been parsed and changed by the preprocessor.

The FASM if directive works the same way as it does in higher programming languages. In its simplest form the condition is formed by a single block as follows
  if <logical-expression>
    ...
  end if

To also specify code that is executed if the logical expression is false, two blocks are used as follows
  if <logical-expression>
    ...
  else
    ...
  end if

Further conditions (with supplementary blocs) may be specified using else if <logical-expression>.

Logical expressions are made of one or several comparison expressions (the result of which being a logical value, "true" or "false"). Logical expressions may be combined using the operators & (logical "and"), or | (logical "or"). There is also a logical "not" operator; it is written as ~.

Comparison expressions with numerical values allow the usual operators = (equal), < (less), > (greater), <= (less or equal), >= (greater or equal), and <> (not equal). As in most higher programming languages, a numerical expression (as for example just a variable name) may be used instead of the comparison expression. The evaluation of this expression yields "false" if the result is zero, "true" in all other cases.

Example:
  if count & ~ count mod 4
    ...
  end if
This logical expression is true if "count" (defined before, of course) is not zero and if it is divisible by 4 (if count is divisible by 4, count mod 4 = 0, that corresponds to logical "false", thus we use the ~ (not) operator to get a logical value of "true", if count is divisible by 4).

There are also operators that allow the comparison of values being any chains of symbols. The eq operator compares whether two values are exactly the same (from the point of view of the assembler). The in operator checks whether a given value is a member of the list of values following this operator. The list should be enclosed between < and > characters; its members should be separated with commas.

The eqtype operator checks whether the two compared values have the same structure, and whether the structural elements are of the same type. The distinguished types include numerical expressions, individual quoted strings, floating point numbers, address expressions (expressions enclosed in square brackets or preceded by ptr operator), instruction mnemonics, registers, size operators, jump type and code type operators. And each of the special characters that act as a separator (like comma or colon), is the separate type itself. For example, two values, each one consisting of a register name followed by a comma and a numerical expression, will be regarded as of the same type, no matter what kind of register and how complicated the numerical expression is, except for quoted strings and floating point values, which are special kinds of numerical expressions and are treated as different types. Thus the eax,16 eqtype fs,3+7 condition is true, but eax,16 eqtype eax,1.6 is false.

Finally, the used, defined, and definite operators allow to check if a symbol is used or has been defined; cf. the FASM documentation for details.

The "special" operators described above are normally used in if directives within a macro, as we will see in the following examples.

Consider a macro that declares a symbol assigned to either a word, a doubleword, or a string value (depending on the situation). We can code it using if and eq as follows:
  macro declare_item item, value {
    if item eq WORD
      dw value
    else if item eq DWORD
      dd value
    else if item eq STRING
      db value, 0
    end if
  }
The macro call declare_item STRING "Hello, World!", for example, would result in the following preprocessor output:
    if STRING eq WORD
      dw "Hello, World!"
    else if STRING eq DWORD
      dd "Hello, World!"
    else if STRING eq STRING
      db "Hello, World!", 0
    end if
And the assembler would create the object code corresponding to the instruction
    db "Hello, World!", 0

FASM allows to redefine standard assembly instructions. Consider, for example, to extend the instruction mov, giving the possibility to use 3 arguments, the second being copied to the first, and the third to the second. Here is how this could be done:
  macro mov operand1*, operand2*, operand3 {
    if operand3 eq
      mov operand1, operand2
    else
      mov operand1, operand2
      mov operand2, operand3
    end if
  }
Note the logical expression with no operand following eq. that means here if "operand3" is empty (i.e. is not specified when calling the macro). This allows to use mov with 2 arguments (as you normally do), but also its extended form with 3 arguments.

The operator in is a shorter form for several eq operations combined by a logical "or". Consider another extension of mov, that also allows both operands to be segment registers. Here is how this could be done:
  macro mov operand1, operand2 {
    if operand1 in <cs, ds, es, fs, gs, ss> & operand2 in <cs, ds, es, fs, gs, ss>
      push operand2
      pop operand1
    else
      mov operand1, operand2
    end if
  }
In the "normal" case, the new mov will work just as before; if both arguments are segment registers, however, push and pop are used.

And here the example of a further extension of the mov instruction; it allows to copy a value from one memory location to another memory location:
  macro mov operand1, operand2 {
    if operand1 operand2 eqtype [0] [0]
      push operand2
      pop operand1
    else
      mov operand1, operand2
    end if
  }
If the macro is, for example called as mov [var], 5, the instruction mov will be used; if it is called as mov [var1], [var2], push and pop will be used.

Note: A more readable way to write the logical expression would be to use the & operator: if operand1 equ [0] & operand2 equ [0]

The sample program macros5.asm asks the user for the name and uses this name to display a personal greeting message. The program uses a simplified version of the str_read macro seen before and an extended version of str_write, that allows to display not only of a string identified by the label of the address where it is located, but also a quoted literal. Here is the code:
  format ELF64
  ; Read a string of maximum length (defined by input format)
  macro str_read text, string, fmt {
  local ..txt, ..fmt, ..code
            jmp      ..code
    ..txt   db       text, 00h
    ..fmt   db       fmt, 00h
    ..code:
            mov      rcx, ..txt
            call     printf
            mov      rcx, ..fmt
            mov      rdx, string
            call     scanf
  }
  ; Print a string
  macro str_write string {
  local ..continue
        if string eqtype 'string'
            call     ..continue
            db       string, 00h
          ..continue:
            pop      rcx
            call     printf
        else if string eqtype ..continue
            mov      rcx, string
            call     printf
        end if
  }
  ; Print a string with new line
  macro str_writeln string {
            str_write string
            newline
  }
  ; Print a new line
  macro newline {
  local ..str, ..code
            jmp      ..code
    ..str   db       0Dh, 0Ah, 00h
    ..code:
            mov      rcx, ..str
            call     printf
  }
  ; BSS section
  section '.bss' writeable
  uname     db       26 dup (?)
  ; Code section (main program)
  section ".text" executable
  public main
  extrn printf
  extrn scanf
  main:
            mov      rbp, rsp;
            sub      rsp, 32
            and      rsp, -16
            str_read "Please enter your name? ", uname, "%25s"
            str_write "Hello, "
            str_write uname
            str_writeln "! How are you?"
            mov      rsp, rbp
            xor      rax, rax
            ret

Repeating blocks.

First, as with conditionals, we can use the FASM assembler repeating blocks directives. Second there are also several repeating macroinstructions.

The assembler directive times repeats one instruction a specified number of times. It should be followed by a numerical expression specifying the number of repeats and the instruction to be repeated. When the special symbol % is used inside the instruction, it is equal to the number of the current repeat. Example:
  times 5 db %
will declare five bytes with values 1, 2, 3, 4, 5.

The assembler directive repeat repeats a block of instructions. It should be followed by a numerical expression specifying the number of repeats. the instructions to be repeated are expected in the next lines, ended with the end repeat directive. Example:
  repeat 8
    mov byte [rdi], %
    inc rdi
  end repeat
This stores the numbers 1 to 8 into successive memory locations, starting with the address contained in RDI.

The assembler directive while repeats a block of instructions as long as the condition specified by the logical expression following it is true. The block of instructions to be repeated should end with the end while directive. Before each repetition, the logical expression is evaluated and when its value is false, the assembly is continued starting from the first line after the end while. Example:
  i = 1
  while i <= 8
    mov byte [rdi+i-1], i
    i = i + 1
  end while
This does exactly the same as in the previous example.

The sample program macros6.asm fills an array with successive values of bytes using the macro fill_array (usage of the directive repeat), and then prints the array out using the macro print_array (usage of the directive while).
  format ELF64
  ; Fill array with byte numbers from M to N
  macro fill_array array, max, first, last {
            mov      rdi, array
            repeat last - first + 1
              if % > max
                break
              end if
              mov  byte [rdi+%-1], first + % - 1
            end repeat
  }
  ; Print array of bytes
  macro print_array array, len {
  local ..fmt, ..eol, ..code
            jmp      ..code
    ..fmt   db       '%5u ', 00h
    ..eol   db       0Dh, 0Ah, 00h
    ..code:
            i = 1
            while i <= len
              mov   rcx, ..fmt
              mov   rdi, array
              xor   rdx, rdx
              mov   dl, [rdi+i-1]
              call  printf
              i = i + 1
            end while
            mov      rcx, ..eol
            call     printf
  }
  ; BSS section
  section '.bss' writeable
  array     db       10 dup (?)
  ; Code section (main program)
  section '.text' executable
  public main
  extrn printf
  extrn scanf
  main:
            mov      rbp, rsp
            sub      rsp, 32
            and      rsp, -16
            fill_array array, 10, 1, 5
            print_array array, 5
            fill_array array, 10, 20, 30
            print_array array, 10
            mov      rsp, rbp
            xor      rax, rax
            ret

The program output will be two lines, in the first one the numbers from 1 to 5, in the second one the numbers from 20 to 29. Note that the program does not abort if we try to fill the array with numbers from 20 to 30, what's actually 11 numbers (thus normally exceeding the area reserved for the array) thanks to the usage of the if and break directives.

FASM includes the following repeating macroinstructions: rept, irp, irps, and irpv.

The rept directive makes a given amount of duplicates of the block enclosed with the { and } curly brackets. The number of duplicates is a number following the directive. The number itself may be followed by the name of a counter symbol, eventually followed by the base for the counting (separated from the counter name by a colon (:). Examples:
Copy of given value to 10 successive memory locations:
  rept 10 {
  mov [rdi], rax
  add rdi, 8
  }
Repetitive symbol generation:
  rept 3 counter {
    byte#counter db counter
  }
what will generate the following:
  byte1 db 1
  byte2 db 2
  byte3 db 3
Reset to zero of the SSE registers XMM0 to XMM7:
  rept 8 n:0 {
    pxor xmm#n, xmm#n
  }

Note that multiple counters, separated by commas (,) and with individual base may be specified.

The irp directive iterates the single argument through a given list of parameters (values). Example:
  irp oddnumber, 1, 3, 5, 7, 9 {
    db oddnumber
  }
will generate the following:
  db 1
  db 3
  db 5
  db 7
  db 9

The irps directive iterates the single argument through a given list of symbols. Example:
  irps register, rax rbx rcx rdx {
    xor register, register
  }
will generate the following:
  xor rax, rax
  xor rbx, rbx
  xor rcx, rcx
  xor rdx, rdx

Structures.

Structures are similar to macros (you may even say that they are a special variant of macro); they are used to define data structures (comparable with C structures, and Pascal records). General format of a structure definition:
  struc <structure-name> <arguments> {
    <structure-body>
  }

Let's consider the following structure definition:
  struc point x, y {
    .x dq x
    .y dq y
  }
This definition is, however, not enough. In fact, to use a structure, you'll have to create an instance of the structure, using a label as identifier. Examples:
  p1 point 1, 2
  p2 point ?, ?
The label (name of the instance) will be attached at the beginning of every item name within the struc macroinstruction that starts with a dot. For the examples above, the preprocessor will generate the following:
  p1.x dq 1
  p1.y dq 2
  p2.x dq ?
  p2.y dq ?
and we can code instructions like mov [p2.x], -2, and mov [p2.y], -1.

If somewhere inside the definition of a structure, a name consisting of nothing but a single dot is found, it is replaced by the name of the label for the given instance of the structure. This label will not be defined automatically in such case, allowing to completely customize the definition. Example:
  struc db [data] {
    common
    . db data
    .size = $ - .
  }
The instruction msg db "Hello!", 0Dh, 0Ah, 00h will be transformed by the preprocessor to
  msg db "Hello!", 0Dh, 0Ah, 00h
  msg.size = $ - msg
This actually is a redefinition of the pseudo-instruction db, that beside the declaration of the data label, also includes the ability to calculate the size of the defined data (note that in this example the data size includes the null-terminator character).

The sample program structure1.asm defines a rectangle as structure described by its top-left and bottom-right coordinates. Two instances of this rectangle are created, one being initialized with the instance creation, the other by assigning values to the structure's items. The program then calculates and displays the rectangles' surface.
  format ELF64
  struc rectangle x1, y1, x2, y2 {
    .x1     dw       x1
    .y1     dw       y1
    .x2     dw       x2
    .y2     dw       y2
  }
  macro print_int integer {
  local ..fmt, ..code
            jmp      ..code
    ..fmt   db       '%u', 0Dh, 0Ah, 00h
    ..code:
            mov      rcx, ..fmt
            xor      rdx, rdx
            mov      dx, integer
            call     printf
  }
  macro print_literal string {
  local ..str, ..code
            jmp      ..code
    ..str   db       string
            db       00h
    ..code:
            mov      rcx, ..str
            call     printf
  }
  section '.data' writeable
  rect1     rectangle 0, 0, 25, 4
  rect2     rectangle ?, ?, ?, ?
  section '.text' executable
  public main
  extrn printf
  main:
            mov      rbp, rsp
            sub      rsp, 32
            and      rsp, -16
            print_literal "The surface of rectangle 1 is "
            mov      ax, [rect1.x2]
            sub      ax, [rect1.x1]
            mov      bx, [rect1.y2]
            sub      bx, [rect1.y1]
            imul     bx
            print_int ax
            mov      [rect2.x1], 10
            mov      [rect2.y1], 20
            mov      [rect2.x2], 60
            mov      [rect2.y2], 25
            print_literal "The surface of rectangle 2 is "
            mov      ax, [rect2.x2]
            sub      ax, [rect2.x1]
            mov      bx, [rect2.y2]
            sub      bx, [rect2.y1]
            imul     bx
            print_int ax
            mov      rsp, rbp
            xor      rax, rax
            ret

The sample program structure2.asm does the same as structure1.asm, but uses a macro to calculate the rectangles' surface.
  format ELF64
  struc rectangle x1, y1, x2, y2 {
    .x1     dw       x1
    .y1     dw       y1
    .x2     dw       x2
    .y2     dw       y2
  }
  macro print_int integer {
  local ..fmt, ..code
            jmp      ..code
    ..fmt   db       '%u', 0Dh, 0Ah, 00h
    ..code:
            mov      rcx, ..fmt
            xor      rdx, rdx
            mov      dx, integer
            call     printf
  }
  macro print_literal string {
  local ..str, ..code
            jmp      ..code
    ..str   db       string
            db       00h
    ..code:
            mov      rcx, ..str
            call     printf
  }
  macro rect_surface x1, y1, x2, y2 {
            mov      ax, x2
            sub      ax, x1
            mov      bx, y2
            sub      bx, y1
            imul     bx
  }
  section '.data' writeable
  rect1     rectangle 0, 0, 25, 4
  rect2     rectangle 10, 20, 60, 25
  section '.text' executable
  public main
  extrn printf
  main:
            mov      rbp, rsp
            sub      rsp, 32
            and      rsp, -16
            print_literal "The surface of rectangle 1 is "
            rect_surface [rect1.x1], [rect1.y1], [rect1.x2], [rect1.y2]
            print_int ax
            print_literal "The surface of rectangle 2 is "
            rect_surface [rect2.x1], [rect2.y1], [rect2.x2], [rect2.y2]
            print_int ax
            mov      rsp, rbp
            xor      rax, rax
            ret

That's it for my tutorial about assembly macros using FASM 64-bit. As the title of the tutorial indicates, it's an introduction to the subject, not more and not less. The FASM macros allow to do lots more as described here. Some features, that I tried, failed, in particular using rept (I always got the assembler error message Incomplete macro) (?), and passing a structure as argument to a macro and then using the structure's items within the macro (Undefined symbol error message in this case) (?). Other features have not been discussed, because I didn't try them out, as for example usage of the virtual directive, or passing a structure as argument to a procedure. You can find some details about macros in the Flat Assembler Programmer's Manual, available as PDF on several web sites. I guess that you can also find some macro-specific information, when searching the Internet. Finally, to note that there are several macro libraries available for download on development relates websites.


If you find this text helpful, please, support me and this website by signing my guestbook.