Inline assembly

Using 32-bit and 64-bit inline assembly with Free Pascal.

There are to ways to use assembly code with a Free Pascal program:

Including the assembly code within the Free Pascal source code.
Creating a separate assembly file and linking the object file resulting from the assembly with the Free Pascal program.

The first of these ways to proceed is called inline assembly and it's concerning this, that this tutorial is about. Please, note that the included programs (click the following link to download the sample source code) have been created and tested on Windows; they might not work on other platforms. Knowledge of Free Pascal and x86 assembly (Intel syntax) is considered as a prerequisite. In order to reproduce all samples of the tutorials, you need either to install both the 32-bit and 64-bit version of Lazarus (cf. the tutorial Installing Lazarus/Free Pascal on MS Windows, that explains how to make a secondary Lazarus installation), or Lazarus 64-bit with the 32-bit cross-compiler add-on (cf. the tutorial 32-bit cross-compiling on Windows 64-bit).

As the Free Pascal compiler supports both the Intel syntax and the AT&T syntax, you have to include a compiler directive:
{$asmMode intel}

Assembly code included within the main program.

By "included within the main program", I mean not as an assembly function/procedure... Assembly code may be directly inserted into the Free Pascal source, putting it within the bloc:
asm
...
end;

There are quite few official Internet sites talking about the mixture of Free Pascal and assembly. If you find a page, it's mostly the answer to some question in a forum. Trying out the code, that I found there, often didn't work. Maybe, that it did with Delphi, but not with Free Pascal. And, the important point to be aware: Free Pascal inline assembly is architecture-dependent, i.e. in most cases, you have to use different assembly code, depending on the target platform: Windows 32-bit, or Windows 64-bit (note that programs for Windows 32-bit also run on a 64-bit Windows).

Inline assembly for a 32-bit target.

This is really easy: Just write your assembly code as if you would write an independent assembly routine. The Free Pascal variables may be used by their name, in a very similar way, as you would use variable labels in assembly.

Our first sample program ass1_x86 swaps to variables A and B. It's just 3 assembly instructions; here is the code of the program:

program ass1_x86;
{$asmMode intel}
var
A, B: LongInt;
begin
Write('Enter two integer values A and B? '); Readln(A, B);
Writeln('A = ', A, ' B = ', B);
Writeln('Swapping the variables A and B...');
asm
mov eax, [A]
xchg eax, [B]
mov [A], eax
end;
Writeln('A = ', A, ' B = ', B);
Write('Hit ENTER to terminate the program '); Readln;
end.

Two notes concerning this code:

To avoid surprises, you should declare your integer variables as LongInt, when using assembly for a 32-bit target.
The square brackets are actually optional. I would, however, recommend to use them. If you were in an assembly routine and A and B were variable labels, you had to use square brackets, too, in order to tell the assembler to use the value at that address.

Inline assembly for a 64-bit target.

Why should this code (using Int64 variables and using the 64-bit instead of the 32-bit registers) not work on Windows 64-bit? If we try it, and compile the program in the Lazarus IDE, we'll get the warning Object file contains 32-bit absolute relocation, as shown on the screenshot below (the program actually runs, but with the wrong results: input: A = 10, B = 20; after the swap: A = 0, B = 20).

Free Pascal inline assembly: Absolute addressing issue with 64-bit code

On several forum pages, this warning is commented as being an alignment issue, but it's more probable that the problem is the usage of absolute addresses. When referencing the variables with an absolute address and the address is not within the first 4 GB, this will result in a wrong memory reference, because the reference will be truncated to 32 bits. The correct way to proceed on a 64-bit platform is to use AMD64 RIP-relative addressing. RIP-relative addressing is the x86_x64 specific term for what most other architectures call PC-relative addressing. PC-relative means that, rather than holding an absolute address as part of an instruction that references memory, the instruction includes an offset relative to the instruction itself.

And here, how this looks in our variable swapping program ass1_x64:

program ass1_x64;
{$asmMode intel}
var
A, B: Int64;
begin
Write('Enter two integer values A and B? '); Readln(A, B);
Writeln('A = ', A, ' B = ', B);
Writeln('Swapping the variables A and B...');
asm
mov rax, [rip + A]
xchg rax, [rip + B]
mov [rip + A], rax
end;
Writeln('A = ', A, ' B = ', B);
Write('Hit ENTER to terminate the program '); Readln;
end.

Notes:

Always use RIP-relative addressing instead of absolute addressing when using assembly for a 64-bit target.
To avoid surprises, you should declare your integer variables as Int64, when using assembly for a 64-bit target; we'll see such a "surprising" (wrong) result with a sample program further down in the text.

Assembly procedures and "by reference" arguments.

In the next program sample, we will include the variable swapping code in an assembly procedure, called from the main program as any other procedure. The two variables to be swapped have to be declared by reference (using var) in the procedure header. This means that we can't use the code above do perform the swap. In fact, by reference arguments actually aren't values, but pointers to variables in the calling program block (here the main program). It seems obvious that the way to proceed is using the index registers, loaded with the value of the pointers, to get the values of A and B in the assembly code. This approach works well with both a 32-bit and a 64-bit target.

Here is the code of the programs ass2_x86 and ass2_x64.

program ass2_x86;
{$asmMode intel}
var
A, B: LongInt;
procedure SwapInt(var M, N: LongInt); assembler;
asm
mov esi, M
mov edi, N
mov eax, [esi]
xchg eax, [edi]
mov [esi], eax
end;
begin
Write('Enter two integer values A and B? '); Readln(A, B);
Writeln('A = ', A, ' B = ', B);
Writeln('Swapping the variables A and B...');
SwapInt(A, B);
Writeln('A = ', A, ' B = ', B);
Write('Hit ENTER to terminate the program '); Readln;
end.

program ass2_x64;
{$asmMode intel}
var
A, B: Int64;
procedure SwapInt(var M, N: Int64); assembler;
asm
mov rsi, M
mov rdi, N
mov rax, [rsi]
xchg rax, [rdi]
mov [rsi], rax
end;
begin
Write('Enter two integer values A and B? '); Readln(A, B);
Writeln('A = ', A, ' B = ', B);
Writeln('Swapping the variables A and B...');
SwapInt(A, B);
Writeln('A = ', A, ' B = ', B);
Write('Hit ENTER to terminate the program '); Readln;
end.

The screenshot shows the execution of the 32-bit program within the Lazarus IDE.

Free Pascal inline assembly: Execution of a 32-bit 'swap variables' program

The primary goal of the assembler directive is to tell the fpc compiler that the following code actually is assembly, allowing it to perform some optimizations of the object code created.

The procedures are the same for a 32-bit and a 64-bit target, except for the registers used: ESI and EDI (and EAX) vs. RSI and RDI (and RAX).

Assembly functions using the register calling convention.

Calling an assembly function is somewhat tricky, because the way to proceed depends on the calling convention used. Free Pascal supports several of them, and we use a further directive to tell the fpc compiler which one should be used in our program. On Windows 64-bit, the register calling convention is used. As this is a rather simple way to handle the function arguments, we can use it for a 32-bit target, too. Important to note, however, that the registers used for a 32-bit resp. a 64-bit target are not the same!

On a 32-bit platform, the registers used for the arguments are EAX and EDX, the function return value having to be in EAX. On a 64-bit platform, the registers used for the arguments are RCX, RDX, R8 and R9, the function return value having to be in RAX.

To show, how this works, lets write a simple function determining the sign of an integer value. The function should return 0 if the argument is zero, 1 if it is greater than zero and -1 if it less than zero. To note that my "sign" assembly code is not the best one; there are ways to do this with less and more performant instructions...

Here is the code of the sample program ass3_x86:

program ass3_x86;
{$asmMode intel}
var
A: LongInt;
function Sign(N: LongInt): LongInt; assembler; register;
asm
xor ecx, ecx
cmp eax, 0
je @Done
jl @Negative
mov ecx, 1
jmp @Done
@Negative:
mov ecx, -1
@Done:
mov eax, ecx
end;
begin
Write('Enter a signed integer A? '); Readln(A);
Writeln('The sign of A = ', Sign(A));
Write('Hit ENTER to terminate the program '); Readln;
end.

No need to make a mov to get the function argument: As we use the register calling convention, the first (and only, in this case) argument is automatically available in EAX. I use ECX to store the function return value; initialized at 0, it is set to 1 if the content of EAX is greater than 0, to -1 if it is less. As the register calling convention states that the function return value has to be in EAX, we move ECX to EAX at the end of the routine. This automatically sets the Free Pascal pre-declared variable Result to the content of EAX (so, no need to use mov @Result, <register> before returning from an assembly function).

And the code of the sample program ass3_x64:

program ass3_x64;
{$asmMode intel}
var
A: Int64;
function Sign(N: Int64): Int64; assembler; register;
asm
xor rax, rax
cmp rcx, 0
je @Done
jl @Negative
mov rax, 1
jmp @Done
@Negative:
mov rax, -1
@Done:
end;
begin
Write('Enter a signed integer A? '); Readln(A);
Writeln('The sign of A = ', Sign(A));
Write('Hit ENTER to terminate the program '); Readln;
end.

On a 64-bit system, the first (and here only) argument is in RCX. This is convenient (in comparison with 32-bit), as we can immediately use RAX (that will be used to return the function result), to store the sign value.

I said before, that it's the best to use LongInt variables on a 32-bit and Int64 variables on a 64-bit platform. The screenshot below shows the execution of the 64-bit program within the Lazarus IDE. Note that the wrong function result (1 for a negative number) is due to the fact of using a LongInt instead of an Int64 argument. Other example: Using an Int64 argument, but a Byte function result, would return 255 for a negative number!

Free Pascal inline assembly: 64-bit 'sign' function with wrong result due to a bad argument size

Assembly functions with an array argument.

We have seen that on a 64-bit system, using the register calling convention, the parameter values of a function with two scalar arguments are passed to the assembly code in registers RCX and RDX. But, how to proceed if one of the arguments is an array? If you have read through the code of the sample programs of the tutorial, maybe that you guess the answer to this question. An array should (always) be passed "by reference" to a function. Thus the array argument actually is a pointer to the first element of the array declared in the calling program bloc. So, it seems obvious that using an index register to point to the array and its elements could be a quite simple way to get the array values.

The program ass4_x64 uses an assembly function to determine the maximum value in an array of integers entered by the user. Here is the code:

program ass4_x64;
{$asmMode intel}
var
N, I: Int64;
Numbers: array of Int64;
function Max(N: Int64; var Series: array of Int64): Int64; assembler; register;
asm
mov rsi, rdx
mov rax, [rsi]
@Next:
add rsi, 8
dec rcx
jz @Done
cmp rax, [rsi]
jge @Next
mov rax, [rsi]
jmp @Next
@Done:
end;
begin
Writeln('Enter a series of integers; terminate with 0? ');
N := 0;
repeat
Write('? '); Readln(I);
if I <> 0 then begin
Inc(N);
SetLength(Numbers, N);
Numbers[N - 1] := I;
end;
until I = 0;
if N > 1 then
Writeln('The maximum of this series = ', Max(N, Numbers));
Write('Hit ENTER to terminate the program '); Readln;
end.

When entering the function, RCX contains the number of elements in the array and RDX contains the pointer to the array, i.e. the pointer to the array's first element. The value in RCX can directly be used as counter in a loop where at each iteration, it is decremented and its value becoming 0 meaning that all elements have been done, so the loop has to be terminated. Concerning the array, we load the pointer in RDX into RSI. This allows us to access the array elements using indexed addressing with RSI. Before entering the loop, we load the first array element into RAX, that will contain the function return value (maximum value of the array elements). At each loop iteration we add +8 to RSI (Int64 variables have qword size), thus pointing to the next array element. If this element's value is greater than the actual maximum value in RAX, it becomes the new maximum. When the element counter is 0, we're done and may return to the calling program (the Free Pascal variable Result being automatically assigned the value in RAX).

Assembly procedures with register calling convention.

In the program sample ass2_x64, we have seen how to use the index registers to access "by reference" arguments of a procedure. But, how to proceed in the case of a "by value" argument? I think that the simplest way to do is to do the same way as with the function in the previous example: Passing the arguments using the register calling convention. And to fill in the procedure results (that are always "by reference" arguments), we use an index register, as described in sample program ass2_x64.

The program ass5_x64 uses an assembly procedure to determine the minimum and maximum value in an array of integers entered by the user. Here is the code:

program ass5_x64;
{$asmMode intel}
var
N, I, Min, Max: Int64;
Numbers: array of Int64;
procedure MinMax(N: Int64; var Series: array of Int64; out Min, Max: Int64); assembler; register;
asm
mov rsi, rdx
mov rax, [rsi]
mov rbx, [rsi]
@Next:
add rsi, 8
dec rcx
jz @Done
cmp rax, [rsi]
jl @Greater
cmp rbx, [rsi]
jg @Smaller
jmp @Next
@Greater:
mov rax, [rsi]
jmp @Next
@Smaller:
mov rbx, [rsi]
jmp @Next
@Done:
mov rsi, Max
mov [rsi], rax
mov rsi, Min
mov [rsi], rbx
end;
begin
Writeln('Enter a series of integers; terminate with 0? ');
N := 0;
repeat
Write('? '); Readln(I);
if I <> 0 then begin
Inc(N);
SetLength(Numbers, N);
Numbers[N - 1] := I;
end;
until I = 0;
if N > 1 then begin
MinMax(N, Numbers, Min, Max);
Writeln('The minimum of this series = ', Min);
Writeln('The maximum of this series = ', Max);
end;
Write('Hit ENTER to terminate the program '); Readln;
end.

The screenshot below shows the execution of the program within the Lazarus IDE.

Free Pascal inline assembly: Execution of a 64-bit 'array minimum and maximum' procedure

Passing the number of array elements and the pointer to the array is done in exactly the same way as in program sample ass4_x64. The logic of the routine is also the same, except that we make two comparisons, one to find the maximum (stored in RAX), and one for the minimum (stored in RBX). Whereas with a function, we can directly return the result in the RAX register, with a procedure, we have to copy the result values to the calling program variables. As Min and Max are actually pointers to variables in the Free Pascal main program, we use the index register RSI to make an indexed move to these memory addresses (as in program sample ass2_x64).

I hope that this tutorial helps the reader to understand the fundamentals of using inline assembly in a Free Pascal program, and that the included sample programs are useful as templates for their own mixtures of Pascal and assembly code. Programming in assembly may seem something really complex and so may scare off people to give it try. But, maybe, that this tutorial convinced you that, even for a beginner, it not impossible to create programs containing some assembly functions or procedures. Anyway, programming is fun! Enjoy!

If you find this text helpful, please, support me and this website by signing my guestbook.

Computing: Free Pascal Programming

Using 32-bit and 64-bit inline assembly with Free Pascal.