Codementor Events

Reverse Engineering with FASM and x64dbg

Published Jan 31, 2019

Learn how to use FASM for building our first assembly language program and debug the executable using x64dbg in this article by Reginald Wong, a lead anti-malware researcher at Vipre Security, a J2 Global company, covering various security technologies focused on attacks and malware.

The main piece of knowledge required in advance for any reverse engineer is assembly language. Understanding assembly language is like learning the ABCs of reversing. It may look hard at first, but eventually, it will become like muscle memory. Assembly language is the language that is used to communicate with the machine. The source code of a program can be understood by humans but not by the machine. The source code has to be compiled down to its assembly language code form for the machine to understand it.

But, as humans, what if the source code is not available? Our only way to understand what a program does is to read its assembly codes. In a way, what we are building here is a way to turn an assembly language code back to the source code. That would be why this is called reversing.

All programming languages need to be built to become an executable on the system platform that the program was built for. Unless you want to enter each opcode byte in a binary file, developers have made tools to convert that source code to an executable that contains code that the machine can understand. Now, let's take a look at some of the popular assembly language builders today.

FASM

FASM, or Flat Assembler, is similar MASM and NASM. Like MASM, it has its own source editor. Like NASM, the sections are easily identifiable and configured, and the software comes in flavors for both Windows and Linux:

1.png

FASM can be downloaded from http://flatassembler.net/. In our assembly language programming, we will use FASM, since we can use its editor in both Windows and Linux.

x64dbg

This debugger is most recommended as the developers keep this up-to-date, working with the community. It also supports both 64- and 32-bit Windows platforms with a lot of useful plugins available. It has a similar interface as Ollydebug.

2.png

x64dbg can be downloaded from https://x64dbg.com/.

Hello World

We are going to use FASM for building our first assembly language program, and we will debug the executable using x64dbg.

Installation of FASM

Using our Windows setup, download FASM from http://flatassembler.net/, then extract FASM into a folder of your choice:

Run FASMW.EXE to bring up the FASM GUI.

It works!

In your text editor, write down the following code, or you can simply do a Git clone of the data at https://github.com/PacktPublishing/Mastering-Reverse-Engineering/blob/master/ch3/fasmhello.asm.

format PE CONSOLE
entry start

include '%include%\win32a.inc' 

section '.data' data readable writeable 
  message db 'Hello World!',0
  msgformat db '%s',0

section '.code' code readable executable 
  start:
    push message
    push msgformat
    call [printf]
    push 0
    call [ExitProcess]

section '.idata' import data readable writeable 
  library kernel32, 'kernel32.dll', \
          msvcrt, 'msvcrt.dll'
  import kernel32, ExitProcess, 'ExitProcess'
  import msvcrt, printf, 'printf'

Save it by clicking on File->Save as..., then click on Run->Compile:

3.png

The executable file will be located where the source was saved:

4.png

If "Hello World!" did not show up, one thing to note is that this is a console program. You'll have to open up a command terminal and run the executable from there:

5.png

Dealing with common errors when building

Write Failed Error – This means that the builder or compiler is not able to write to the output file. It is possible that the executable file it was going to build to is still running. Try looking for the program that was run previously and terminate it. You can also terminate it from the process list or Task Manager.
Unexpected Characters – Check for the syntax at the indicated line. Sometimes the included files also need to be updated because of changing syntax on recent versions of the builder.

Invalid argument – Check for the syntax at the indicated line. There might be missing parameters of a definition or a declaration.

Illegal instruction – Check for the syntax at the indicated line. If you are sure that the instruction is valid, it might be that the builder version doesn't match where the instruction was valid. While updating the builder to the most recent version, also update the source to comply with the recent version.

Dissecting the program

Now that we have built our program and got it working, let's discuss what the program contains and is intended for.

A program is mainly structured with a code section and a data section. The code section, as its name states, is where program codes are placed. On the other hand, the data section is where the data, such as text strings, used by the program code is located. There are requirements before a program can be compiled. These requirements define how the program will be built. For example, we can tell the compiler to build this program as a Windows executable, instead of a Linux executable. We can also tell the compiler which line in the code should the program start running. An example of a program structure is given here:

6.png

We can also define the external library functions that the program will be using. This list is described under a separate section called the Import section. There are various sections that can be supported by a compiler. An example of these extended sections includes the resource section, which contains data such as icons and images.

With the basic picture of what a program is structured, let see how our program was written. The first line, format PE CONSOLE, indicates that the program will be compiled as a Windows PE executable file and built to run on the console, better known in Windows as Command Prompt.

The next line, entry start, means that the program will start running code located at the start label. The name of the label can be changed as desired by the programmer. The next line, include '%include%\win32a.inc', will add declarations from the FASM library file win32a.inc. The declared functions expected are for calling the printf and ExitProcess API functions discussed in the idata section.

There are three sections built in this program: the data, code, and idata sections. The section names here are labeled as .data, .code, and .idata. The permissions for each section are also indicated as either readable, writeable, and executable. The data section is where integers and text strings are placed and listed using the define byte (db) instruction. The code section is where lines of the instruction code are executed. The idata section is where imported API functions are declared.

On the next line, we see that the data section is defined as a writeable section:

section '.data' data readable writeable

The program's .data section contains two constant variables, message and msgformat. Both text strings are ASCIIZ (ASCII-Zero) strings, which means that they are terminated with a zero (0) byte. These variables are defined with the db instruction:

message db 'Hello World!',0
 msgformat db '%s',0

The next line defines the code section. It is defined with read and execute permissions:

section '.code' code readable executable

It is in the .code section where the start: label is and where our code is. Label names are prefixed with a colon character.

In C programming, printf is a function commonly used to print out messages to the console using the C syntax, as follows:
int printf ( const char * format, ... );

The first parameter is the message containing format specifiers. The second parameter contains the actual data that fills up the format specifiers. In assembly language perspective, the printf function is an API function that is in the msvcrt library. An API function is set up by placing the arguments in the memory stack space before calling a function. If your program is built in C, a function that requires 3 parameters (for example, myfunction(arg1, arg2, arg3)) will have the following as an equivalent in assembly language:

push <arg3>
push <arg2>
push <arg1>
call myfunction

For a 32-bit address space, the push instruction is used to write a DWORD (32 bits) of data on the top of the stack. The address of the top of the stack is stored in the ESP register. When a push instruction is executed, the ESP decreases by 4. If the argument is a text string or a data buffer, the address is push-ed to the stack. If the argument is a number value, the value is directly push-ed to the stack.

Following the same API calling structure, with two arguments, our program called printf in this manner:

push message
 push msgformat
 call [printf]

In the data section, the addresses, labeled as message and msgformat, are pushed to the stack as a setup before calling the printf function. Addresses are usually placed in square brackets, []. As discussed previously, the value at the address is used instead. The printf is actually a label that is the local address in the program declared in the .idata section. [printf] then means that we are using the address of the printf API function from the msvcrt library. Thus, call [printf] will execute the printf function from the msvcrt library.

The same goes for ExitProcess. ExitProcess is a kernel32 function that terminates the running process. It requires a single parameter, which is the exit code. An exit code of 0 means that the program will terminate without any errors:

push 0 
 call [ExitProcess]

In C syntax, this code is equivalent to ExitProcess(0), which terminates the program with a success result defined with zero.

The program's .idata section contains external functions and is set with read and write permissions:

section '.idata' import data readable writeable

In the following code snippet, there are two portions. The first part indicates which library files the functions are located in. The library command is used to set the libraries required and uses the syntax library <library name>, <library file>. A backslash, , is placed to indicate that the next line is a continuation of the current line:

library kernel32, 'kernel32.dll', \
           msvcrt, 'msvcrt.dll'

Once the libraries are declared, specific API functions are indicated using the import command. The syntax is import <library name>, <function name>, <function name in library file>. Two external API functions are imported here, kernel32's ExitProcess and msvcrt's printf:

import kernel32, ExitProcess, 'ExitProcess'
 import msvcrt, printf, 'printf'

An annotated version of the program can be found at https://github.com/PacktPublishing/Mastering-Reverse-Engineering/blob/master/ch3/FASM%20commented.txt

The library of API functions can be found in the MSDN library, which also has an offline version packaged in the Visual Studio installer. It contains detailed information about what the API function is for and how to use it. The online version looks like the following:

7.png

If you found this article interesting, you can explore Mastering Reverse Engineering to implement reverse engineering techniques for analyzing software, exploit software targets, and defend against security threats like malware and viruses. If you want to analyze software in order to exploit its weaknesses and strengthen its defenses, then you should explore Mastering Reverse Engineering.

Discover and read more posts from PACKT
get started
post commentsBe the first to share your opinion
Show more replies