Reverse Engineering: Ghidra and Radare2

From Embedded Lab Vienna for IoT & Security
Jump to navigation Jump to search
Ghidra Logo
Radare2 Logo

Summary

This documentation serves as a short and quick tutorial for reverse engineering. It is split in two parts:

1) Computer and operating systems basics (foundations to understand reverse engineering)
2) Reverse Engineering with 2 common tools (Ghidra and Radare2)

Feel free to skip the first parts if you are already familiar with these concepts.

Computer architecture

There are a variety of computer architectures, the most prominent of which include both the Harvard and von Neumann concepts.

Harvard Concept

The Harvard concept is widespread in various fields, particularly where high efficiency and speed in data processing are required. The Harvard architecture refers to a concept where instructions and data are stored in separate memories. The processor (CPU) has a separate bus for the instruction memory and the data memory. To communicate with the outside world, such as users or other computers, input and output units (input and output devices) are required. Possible input devices include keyboard, mouse, microphone, or network devices, and possible output devices include display, speaker, or network devices.Image.png

Von Neumann Concept

Most of today's commercial computers for private users are based on the von Neumann concept. A key aspect of the von Neumann architecture is storing data and programs in the same memory. The processor is connected to the memory and the I/O units through the same bus system, allowing it to access the stored instructions and associated data in the memory as well as I/O data. To communicate with the outside world, such as users or other computers, input and output units (input and output devices) are also required. Possible input devices include keyboard, mouse, microphone, or network devices, and possible output devices include display, speaker, or network devices.

Image2.png

The following chapters will exclusively focus on the von Neumann concept, even though there are similarities between the concepts mentioned above.

The Processor

The processor (CPU) is the central component of every computer and essentially consists of the following components:

  • Arithmetic and Logic Unit (ALU): Performs arithmetic and logical operations.
  • Working or Arithmetic Registers: Stores operands and results temporarily during operations.
  • Control Unit: Manages internal components of the processor such as ALU, registers, bus systems, and other resources.

Each processor has its own instruction set, dependent on the processor type or architecture. The instruction set comprises a set of commands. Each command is associated with a bit pattern known as the instruction code, machine code, or opcode. Commands stored in memory are executed sequentially. The sequence of commands in memory is referred to as a program. Depending on the commands in the program, the processor performs various operations. When designing a computer architecture, the functionality of the computer is defined in part by the processor's instruction set.

The processor includes an internal bus system and interfaces with external address, data, and control buses (the control bus is not depicted here for simplicity).

The address bus is used to communicate with devices such as memory cells, input/output devices, etc., for interaction purposes. The data bus is used to exchange data with addressed devices. The control bus has diverse functionalities, such as controlling the direction of data exchange on the data bus.

The highly simplified and schematically represented processor in Figure 3 features a working register A, used, for example, to store interim results. All arithmetic and logical commands are executed by the Arithmetic Logic Unit (ALU), where in this case, the working register A must always contain one of the operands. For operations involving two operands, the second operand is loaded from memory by specifying its address through the intermediate register ZR. The result of the operation is then written back into working register A, replacing its previous content. Depending on the result of an operation, the ALU sets individual bits in the status register (Flag Register F). These bits, known as flags, indicate the outcome of the executed operation. Examples of flags in the status register include the Zero Flag, Carry Flag, and Overflow Flag. The state of these flags in the status register can be checked and used, for instance, for conditional jump commands to execute conditional branching in the program.

The control unit houses all the logic necessary for executing commands and controlling the resources accordingly.

Image3.png

Command Execution

Essentially, the following cyclic scheme outlines the execution of an instruction:

  • FETCH (Fetch instruction from memory)
  • EXECUTE (Execute instruction)

During the Fetch phase, the instruction located at the address stored in the program counter is loaded into the instruction register. Subsequently, the program counter is incremented by one. If an instruction spans multiple words, the control unit repeats this process until the entire instruction is loaded.

During the Execute phase, the instruction is executed. For example, an address is issued through an intermediate register, an operand is fetched from that address, it is added to the second operand already stored in working register A, the result is then stored back into working register A, and corresponding flags are set in the flag register.

Branch instructions allow for interruption or alteration of the sequential program flow. In the case of a branch instruction, the program flow continues at any other point defined by the branch instruction.

The processor presented in this section is a highly simplified model. Commercial processors typically feature a multitude of working registers, registers with specialized functions, and data pathways beyond those outlined here for address and data buses.

Memory

The memory can essentially be envisioned as a one-dimensional array of field elements with a fixed bit width, where each field element can be accessed by a unique address.

Hardware programming

Machine Commands

Typically, a machine instruction consists of an opcode followed by an address portion. An opcode is a unique bit pattern interpreted by the control unit, prompting it to execute all necessary steps for that particular instruction. The address portion, on the other hand, is used to address one or more operands. Machine instructions can vary in length, which can either complicate and slow down the control unit or be fixed in length to mitigate the aforementioned drawbacks.

Addressing Types:

Register addressing:

In register addressing, only registers are referenced.

MOV R1, R2  ; Copy the value from register R2 to register R1

Immediate addressing:

In immediate addressing, the operand itself is specified in the address part of the machine instruction.

MOV R0, #5  ; Load the immediate value 5 into register R0

Absolute addressing:

Absolute addressing specifies the memory address of the operand in the address part of the machine instruction.

MOV EAX, [0x1234]  ; Loads the value from memory address 0x1234 into register EAX

Relative addressing:

Relative addressing specifies the desired address as an offset relative to a base address. The base address is typically stored in a defined register, and the offset is provided in the machine instruction. The Program Counter (PC) contents are often used as the base address.

LDR R0, [PC, #4]  ; Load the value from address (PC + 4) into register R0

Indexed Addressing:

Indexed addressing is similar to relative addressing. It also calculates an offset from a base address. However, in indexed addressing, the address is provided in the instruction, and the offset is fetched from a special register called the index register.

LDR R0, [R1, #4]  ; Load the value from memory address (R1 + 4) into R0

Indirect Addressing (Pointer):

In indirect addressing, the actual address resides in a memory location. This means that the address of the memory location containing the desired address is specified.

LDR R0, [R1]  ; Load the value from the memory address stored in R1 into R0


Note: The examples provided here are presented in assembler code for readability and understanding, and should be interpreted as pseudocode until the introduction of assembler language.


Subprograms

To save memory, frequently executed program segments are stored in memory only once. These commonly executed program segments, also known as functions, can be accessed using specific jump instructions. Depending on the instruction, either the jump address is immediately loaded into the Program Counter (PC) and execution jumps to that address, or the current Program Counter contents are stored first, followed by loading the jump address into the PC. This allows the program flow to resume at the point where the function was called after exiting the function.

In ARMv7, for example, you can jump to a subroutine (function) using the following two commands:

BAL target_label         ; Branch Always: Jump immediately to the address specified by "target_label"
BL function_label        ; Branch with Link: The current PC contents, which hold the address of the next instruction, are stored in the Link Register (LR). Then, jump to the address specified by "function_label".

A function called as a subroutine can be exited using a specific return instruction.

In ARMv7, after completing a subroutine, you can return to the calling function using the following commands:

BX LR                     ; "Branch and Exchange": The processor jumps to the address stored in the specified register (LR).
MOV PC, LR                ; Move the contents of the Link Register (LR) into the Program Counter (PC).

These instructions allow for efficient management of program flow, returning execution to the point where the subroutine was called. To store the Program Counter and other data (such as register contents), the stack can be utilized. The stack is a simple last-in, first-out (LIFO) managed memory structure. This method of management facilitates handling of nested subroutine calls. Typically, the stack is located in main memory, and the processor maintains the memory address of the most recently stored item in a dedicated register known as the Stack Pointer (SP).


Note: The examples provided here have been presented in assembler code for readability and understanding. At this point, they should be interpreted as pseudocode, pending the introduction of assembler language.

Higher-level programming languages ​​and abstraction concepts

The two main problems of hardware programming described so far are that machine instruction bit patterns are difficult to remember, and programs in machine code are hard to relocate due to their absolute addresses.

Assembler

The solution to the issues of poor memorability and readability of machine code involves introducing an abstraction where machine instructions are associated with words and symbols used in programming instead of directly using the machine instruction bit patterns in source code files. Now, programs can be written in text files using these words and symbols, known as assembly language, rather than machine instruction bit patterns. These text files are processed by a translation program called an assembler, which converts them back into machine instruction bit patterns, transforming the word- and symbol-based program into executable code. Text files containing assembly code are commonly referred to as assembly source files and are typically saved with the extension ".s". Files containing the output translated by the assembler are called object files and are usually saved with the extension ".o".

Here's an example of an assembly program that adds two registers:

.global _start
_start:
         movl 	$0, %esi
         movl   $3, %eax
         add	%eax, %esi

An assembler program can be translated into machine code on Linux using the following command:

as test.s -o test.o

Please note, this creates an unlinked program object in machine code!

To display machine code as assembler code, i.e., to disassemble it under Linux, you can use the following command:

objdump -d test.o

Linker

The solution to the problem of difficulty in relocation involves introducing symbolic addresses, which act as "variables" called "labels" and are definitively set by the linker. These symbolic addresses are prefixed with zeros by the assembler, and relocation information tables are created to instruct the linker to insert the addresses of these labels at the respective locations.

From now on, programs can be divided into components and developed independently. Files containing the output bound by the linker are executable files known as binaries. They receive an extension dependent on the operating system (e.g., ".exe" in Windows).

Unlinked object files can be linked into a binary under Linux using the following command:

ld test.o -o test

This command links the object file "test.o" into an executable binary named "test".

Compiler

Writing program logic in assembler is unfortunately still somewhat cumbersome. Firstly, machine instructions expressed in words or symbols, despite commonalities across different processors, remain heavily dependent on the specific processor architecture. Secondly, human thought constructs can be complex and may be implemented differently or with difficulty in various processor assemblers. This situation has led to another level of abstraction: the development of higher-level programming languages. These languages provide means to express logical constructs in ways that are more easily understandable to humans and can be formulated universally across different processor instruction sets. From this point onward, primitives can be formulated in these higher-level languages, which are then translated into the specific assembler instruction set of a processor by a program called a compiler.

Compilation Process Example

Here is a highly simplified example illustrating the respective representations of a sample program:

  • Example.c:
int main() {
   exit(0);
}

Compilation process:

Command to invoke only the GCC compiler under Linux:

gcc -S Example.c
  • Example.s:
main:
   pushl   %ebp
   movl    %esp, %ebp
   pushl   $0
   call    exit
   addl    $4, %esp
   popl    %ebp
   ret

Assembly Process:

Command under Linux to translate an assembly program into machine code:

as Example.s -o Example.o
  • Example.o:
0000    55
0001    89E5
0003    6A00
0005    E800000000
000a    83C404
000d    89EC
000f    5D
0010    C3
main:   0
0006:   exit ADDR32 ; Relocation information: insert the address of "exit" at address 0006

Linking Process:

Command under Linux to combine object files into a binary:

ld Example.o -o Example
  • Example.exe
…
0030    55
0031    89E5
0033    6A00
0035    E848010000 ; Address of the function "exit" was inserted by the linker
003a    83C404
003d    89EC
003f    5D
0040    C3
…

Working with an operating-system

Despite higher-level programming languages, challenges such as operating specific hardware, managing multitasking, and handling virtual memory management persist. To address these issues, the concept of an operating system (OS) was introduced as a solution. An operating system is a program or a collection of programs that perform various tasks, including:

  • Loading and executing programs
  • Managing multitasking
  • Virtualizing memory, allowing each program to be written as if it starts at address 0 without conflict in multitasking scenarios (this is achieved, for example, using specialized hardware like a Memory Management Unit, MMU)
  • Managing hardware and providing hardware abstractions through drivers, eliminating the need for every programmer to understand the intricate details of hardware operation (e.g., configuring voltage levels manually for network card operations)
  • Providing a user-friendly interface for users

To utilize the functionalities of an operating system, various programming interfaces (APIs) were developed and standardized. These APIs offer similar functions (known as primitives). One prominent example is the Portable Operating System Interface (POSIX) standard (ISO/IEC/IEEE 9945), which has been largely implemented by operating systems like Linux.

Examples of primitives offered by POSIX in a higher-level programming language like C include:

* int open(const char *path, int oflag, ...) for opening files
* int close(int fildes) for closing files
* pid_t fork(void) for creating a new process
* void (*signal(int sig, void (*func)(int)))(int) for registering signal handler functions
* ...

At the hardware level, these primitives are implemented through system calls to the operating system. These system calls are highly dependent on the operating system and processor.

For instance, the Linux kernel maintains a list of all system calls it provides, known as the System Call Table. Each system call is associated with a unique number and a kernel internal function that performs the actual tasks. To invoke a system call, for example on x86 machines, one loads the number of the desired call into the EAX register and then triggers a software interrupt 128 (hexadecimal: 0x80). The arguments for the system call are stored in CPU registers according to the FastCall calling convention.

The software interrupt, also known as an exception, halts the execution of the program in user mode and triggers the execution of an exception handler in kernel mode. The kernel's exception handler reads the EAX register and, if it contains a valid system call number, invokes the corresponding kernel function from the System Call Table with the arguments stored in the other registers. After validating the arguments, the kernel performs the requested tasks. Upon completion of this function, the exception handler finishes its work, and normal program execution resumes.

Example of using a POSIX system call under Linux:

.global _start
_start:
   MOV R0, #1         @ Set R0 to define the data stream (STDOUT here). This data stream is typically connected to the console.
   LDR R1, =message   @ Load R1 with a pointer to the data to be written.
   LDR R2, =len       @ Load R2 with the length of the data to be written.
   MOV R7, #4         @ Set R7 to define the desired system call (here, write to STDOUT).
   SWI 0              @ Trigger software interrupt 0 (syscall).
   MOV R7, #1         @ Set R7 to define the exit system call.
   SWI 0              @ Trigger software interrupt 0 (syscall).
.data
message:
   .asciz "Hello World! \n"  @ Define the message to be printed.
len = .-message              @ Calculate the length of the message.

Explanation:

Register R0 is used to define the data stream (in this case, STDOUT), which is the default output connected to the console.
Register R1 serves as a pointer to the data that will be written.
Register R2 indicates the length of the data to be written.
Register R7 is used to specify the desired system call (in this case, writing to STDOUT).

Finally, the operating system must be instructed to terminate the program so that data processing does not continue indefinitely. This is achieved with software interrupt #1.

Reverse Engineering

Description

Reverse engineering is the process of analyzing a finished product or system to understand its design, structure, and functionality. This process is frequently applied in software development, as well as in hardware development and other engineering disciplines.

Applications of Reverse Engineering

Software Development:

  • Bug Fixing and Debugging
  • Compatibility and Interoperability
  • Security Analysis
  • Detecting License Violations

Hardware Development:

  • Product Analysis
  • Repair and Maintenance
  • Manufacturing Spare Parts

Typical Steps in Reverse Engineering

1. Collection and Gathering information about the target product

2. Disassembly and Decompilation (converting binary code or machine code into a higher-level programming language or human-readable form)

3. Analysis and Documentation of the individual components and their functionality

4. Recovery and Reconstruction

Legal Ethical Considerations

Reverse engineering can raise legal and ethical questions, particularly concerning intellectual property. While it is legal in many cases, there are certain situations where it can lead to legal disputes:

  • Copyright: No rights of the holder's permission
  • Patents: Patent infringments if reproduced products use patented technologies
  • Terms of Use: Many software licenses explicitly prohibit reverse engineering

Tools and Techniques

There are 2 common tools which are free and open-source:

* Ghidra
* Radare2

Ghidra

Description

Ghidra is an open-source software, created by the National Security Agency (NSA) and publicly released in 2019. It offers a powerful environment for reverse engineering and malware analysis. Ghidra supports a wide variety of processor architectures and file formats, making it a versatile tool for security researchers and developers.

Installation

Requirements:

* Java Development Kit (JDK) 11 or higher
* At least 4 GB of RAM (8GB or more recommended)

Steps:

* Download the latest version of Ghidra from the official website: https://ghidra-sre.org/
* Unzip the downloaded archive to a directory of your choice
* Ensure that the JDK is installed and the 'JAVA_HOME' environment variable is set correctly

Starting Ghidra:

Navigate to the directory where you unzipped Ghidra and run the start script:

* Linux/macOS: ./ghidraRun
* Windows: ./ghidraRun.bat

Basic Features

Creating a new project:

1. Start Ghidra and create a new project ('File -> New Project')
2. Choose a project type (usually Non-shared Project) and specify a location and name for the project
3. Import the binary file you want to analyze ('File -> Import File')

Analyzing the Binary:

1. After importing the file, double-click on it in the explorer to open it
2. Ghidra will automatically perform an initial analysis of the file. Confirm the suggested settings and start the analysis.

Disassembly and Decompilation:

Disassembly: Shows the machine code of the binary file. This is useful for understanding the low-level execution of the file.
Decompilation: Converts the machine code into a higher-level, human-readable form (similar to C code).

Advanced Features

  • Scripting

Ghidra supports scripting to automate complex analyses. You can write scripts in Java or Python (Jython).

  • Debugging

Ghidra can be integrated with external debuggers to perform dynamic analysis. This allows you to set breakpoints and step through the code.

Of course there are many other notable features, but these are beyond the scope of this documentation.

Radare2

Description

Radare2 is a comprehensive framework for reverse engineering and binary analysis. It offers a wide range of features, from static and dynamic analysis to patching and debugging support. Due to its power and flexibility, radare2 is popular among security experts, malware analysts, and developers.

Installation

Radare2 can be installed on various operating systems (comes pre-installed on kali-linux). These are the steps for the most common platforms:

  • Linux
git clone https://github.com/radareorg/radare2.git
cd radare2
sys/install.sh
  • macOS
brew install radare2
  • Windows
Download the installation package from the official website and follow the instructions.

Basic Commands

After installation, you can start radare2 by typing 'r2' followd by the path to the binary you want to analyze:

r2 /path/to/binary

Now typically you start by typing 'V' to enter the hex-view, afterwards you have following options:

  • p - change view forwards
  • P - change view backwards
  • aaa - analyze the binary to be able to use further instructions
  • s - search for a keyword
  • afl - find all functions
  • q - quit to start screen
  • ? - help and information
  • VV - opens a visual mode with a graphical representation

Advanced Features

  • Scripting

With radare2, you can write scripts in various languages to perform complex analyses and automations.

  • Debugging

Radare2 also supports debugging binaries:

r2 -d /path/to/binary

In debug mode, additional commands are available:

  • db - sets a breakpoint
  • dc - continue execution
  • dr - displays register contents

Of course there are many other notable features, but these are beyond the scope of this documentation.

Training Code

simpleMath.c

include <stdio.h>
int add(int a, int b) {
   return a + b;
}
int subtract(int a, int b) {
   return a - b;
}
int multiply(int a, int b) {
   return a * b;
}
int main() {
   int x = 5;
   int y = 3;
   int sum = add(x, y);
   int diff = subtract(x, y);
   int prod = multiply(x, y);
   printf("Sum: %d\n", sum);
   printf("Difference: %d\n", diff);
   printf("Product: %d\n", prod);
   return 0;
}

Don't forget to compile the file first with gcc!

References