Reverse Engineering: Ghidra and Radare2
Summary
This documentation serves as a short and quick tutorial for reverse engineering. It is split in two parts:
1) Computer and operating systems basics (foundations to understand reverse engineering)
2) Reverse Engineering with 2 common tools (Ghidra and Radare2)
Feel free to skip the first parts if you are already familiar with these concepts.
Computer architecture
There are a variety of computer architectures, the most prominent of which include both the Harvard and von Neumann concepts.
Harvard Concept
The Harvard concept is widespread in various fields, particularly where high efficiency and speed in data processing are required. The Harvard architecture refers to a concept where instructions and data are stored in separate memories. The processor (CPU) has a separate bus for the instruction memory and the data memory. To communicate with the outside world, such as users or other computers, input and output units (input and output devices) are required. Possible input devices include keyboard, mouse, microphone, or network devices, and possible output devices include display, speaker, or network devices.
Von Neumann Concept
Most of today's commercial computers for private users are based on the von Neumann concept. A key aspect of the von Neumann architecture is storing data and programs in the same memory. The processor is connected to the memory and the I/O units through the same bus system, allowing it to access the stored instructions and associated data in the memory as well as I/O data. To communicate with the outside world, such as users or other computers, input and output units (input and output devices) are also required. Possible input devices include keyboard, mouse, microphone, or network devices, and possible output devices include display, speaker, or network devices.
The following chapters will exclusively focus on the von Neumann concept, even though there are similarities between the concepts mentioned above.
The Processor
The processor (CPU) is the central component of every computer and essentially consists of the following components:
- Arithmetic and Logic Unit (ALU): Performs arithmetic and logical operations.
- Working or Arithmetic Registers: Stores operands and results temporarily during operations.
- Control Unit: Manages internal components of the processor such as ALU, registers, bus systems, and other resources.
Each processor has its own instruction set, dependent on the processor type or architecture. The instruction set comprises a set of commands. Each command is associated with a bit pattern known as the instruction code, machine code, or opcode. Commands stored in memory are executed sequentially. The sequence of commands in memory is referred to as a program. Depending on the commands in the program, the processor performs various operations. When designing a computer architecture, the functionality of the computer is defined in part by the processor's instruction set.
The processor includes an internal bus system and interfaces with external address, data, and control buses (the control bus is not depicted here for simplicity).
The address bus is used to communicate with devices such as memory cells, input/output devices, etc., for interaction purposes. The data bus is used to exchange data with addressed devices. The control bus has diverse functionalities, such as controlling the direction of data exchange on the data bus.
The highly simplified and schematically represented processor in Figure 3 features a working register A, used, for example, to store interim results. All arithmetic and logical commands are executed by the Arithmetic Logic Unit (ALU), where in this case, the working register A must always contain one of the operands. For operations involving two operands, the second operand is loaded from memory by specifying its address through the intermediate register ZR. The result of the operation is then written back into working register A, replacing its previous content. Depending on the result of an operation, the ALU sets individual bits in the status register (Flag Register F). These bits, known as flags, indicate the outcome of the executed operation. Examples of flags in the status register include the Zero Flag, Carry Flag, and Overflow Flag. The state of these flags in the status register can be checked and used, for instance, for conditional jump commands to execute conditional branching in the program.
The control unit houses all the logic necessary for executing commands and controlling the resources accordingly.
Command Execution
Essentially, the following cyclic scheme outlines the execution of an instruction:
- FETCH (Fetch instruction from memory)
- EXECUTE (Execute instruction)
During the Fetch phase, the instruction located at the address stored in the program counter is loaded into the instruction register. Subsequently, the program counter is incremented by one. If an instruction spans multiple words, the control unit repeats this process until the entire instruction is loaded.
During the Execute phase, the instruction is executed. For example, an address is issued through an intermediate register, an operand is fetched from that address, it is added to the second operand already stored in working register A, the result is then stored back into working register A, and corresponding flags are set in the flag register.
Branch instructions allow for interruption or alteration of the sequential program flow. In the case of a branch instruction, the program flow continues at any other point defined by the branch instruction.
The processor presented in this section is a highly simplified model. Commercial processors typically feature a multitude of working registers, registers with specialized functions, and data pathways beyond those outlined here for address and data buses.
Memory
The memory can essentially be envisioned as a one-dimensional array of field elements with a fixed bit width, where each field element can be accessed by a unique address.
Hardware programming
Machine Commands
Typically, a machine instruction consists of an opcode followed by an address portion. An opcode is a unique bit pattern interpreted by the control unit, prompting it to execute all necessary steps for that particular instruction. The address portion, on the other hand, is used to address one or more operands. Machine instructions can vary in length, which can either complicate and slow down the control unit or be fixed in length to mitigate the aforementioned drawbacks.
Addressing Types:
Register addressing:
In register addressing, only registers are referenced.
MOV R1, R2 ; Copy the value from register R2 to register R1
Immediate addressing:
In immediate addressing, the operand itself is specified in the address part of the machine instruction.
MOV R0, #5 ; Load the immediate value 5 into register R0
Absolute addressing:
Absolute addressing specifies the memory address of the operand in the address part of the machine instruction.
MOV EAX, [0x1234] ; Loads the value from memory address 0x1234 into register EAX
Relative addressing:
Relative addressing specifies the desired address as an offset relative to a base address. The base address is typically stored in a defined register, and the offset is provided in the machine instruction. The Program Counter (PC) contents are often used as the base address.
LDR R0, [PC, #4] ; Load the value from address (PC + 4) into register R0
Indexed Addressing:
Indexed addressing is similar to relative addressing. It also calculates an offset from a base address. However, in indexed addressing, the address is provided in the instruction, and the offset is fetched from a special register called the index register.
LDR R0, [R1, #4] ; Load the value from memory address (R1 + 4) into R0
Indirect Addressing (Pointer):
In indirect addressing, the actual address resides in a memory location. This means that the address of the memory location containing the desired address is specified.
LDR R0, [R1] ; Load the value from the memory address stored in R1 into R0
Note:
The examples provided here are presented in assembler code for readability and understanding, and should be interpreted as pseudocode until the introduction of assembler language.
Subprograms
To save memory, frequently executed program segments are stored in memory only once. These commonly executed program segments, also known as functions, can be accessed using specific jump instructions. Depending on the instruction, either the jump address is immediately loaded into the Program Counter (PC) and execution jumps to that address, or the current Program Counter contents are stored first, followed by loading the jump address into the PC. This allows the program flow to resume at the point where the function was called after exiting the function.
In ARMv7, for example, you can jump to a subroutine (function) using the following two commands:
BAL target_label ; Branch Always: Jump immediately to the address specified by "target_label" BL function_label ; Branch with Link: The current PC contents, which hold the address of the next instruction, are stored in the Link Register (LR). Then, jump to the address specified by "function_label".
A function called as a subroutine can be exited using a specific return instruction.
In ARMv7, after completing a subroutine, you can return to the calling function using the following commands:
BX LR ; "Branch and Exchange": The processor jumps to the address stored in the specified register (LR). MOV PC, LR ; Move the contents of the Link Register (LR) into the Program Counter (PC).
These instructions allow for efficient management of program flow, returning execution to the point where the subroutine was called. To store the Program Counter and other data (such as register contents), the stack can be utilized. The stack is a simple last-in, first-out (LIFO) managed memory structure. This method of management facilitates handling of nested subroutine calls. Typically, the stack is located in main memory, and the processor maintains the memory address of the most recently stored item in a dedicated register known as the Stack Pointer (SP).
Note:
The examples provided here have been presented in assembler code for readability and understanding. At this point, they should be interpreted as pseudocode, pending the introduction of assembler language.
Higher-level programming languages and abstraction concepts
The two main problems of hardware programming described so far are that machine instruction bit patterns are difficult to remember, and programs in machine code are hard to relocate due to their absolute addresses.
Assembler
The solution to the issues of poor memorability and readability of machine code involves introducing an abstraction where machine instructions are associated with words and symbols used in programming instead of directly using the machine instruction bit patterns in source code files. Now, programs can be written in text files using these words and symbols, known as assembly language, rather than machine instruction bit patterns. These text files are processed by a translation program called an assembler, which converts them back into machine instruction bit patterns, transforming the word- and symbol-based program into executable code. Text files containing assembly code are commonly referred to as assembly source files and are typically saved with the extension ".s". Files containing the output translated by the assembler are called object files and are usually saved with the extension ".o".
Here's an example of an assembly program that adds two registers:
.global _start _start: movl $0, %esi movl $3, %eax add %eax, %esi
An assembler program can be translated into machine code on Linux using the following command:
as test.s -o test.o
Please note, this creates an unlinked program object in machine code!
To display machine code as assembler code, i.e., to disassemble it under Linux, you can use the following command:
objdump -d test.o
Linker
The solution to the problem of difficulty in relocation involves introducing symbolic addresses, which act as "variables" called "labels" and are definitively set by the linker. These symbolic addresses are prefixed with zeros by the assembler, and relocation information tables are created to instruct the linker to insert the addresses of these labels at the respective locations.
From now on, programs can be divided into components and developed independently. Files containing the output bound by the linker are executable files known as binaries. They receive an extension dependent on the operating system (e.g., ".exe" in Windows).
Unlinked object files can be linked into a binary under Linux using the following command:
ld test.o -o test
This command links the object file "test.o" into an executable binary named "test".
Compiler
Writing program logic in assembler is unfortunately still somewhat cumbersome. Firstly, machine instructions expressed in words or symbols, despite commonalities across different processors, remain heavily dependent on the specific processor architecture. Secondly, human thought constructs can be complex and may be implemented differently or with difficulty in various processor assemblers. This situation has led to another level of abstraction: the development of higher-level programming languages. These languages provide means to express logical constructs in ways that are more easily understandable to humans and can be formulated universally across different processor instruction sets. From this point onward, primitives can be formulated in these higher-level languages, which are then translated into the specific assembler instruction set of a processor by a program called a compiler.
Compilation Process Example
Here is a highly simplified example illustrating the respective representations of a sample program:
- Example.c:
int main() { exit(0); }
Compilation process:
Command to invoke only the GCC compiler under Linux:
gcc -S Example.c
- Example.s:
main: pushl %ebp movl %esp, %ebp pushl $0 call exit addl $4, %esp popl %ebp ret
Assembly Process:
Command under Linux to translate an assembly program into machine code:
as Example.s -o Example.o
- Example.o:
0000 55 0001 89E5 0003 6A00 0005 E800000000 000a 83C404 000d 89EC 000f 5D 0010 C3 main: 0 0006: exit ADDR32 ; Relocation information: insert the address of "exit" at address 0006
Linking Process:
Command under Linux to combine object files into a binary:
ld Example.o -o Example
- Example.exe
… 0030 55 0031 89E5 0033 6A00 0035 E848010000 ; Address of the function "exit" was inserted by the linker 003a 83C404 003d 89EC 003f 5D 0040 C3 …
Working with an operating-system
Despite higher-level programming languages, challenges such as operating specific hardware, managing multitasking, and handling virtual memory management persist. To address these issues, the concept of an operating system (OS) was introduced as a solution. An operating system is a program or a collection of programs that perform various tasks, including:
- Loading and executing programs
- Managing multitasking
- Virtualizing memory, allowing each program to be written as if it starts at address 0 without conflict in multitasking scenarios (this is achieved, for example, using specialized hardware like a Memory Management Unit, MMU)
- Managing hardware and providing hardware abstractions through drivers, eliminating the need for every programmer to understand the intricate details of hardware operation (e.g., configuring voltage levels manually for network card operations)
- Providing a user-friendly interface for users
To utilize the functionalities of an operating system, various programming interfaces (APIs) were developed and standardized. These APIs offer similar functions (known as primitives). One prominent example is the Portable Operating System Interface (POSIX) standard (ISO/IEC/IEEE 9945), which has been largely implemented by operating systems like Linux.
Examples of primitives offered by POSIX in a higher-level programming language like C include:
* int open(const char *path, int oflag, ...) for opening files * int close(int fildes) for closing files * pid_t fork(void) for creating a new process * void (*signal(int sig, void (*func)(int)))(int) for registering signal handler functions * ...
At the hardware level, these primitives are implemented through system calls to the operating system. These system calls are highly dependent on the operating system and processor.
For instance, the Linux kernel maintains a list of all system calls it provides, known as the System Call Table. Each system call is associated with a unique number and a kernel internal function that performs the actual tasks. To invoke a system call, for example on x86 machines, one loads the number of the desired call into the EAX register and then triggers a software interrupt 128 (hexadecimal: 0x80). The arguments for the system call are stored in CPU registers according to the FastCall calling convention.
The software interrupt, also known as an exception, halts the execution of the program in user mode and triggers the execution of an exception handler in kernel mode. The kernel's exception handler reads the EAX register and, if it contains a valid system call number, invokes the corresponding kernel function from the System Call Table with the arguments stored in the other registers. After validating the arguments, the kernel performs the requested tasks. Upon completion of this function, the exception handler finishes its work, and normal program execution resumes.
Example of using a POSIX system call under Linux:
.global _start _start: MOV R0, #1 @ Set R0 to define the data stream (STDOUT here). This data stream is typically connected to the console. LDR R1, =message @ Load R1 with a pointer to the data to be written. LDR R2, =len @ Load R2 with the length of the data to be written. MOV R7, #4 @ Set R7 to define the desired system call (here, write to STDOUT). SWI 0 @ Trigger software interrupt 0 (syscall). MOV R7, #1 @ Set R7 to define the exit system call. SWI 0 @ Trigger software interrupt 0 (syscall). .data message: .asciz "Hello World! \n" @ Define the message to be printed. len = .-message @ Calculate the length of the message.
Explanation:
Register R0 is used to define the data stream (in this case, STDOUT), which is the default output connected to the console. Register R1 serves as a pointer to the data that will be written. Register R2 indicates the length of the data to be written. Register R7 is used to specify the desired system call (in this case, writing to STDOUT).
Finally, the operating system must be instructed to terminate the program so that data processing does not continue indefinitely. This is achieved with software interrupt #1.
Reverse Engineering
Description
Reverse engineering is the process of analyzing a finished product or system to understand its design, structure, and functionality. This process is frequently applied in software development, as well as in hardware development and other engineering disciplines.
Applications of Reverse Engineering
Software Development:
- Bug Fixing and Debugging
- Compatibility and Interoperability
- Security Analysis
- Detecting License Violations
Hardware Development:
- Product Analysis
- Repair and Maintenance
- Manufacturing Spare Parts
Typical Steps in Reverse Engineering
1. Collection and Gathering information about the target product
2. Disassembly and Decompilation (converting binary code or machine code into a higher-level programming language or human-readable form)
3. Analysis and Documentation of the individual components and their functionality
4. Recovery and Reconstruction
Legal Ethical Considerations
Reverse engineering can raise legal and ethical questions, particularly concerning intellectual property. While it is legal in many cases, there are certain situations where it can lead to legal disputes:
- Copyright: No rights of the holder's permission
- Patents: Patent infringments if reproduced products use patented technologies
- Terms of Use: Many software licenses explicitly prohibit reverse engineering
Tools and Techniques
There are 2 common tools which are free and open-source:
* Ghidra * Radare2
Ghidra
Description
Ghidra is an open-source software, created by the National Security Agency (NSA) and publicly released in 2019. It offers a powerful environment for reverse engineering and malware analysis. Ghidra supports a wide variety of processor architectures and file formats, making it a versatile tool for security researchers and developers.
Installation
Requirements:
* Java Development Kit (JDK) 11 or higher * At least 4 GB of RAM (8GB or more recommended)
Steps:
* Download the latest version of Ghidra from the official website: https://ghidra-sre.org/ * Unzip the downloaded archive to a directory of your choice * Ensure that the JDK is installed and the 'JAVA_HOME' environment variable is set correctly
Starting Ghidra:
Navigate to the directory where you unzipped Ghidra and run the start script:
* Linux/macOS: ./ghidraRun * Windows: ./ghidraRun.bat
Basic Features
Creating a new project:
1. Start Ghidra and create a new project ('File -> New Project') 2. Choose a project type (usually Non-shared Project) and specify a location and name for the project 3. Import the binary file you want to analyze ('File -> Import File')
Analyzing the Binary:
1. After importing the file, double-click on it in the explorer to open it 2. Ghidra will automatically perform an initial analysis of the file. Confirm the suggested settings and start the analysis.
Disassembly and Decompilation:
Disassembly: Shows the machine code of the binary file. This is useful for understanding the low-level execution of the file. Decompilation: Converts the machine code into a higher-level, human-readable form (similar to C code).
Advanced Features
- Scripting
Ghidra supports scripting to automate complex analyses. You can write scripts in Java or Python (Jython).
- Debugging
Ghidra can be integrated with external debuggers to perform dynamic analysis. This allows you to set breakpoints and step through the code.
Of course there are many other notable features, but these are beyond the scope of this documentation.
Radare2
Description
Radare2 is a comprehensive framework for reverse engineering and binary analysis. It offers a wide range of features, from static and dynamic analysis to patching and debugging support. Due to its power and flexibility, radare2 is popular among security experts, malware analysts, and developers.
Installation
Radare2 can be installed on various operating systems (comes pre-installed on kali-linux). These are the steps for the most common platforms:
- Linux
git clone https://github.com/radareorg/radare2.git cd radare2 sys/install.sh
- macOS
brew install radare2
- Windows
Download the installation package from the official website and follow the instructions.
Basic Commands
After installation, you can start radare2 by typing 'r2' followd by the path to the binary you want to analyze:
r2 /path/to/binary
Now typically you start by typing 'V' to enter the hex-view, afterwards you have following options:
- p - change view forwards
- P - change view backwards
- aaa - analyze the binary to be able to use further instructions
- s - search for a keyword
- afl - find all functions
- q - quit to start screen
- ? - help and information
- VV - opens a visual mode with a graphical representation
Advanced Features
- Scripting
With radare2, you can write scripts in various languages to perform complex analyses and automations.
- Debugging
Radare2 also supports debugging binaries:
r2 -d /path/to/binary
In debug mode, additional commands are available:
- db - sets a breakpoint
- dc - continue execution
- dr - displays register contents
Of course there are many other notable features, but these are beyond the scope of this documentation.
Training Code
simpleMath.c
include <stdio.h> int add(int a, int b) { return a + b; } int subtract(int a, int b) { return a - b; } int multiply(int a, int b) { return a * b; } int main() { int x = 5; int y = 3; int sum = add(x, y); int diff = subtract(x, y); int prod = multiply(x, y); printf("Sum: %d\n", sum); printf("Difference: %d\n", diff); printf("Product: %d\n", prod); return 0; }
Don't forget to compile the file first with gcc!
References
- https://ghidra-sre.org/
- https://rada.re/n/radare2.html
- https://github.com/radareorg/radare2
- https://github.com/NationalSecurityAgency/ghidra
- https://beginners.re/RE4B-DE.pdf
- Hardwaregrundlagen: Friedrich Bauer - Vorlesung Digitale Systeme - Technische Universität Wien “F. Bauer - Digitale Systeme - Institut für Computertechnik WS 2016/17"
- Carl Hamacher, Zvonko Vranesic, Safwat Zaky, Naraig Manjikian - Computer Organization and Embedded System (SIXTH EDITION) (McGRAW HILL INTERNATIONAL EDITION)
- Horst Schirmeier - Vorlesung Betriebssysteme - Technische Universität Dortmund SS 2020 https://www.youtube.com/watch?v=DX1wmistewI&list=PLOlqF42t6O1vcrGTDEagGE5PrVrO0ovqf
- https://www.linux-magazin.de/ausgaben/2004/08/kern-technik/
- https://articles.manugarg.com/systemcallinlinux2_6.html
- https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/