Buffer Overflows

From Embedded Lab Vienna for IoT & Security
Revision as of 16:52, 5 February 2020 by Ikramer (talk | contribs)
Jump to navigation Jump to search

Summary

Since the rise of C in the early 1970s, buffer overflows have become a serious security vulnerability. Even though high-level programming languages are typically not affected, the number of vulnerable systems is actually rising.

At the same time, a wide array of countermeasures are also increasingly adopted and applied. Features like executable space protection (e.g. Data Execution Prevention under Windows) already deployed since the mid 2000s, and on the compiler side, technologies like Stackguard support several detection and prevention mechanisms (e.g. different types of Canaries). Furthermore, almost every wider used operation system supports Address Space Layout Randomization, in order to minimize the attack surface for buffer overflow attacks. For example, at the beginning of 2020 most of the bigger operating systems (Linux, Windows, macOS, iOS, Android, Solaris, OpenBSD, etc.) offer support for ASLR.

Requirements

  • Operating system: not limited
  • A vulnerable library (or function), within any attacked binary

Description

A buffer overflow occurs when there is more information written to a data region, than it can hold. For example in C, allowing user input directly to be written to a character array with a size of ten bytes. If in this case, the user enters more than ten characters, and furthermore the program attempts to insert said data into the smaller array, an overflow occurs.

Basic Vulnerability

#include <string.h>

int main(int argc, char *argv[]) {
    char buffer[6];
    strcpy(buffer, argv[1]); 
    return 0;
}

In this example an argument passed to this executable (e.g. the binary compiled from this source), with more than 6 characters, will typically overflow the buffer. However, the exact input size necessary to affect the program flow might be different (bigger), and will be a multiple of 4 characters (for 32 bit binaries).

These types of vulnerabilities can be taken advantage of in several different ways. For example most prominently, ROP attacks (return-oriented programming), targets the return address of a binary. By rewriting the return address, it aims at influencing the control flow of a program. Which can still be viable, when controlling security features, like executable-space protection, are inplace. Therefore, the attacker uses gadgets (small instruction sequences) which are already available within the binary, and manipulates their return location, and theirby does not directly need to inject executable instructions (which might be thwarted by the OS), but rather use these compiled resources (gadgets).

Countermeasures

Canaries

Canaries can be employed, in order to recognize whether a buffer overflow occurred. Therefore, in the simplest cases, Canaries are specific values in memory, which are located right after the thereby protected buffer. If their values are changed at any point, it is safe to assume that a buffer overflow occurred and countermeasures (e.g. termination) can be taken. Typical types of Canaries, which are supported by security hardening technologies like ProPolice or Stackguard (GCC), are Terminator Canaries, Random Canaries, and Random XOR Canaries.

Data Execution Prevention (DEP)

Data Execution Prevention (DEP) is a security feature, originally developed by MicrosoftR© for Windows XP SP2. There are two basic variants, hardware-based DEP and software-based DEP. If supported by the CPU as well as process, hardware DEP will be employed, otherwise DEP has to be carried out in software, which is part of the Windows operating system. The basic functionality of DEP is to prevent applications from executing code in a non-executable area of the memory.

Address Space Layout Randomization (ASLR)

Since code reuse attacks (e.g. ROP attack) require the memory addresses of gadgets to be known to an attacker, techniques to randomize their entry points have become increasingly popular. ASLR randomizes the location of data and code regions offers a plausible defensive strategy, (since) code region layout randomization hinders code reuse in exploits and data randomization impedes the redirection of control-flow by making it difficult to guess the location of injected code (partly paraphrased from the 2013 paper by Snow et al.).

Courses

References