Safe list unlinking

History

Before we talk about safe list unlinking, let's discuss unsafe list unlinking.

In 2001, Michel "MaXX" Kaempf wrote the Phrack Article "Vudo malloc tricks" [1]. In this article, he describes multiple different techniques for corruption and exploitation of the heap. For this discussion, we're interested in topic 3.6.1 - The unlink() technique.

Gaining arbitrary code execution using the unlink() technique was first introduced by "Solar Designer". The technique tricks dlmalloc into processing carefully crafted forward, fd, and backward, bk, pointers of a chunk being unlink()ed from a doubly linked list. Abusing the unlink() macro in this manner provides an attacker the ability to conduct an arbitrary write to any location in memory which can lead to arbitrary code execution. The unlink() macro is as follows:

#define unlink( P, BK, FD ) {   \
    BK = P->bk;                 \
    FD = P->fd;                 \
    FD->bk = BK;                \
    BK->fd = FD;                \
}

The proof of concept provided in this article leverages a heap buffer overflow to overwrite the chunk metadata of a succeeding chunk, clearing the PREV_INUSE bit and tricking malloc into thinking that the overflown chunk is currently free. When the corrupted chunk is free()d, malloc consolidates the two chunks, executing the unlink() macro on the overflown chunk - triggering a reflective write. Because the attacker controls the user data of the overflown chunk, the attacker can craft the fd and bk pointers. In this example, the fd pointer is the address of an imaginary chunk that overlaps with the __free_hook, and the bk pointer is the address of an imaginary chunk that overlaps the shellcode.

At the time this article was written, W^X mitigations for memory segments were uncommon. The proof of concept stores sys_execve("/bin/sh", NULL, NULL) shellcode on the heap, uses the arbitrary write to write the heap address of the shellcode to the __free_hook, and then executes free() to gain code execution. The shellcode accounts for the reflective write conducted by unlink() at the fd of the imaginary chunk that overlaps the shellcode. The shellcode executes a jmp to avoid executing the invalid instructions represented by the fd pointer.

Safe unlinking

The unlink() technique was used to exploit a number of vulnerable applications and, surprisingly, it took a while before a fix was implemented for glibc. It wasn't until glibc 2.3.4, released in 2004, that checks were implemented for the fd and bk pointers of chunks being operated upon by unlink(). The updated unlink() macro can be found below [2]:

#define unlink( P, BK, FD ) {                                               \
    BK = P->bk;                                                             \
    FD = P->fd;                                                             \
    if (__builtin_expect (FD->bk != P || BK->fd != P, 0))                   \
        malloc_printerr (check_action, "corrupted double-linked list", P);  \
    else {                                                                  \
        FD->bk = BK;                                                        \
        BK->fd = FD;                                                        \
    }                                                                       \
}

In the updated unlink() macro, a check is now conducted to ensure that the bk of the fd chunk pointed to by the chunk being inspected, P, is P. A check is also conducted to ensure that the fd of the bk chunk pointed to by the chunk being inspected, P, is P. This prevents an attacker from setting the fd and bk of P to arbitrary locations in memory, thwarting the previous technique's method of using imaginary chunks for arbitary writes. If the program fails this check, the process raises the SIGABRT signal and terminates execution.

This mitigation has been proven to still be exploitable, it just requires specific circumstances to be feasible. A good example of a situation in which an attacker could still exploit this technique is if a pointer to the overflown chunk is present in some location in memory that the attacker can control, such as .data or .bss. The attacker can then craft the fd and bk pointers of the overflown chunk to point to imaginary chunks that overlap the portion of .data containing the heap pointer of the overflown chunk. This technique spoofs valid fd and bk chunks that will pass the updated unlink() check. If the program uses this location in .data to track heap chunk addresses for management and editing of data, the attacker can abuse this technique to corrupt pointers in .data, eventually creating an arbitrary write primitive.

Later mitigations

As more techniques for exploiting unlink() were discovered, more patches were released for glibc to mitigate exploitation of this mechanism. A good example is the below code that was implemented for the unlink_chunk() function in malloc. This code checks that the next chunk's prev_size field matches the current chunk's size field:

static void
unlink_chunk (mstate av, mchunkptr p)
{
  if (chunksize (p) != prev_size (next_chunk (p)))
    malloc_printerr ("corrupted size vs. prev_size");

<...snip...>

Patches continue to be released that work to harden glibc's implementation of malloc, the most recent patch being in 2019.

References

  1. http://phrack.org/issues/57/8.html
  2. http://phrack.org/issues/66/10.html