Safe list unlinking
History
Before we talk about safe list unlinking, let's discuss unsafe list unlinking.
In 2001, Michel "MaXX" Kaempf wrote the Phrack Article "Vudo malloc tricks" [1]. In this article, he describes multiple different techniques for corruption and exploitation of the heap. For this discussion, we're interested in topic 3.6.1 - The unlink() technique.
Gaining arbitrary code execution using the unlink() technique was first
introduced by "Solar Designer". The technique tricks dlmalloc into processing
carefully crafted forward, fd, and backward, bk, pointers of a chunk being
unlink()ed from a doubly linked list. Abusing the unlink() macro in this
manner provides an attacker the ability to conduct an arbitrary write to any
location in memory which can lead to arbitrary code execution. The unlink()
macro is as follows:
#define unlink( P, BK, FD ) { \
BK = P->bk; \
FD = P->fd; \
FD->bk = BK; \
BK->fd = FD; \
}
The proof of concept provided in this article leverages a heap buffer overflow
to overwrite the chunk metadata of a succeeding chunk, clearing the
PREV_INUSE bit and tricking malloc into thinking that the overflown chunk
is currently free. When the corrupted chunk is free()d, malloc
consolidates the two chunks, executing the unlink() macro on the overflown
chunk - triggering a reflective write. Because the attacker controls the user
data of the overflown chunk, the attacker can craft the fd and bk pointers.
In this example, the fd pointer is the address of an imaginary chunk that
overlaps with the __free_hook, and the bk pointer is the address of an
imaginary chunk that overlaps the shellcode.
At the time this article was written, W^X mitigations for memory segments were
uncommon. The proof of concept stores sys_execve("/bin/sh", NULL, NULL)
shellcode on the heap, uses the arbitrary write to write the heap address of
the shellcode to the __free_hook, and then executes free() to gain code
execution. The shellcode accounts for the reflective write conducted by
unlink() at the fd of the imaginary chunk that overlaps the shellcode. The
shellcode executes a jmp to avoid executing the invalid instructions
represented by the fd pointer.
Safe unlinking
The unlink() technique was used to exploit a number of vulnerable
applications and, surprisingly, it took a while before a fix was implemented
for glibc. It wasn't until glibc 2.3.4, released in 2004, that checks were
implemented for the fd and bk pointers of chunks being operated upon by
unlink(). The updated unlink() macro can be found below [2]:
#define unlink( P, BK, FD ) { \
BK = P->bk; \
FD = P->fd; \
if (__builtin_expect (FD->bk != P || BK->fd != P, 0)) \
malloc_printerr (check_action, "corrupted double-linked list", P); \
else { \
FD->bk = BK; \
BK->fd = FD; \
} \
}
In the updated unlink() macro, a check is now conducted to ensure that the
bk of the fd chunk pointed to by the chunk being inspected, P, is P. A
check is also conducted to ensure that the fd of the bk chunk pointed to by
the chunk being inspected, P, is P. This prevents an attacker from setting
the fd and bk of P to arbitrary locations in memory, thwarting the
previous technique's method of using imaginary chunks for arbitary writes. If
the program fails this check, the process raises the SIGABRT signal and
terminates execution.
This mitigation has been proven to still be exploitable, it just requires
specific circumstances to be feasible. A good example of a situation in which
an attacker could still exploit this technique is if a pointer to the overflown
chunk is present in some location in memory that the attacker can control, such
as .data or .bss. The attacker can then craft the fd and bk pointers
of the overflown chunk to point to imaginary chunks that overlap the portion
of .data containing the heap pointer of the overflown chunk. This technique
spoofs valid fd and bk chunks that will pass the updated unlink()
check. If the program uses this location in .data to track heap chunk
addresses for management and editing of data, the attacker can abuse this
technique to corrupt pointers in .data, eventually creating an arbitrary
write primitive.
Later mitigations
As more techniques for exploiting unlink() were discovered, more patches were
released for glibc to mitigate exploitation of this mechanism. A good example
is the below code that was implemented for the unlink_chunk() function in
malloc. This code checks that the next chunk's prev_size field matches the
current chunk's size field:
static void
unlink_chunk (mstate av, mchunkptr p)
{
if (chunksize (p) != prev_size (next_chunk (p)))
malloc_printerr ("corrupted size vs. prev_size");
<...snip...>
Patches continue to be released that work to harden glibc's implementation of
malloc, the most recent patch being in 2019.