A history lesson
Before we describe what ASLR is or how it’s used to prevent exploitation, we need to understand why it was created.
In 1996, a hacker by the name of Aleph One published an article in Phrack
Magazine titled, “Smashing the Stack for Fun and Profit”. In this article,
Aleph One describes, in great detail, how buffer overflow vulnerabilities can be
exploited to gain arbitrary code execution. These exploitation techniques
utilized a stack buffer overflow to overwrite the saved return pointer of a
function with a stack address. When the vulnerable function executes its final
ret
instruction, the process jumps into the stack and begins to execute
whatever data is residing at the provided location in the stack - the data being
the attacker’s shellcode. [1]
The hacker, Solar Designer, took this a step further by introducting the
ret2libc
technique. ret2libc
leverages a stack buffer overflow to gain
control of a function’s saved return address, however, instead of ret
urning
into the stack, the process ret
urns into libc
’s system()
symbol. The
attacker crafts the stack frame so that system()
interprets the string
'/bin/sh'
as its first argument, hijacking the process to call
system('/bin/sh')
. [2]
In order to thwart these stack buffer overflow exploitation techniques, the
PaX Team released PaX, a patch for the Linux kernel that implements the
W^X
protection for memory pages and ASLR. [3] In a future
section we’ll cover the concept of W^X
and the non-executable stack.
So what is ASLR?
“The goal of Address Space Layout Randomization is to introduce randomness into addresses used by a given task. This will make a class of exploit techniques fail with a quantifiable probability and also allow their detection since failed attempts will most likely crash the attacked task.” - The PaX Team [4]
The exploit techniques of the past relied upon the memory image of the process
being exploited to be the same for each exploit, with minor differences due to
changes in the environment. Attackers could expect that the libc
system()
symbol and a '/bin/sh'
string would be at static locations within memory for
each exploit. ASLR randomizes the base addresses of the heap, shared objects,
and the stack when these segments are mapped into process memory, attempting to
introduce enough entropy to prevent an attacker from successfully brute forcing
their locations. Without knowledge of the base address of libc
within a
target’s memory, attackers conducting the ret2libc
exploit technique have to
guess the system()
symbol address and the '/bin/sh'
string’s address.
ASLR in action
To see ASLR for yourself, first let’s turn it off. This technique works on Ubuntu 18.04 - if it doesn’t seem to work for you, try looking up how your operating system implements ASLR and how you can enable/disable it. To turn off ASLR globally for all processes, execute the following command:
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
Now, lets see how this affects the mappings of our processes. Run strace
on
/bin/ls
and notice the results of the mmap()
calls being made to setup the
process’s memory mappings. Each call to mmap()
ends up with the same address
each time, right? Let’s re-enable ASLR on your operating system:
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
Now, run strace
on /bin/ls
a couple times. You should see that the results
of the mmap()
calls are no longer the same each time /bin/ls
is invoked. The
memory map of our /bin/ls
process is being randomized each time we execute it.
Breaking ASLR
Is it possible to brute force?
Well, it depends. In
this paper researchers
demonstrated that, with enough tries on the same memory mapping, attackers
could brute force the delta_mmap
value described in [4].
Symbols in libc
have static offsets, we can find these by using objdump
. If
we know the usual base address of libc
within process memory, we can compute
the address of our target symbol and then account for the entropy introduced by
ASLR. A condition that enabled the researchers to continuously brute force a
target to determine its delta_mmap
value is that the parent process executed
fork()
each time a connection was made. When a process fork()
s, the child
process’s memory mapping will be identical to the parent’s. The researchers
could crash the child process as many times as necessary to discover the
delta_mmap
value. Once they discovered the delta_mmap
value, the researchers
computed the base address of libc
within the target’s memory, allowing them to
compute the location of any symbol in the target’s libc
to build the final
payload.
Other methods?
Another interesting thing to think about is the amount of entropy introduced by
a system’s implementation of ASLR. If the number of bits randomized within the
virtual address space of a process is small enough, even if the process doesn’t
fork()
like in the example above and the mapping is different on each
invocation, we could still feasibly brute force the location of libc
symbols
in memory with enough guesses.
A common method of beating ASLR is through utilizing some sort of sensitive
information leak, like a format string vulnerability. [6] If an
attacker can leak information from the process to discover the location of a
libc
symbol in the process’s memory, the attacker can calculate the base of
libc
within the process’s memory, beating ASLR. [7]