Calling conventions
First, let’s talk about calling conventions. For the System V x86
application binary interface (ABI), parameters to functions are passed on the
stack in reverse order such that the first parameter is the last value pushed to
the stack.
Functions are called using the call instruction, pushing the address of the
next instruction onto the stack and jumping to the operand. Functions return to
the caller using ret, popping a value from the stack and jumping to it.
Function calls preserve the registers ebx, esi, edi, ebp, and esp.
eax, ecx, and edx are scratch registers, their values might not be the
same after a function call. The return value of a function is stored in eax,
unless the return value is a 64-bit value, then the higher 32-bits will be
returned in edx.
For the System V x86-64 ABI, parameters to functions are passed in the
registers rdi, rsi, rdx, rcx, r8, and r9. If a function call
requires more than 6 parameters, further values are passed to the function call
on the stack in reverse order. Functions return to the caller using ret,
popping a value from the stack and jumping to it.
Function calls preserve the registers rbx, rsp, rbp, r12, r13, r14,
and r15. rax, rdi, rsi, rdx, rcx, r8, r9, r10, and r11 are
scratch registers, their values might not be the same after a function call. The
return value of a function is stored in rax, unless the return value is a
128-bit value, then the higher 64-bits will be returned in rdx.
[1]
The standard C library and syscalls
What’s a libc function?
The term libc is shorthand for the “standard C library”. The most widely used
C library on Linux is the GNU C Library,
referred to as glibc. The pathname for your installation of libc on a Linux
operating system is probably something similar to: /lib/libc.so.6. For a
dynamically-linked target binary, you can use ldd to print its shared object
dependencies - you’ll usually find libc.so.6 there. [2]
libc functions are functions exported by the libc.so.6 shared library.
libc functions are commonly used functions within the C programming language,
for example: printf(), puts(), gets()
What’s a syscall?
A system call (syscall), sometimes called kernel calls, is a request for
service that a program makes of the kernel. This is usually a privileged
operation, like changing permissions of a file, opening a file descriptor, etc.
The libc functions mentioned earlier usually handle syscalls for the
programmer, however, an attacker sometimes has to make syscalls directly for
an exploit to work. [3]
To make a syscall directly on x86 on a Linux operating system, first ensure
your registers contain their necessary values. [4] Then, issue
the instruction int 0x80. This instruction is an interrupt that passes control
to interrupt vector 0x80. On Linux, this interrupt vector is used to handle
syscalls. [5]
To make a syscall directly on x86-64 on a Linux operating system, first
ensure your registers contain their necessary values. [6] Then,
issue the instruction syscall. [7]
Using ROP to call a libc function
Now that we understand the calling conventions, we can use ROP to call libc
functions. We’re going to assume that the attacker already has control over the
stack and the EIP or RIP. We’re also going to assume that the attacker has
used an information leak to expose sensitive information, namely the base of
libc within memory. With these conditions, the attacker can execute one of the
most basic ROP attacks: ret2libc.
The attacker must use their control over the EIP/RIP and stack to
construct a stack frame that will call libc’s system() function. The
attacker will also craft the stack frame so that the system() function will
use the string '/bin/sh' as an argument.
Here’s what the stack would look like for a ret2libc attack on x86:
p32(system) # Address of system() function within libc
p32(0xcafebabe) # Fake return pointer
p32(bin_sh_ptr) # arg0: pointer to '/bin/sh' stringAfter the stack is corrupted with the attacker’s ret2libc stack frame,
when the vulnerable function conducts a ret, the program will pop the
system() address from the stack and jump to system() within libc. The
system() function will continue execution and use the pointer to the
'/bin/sh' string as an argument, executing system('/bin/sh').
For x86-64, remember that we need to place our arguments into registers before
making calls to libc functions. A ret2libc attack will look slightly
different:
p64(pop_rdi_ret) # pop rdi; ret gadget
p64(bin_sh_ptr) # arg0: pointer to '/bin/sh' string
p64(system) # Address of system() function within libcFor x86-64, system() expects arg0 to be contained within the rdi
register. Using our techniques discussed earlier to find ROP gadgets, we find
a pop rdi; ret gadget within either libc or the target binary. We use this
gadget to pop the address of our '/bin/sh' string into rdi and then we
ret into the address of system(). Just like in x86, this executes
system('/bin/sh').
The next section will cover executing syscalls directly.
Related pages
References
- https://wiki.osdev.org/System_V_ABI
- https://man7.org/linux/man-pages//man7/libc.7.html
- https://www.gnu.org/software/libc/manual/html_node/System-Calls.html
- https://web.archive.org/web/20200218024630/https://syscalls.kernelgrok.com/
- https://en.wikipedia.org/wiki/Interrupt_vector_table
- https://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/
- https://www.cs.fsu.edu/~langley/CNT5605/2017-Summer/assembly-example/assembly.html