A race condition was found in the Linux kernel versions 3.14-rc1 through 4.12. The race happens between threads of inotify_handle_event() and vfs_rename() while running the rename operation against the same file. The next slab data or the slab's free list pointer can be corrupted with attacker-controlled data as a result of the race.
6f2a5e363da711fc3b5559695e8bd8e9b01036beec7e3b2a4461d9671ad35ee8
Hello,
A race condition was found in Linux kernel present since v3.14-rc1 upto v4.12
including. The race happens between threads of inotify_handle_event() and
vfs_rename() while running the rename operation against the same file. The next
slab data or the slab's free list pointer can be corrupted with attacker-controlled
data as a result of the race.
The researchers of this flaw are Fan Wu and Shixiong Zhao from a research group
supervised by Dr. Heming Cui of the Department of Computer Science, The University
of Hong Kong. Thanks to Rui Gu and Prof. Junfeng Yang from Columbia University for
tools and suggestions.
References:
https://bugzilla.redhat.com/show_bug.cgi?id=1468283
https://access.redhat.com/security/vulnerabilities/3112931
https://patchwork.kernel.org/patch/9755753/
https://patchwork.kernel.org/patch/9755757/
An upstream patch:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=49d31c2f389acfe83417083e1208422b4091cd9
So as for the flaw itself.
There is quite easily reached race condition between inotify_handle_event() and
sys_rename(). A rename thread can change the dentry name before an fsnotify
thread is copying the dentry name but after a memory for this is allocated:
CPU 1 CPU 2
fsnotify()
inotify_handle_event(.., file_name)
strlen(file_name) // file_name is "foobar"
alloc_len += len + 1;
event = kmalloc(alloc_len, GFP_KERNEL); // 7 bytes for the file_name
sys_rename()
__d_move() [in fs/dcache.c]
copy_name()
// rename to "foobar_lol_kek_u_pwned"
strcpy(event->name, file_name);
// now file_name points to "foobar_lol_kek_u_pwned"
// but there is a space only for "foobar\0"
// the next slab or slab's *freelist is corrupted with user controlled data
There is a working exploit allowing privileges escalation in the wild for 32 bit
kernels. We are unaware of such exploit for 64 bit kernels, but these are affected
by this race too and we believe such an exploit could exist.
The result of exploiting the flaw is modified data after the slab, which can be
the next slab data, freelist pointer or something else (if the slab is the last
one in the cache).
The slab corruption caused by the exploit or the reproducer cat be easily seen
with "slub_debug=FZ" kernel parameter. The following log indicates a write beyond
the allocated slab, in this case a write to the slab's red zone:
[ 144.109993] =============================================================================
[ 144.110011] BUG kmalloc-64 (Not tainted): Redzone overwritten
[ 144.110011] -----------------------------------------------------------------------------
[ 144.110011] Disabling lock debugging due to kernel taint
[ 144.110011] INFO: 0xffff8800bbb544f0-0xffff8800bbb544f7. First byte 0x33 instead of 0xcc
[ 144.110011] INFO: Slab 0xffffea0002eed500 objects=51 used=23 fp=0xffff8800bbb54d70 flags=0x5fffff00000081
[ 144.110011] INFO: Object 0xffff8800bbb544b0 @offset=1200 fp=0xffff8800bbb544b0
[ 144.110011]
[ 144.110011] Bytes b4 ffff8800bbb544a0: cc cc cc cc cc cc cc cc 00 00 00 00 00 00 00 00 ................
[ 144.110011] Object ffff8800bbb544b0: b0 44 b5 bb 00 88 ff ff b0 44 b5 bb 00 88 ff ff .D.......D......
[ 144.110011] Object ffff8800bbb544c0: b8 78 62 bb 00 88 ff ff 20 00 00 08 00 00 00 00 .xb..... .......
[ 144.110011] Object ffff8800bbb544d0: 01 00 00 00 00 00 00 00 01 00 00 00 61 61 61 61 ............aaaa
[ 144.110011] Object ffff8800bbb544e0: 33 32 31 30 33 32 31 30 33 32 31 30 33 32 31 30 3210321032103210
[ 144.110011] Redzone ffff8800bbb544f0: 33 32 31 30 33 32 31 30 32103210
[ 144.110011] Padding ffff8800bbb544f8: 00 00 00 00 00 00 00 00 ........
[ 144.110011] CPU: 2 PID: 1016 Comm: inotify Tainted: G B ------------ 3.10.0-514.16.1.el7.x86_64 #1
[ 144.110011] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
[ 144.110011] ffff88007d801d00 0000000070c5e4c2 ffff88013bad7c08 ffffffff816869c3
[ 144.110011] ffff88013bad7c48 ffffffff811d9cad 0000000000000008 ffff880000000001
[ 144.110011] ffff8800bbb544f8 ffff88007d801d00 00000000000000cc ffff8800bbb544b0
[ 144.110011] Call Trace:
[ 144.110011] [<ffffffff816869c3>] dump_stack+0x19/0x1b
[ 144.110011] [<ffffffff811d9cad>] print_trailer+0x14d/0x200
[ 144.110011] [<ffffffff811d9e9f>] check_bytes_and_report+0xcf/0x110
[ 144.110011] [<ffffffff811dab33>] check_object+0x193/0x250
[ 144.110011] [<ffffffff8168380f>] free_debug_processing+0xcc/0x259
[ 144.110011] [<ffffffff81213130>] ? poll_select_copy_remaining+0x150/0x150
[ 144.110011] [<ffffffff81244f9e>] ? inotify_free_event+0xe/0x10
[ 144.110011] [<ffffffff81244f9e>] ? inotify_free_event+0xe/0x10
[ 144.110011] [<ffffffff811dca30>] __slab_free+0x250/0x2f0
[ 144.110011] [<ffffffff81213130>] ? poll_select_copy_remaining+0x150/0x150
[ 144.110011] [<ffffffff8168ba60>] ? __schedule+0x3b0/0x990
[ 144.110011] [<ffffffff81244f9e>] ? inotify_free_event+0xe/0x10
[ 144.110011] [<ffffffff811dd173>] kfree+0x103/0x140
[ 144.110011] [<ffffffff81244f9e>] inotify_free_event+0xe/0x10
[ 144.110011] [<ffffffff81242b30>] fsnotify_destroy_event+0x30/0x50
[ 144.110011] [<ffffffff81245424>] inotify_read+0x224/0x3e0
[ 144.110011] [<ffffffff810b1b20>] ? wake_up_atomic_t+0x30/0x30
[ 144.110011] [<ffffffff811fe61e>] vfs_read+0x9e/0x170
[ 144.110011] [<ffffffff811ff1ef>] SyS_read+0x7f/0xe0
[ 144.110011] [<ffffffff81214954>] ? SyS_poll+0x74/0x110
[ 144.110011] [<ffffffff81697089>] system_call_fastpath+0x16/0x1b
[ 144.110011] FIX kmalloc-64: Restoring 0xffff8800bbb544f0-0xffff8800bbb544f7=0xcc
Best regards,
Vladis Dronov | Red Hat, Inc. | Product Security Engineer