oss-sec mailing list archives
[CVE-2023-42756] Linux kernel race condition in netfilter
From: Kyle Zeng <zengyhkyle () gmail com>
Date: Wed, 27 Sep 2023 13:44:48 -0700
Hi there,
I recently found a race condition bug in the Linux kernel between
IPSET_CMD_ADD and IPSET_CMD_SWAP in netfilter/ip_set, which can
lead to the invocation of `__ip_set_put` on a wrong `set`, triggering
the `BUG_ON(set->ref == 0);` check in it, which leads to local DoS.
I confirm it at least affect upstream, v6.5.rc7, v6.1, and v5.10.
[Root Cause]
The bug is in the netfilter subsystem.
In `ip_set_swap` function, it will hold the `ip_set_ref_lock`
and then do the following to swap the sets:
~~~
strncpy(from_name, from->name, IPSET_MAXNAMELEN);
strncpy(from->name, to->name, IPSET_MAXNAMELEN);
strncpy(to->name, from_name, IPSET_MAXNAMELEN);
swap(from->ref, to->ref);
~~~
But in the retry loop in `call_ad`:
~~~
if (retried) {
__ip_set_get(set);
nfnl_unlock(NFNL_SUBSYS_IPSET);
cond_resched();
nfnl_lock(NFNL_SUBSYS_IPSET);
__ip_set_put(set);
}
~~~
No lock is hold when it does the `cond_resched()`.
As a result, `ip_set_ref_lock` (in thread 2) can swap the set with
another when thread 1 is doing the `cond_resched()`. When thread 1
wakes up, the `set` variable alreays means another `set`, calling
`__ip_set_put` on it will decrease the refcount on the wrong `set`,
triggering the `BUG_ON` call.
According to Jozsef Kadlecsik, who fixed the bug, the root cause is that
the `call_ad` function is using a wrong ref counter. Instead of using
`__ip_set_get`, which operates on `set->ref`, the correct way is to
operate on `set->ref_netlink`.
[Severity]
It will invoke a `BUG_ON` call, leading to kernel panic.
In other words, it will lead to local DoS.
[Patch]
Jozsef Kadlecsik prepared a patch and it got merged into mainline and
stables already.
The patch can be found here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7433b6d2afd512d04398c73aa984d1e285be125b
[Proof-of-Concept]
A proof-of-concept code to trigger the bug is attached to this email.
Best,
Kyle
========================================================================
[ 5.110096] ------------[ cut here ]------------
[ 5.110337] kernel BUG at net/netfilter/ipset/ip_set_core.c:677!
[ 5.110618] invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 5.110892] CPU: 2 PID: 507 Comm: poc Not tainted 6.1.47+ #67
[ 5.111143] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 5.111490] RIP: 0010:call_ad+0x83e/0x850
[ 5.111677] Code: 89 df e8 35 c6 d2 fd e9 d4 fd ff ff 44 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c d7 fd ff ff 4c 89 ff e8
a7 c5 d2 fd e9 ca fd ff ff <0f> 0b e8 0b 09 85 00 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00
[ 5.112481] RSP: 0018:ffff88800c4d7350 EFLAGS: 00010246
[ 5.112718] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000000ff
[ 5.113047] RDX: ffff88800b658324 RSI: 0000000000000004 RDI: ffff88800c4d7314
[ 5.113373] RBP: ffff88800c4d7448 R08: dffffc0000000000 R09: ffffed100189ae63
[ 5.113696] R10: dfffe9100189ae64 R11: 1ffff1100189ae62 R12: dffffc0000000000
[ 5.114024] R13: 1ffff110016cb067 R14: ffff88800b658338 R15: ffffffff8557d401
[ 5.114346] FS: 00000000027203c0(0000) GS:ffff888034f00000(0000) knlGS:0000000000000000
[ 5.114745] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.115049] CR2: 000000000046c280 CR3: 000000000d71c005 CR4: 0000000000770ee0
[ 5.115478] PKRU: 55555554
[ 5.115653] Call Trace:
[ 5.115799] <TASK>
[ 5.115923] ? __die_body+0x67/0xb0
[ 5.116125] ? die+0xa0/0xc0
[ 5.116295] ? do_trap+0x124/0x350
[ 5.116485] ? call_ad+0x83e/0x850
[ 5.116670] ? call_ad+0x83e/0x850
[ 5.116855] ? handle_invalid_op+0x96/0xd0
[ 5.117084] ? call_ad+0x83e/0x850
[ 5.117270] ? exc_invalid_op+0x2f/0x40
[ 5.117453] ? asm_exc_invalid_op+0x16/0x20
[ 5.117633] ? call_ad+0x83e/0x850
[ 5.117782] ip_set_ad+0x68e/0x7d0
[ 5.117932] ? mutex_lock+0x76/0xc0
[ 5.118083] nfnetlink_rcv_msg+0x6a7/0x830
[ 5.118262] netlink_rcv_skb+0x15a/0x330
[ 5.118430] ? nfnetlink_unbind+0x180/0x180
[ 5.118632] nfnetlink_rcv+0x22d/0x1e70
[ 5.118797] ? __stack_depot_save+0x35/0x480
[ 5.118982] ? kasan_set_track+0x61/0x70
[ 5.119150] ? kasan_set_track+0x4c/0x70
[ 5.119318] ? __kasan_kmalloc+0x85/0x90
[ 5.119486] ? netlink_sendmsg+0x509/0xa00
[ 5.119660] ? __sys_sendto+0x494/0x4b0
[ 5.119826] ? __x64_sys_sendto+0xda/0xf0
[ 5.119998] ? do_syscall_64+0x67/0x90
[ 5.120159] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 5.120383] ? __netlink_lookup+0x2fa/0x310
[ 5.120562] netlink_unicast+0x675/0x8a0
[ 5.120731] netlink_sendmsg+0x685/0xa00
[ 5.120902] ? netlink_getsockopt+0x3f0/0x3f0
[ 5.121093] __sys_sendto+0x494/0x4b0
[ 5.121264] __x64_sys_sendto+0xda/0xf0
[ 5.121438] do_syscall_64+0x67/0x90
[ 5.121628] ? exit_to_user_mode_prepare+0x12/0xa0
[ 5.121874] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 5.122142] RIP: 0033:0x475b30
[ 5.122305] Code: c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 1d 45 31 c9 45
31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 68 c3 0f 1f 80 00 00 00 00 41 54 48 83 ec 20
[ 5.123328] RSP: 002b:00007ffc64795c98 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 5.123741] RAX: ffffffffffffffda RBX: 00007ffc64795f48 RCX: 0000000000475b30
[ 5.124512] RDX: 000000000000007c RSI: 00000000027244a0 RDI: 0000000000000005
[ 5.124905] RBP: 00007ffc64795d40 R08: 0000000000000000 R09: 0000000000000000
[ 5.125416] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
[ 5.125860] R13: 00007ffc64795f38 R14: 0000000000500740 R15: 0000000000000002
[ 5.126282] </TASK>
[ 5.126408] Modules linked in:
[ 5.126613] ---[ end trace 0000000000000000 ]---
[ 5.127317] RIP: 0010:call_ad+0x83e/0x850
[ 5.127565] Code: 89 df e8 35 c6 d2 fd e9 d4 fd ff ff 44 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c d7 fd ff ff 4c 89 ff e8
a7 c5 d2 fd e9 ca fd ff ff <0f> 0b e8 0b 09 85 00 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00
[ 5.128567] RSP: 0018:ffff88800c4d7350 EFLAGS: 00010246
[ 5.128928] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000000ff
[ 5.129356] RDX: ffff88800b658324 RSI: 0000000000000004 RDI: ffff88800c4d7314
[ 5.129766] RBP: ffff88800c4d7448 R08: dffffc0000000000 R09: ffffed100189ae63
[ 5.130203] R10: dfffe9100189ae64 R11: 1ffff1100189ae62 R12: dffffc0000000000
[ 5.130602] R13: 1ffff110016cb067 R14: ffff88800b658338 R15: ffffffff8557d401
[ 5.130973] FS: 00000000027203c0(0000) GS:ffff888034f00000(0000) knlGS:0000000000000000
[ 5.131454] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.131809] CR2: 000000000046c280 CR3: 000000000d71c005 CR4: 0000000000770ee0
[ 5.132290] PKRU: 55555554
[ 5.132452] Kernel panic - not syncing: Fatal exception in interrupt
[ 5.133092] Kernel Offset: disabled
[ 5.133320] Rebooting in 1000 seconds..
Attachment:
poc.c
Description:
Current thread:
- [CVE-2023-42756] Linux kernel race condition in netfilter Kyle Zeng (Sep 27)
