oss-sec mailing list archives
Linux: general protection fault in __vmx_vcpu_run with nested virtualization
From: Linfeng Sun <slf () hdu edu cn>
Date: Mon, 6 Jan 2025 17:01:49 +0800 (GMT+08:00)
Hello list, A bug has been detected in the Linux kernel's nested virtualization implementation, which can lead to a general protection fault in __vmx_vcpu_run when running a higher version L1 hypervisor kernel on an L0 host kernel version predating the following commit: https://github.com/torvalds/linux/commit/45779be5ced626db836e612e0dc638a1601abcf2 The issue can be reproduced by running the provided PoC in the L1 environment as a user in the kvm group. The following files are included as attachments: 1. L0 and L1 kernel .config 2. PoC script run in the L1 environment ------------[ Backtrace ]------------ [ 66.778426] Oops: general protection fault, maybe for address 0x0: 0000 [#1] PREEMPT SMP I [ 66.779973] CPU: 0 UID: 0 PID: 1497 Comm: poc Not tainted 6.12.4 #2 [ 66.780523] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.14 [ 66.781289] RIP: 0010:__vmx_vcpu_run+0x99/0xa0 [ 66.781742] Code: 8b 58 58 4c 8b 60 60 4c 8b 68 68 4c 8b 70 70 4c 8b 78 78 48 8b 00 0f 1f8 [ 66.783296] RSP: 0018:ffff88801ea07978 EFLAGS: 00000042 [ 66.783768] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 66.784366] RDX: 0000000000000600 RSI: 0000000000000000 RDI: 0000000000000000 [ 66.785013] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 66.785623] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 66.786148] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 66.786668] FS: 000000001e0dd380(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 [ 66.787257] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 66.787717] CR2: 0000000000000000 CR3: 000000001f83a002 CR4: 0000000000772ef0 [ 66.788239] DR0: 0000000000003a92 DR1: 0000000000000000 DR2: 0000000000000400 [ 66.788765] DR3: 0000000000000007 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 66.789284] PKRU: 55555554 [ 66.789495] Call Trace: [ 66.789688] <TASK> [ 66.789858] ? die_addr+0x3c/0xa0 [ 66.790166] ? exc_general_protection+0x1a3/0x320 [ 66.790571] ? asm_exc_general_protection+0x26/0x30 [ 66.791098] ? __vmx_vcpu_run+0x99/0xa0 [ 66.791434] ? __vmx_vcpu_run+0x1f/0xa0 [ 66.791728] ? vmx_vcpu_enter_exit+0xbd/0x2d0 [ 66.792066] ? vmx_vcpu_run+0x8af/0x2870 [ 66.792376] ? __pfx_lock_release+0x10/0x10 [ 66.792727] ? __pfx_vmx_vcpu_run+0x10/0x10 [ 66.793053] ? trace_x86_fpu_regs_activated+0x135/0x190 [ 66.793466] ? vcpu_enter_guest.constprop.0+0x1997/0x4ed0 [ 66.793892] ? __pfx_vcpu_enter_guest.constprop.0+0x10/0x10 [ 66.794320] ? __pfx_lock_acquire+0x10/0x10 [ 66.794647] ? __pfx_blkcg_maybe_throttle_current+0x10/0x10 [ 66.795101] ? lockdep_hardirqs_on_prepare+0x262/0x3f0 [ 66.795501] ? fpu_swap_kvm_fpstate+0x1c8/0x400 [ 66.795850] ? kvm_arch_vcpu_ioctl_run+0x1503/0x2340 [ 66.796232] ? kvm_arch_vcpu_ioctl_run+0xba2/0x2340 [ 66.796654] ? kvm_arch_vcpu_ioctl_run+0x1503/0x2340 [ 66.797196] ? kvm_vcpu_ioctl+0x687/0x14e0 [ 66.797576] ? do_vfs_ioctl+0x4ad/0x1840 [ 66.797898] ? __pfx_kvm_vcpu_ioctl+0x10/0x10 [ 66.798247] ? lock_release+0x20f/0x6f0 [ 66.798551] ? __pfx_lock_release+0x10/0x10 [ 66.798882] ? ioctl_has_perm.constprop.0.isra.0+0x2b2/0x410 [ 66.799322] ? __pfx_ioctl_has_perm.constprop.0.isra.0+0x10/0x10 [ 66.799789] ? __might_fault+0xe0/0x190 [ 66.800097] ? __might_fault+0x151/0x190 [ 66.800402] ? __asan_memset+0x24/0x50 [ 66.800713] ? selinux_file_ioctl+0x187/0x280 [ 66.801042] ? selinux_file_ioctl+0xb9/0x280 [ 66.801371] ? __pfx_kvm_vcpu_ioctl+0x10/0x10 [ 66.801713] ? __x64_sys_ioctl+0x19d/0x210 [ 66.802031] ? do_syscall_64+0xc1/0x1d0 [ 66.802330] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 66.802725] </TASK> [ 66.802903] Modules linked in: [ 66.803159] ---[ end trace 0000000000000000 ]--- [ 66.803518] RIP: 0010:__vmx_vcpu_run+0x99/0xa0 [ 66.803859] Code: 8b 58 58 4c 8b 60 60 4c 8b 68 68 4c 8b 70 70 4c 8b 78 78 48 8b 00 0f 1f8 [ 66.805360] RSP: 0018:ffff88801ea07978 EFLAGS: 00000042 [ 66.805752] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 66.806273] RDX: 0000000000000600 RSI: 0000000000000000 RDI: 0000000000000000 [ 66.806791] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 66.807310] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 66.807853] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 66.808377] FS: 000000001e0dd380(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 [ 66.808970] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 66.809404] CR2: 0000000000000000 CR3: 000000001f83a002 CR4: 0000000000772ef0 [ 66.809925] DR0: 0000000000003a92 DR1: 0000000000000000 DR2: 0000000000000400 [ 66.810452] DR3: 0000000000000007 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 66.810983] PKRU: 55555554 [ 66.811197] note: poc[1497] exited with irqs disabled [ 66.811816] note: poc[1497] exited with preempt_count 1 [ 66.817794] ------------[ cut here ]------------ [ 66.818305] WARNING: CPU: 0 PID: 1497 at arch/x86/kernel/fpu/core.c:264 fpu_free_guest_fp0 [ 66.819083] Modules linked in: [ 66.819337] CPU: 0 UID: 0 PID: 1497 Comm: poc Tainted: G D 6.12.4 #2 [ 66.819942] Tainted: [D]=DIE [ 66.820172] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.14 [ 66.820886] RIP: 0010:fpu_free_guest_fpstate+0x98/0xb0 [ 66.821296] Code: 41 80 fc 03 75 1e e8 87 bb 44 00 48 c7 43 20 00 00 00 00 48 89 ef e8 874 [ 66.822674] RSP: 0018:ffff88801ea07ae0 EFLAGS: 00010293 [ 66.823078] RAX: 0000000000000000 RBX: ffff88801c852ac8 RCX: ffffffff813011de [ 66.823669] RDX: ffff888014012280 RSI: ffffffff81301207 RDI: 0000000000000001 [ 66.824211] RBP: ffffc9000265f000 R08: 0000000000000000 R09: 0000000000000000 [ 66.824753] R10: 000000000000000b R11: 00000000000000a0 R12: 000000000000000b [ 66.825290] R13: ffffc90002610000 R14: ffff88801c852280 R15: ffff88801ea07b68 [ 66.825839] FS: 0000000000000000(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 [ 66.826469] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 66.826909] CR2: 0000000000000000 CR3: 0000000005ad2005 CR4: 0000000000772ef0 [ 66.827471] DR0: 0000000000003a92 DR1: 0000000000000000 DR2: 0000000000000400 [ 66.828005] DR3: 0000000000000007 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 66.828545] PKRU: 55555554 [ 66.828761] Call Trace: [ 66.828960] <TASK> [ 66.829134] ? __warn+0xea/0x380 [ 66.829415] ? fpu_free_guest_fpstate+0x98/0xb0 [ 66.829789] ? report_bug+0x2f8/0x3f0 [ 66.830091] ? fpu_free_guest_fpstate+0x98/0xb0 [ 66.830448] ? fpu_free_guest_fpstate+0x99/0xb0 [ 66.830823] ? handle_bug+0xe5/0x180 [ 66.831111] ? exc_invalid_op+0x35/0x80 [ 66.831423] ? asm_exc_invalid_op+0x1a/0x20 [ 66.831775] ? fpu_free_guest_fpstate+0x6e/0xb0 [ 66.832138] ? fpu_free_guest_fpstate+0x97/0xb0 [ 66.832511] ? fpu_free_guest_fpstate+0x98/0xb0 [ 66.832871] kvm_arch_vcpu_destroy+0x96/0x2a0 [ 66.833211] kvm_destroy_vcpus+0x111/0x290 [ 66.833562] ? __pfx_kvm_destroy_vcpus+0x10/0x10 [ 66.833923] ? kvm_arch_vcpu_put+0x587/0x920 [ 66.834264] kvm_arch_destroy_vm+0x2e1/0x470 [ 66.834615] ? __pfx_kvm_arch_destroy_vm+0x10/0x10 [ 66.834986] ? synchronize_srcu+0x1b5/0x250 [ 66.835329] ? __pfx_kvm_vm_release+0x10/0x10 [ 66.835705] kvm_put_kvm+0x4a6/0xa00 [ 66.835988] ? __pfx_kvm_vm_release+0x10/0x10 [ 66.836327] kvm_vm_release+0x3d/0x50 [ 66.836637] __fput+0x3f6/0xb40 [ 66.836901] ? trace_irq_enable.constprop.0+0xd2/0x110 [ 66.837311] task_work_run+0x169/0x260 [ 66.837634] ? __pfx_task_work_run+0x10/0x10 [ 66.837966] ? do_raw_spin_unlock+0x53/0x220 [ 66.838309] do_exit+0xab9/0x2a30 [ 66.838596] ? _printk+0xbf/0x100 [ 66.838873] ? __pfx__printk+0x10/0x10 [ 66.839172] ? __pfx_do_exit+0x10/0x10 [ 66.839497] make_task_dead+0x174/0x3c0 [ 66.839803] ? do_syscall_64+0xc1/0x1d0 [ 66.840105] rewind_stack_and_make_dead+0x16/0x20 [ 66.840499] RIP: 0033:0x432de9 [ 66.840754] Code: Unable to access opcode bytes at 0x432dbf. [ 66.841177] RSP: 002b:00007ffc61c6a708 EFLAGS: 00000217 ORIG_RAX: 0000000000000010 [ 66.841764] RAX: ffffffffffffffda RBX: 00007ffc61c6a938 RCX: 0000000000432de9 [ 66.842297] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005 [ 66.842836] RBP: 00007ffc61c6a720 R08: 00007ffc61c6a720 R09: 00007ffc61c6a720 [ 66.843358] R10: 00007ffc61c6a720 R11: 0000000000000217 R12: 0000000000000001 [ 66.843910] R13: 00007ffc61c6a928 R14: 0000000000000001 R15: 0000000000000001 [ 66.844438] </TASK> [ 66.844638] irq event stamp: 4538 [ 66.844904] hardirqs last enabled at (4537): [<ffffffff8123a20f>] vmx_vcpu_run+0x8af/0x20 [ 66.845563] hardirqs last disabled at (4538): [<ffffffff847e8623>] exc_general_protection0 [ 66.846243] softirqs last enabled at (3744): [<ffffffff81304ef8>] fpu_swap_kvm_fpstate+00 [ 66.846927] softirqs last disabled at (3742): [<ffffffff81304daf>] fpu_swap_kvm_fpstate+00 [ 66.847628] ---[ end trace 0000000000000000 ]--- [ 66.848051] Oops: general protection fault, probably for non-canonical address 0xfbd59c00I [ 66.848927] KASAN: maybe wild-memory-access in range [0xdead000000000110-0xdead0000000001] [ 66.849563] CPU: 0 UID: 0 PID: 1497 Comm: poc Tainted: G D W 6.12.4 #2 [ 66.850140] Tainted: [D]=DIE, [W]=WARN [ 66.850428] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.14 [ 66.851099] RIP: 0010:__schedule+0x112f/0x3020 [ 66.851475] Code: 4c 89 e8 48 c1 e8 03 42 80 3c 38 00 0f 85 47 16 00 00 4d 8b 6d 00 4d 858 [ 66.852820] RSP: 0018:ffff88801ea07960 EFLAGS: 00010806 [ 66.853211] RAX: 1bd5a00000000022 RBX: ffff888014012280 RCX: 1ffff1100390a5ae [ 66.853734] RDX: 00000027551399a8 RSI: ffffffff812197dc RDI: dead000000000110 [ 66.854261] RBP: ffff88801ea07a90 R08: 0000000000000000 R09: fffffbfff0c813b9 [ 66.854806] R10: 0000000000000000 R11: ffff8880140126f8 R12: ffff88806ce3bd40 [ 66.855435] R13: dead000000000100 R14: ffff888009598000 R15: dffffc0000000000 [ 66.855968] FS: 0000000000000000(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 [ 66.856573] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 66.857006] CR2: 0000000000000000 CR3: 0000000005ad2005 CR4: 0000000000772ef0 [ 66.857531] DR0: 0000000000003a92 DR1: 0000000000000000 DR2: 0000000000000400 [ 66.858059] DR3: 0000000000000007 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 66.858586] PKRU: 55555554 [ 66.858799] Call Trace: [ 66.858988] <TASK> [ 66.859160] ? die_addr+0x3c/0xa0 [ 66.859435] ? exc_general_protection+0x1a3/0x320 [ 66.859803] ? asm_exc_general_protection+0x26/0x30 [ 66.860181] ? vmx_vcpu_put+0xbc/0x7d0 [ 66.860501] ? __schedule+0x112f/0x3020 [ 66.860805] ? __pfx___schedule+0x10/0x10 [ 66.861117] ? __virt_addr_valid+0x100/0x5d0 [ 66.861455] ? trace_irq_enable.constprop.0+0xd2/0x110 [ 66.861847] ? kasan_quarantine_put+0x84/0x1d0 [ 66.862192] __cond_resched+0x45/0x70 [ 66.862481] cpus_read_lock+0x20/0x160 [ 66.862777] static_key_slow_dec+0x53/0xc0 [ 66.863108] kvm_free_lapic+0x187/0x1c0 [ 66.863448] kvm_arch_vcpu_destroy+0x10a/0x2a0 [ 66.863791] kvm_destroy_vcpus+0x111/0x290 [ 66.864111] ? __pfx_kvm_destroy_vcpus+0x10/0x10 [ 66.864463] ? kvm_arch_vcpu_put+0x587/0x920 [ 66.864801] kvm_arch_destroy_vm+0x2e1/0x470 [ 66.865138] ? __pfx_kvm_arch_destroy_vm+0x10/0x10 [ 66.865507] ? synchronize_srcu+0x1b5/0x250 [ 66.865831] ? __pfx_kvm_vm_release+0x10/0x10 [ 66.866166] kvm_put_kvm+0x4a6/0xa00 [ 66.866449] ? __pfx_kvm_vm_release+0x10/0x10 [ 66.866791] kvm_vm_release+0x3d/0x50 [ 66.867084] __fput+0x3f6/0xb40 [ 66.867339] ? trace_irq_enable.constprop.0+0xd2/0x110 [ 66.867745] task_work_run+0x169/0x260 [ 66.868037] ? __pfx_task_work_run+0x10/0x10 [ 66.868365] ? do_raw_spin_unlock+0x53/0x220 [ 66.868695] do_exit+0xab9/0x2a30 [ 66.868956] ? _printk+0xbf/0x100 [ 66.869215] ? __pfx__printk+0x10/0x10 [ 66.869510] ? __pfx_do_exit+0x10/0x10 [ 66.869804] make_task_dead+0x174/0x3c0 [ 66.870114] ? do_syscall_64+0xc1/0x1d0 [ 66.870418] rewind_stack_and_make_dead+0x16/0x20 [ 66.870783] RIP: 0033:0x432de9 [ 66.871026] Code: Unable to access opcode bytes at 0x432dbf. [ 66.871469] RSP: 002b:00007ffc61c6a708 EFLAGS: 00000217 ORIG_RAX: 0000000000000010 [ 66.872023] RAX: ffffffffffffffda RBX: 00007ffc61c6a938 RCX: 0000000000432de9 [ 66.872543] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005 [ 66.873060] RBP: 00007ffc61c6a720 R08: 00007ffc61c6a720 R09: 00007ffc61c6a720 [ 66.873580] R10: 00007ffc61c6a720 R11: 0000000000000217 R12: 0000000000000001 [ 66.874104] R13: 00007ffc61c6a928 R14: 0000000000000001 R15: 0000000000000001 [ 66.874634] </TASK> [ 66.874810] Modules linked in: [ 66.875057] ---[ end trace 0000000000000000 ]--- ------------------------------ Best regards, Linfeng Sun
Attachment:
config-L0
Description:
Attachment:
config-L1
Description:
Attachment:
poc.c
Description:
Current thread:
- Linux: general protection fault in __vmx_vcpu_run with nested virtualization Linfeng Sun (Jan 06)
- Re: Linux: general protection fault in __vmx_vcpu_run with nested virtualization Greg KH (Jan 06)
- Re: Linux: general protection fault in __vmx_vcpu_run with nested virtualization Demi Marie Obenour (Jan 06)
- Re: Linux: general protection fault in __vmx_vcpu_run with nested virtualization Solar Designer (Jan 07)
- Re: Linux: general protection fault in __vmx_vcpu_run with nested virtualization Greg KH (Jan 06)
