oss-sec mailing list archives

Linux: general protection fault in __vmx_vcpu_run with nested virtualization


From: Linfeng Sun <slf () hdu edu cn>
Date: Mon, 6 Jan 2025 17:01:49 +0800 (GMT+08:00)

Hello list,

A bug has been detected in the Linux kernel's nested virtualization implementation, which 
can lead to a general protection fault in __vmx_vcpu_run when running a higher 
version L1 hypervisor kernel on an L0 host kernel version predating the following 
commit: https://github.com/torvalds/linux/commit/45779be5ced626db836e612e0dc638a1601abcf2

The issue can be reproduced by running the provided PoC in the L1 environment as a user in 
the kvm group. The following files are included as attachments:
        1. L0 and L1 kernel .config
        2. PoC script run in the L1 environment
------------[ Backtrace ]------------
[   66.778426] Oops: general protection fault, maybe for address 0x0: 0000 [#1] PREEMPT SMP I
[   66.779973] CPU: 0 UID: 0 PID: 1497 Comm: poc Not tainted 6.12.4 #2
[   66.780523] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.14
[   66.781289] RIP: 0010:__vmx_vcpu_run+0x99/0xa0
[   66.781742] Code: 8b 58 58 4c 8b 60 60 4c 8b 68 68 4c 8b 70 70 4c 8b 78 78 48 8b 00 0f 1f8
[   66.783296] RSP: 0018:ffff88801ea07978 EFLAGS: 00000042
[   66.783768] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   66.784366] RDX: 0000000000000600 RSI: 0000000000000000 RDI: 0000000000000000
[   66.785013] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[   66.785623] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   66.786148] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   66.786668] FS:  000000001e0dd380(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000
[   66.787257] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   66.787717] CR2: 0000000000000000 CR3: 000000001f83a002 CR4: 0000000000772ef0
[   66.788239] DR0: 0000000000003a92 DR1: 0000000000000000 DR2: 0000000000000400
[   66.788765] DR3: 0000000000000007 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   66.789284] PKRU: 55555554
[   66.789495] Call Trace:
[   66.789688]  <TASK>
[   66.789858]  ? die_addr+0x3c/0xa0
[   66.790166]  ? exc_general_protection+0x1a3/0x320
[   66.790571]  ? asm_exc_general_protection+0x26/0x30
[   66.791098]  ? __vmx_vcpu_run+0x99/0xa0
[   66.791434]  ? __vmx_vcpu_run+0x1f/0xa0
[   66.791728]  ? vmx_vcpu_enter_exit+0xbd/0x2d0
[   66.792066]  ? vmx_vcpu_run+0x8af/0x2870
[   66.792376]  ? __pfx_lock_release+0x10/0x10
[   66.792727]  ? __pfx_vmx_vcpu_run+0x10/0x10
[   66.793053]  ? trace_x86_fpu_regs_activated+0x135/0x190
[   66.793466]  ? vcpu_enter_guest.constprop.0+0x1997/0x4ed0
[   66.793892]  ? __pfx_vcpu_enter_guest.constprop.0+0x10/0x10
[   66.794320]  ? __pfx_lock_acquire+0x10/0x10
[   66.794647]  ? __pfx_blkcg_maybe_throttle_current+0x10/0x10
[   66.795101]  ? lockdep_hardirqs_on_prepare+0x262/0x3f0
[   66.795501]  ? fpu_swap_kvm_fpstate+0x1c8/0x400
[   66.795850]  ? kvm_arch_vcpu_ioctl_run+0x1503/0x2340
[   66.796232]  ? kvm_arch_vcpu_ioctl_run+0xba2/0x2340
[   66.796654]  ? kvm_arch_vcpu_ioctl_run+0x1503/0x2340
[   66.797196]  ? kvm_vcpu_ioctl+0x687/0x14e0
[   66.797576]  ? do_vfs_ioctl+0x4ad/0x1840
[   66.797898]  ? __pfx_kvm_vcpu_ioctl+0x10/0x10
[   66.798247]  ? lock_release+0x20f/0x6f0
[   66.798551]  ? __pfx_lock_release+0x10/0x10
[   66.798882]  ? ioctl_has_perm.constprop.0.isra.0+0x2b2/0x410
[   66.799322]  ? __pfx_ioctl_has_perm.constprop.0.isra.0+0x10/0x10
[   66.799789]  ? __might_fault+0xe0/0x190
[   66.800097]  ? __might_fault+0x151/0x190
[   66.800402]  ? __asan_memset+0x24/0x50
[   66.800713]  ? selinux_file_ioctl+0x187/0x280
[   66.801042]  ? selinux_file_ioctl+0xb9/0x280
[   66.801371]  ? __pfx_kvm_vcpu_ioctl+0x10/0x10
[   66.801713]  ? __x64_sys_ioctl+0x19d/0x210
[   66.802031]  ? do_syscall_64+0xc1/0x1d0
[   66.802330]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   66.802725]  </TASK>
[   66.802903] Modules linked in:
[   66.803159] ---[ end trace 0000000000000000 ]---
[   66.803518] RIP: 0010:__vmx_vcpu_run+0x99/0xa0
[   66.803859] Code: 8b 58 58 4c 8b 60 60 4c 8b 68 68 4c 8b 70 70 4c 8b 78 78 48 8b 00 0f 1f8
[   66.805360] RSP: 0018:ffff88801ea07978 EFLAGS: 00000042
[   66.805752] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   66.806273] RDX: 0000000000000600 RSI: 0000000000000000 RDI: 0000000000000000
[   66.806791] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[   66.807310] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   66.807853] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   66.808377] FS:  000000001e0dd380(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000
[   66.808970] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   66.809404] CR2: 0000000000000000 CR3: 000000001f83a002 CR4: 0000000000772ef0
[   66.809925] DR0: 0000000000003a92 DR1: 0000000000000000 DR2: 0000000000000400
[   66.810452] DR3: 0000000000000007 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   66.810983] PKRU: 55555554
[   66.811197] note: poc[1497] exited with irqs disabled
[   66.811816] note: poc[1497] exited with preempt_count 1
[   66.817794] ------------[ cut here ]------------
[   66.818305] WARNING: CPU: 0 PID: 1497 at arch/x86/kernel/fpu/core.c:264 fpu_free_guest_fp0
[   66.819083] Modules linked in:
[   66.819337] CPU: 0 UID: 0 PID: 1497 Comm: poc Tainted: G      D            6.12.4 #2
[   66.819942] Tainted: [D]=DIE
[   66.820172] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.14
[   66.820886] RIP: 0010:fpu_free_guest_fpstate+0x98/0xb0
[   66.821296] Code: 41 80 fc 03 75 1e e8 87 bb 44 00 48 c7 43 20 00 00 00 00 48 89 ef e8 874
[   66.822674] RSP: 0018:ffff88801ea07ae0 EFLAGS: 00010293
[   66.823078] RAX: 0000000000000000 RBX: ffff88801c852ac8 RCX: ffffffff813011de
[   66.823669] RDX: ffff888014012280 RSI: ffffffff81301207 RDI: 0000000000000001
[   66.824211] RBP: ffffc9000265f000 R08: 0000000000000000 R09: 0000000000000000
[   66.824753] R10: 000000000000000b R11: 00000000000000a0 R12: 000000000000000b
[   66.825290] R13: ffffc90002610000 R14: ffff88801c852280 R15: ffff88801ea07b68
[   66.825839] FS:  0000000000000000(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000
[   66.826469] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   66.826909] CR2: 0000000000000000 CR3: 0000000005ad2005 CR4: 0000000000772ef0
[   66.827471] DR0: 0000000000003a92 DR1: 0000000000000000 DR2: 0000000000000400
[   66.828005] DR3: 0000000000000007 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   66.828545] PKRU: 55555554
[   66.828761] Call Trace:
[   66.828960]  <TASK>
[   66.829134]  ? __warn+0xea/0x380
[   66.829415]  ? fpu_free_guest_fpstate+0x98/0xb0
[   66.829789]  ? report_bug+0x2f8/0x3f0
[   66.830091]  ? fpu_free_guest_fpstate+0x98/0xb0
[   66.830448]  ? fpu_free_guest_fpstate+0x99/0xb0
[   66.830823]  ? handle_bug+0xe5/0x180
[   66.831111]  ? exc_invalid_op+0x35/0x80
[   66.831423]  ? asm_exc_invalid_op+0x1a/0x20
[   66.831775]  ? fpu_free_guest_fpstate+0x6e/0xb0
[   66.832138]  ? fpu_free_guest_fpstate+0x97/0xb0
[   66.832511]  ? fpu_free_guest_fpstate+0x98/0xb0
[   66.832871]  kvm_arch_vcpu_destroy+0x96/0x2a0
[   66.833211]  kvm_destroy_vcpus+0x111/0x290
[   66.833562]  ? __pfx_kvm_destroy_vcpus+0x10/0x10
[   66.833923]  ? kvm_arch_vcpu_put+0x587/0x920
[   66.834264]  kvm_arch_destroy_vm+0x2e1/0x470
[   66.834615]  ? __pfx_kvm_arch_destroy_vm+0x10/0x10
[   66.834986]  ? synchronize_srcu+0x1b5/0x250
[   66.835329]  ? __pfx_kvm_vm_release+0x10/0x10
[   66.835705]  kvm_put_kvm+0x4a6/0xa00
[   66.835988]  ? __pfx_kvm_vm_release+0x10/0x10
[   66.836327]  kvm_vm_release+0x3d/0x50
[   66.836637]  __fput+0x3f6/0xb40
[   66.836901]  ? trace_irq_enable.constprop.0+0xd2/0x110
[   66.837311]  task_work_run+0x169/0x260
[   66.837634]  ? __pfx_task_work_run+0x10/0x10
[   66.837966]  ? do_raw_spin_unlock+0x53/0x220
[   66.838309]  do_exit+0xab9/0x2a30
[   66.838596]  ? _printk+0xbf/0x100
[   66.838873]  ? __pfx__printk+0x10/0x10
[   66.839172]  ? __pfx_do_exit+0x10/0x10
[   66.839497]  make_task_dead+0x174/0x3c0
[   66.839803]  ? do_syscall_64+0xc1/0x1d0
[   66.840105]  rewind_stack_and_make_dead+0x16/0x20
[   66.840499] RIP: 0033:0x432de9
[   66.840754] Code: Unable to access opcode bytes at 0x432dbf.
[   66.841177] RSP: 002b:00007ffc61c6a708 EFLAGS: 00000217 ORIG_RAX: 0000000000000010
[   66.841764] RAX: ffffffffffffffda RBX: 00007ffc61c6a938 RCX: 0000000000432de9
[   66.842297] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005
[   66.842836] RBP: 00007ffc61c6a720 R08: 00007ffc61c6a720 R09: 00007ffc61c6a720
[   66.843358] R10: 00007ffc61c6a720 R11: 0000000000000217 R12: 0000000000000001
[   66.843910] R13: 00007ffc61c6a928 R14: 0000000000000001 R15: 0000000000000001
[   66.844438]  </TASK>
[   66.844638] irq event stamp: 4538
[   66.844904] hardirqs last  enabled at (4537): [<ffffffff8123a20f>] vmx_vcpu_run+0x8af/0x20
[   66.845563] hardirqs last disabled at (4538): [<ffffffff847e8623>] exc_general_protection0
[   66.846243] softirqs last  enabled at (3744): [<ffffffff81304ef8>] fpu_swap_kvm_fpstate+00
[   66.846927] softirqs last disabled at (3742): [<ffffffff81304daf>] fpu_swap_kvm_fpstate+00
[   66.847628] ---[ end trace 0000000000000000 ]---
[   66.848051] Oops: general protection fault, probably for non-canonical address 0xfbd59c00I
[   66.848927] KASAN: maybe wild-memory-access in range [0xdead000000000110-0xdead0000000001]
[   66.849563] CPU: 0 UID: 0 PID: 1497 Comm: poc Tainted: G      D W          6.12.4 #2
[   66.850140] Tainted: [D]=DIE, [W]=WARN
[   66.850428] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.14
[   66.851099] RIP: 0010:__schedule+0x112f/0x3020
[   66.851475] Code: 4c 89 e8 48 c1 e8 03 42 80 3c 38 00 0f 85 47 16 00 00 4d 8b 6d 00 4d 858
[   66.852820] RSP: 0018:ffff88801ea07960 EFLAGS: 00010806
[   66.853211] RAX: 1bd5a00000000022 RBX: ffff888014012280 RCX: 1ffff1100390a5ae
[   66.853734] RDX: 00000027551399a8 RSI: ffffffff812197dc RDI: dead000000000110
[   66.854261] RBP: ffff88801ea07a90 R08: 0000000000000000 R09: fffffbfff0c813b9
[   66.854806] R10: 0000000000000000 R11: ffff8880140126f8 R12: ffff88806ce3bd40
[   66.855435] R13: dead000000000100 R14: ffff888009598000 R15: dffffc0000000000
[   66.855968] FS:  0000000000000000(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000
[   66.856573] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   66.857006] CR2: 0000000000000000 CR3: 0000000005ad2005 CR4: 0000000000772ef0
[   66.857531] DR0: 0000000000003a92 DR1: 0000000000000000 DR2: 0000000000000400
[   66.858059] DR3: 0000000000000007 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   66.858586] PKRU: 55555554
[   66.858799] Call Trace:
[   66.858988]  <TASK>
[   66.859160]  ? die_addr+0x3c/0xa0
[   66.859435]  ? exc_general_protection+0x1a3/0x320
[   66.859803]  ? asm_exc_general_protection+0x26/0x30
[   66.860181]  ? vmx_vcpu_put+0xbc/0x7d0
[   66.860501]  ? __schedule+0x112f/0x3020
[   66.860805]  ? __pfx___schedule+0x10/0x10
[   66.861117]  ? __virt_addr_valid+0x100/0x5d0
[   66.861455]  ? trace_irq_enable.constprop.0+0xd2/0x110
[   66.861847]  ? kasan_quarantine_put+0x84/0x1d0
[   66.862192]  __cond_resched+0x45/0x70
[   66.862481]  cpus_read_lock+0x20/0x160
[   66.862777]  static_key_slow_dec+0x53/0xc0
[   66.863108]  kvm_free_lapic+0x187/0x1c0
[   66.863448]  kvm_arch_vcpu_destroy+0x10a/0x2a0
[   66.863791]  kvm_destroy_vcpus+0x111/0x290
[   66.864111]  ? __pfx_kvm_destroy_vcpus+0x10/0x10
[   66.864463]  ? kvm_arch_vcpu_put+0x587/0x920
[   66.864801]  kvm_arch_destroy_vm+0x2e1/0x470
[   66.865138]  ? __pfx_kvm_arch_destroy_vm+0x10/0x10
[   66.865507]  ? synchronize_srcu+0x1b5/0x250
[   66.865831]  ? __pfx_kvm_vm_release+0x10/0x10
[   66.866166]  kvm_put_kvm+0x4a6/0xa00
[   66.866449]  ? __pfx_kvm_vm_release+0x10/0x10
[   66.866791]  kvm_vm_release+0x3d/0x50
[   66.867084]  __fput+0x3f6/0xb40
[   66.867339]  ? trace_irq_enable.constprop.0+0xd2/0x110
[   66.867745]  task_work_run+0x169/0x260
[   66.868037]  ? __pfx_task_work_run+0x10/0x10
[   66.868365]  ? do_raw_spin_unlock+0x53/0x220
[   66.868695]  do_exit+0xab9/0x2a30
[   66.868956]  ? _printk+0xbf/0x100
[   66.869215]  ? __pfx__printk+0x10/0x10
[   66.869510]  ? __pfx_do_exit+0x10/0x10
[   66.869804]  make_task_dead+0x174/0x3c0
[   66.870114]  ? do_syscall_64+0xc1/0x1d0
[   66.870418]  rewind_stack_and_make_dead+0x16/0x20
[   66.870783] RIP: 0033:0x432de9
[   66.871026] Code: Unable to access opcode bytes at 0x432dbf.
[   66.871469] RSP: 002b:00007ffc61c6a708 EFLAGS: 00000217 ORIG_RAX: 0000000000000010
[   66.872023] RAX: ffffffffffffffda RBX: 00007ffc61c6a938 RCX: 0000000000432de9
[   66.872543] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005
[   66.873060] RBP: 00007ffc61c6a720 R08: 00007ffc61c6a720 R09: 00007ffc61c6a720
[   66.873580] R10: 00007ffc61c6a720 R11: 0000000000000217 R12: 0000000000000001
[   66.874104] R13: 00007ffc61c6a928 R14: 0000000000000001 R15: 0000000000000001
[   66.874634]  </TASK>
[   66.874810] Modules linked in:
[   66.875057] ---[ end trace 0000000000000000 ]---
------------------------------
Best regards,
Linfeng Sun


Attachment: config-L0
Description:

Attachment: config-L1
Description:

Attachment: poc.c
Description:


Current thread: