We have seen stuck of Linux VMs (kernel 5.4, 5.15, 6.x) with messages in Linux's dmesg:
- BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 37s!
- watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [kworker/3:3:2090493]
- rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
Physical CPUs are Intel(R) Xeon(R) Gold 6248R and have enabled "Virtual Interrupt Delivery" and "Process posted interrupts" features.
Investigation shown that vmx_pending_intr returns 0 even if vmexit->u.hlt.intr_status was 0xfe, 0xfd, ... and per_desc->pending was 1.
Function vmx_inject_pir has mention about situation when 'pending' is 1 and zero pirval-s:
* It is possible for pirval to be 0 here, even though the * pending bit has been set. The scenario is:
Correct initial fix "02cc877968bbcd57695035c67114a67427f54549 Recognize a pending virtual interrupt while emulating the halt instruction" for all cases: pending is 0 and pending is 1.
Possible issue is also mentioned here: debian-vm-freezes-after-several-hours
Sponsored by: vStack