Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d Rust-for-Linux#2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc Rust-for-Linux#3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 Rust-for-Linux#4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] Rust-for-Linux#5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 Rust-for-Linux#6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa Rust-for-Linux#7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc Rust-for-Linux#8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e Rust-for-Linux#9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 Rust-for-Linux#10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 Rust-for-Linux#11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] Rust-for-Linux#12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] Rust-for-Linux#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] Rust-for-Linux#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 Rust-for-Linux#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 Rust-for-Linux#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 Rust-for-Linux#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 Rust-for-Linux#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 Rust-for-Linux#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc Rust-for-Linux#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d Rust-for-Linux#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 Rust-for-Linux#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <[email protected]> Signed-off-by: Stefan Assmann <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
- Loading branch information