VM's frozen at boot after snap LXD update (4.24) io_uring problems

Booting back to the non-HWE kernel doesn’t work?

Yep, reinstalled the node with 20.04 HWE kernel and now copying the VM’s back to test.

Error after the HWE kernel installation: “No key available with this passphrase.” on the ZFS partitions. Unfortunately it also doesn’t work with the old kernel and my 2 passphrases. So be careful with the kernel upgrade in combination with LUKS.

I did not expect this because the OS installation itself isn’t encrypted. Anyway… We have backups for that. Everything is up & running and the VMs are working with the HWE kernel.

Thanks @tomp and @stgraber

@tomp @stgraber
Hello gentlemen,
today I got several prod servers hung after reboot, came to conclusion that 4.24 snap lxd package is the cause. Downgrading to 4.23 from rescue helped even though it was not straightforward. Then I came to this issue. I see the fix is merged into the master. I am holding my breath and trying not touch/reboot other hundreds of the servers… could you please advise when new snap package will come with the fix? I want to decide if I should downgrade massively or just wait…
Thanks in advance.

It’s building right now, I expect we’ll start an accelerated roll-out in the next couple of hours.

1 Like

io_uring works with HWE and LUKS in 20.04. I couldn’t find any problems.
I will use HWE from now on (until 22.04 is available)

1 Like

Today I saw that my / partition was full. :frowning_face: with this spam in syslog / kern.log

Mar 26 13:29:48 server kernel: [  217.602554]  <TASK>
Mar 26 13:29:48 server kernel: [  217.602554]  kvm_arch_vcpu_ioctl_run+0xe6/0x5d0 [kvm]
Mar 26 13:29:48 server kernel: [  217.602555]  vcpu_enter_guest+0x354/0x11e0 [kvm]
Mar 26 13:29:48 server kernel: [  217.602589]  kvm_vcpu_ioctl+0x247/0x5f0 [kvm]
Mar 26 13:29:48 server kernel: [  217.602594]  ? kvm_skip_emulated_instruction+0x1f/0x40 [kvm]
Mar 26 13:29:48 server kernel: [  217.602616]  ? __fget_light+0xce/0xf0
Mar 26 13:29:48 server kernel: [  217.602619]  __x64_sys_ioctl+0x91/0xc0
Mar 26 13:29:48 server kernel: [  217.602622]  do_syscall_64+0x61/0xb0
Mar 26 13:29:48 server kernel: [  217.602624]  ? do_syscall_64+0x6e/0xb0
Mar 26 13:29:48 server kernel: [  217.602625]  ? do_syscall_64+0x6e/0xb0
Mar 26 13:29:48 server kernel: [  217.602627]  ? asm_sysvec_irq_work+0xa/0x20
Mar 26 13:29:48 server kernel: [  217.602630]  entry_SYSCALL_64_after_hwframe+0x44/0xae
Mar 26 13:29:48 server kernel: [  217.602633] RIP: 0033:0x7ff6420a83db
Mar 26 13:29:48 server kernel: [  217.602634] Code: 0f 1e fa 48 8b 05 b5 7a 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 7a 0d 00 f7 d8 64 89 01 48
Mar 26 13:29:48 server kernel: [  217.602636] RSP: 002b:00007ff637071248 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Mar 26 13:29:48 server kernel: [  217.602638] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007ff6420a83db
Mar 26 13:29:48 server kernel: [  217.602640] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000019
Mar 26 13:29:48 server kernel: [  217.602641] RBP: 000055e0cf5e19e0 R08: 000055e0cd790e68 R09: 000055e0cd447370
Mar 26 13:29:48 server kernel: [  217.602642] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
Mar 26 13:29:48 server kernel: [  217.602643] R13: 000055e0cdddefe0 R14: 0000000000000000 R15: 0000000000000000
Mar 26 13:29:48 server kernel: [  217.602645]  </TASK>
Mar 26 13:29:48 server kernel: [  217.602646] ---[ end trace 836ff300d5b60eae ]---
Mar 26 13:29:02 server kernel: [  171.484052] Call Trace:
Mar 26 13:29:02 server kernel: [  171.484051]  __x64_sys_ioctl+0x91/0xc0
Mar 26 13:29:02 server kernel: [  171.484053]  <TASK>
Mar 26 13:29:02 server kernel: [  171.484057]  do_syscall_64+0x61/0xb0
Mar 26 13:29:02 server kernel: [  171.484061]  ? syscall_exit_to_user_mode+0x27/0x50
Mar 26 13:29:02 server kernel: [  171.484065]  ? do_syscall_64+0x6e/0xb0
Mar 26 13:29:02 server kernel: [  171.484068]  ? syscall_exit_to_user_mode+0x27/0x50
Mar 26 13:29:02 server kernel: [  171.484071]  ? do_syscall_64+0x6e/0xb0
Mar 26 13:29:02 server kernel: [  171.484074]  ? do_syscall_64+0x6e/0xb0
Mar 26 13:29:02 server kernel: [  171.484076]  ? do_syscall_64+0x6e/0xb0
Mar 26 13:29:02 server kernel: [  171.484079]  ? asm_sysvec_call_function+0xa/0x20
Mar 26 13:29:02 server kernel: [  171.484055]  vcpu_enter_guest+0x354/0x11e0 [kvm]
Mar 26 13:29:02 server kernel: [  171.484084]  entry_SYSCALL_64_after_hwframe+0x44/0xae
Mar 26 13:29:02 server kernel: [  171.484089] RIP: 0033:0x7ff6420a83db
Mar 26 13:29:02 server kernel: [  171.484092] Code: 0f 1e fa 48 8b 05 b5 7a 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 7a 0d 00 f7 d8 64 89 01 48
Mar 26 13:29:02 server kernel: [  171.484095] RSP: 002b:00007ff637071248 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Mar 26 13:29:02 server kernel: [  171.484099] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007ff6420a83db
Mar 26 13:29:02 server kernel: [  171.484101] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000019
Mar 26 13:29:02 server kernel: [  171.484104] RBP: 000055e0cf5e19e0 R08: 000055e0cd790e68 R09: 000000003b9aca00
Mar 26 13:29:02 server kernel: [  171.484106] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
Mar 26 13:29:02 server kernel: [  171.484108] R13: 000055e0cdddefe0 R14: 0000000000000004 R15: 0000000000000000
Mar 26 13:29:02 server kernel: [  171.484112]  </TASK>
Mar 26 13:29:02 server kernel: [  171.484113] ---[ end trace 836ff300d5b227e9 ]---
Mar 26 13:29:02 server kernel: [  171.484111]  ? kvm_skip_emulated_instruction+0x1f/0x40 [kvm]
Mar 26 13:29:02 server kernel: [  171.484144] ------------[ cut here ]------------

Have reinstalled the server without HWE kernel. Let’s see if it continues to run well.

-30 is safe, -38 may be good, -37 and -39 are definitely broken.

1 Like