INFO: task incusd:2939 blocked for more than 362 seconds

oddjobz · June 9, 2025, 10:09am

Anyone any thoughts on what might cause this … experienced during a container restart … container doesn’t restart, then neither will incus …

Jun 09 10:05:50 lite kernel: INFO: task incusd:2939 blocked for more than 362 seconds.
Jun 09 10:05:50 lite kernel:       Tainted: P           O       6.12.25-v8-16k #1
Jun 09 10:05:50 lite kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 09 10:05:50 lite kernel: task:incusd          state:D stack:0     pid:2939  tgid:2847  ppid:1      flags:0x0000000c
Jun 09 10:05:50 lite kernel: Call trace:
Jun 09 10:05:50 lite kernel:  __switch_to+0xf0/0x150
Jun 09 10:05:50 lite kernel:  __schedule+0x38c/0xdd8
Jun 09 10:05:50 lite kernel:  schedule+0x3c/0x148
Jun 09 10:05:50 lite kernel:  grab_super+0x158/0x1c0
Jun 09 10:05:50 lite kernel:  sget+0x150/0x268
Jun 09 10:05:50 lite kernel:  zpl_mount+0x134/0x2f8 [zfs]
Jun 09 10:05:50 lite kernel:  legacy_get_tree+0x38/0x70
Jun 09 10:05:50 lite kernel:  vfs_get_tree+0x30/0x100
Jun 09 10:05:50 lite kernel:  path_mount+0x410/0xa98
Jun 09 10:05:50 lite kernel:  __arm64_sys_mount+0x194/0x2c0
Jun 09 10:05:50 lite kernel:  invoke_syscall+0x50/0x120
Jun 09 10:05:50 lite kernel:  el0_svc_common.constprop.0+0x48/0xf0
Jun 09 10:05:50 lite kernel:  do_el0_svc+0x24/0x38
Jun 09 10:05:50 lite kernel:  el0_svc+0x30/0xd0
Jun 09 10:05:50 lite kernel:  el0t_64_sync_handler+0x100/0x130
Jun 09 10:05:50 lite kernel:  el0t_64_sync+0x190/0x198

oddjobz · June 9, 2025, 11:45am

Update: happened again on a different node, possibly associated with;

ovsdb-server[2598]: ovs|00034|raft|INFO|Transferring leadership to write a snapshot.
ovsdb-server[2598]: ovs|00035|raft|INFO|rejected append_reply (not leader)
ovsdb-server[2598]: ovs|00036|raft|INFO|rejected append_reply (not leader)
ovsdb-server[2598]: ovs|00037|raft|INFO|server 18e5 is leader for term 6

And

kernel: eth1: renamed from veth87299ae4
kernel: veth6f6b37eb: renamed from physn1QH2O
incusd[8150]: time="2025-06-09T12:39:03+01:00" level=warning msg="Could not find OVN Switch port associated to OVS interface" device=eth-1 driver=nic instance=kuma interface=vethf03f89dc project=default

This is bad news because it locks the entire node needing a reboot.

oddjobz · June 9, 2025, 5:27pm

Ok, I’ve not been able to reproduce it “exactly”, however it tends to happen when I’m changing an instance, either the profile (which implicitly changes the network) or adding / deleting interfaces, predominantly OVN networks and interfaces. Had it 4 times today so far … node won’t even close down, needs a power button.

stgraber · June 9, 2025, 5:31pm

The error seems to indicate a kernel level lock with a mount operation.

You can look at ps fauxww on the system and look for processes in D state to get an idea of what’s currently stuck, but when you get those kind of messages, there’s nothing that Incus can do about it, it’s no longer running until whatever syscall it’s stuck on finally completes.

oddjobz · June 9, 2025, 5:36pm

Sure, makes sense … but it doesn’t happen (at all) when I’m not messing with interfaces, which I’m doing through Incus. Whereas I appreciate the problem is at a lower level somewhere in the kernel, it would appear to be being triggered by Incus’ behavior … maybe the way or order in which incus is adding and removing things.

I’ll see if I can pin it down any further …

oddjobz · June 20, 2025, 8:02am

So this is triggered by stopping / starting instances with static IP addresses where the address is on an OVN network. Changing an address can seem to be the issue as one tends to restart an instance after setting it’s address as static.

Avoid restarting containers with pinned IP’s, problem vanishes
Only use dynamically allocated non-pinned IP’s, problem vanishes
Avoid using OVN, problem vanishes

simos · June 20, 2025, 8:36am

Can you make a reproducible with some minimal OVN network setup? That is, likely create a VM in Incus, in there do the minimal OVN setup, and make the problem appear.

oddjobz · June 20, 2025, 9:10am

In theory I could. This would however involve spare equipment I don’t currently have available and time that I seem to be running out of. I’d like to thank everyone who’s helped thus far but as far as OVN is concerned I’ve now run out of road.

I had a fully functional network yesterday morning that I’d spent 4-5 months building and getting to grips with, and a fully automated reproducible deployment system. A number of operational issues remained that I was working out solutions for, but I was fairly confident I knew what I was doing. Hey, 30+ years with ipv4, I must’ve learnt a little, right?

Then it stopped. Something obviously changed, but I don’t know what. After spending all day trying to recover (and failing) including redeploying, I came to the conclusion that I just can’t afford to run on a platform that is as fragile as the one I seem to have built and despite the investment and how much it kills me to bin all the work I’ve put in.

Within a matter of hours of making the decision I had a workable alternative that seems to provide all the facilities I was hoping for from OVN, “and” solves all the outstanding operational issues that seemed difficult to solve with OVN. My important / “live” stuff I’ve migrated off onto non-clustered hardware and I’ll be working on the alternative over the next few days … I doubt it’s going to take too long to implement.

I think from the get-go I mis-interpreted what OVN is for vs the alternatives and just went down the wrong route (for me).

oddjobz · July 1, 2025, 1:18am

Update:

So the original problem is not linked to OVN, nor is it linked to a particular kernel version or any kind of exotic configuration. The problem is relatively random and happens (sometimes) on container restarts. It only happens on instances with overridden network parameters.

I’m updating this because I thought it was just happening on overridden IP addresses, but it’s not, if you override the mac address it can also cause the problem. i.e. restarting a container with an overridden address.

I know technically this looks like a ZFS error, but it’s being caused by something very specific that Incus is doing “differently” when restarting containers with overridden network keys. (specifically the ip address and hw address)

It’s the worst kind of problem as it locks the entire machine, which means a full reboot. The knock-on is that because it locks Incus, it then locks nodes in the rest of the cluster if they try to update anything.

If there’s no fix, any thoughts on mitigation would be much appeaciated. If I don’t override the hardware address then it looks like I’m getting random mac addresses which change each restart. This “seems” to be new behaviour, but either way I can’t keep anything stable atm … either I’m constantly renumbering, or I override stuff and risk lock-ups …

My config is now very simple, local VLANS with my own DHCP server, remote is a L2 tunnel using GRE over Wireguard. Previous config was OVN with managed networks. Running across half a dozen machines with three different kernel versions. (Debian 6.6 => 6/12/25) I’m running an Ubuntu Node now just on the off-chance that makes a difference …

oddjobz · July 7, 2025, 8:56pm

Another Update:

Looks like this is the problem and it looks suspiciously like it might be fixed as of ZFS 2.3.3

github.com/openzfs/zfs

Hang in zpl_create when Docker mounts a dataset under load in 2.3.1

opened 03:57PM - 12 Mar 25 UTC

closed 07:14PM - 04 Apr 25 UTC

TheUbuntuGuy

Type: Defect

### System information Type | Version/Name --- | --- Distribution Name | Debia…n Distribution Version | 12 / Bookworm Kernel Version | 6.13.6 Architecture | amd64 OpenZFS Version | 2.3.1 ### Describe the problem you're observing After upgrading from `2.2.7` to `2.3.1`, repeatedly observing the following task hang mounting datasets when starting a Docker container: ``` INFO: task dockerd:191796 blocked for more than 1087 seconds. Tainted: P O 6.13.6-xxx #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:dockerd state:D stack:0 pid:1914796 tgid:1308 ppid:1 flags:0x00000002 Call Trace: <TASK> __schedule+0x47e/0xf90 ? super_lock.part.0+0x1b/0x70 ? super_lock+0x4d/0xc0 ? zpl_create+0x200/0x200 [zfs] schedule+0x26/0xa0 grab_super+0xd5/0x110 ? prepare_to_wait_event+0x110/0x110 sget+0xe3/0x250 ? get_anon_bdev+0x40/0x40 zpl_mount+0xdb/0x340 [zfs] legacy_get_tree+0x24/0x40 vfs_get_tree+0x1f/0xc0 path_mount+0x42b/0x960 do_mount+0x72/0x90 __x64_sys_mount+0x8e/0xd0 do_syscall_64+0x41/0xd0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x562651633b6e RSP: 002b:000000c0025a8288 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 000000c0001c9da0 RCX: 0000562651633b6e RDX: 000000c0017f2e40 RSI: 000000c0001c9e00 RDI: 000000c0001c9da0 RBP: 000000c0025a82c8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000206 R12: 000000c0017f2e40 R13: 0000000000000000 R14: 000000c001109c00 R15: 00000000000000e4 </TASK> ``` The trace is always the same. It never unblocks on its own, even when the system returns to idle, requiring a kernel reset to recover. ### Describe how to reproduce the problem Under heavy load, spawn a Docker container. It may be possible to trigger this with other ZFS mounts (not triggered by Docker's graphdriver), but I have not tried.

oddjobz · July 10, 2025, 10:40pm

Ok, so I’ve been hitting two new servers pretty heavily over the last two days, no signs of any lockups. These are both running Trixie which comes with zfs 2.3.2, so it looks (at this point) like the problem runs up to ZFS 2.3.1 and is fixed in ZFS 2.3.2. So upgrade to Trixie seems to be the Debian fix.

Incidentally Debian seem to be saying “upgrade at your own risk, we recommend a clean install”, but the upgrade worked fine for me. The only problem I had was with “systemd-networkd-wait-online” which seems to lock-out the boot process (for me) waiting on interfaces that will never come ready. (boots after a 2m timeout)

The best fix I’ve found thus far is;

# cat /etc/systemd/system/systemd-networkd-wait-online.service.d/override.conf
[Service]
ExecStart=
ExecStart=/usr/lib/systemd/systemd-networkd-wait-online --interface=(your actual interfaces)

That gives a very crisp and clean boot.