Hi, me again, the mini-angel-of-death, with another bug report.
Had an error connecting to my incus-os system this morning:
$ incus list
Error: Get “https://192.168.1.2:8443/1.0”: Unable to connect to: 192.168.1.2:8443 ([dial tcp 192.168.1.2:8443: connect: connection refused])
The VM’s were all running fine, but no administrative connection.
Connecting a monitor to the physical machine showed the following errors (which I have to reproduce in an abbreviated way by hand as there is no BMC on the “server” and I can’t show the console other than via blurry photos, blech, which I have and can post if needed):
…
INFO Applying Secure Boot certificate update version 202511292320
INFO Downloading application application=incus release=202511292320
ERROR failed to check for application updates err=Failed to run systemd-dissect ... "systemd-dissect" executable file not found in $PATH provider=images
INFO Downloading OS update release=202511292320
INFO Applying OS update release=202511292320
ERROR failed to check for OS updates err=Failed to run: unshare ... "unshare" executable file not found in $PATH provider=images
The last line is replicated in the center window on the console screen as well.
Remote connections fail with a “connection refused” message, and I haven’t rebooted yet . With the A-B system image support, I expect (and hope) that I’ll have, either two options (the previous 202511280022 version as well as the failed 202511292320 version) and can select the previous, working version.
Anyway, I don’t expect real response to this, it’s just more of an information post as what happened when something went wrong during an update.
The fact that systemd-dissect couldn’t be found and then a variety of other stuff is misbehaving suggests storage issues with the main OS partition rather than something wrong with the update process itself.
Given Incus is offline and the system is failing to update, I’d recommend rebooting the system. That will trigger an immediate check and update on reboot. Once everything is back online, you’ll want to pull the log to see what’s going on.
My assumption is that you’ll find some kind of storage error which cause dm-verity to fail reads from the OS partition, leading to those failures.
Sounds reasonable. It’s a brand-spankin-new cheap m.2 NVMe drive of unknown origin (came in a well-known, but smaller Chinese pc maker).
How should I get the log? Using incus admin os debug log or incus query incus:/os/1.0/debug/log both produce logs that look largely identical (but different formats) with a number of errors in them, but they’re nearly all related to missing kernel modules that I expect have been removed intentionally: …/kernel/drivers/bluetooth/btrtl.ko (repeated), mt7921-common.ko, amdxcp.ko, edac_mce_amd.ko (repeated), amd_atl.ko (repeated), and then a bunch of errors related to loading firmware for those same drivers. There are a few other errors, but none that seem serious or look pertinent (and those I mentioned don’t seem so, either).
Oh, and on re-boot, Incus-os downloaded and applied the Incus-os update (and rebooted automatically) and then applied the incus application update, all successfully.
The original errors only happened when downloading an update and no-where else: not on boot, not at random times, just on updates.
Once I was aware of the other issue, monitoring my system showed regular errors when downloading updates to incusos only, exactly as the other issue would cause.
Once the fix was in place (version ‘555) there have been no more errors, and there have been three pushed updated since then.
It’s still possible that dodgy hardware is in play, as well, but it looks good and I’m going to call this fixed.