First thing first, congrats on that new major release !
One of my IncusOS server was updated to 7.0 and my containers are no longer able to start since the network interface tied to them is no longer available
Error in UI:
Loading network state failed
Network interface "monitoring-br0" not found
Error in logs: context-err="Failed starting: The DNS and DHCP service exited prematurely: exit status 127 (\"dnsmasq: error while loading shared libraries: /lib/x86_64-linux-gnu/libnetfilter_conntrack.so.3: cannot read file data: Input/output error\")" context-network="monitoring-br0" context-project="default" level="error" Failed initializing network
I was about to recreate the network interface but want to check if there is a solution since some of my servers have a lot of network interfaces and that might become a bigger problem
I tried on another staging server and same thing. What are the odds of two corrupted blocks on two different systems that were working just fine before the upgrade ?
Not pointing finger, just want to understand if something is wrong on my end
Yeah, two different installs failing the same way pretty much rules out a hardware issue.
What is the version of IncusOS that you’ve updated to?
We do have logic in place whenever the sysext images are refreshed to warn and remove any that are corrupt or otherwise don’t match their expected signature. This happens each time the system boots, as well as when applying updates. If @dwlfrth hasn’t seen any messages about “corrupt on-disk image, attempting to cleanup”, then the sysext images are at least good enough for systemd-sysext to load them.
Hallo again,
Not quite sure what is happening. I changed the SSD and reinstall the OS (previous version). It installed just fine and proceed with upgrade. Now, it is looping with the following error
ERROR Failed to run: systemd-sysext refresh: exit status 1 (Failed to read metadata for image incus: Structure needs cleaning)
I’m puzzled by this – based on the errors I’d say this is a hardware issue, but changing the drive would eliminate that.
I wonder if your root ext4 partition is becoming dirty somehow, which could then impact systemd-sysext from loading Incus properly. If you’re able, can you follow the instructions at Emergency Procedure for a Lost Client Certificate - IncusOS documentation to decrypt the root partition, then try running fsck and seeing if it reports fixing anything?
We do have systemd-fsck-root.service in the IncusOS image, but it doesn’t appear to be active because of a condition. It would probably be good to actually get this running, so we can be more resilient to file system errors.