Fedora 41 container images result in "degraded" status - images:fedora/41 and images:fedora/41/cloud

I’ve come across a problem with Fedora 41 images — images:fedora/41 and
images:fedora/41/cloud — I have a workaround and I’d like to file a bug report
or even a pull request so that the issue is fixed; I’m asking for help please with the next steps.

In brief the path /run/systemd/nsresource/registry does not exist inside these
images and therefore the systemd-nsresourced service fails to start. Creating
the paths and restarting the service is a workaround.

I guess the issue could be solved in
https://github.com/stgraber/distrobuilder/blob/main/sources/fedora-http.go and
I’d appreciate advice before I go further.

Steps to reproduce

Start with a very simple profile saved in a file called config.yaml:

devices:
  eth0:
    name: eth0
    network: incusbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk

Command to launch a test container called f41:

incus launch --no-profiles images:fedora/41 f41 < config.yaml

Command to show status:

 incus exec f41 -- systemctl is-failed

Output:

degraded

Command to list failed units:

 incus exec f41 -- systemctl --failed

Output:

  UNIT                        LOAD   ACTIVE SUB    DESCRIPTION
● systemd-nsresourced.service loaded failed failed Namespace Resource Manager
● systemd-nsresourced.socket  loaded failed failed Namespace Resource Manager Socket

Legend: LOAD   → Reflects whether the unit definition was properly loaded.
        ACTIVE → The high-level unit activation state, i.e. generalization of SUB.
        SUB    → The low-level unit activation state, values depend on unit type.

2 loaded units listed.
Investigation of root cause

Command to set the log level to debug:

incus exec f41 -- systemctl log-level debug

Command to restart the failing service:

incus exec f41 -- systemctl restart systemd-nsresourced.service

Command to view logs:

incus exec f41 -- journalctl -xeu systemd-nsresourced.service

Relevant lines of debug log output:

Dec 11 10:43:44 f41 (esourced)[220]: systemd-nsresourced.service: Executing: /usr/lib/systemd/systemd-nsresourced
Dec 11 10:43:44 f41 systemd-nsresourced[220]: Failed to open registry directory: Read-only file system
Dec 11 10:43:44 f41 systemd-nsresourced[220]: Failed to start up daemon: Read-only file system

Command to check relevant version

incus exec f41 -- rpm --query --file /usr/lib/systemd/systemd-nsresourced

Output:

systemd-256.9-2.fc41.x86_64

Likely relevant C source code:
https://github.com/systemd/systemd/blob/v256.9/src/nsresourced/nsresourced-manager.c
https://github.com/systemd/systemd/blob/v256.9/src/nsresourced/userns-registry.c

From the later I think the relevant directory is
/run/systemd/nsresource/registry.

Neither of the following paths exist inside the container:

  • /run/systemd/nsresource/registry
  • /run/systemd/nsresource
Workaround

Command to create the two missing directories:

incus exec f41 -- mkdir --parents /run/systemd/nsresource/registry

Command to restart the service:

incus exec f41 -- systemctl restart systemd-nsresourced.service

Command to show the status:

incus exec f41 -- systemctl is-failed

Output:

running

Similarly incus exec f41 -- systemctl --failed now shows
0 loaded units listed.

This is surprising as we do not publish any image which doesn’t have a clean systemctl --failed output.

But I’m able to reproduce it here so it’s likely something where different kernel versions or host OS causes some variations…

So looks like the issue is that this unit is running under ProtectSystem=strict but then wants to write to /run which is read-only at that point. I’m pretty confused as to why this seems to work fine in some cases and not in others…

One workaround we can do is add a ReadWritePaths=/run to our systemd override, basically ensuring units can write to /run and so avoiding this problem.

That should get picked up within 48h or so

Wow, that’s brilliant, thank you for turning around a fix so quickly. I appreciate your help.

Details of re-testing

A pass is running output by the systemctl command below. The previous
failure output degraded.

Commands to re-test images:fedora/41:

incus launch --no-profiles images:fedora/41 f41 < config.yaml
incus exec f41 -- systemctl is-failed
incus stop f41
incus delete f41

Commands to re-test images:fedora/41/cloud:

incus launch --no-profiles images:fedora/41/cloud f41 < config.yaml
incus exec f41 -- systemctl is-failed
incus stop f41
incus delete f41