OCI Error: Failed to retrieve PID of executing child process

When running an oci container that specifies a USER option in the build of the image the error Error: Failed to retrieve PID of executing child process is returned when trying to access incus exec. I was able to reproduce this behavior with the official haproxy and memcached images from docker hub. Locating a config.json file that defines the uid/gid of the container under /var/lib/incus directory, and changing it to root (0) the problem goes away.

Two questions:

Is it possible for incus to support running as a different user and be able to exec into the container?

Is there any added security benefit to this over changing the container to run as root? One that stands out on the surface is that a compromised container can install additional packages as root, but not as the less privileged user. That may not do much outside of the container but could have consequences in the container or aid any escaping.

What version of Incus is that on?

Sorry for the missing details.

Ubuntu 24.04 fully updated.
Kernel 6.8.0
Incus 6.9 from the zabbly mirror.

That’s odd, Incus 6.9 is supposed to have the fix for that…
What OCI image are you using?

The two that I tried were haproxy and memcached. I was only really using haproxy but went looking for another image on docker hub that switches users before the entrypoint. Those were the two, and both behaved the same. Other images like generic debian or Ubuntu just work as they execute the entrypoint as root. It was interesting tracking down the config.json file and essentially overriding the user the executes the entrypoint to root and no other changes, and it just works. For now I am working around it by building my own image from the original from docker hub and running as root, but then wonder how to avoid any container risk for exploitation. My biggest concern is the ability to easily install software that could help escape the container, but am not too familiar with the likelihood of that happening. Better to limit is my thoughts. Thank you for your support.

I encountered a similar issue and found a potential solution that might help others facing the same problem. While examining the file /run/incus/myapp/lxc.conf, I came across the following lines:

lxc.init.uid = 1000
lxc.init.gid = 1000

From my understanding, the issue arises because the init process is being spawned as a user other than root inside the container. When I inspected the Dockerfile used to build the OCI image I was working with, I noticed these lines:

# Create user with UID 1000
RUN useradd -m -u 1000 -s /bin/bash kevin

# Switch to user
USER kevin

ENTRYPOINT ["/bin/customapp"]

To address this issue, I modified the container’s configuration as follows:

config:
  raw.lxc: |
    lxc.init.uid="0"
    lxc.init.gid="0"
    lxc.execute.cmd="tail -f /dev/null"

I also changed the execute command since the default entrypoint command would return an error due to lack of arguments. By using tail -f /dev/null, I ensured the container stays alive without issues.

That’s how I understood the problem and the solution. I welcome any feedback or alternative explanations from the community!

As of incus version 6.11, my container config now looks like this:

config:
  oci.entrypoint: "tail -f /dev/null"
  oci.gid: "0"
  oci.uid: "0"

This achieves the same result as the raw.lxc config from my previous post.

Although I’m happy to see these new fields, I feel that the OCI config docs should mention that the UID and GID fields both apply to the resulting lxc init and execute commands.

I don’t have an example, but maybe someone out there may try and setup the init binary to run as root and their entrypoint/execute command to run as a different user. I don’t think this is possible to achieve so a note could be helpful.