What should I do? container can't start because of pidfds or cgroup

Hello everybody!

I’ve created and prepared a container on Ubuntu 16.04 (lxc 2.0.11), and tested it on Ubuntu 20.04 and Ubuntu 22.04, everything was working fine.

But then I tried to lxc-start the container on a private distro with custom build kernel, the container just couldn’t start.

I have a feeling I’m getting very close to making it work.

Log:

lxc-start dgad 20230726023605.590 INFO     lxccontainer - lxccontainer.c:do_lxcapi_start:979 - Set process title to [lxc monitor] /var/log/image/lxc dgad
lxc-start dgad 20230726023605.606 DEBUG    lxccontainer - lxccontainer.c:wait_on_daemonized_start:840 - First child 44131 exited
lxc-start dgad 20230726023605.606 INFO     lsm - lsm.c:lsm_init:40 - Initialized LSM security driver nop
lxc-start dgad 20230726023605.607 DEBUG    terminal - terminal.c:lxc_terminal_peer_default:665 - No such device - The process does not have a controlling terminal
lxc-start dgad 20230726023605.608 WARN     cgroup - cgroup.c:cgroup_init:50 - Running with unknown cgroup layout
lxc-start dgad 20230726023605.608 INFO     start - start.c:lxc_init:837 - Container "dgad" is initialized
lxc-start dgad 20230726023605.609 ERROR    utils - utils.c:lxc_can_use_pidfd:1853 - Kernel does not support pidfds
lxc-start dgad 20230726023605.609 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWNS
lxc-start dgad 20230726023605.609 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWPID
lxc-start dgad 20230726023605.609 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWUTS
lxc-start dgad 20230726023605.609 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWIPC
lxc-start dgad 20230726023605.609 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWNET
lxc-start dgad 20230726023605.609 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved mnt namespace via fd 16
lxc-start dgad 20230726023605.609 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved pid namespace via fd 17
lxc-start dgad 20230726023605.609 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved uts namespace via fd 18
lxc-start dgad 20230726023605.609 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved ipc namespace via fd 19
lxc-start dgad 20230726023605.609 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved net namespace via fd 20
lxc-start dgad 20230726023605.609 ERROR    start - start.c:lxc_spawn:1741 - Failed to setup cgroup limits for container "dgad"
lxc-start dgad 20230726023605.609 DEBUG    network - network.c:lxc_delete_network:3672 - Deleted network devices
lxc-start dgad 20230726023605.609 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:859 - Received container state "ABORTING" instead of "RUNNING"
lxc-start dgad 20230726023605.609 ERROR    lxc_start - lxc_start.c:main:308 - The container failed to start
lxc-start dgad 20230726023605.609 ERROR    lxc_start - lxc_start.c:main:311 - To get more details, run the container in foreground mode
lxc-start dgad 20230726023605.609 ERROR    lxc_start - lxc_start.c:main:313 - Additional information can be obtained by setting the --logfile and --logpriority options
lxc-start dgad 20230726023605.610 ERROR    start - start.c:__lxc_start:1999 - Failed to spawn container "dgad"
lxc-start dgad 20230726023605.610 WARN     start - start.c:lxc_abort:1018 - No such process - Failed to send SIGKILL to 44133

Environment:
lxc-start --version: 4.0.6

Where should I go from here?

Is it necessary to make the kernel support pidfs? Or is there something wrong with my “cgroup layout”?

pidfds aren’t required, the relevant error here is the one around cgroups.

What’s in your config as far as cgroups (lxc.cgroup or lxc.cgroup2) and what does:

  • cat /proc/self/cgroup
  • cat /proc/self/mountinfo | grep cgroup

Gets you on the new host system?

Wow, thank you for noticing my post!

The entire config:

lxc.include = /usr/share/lxc/config/common.conf
lxc.arch = linux64
lxc.rootfs.path = /var/log/image/lxc/dgad/rootfs
lxc.uts.name = dgad

# Network configuration
lxc.net.0.type = veth
lxc.net.0.link = lxcbr0
lxc.net.0.flags = up
lxc.net.0.hwaddr = 00:16:3e:b4:3d:8c

The included config:

root@vmware:/var/log/image/lxc# cat /usr/share/lxc/config/common.conf | grep -i "cgroup"
# Default legacy cgroup configuration
# CGroup allowlist
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = c *:* m
lxc.cgroup.devices.allow = b *:* m
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
lxc.cgroup.devices.allow = c 1:7 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:2 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 10:229 rwm
# Default unified cgroup configuration
# CGroup allowlist
lxc.cgroup2.devices.deny = a
lxc.cgroup2.devices.allow = c *:* m
lxc.cgroup2.devices.allow = b *:* m
lxc.cgroup2.devices.allow = c 1:3 rwm
lxc.cgroup2.devices.allow = c 1:5 rwm
lxc.cgroup2.devices.allow = c 1:7 rwm
lxc.cgroup2.devices.allow = c 5:0 rwm
lxc.cgroup2.devices.allow = c 5:1 rwm
lxc.cgroup2.devices.allow = c 5:2 rwm
lxc.cgroup2.devices.allow = c 1:8 rwm
lxc.cgroup2.devices.allow = c 1:9 rwm
lxc.cgroup2.devices.allow = c 136:* rwm
lxc.cgroup2.devices.allow = c 10:229 rwm
lxc.mount.auto = cgroup:mixed proc:mixed sys:mixed

Proc:

root@vmware:~# cat /proc/self/cgroup
11:freezer:/
10:memory:/system.slice/system-getty.slice/getty@tty2.service
9:perf_event:/
8:debug:/
7:cpuset:/
6:blkio:/system.slice/system-getty.slice
5:pids:/system.slice/system-getty.slice/getty@tty2.service
4:net_cls,net_prio:/
3:devices:/system.slice/system-getty.slice
2:cpu,cpuacct:/system.slice/system-getty.slice
1:name=systemd:/system.slice/system-getty.slice/getty@tty2.service
0::/system.slice/system-getty.slice/getty@tty2.service
root@vmware:~# cat /proc/self/mountinfo | grep cgroup
root@vmware:~# #no output

Ok, so your system despite having cgroups configured doesn’t seem to have a cgroup tree mounted at /sys/fs/cgroup. This is pretty weird as systemd must have had access to a cgroup tree at some point to set things up.

Can you confirm that ls -lh /sys/fs/cgroup shows you nothing?

Thank you for following up!

Yes, it shows nothing:

root@vmware:~# ls -lh /sys/fs/cgroup
total 0
root@vmware:~#

you mentioned “cgroup tree not mounted”, I suppose that’s true

root@vmware:~# lxc-checkconfig
LXC version 4.0.6
--- Namespaces ---
Namespaces: enabled
Utsname namespace: enabled
Ipc namespace: enabled
Pid namespace: enabled
User namespace: enabled
Network namespace: enabled

--- Control groups ---
Cgroups: enabled

Cgroup v1 mount points:


Cgroup v2 mount points:


Cgroup v1 systemd controller: missing
Cgroup v1 freezer controller: missing
Cgroup namespace: required
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: enabled
Cgroup cpuset: enabled

--- Misc ---
Veth pair device: enabled, not loaded
Macvlan: enabled, not loaded
Vlan: enabled, not loaded
Bridges: enabled, not loaded
Advanced netfilter: enabled, not loaded
CONFIG_NF_NAT_IPV4: enabled, not loaded
CONFIG_NF_NAT_IPV6: missing
CONFIG_IP_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_IP6_NF_TARGET_MASQUERADE: missing
CONFIG_NETFILTER_XT_TARGET_CHECKSUM: enabled, not loaded
CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled, not loaded
FUSE (for use with lxcfs): enabled, loaded

--- Checkpoint/Restore ---
checkpoint restore: enabled
CONFIG_FHANDLE: enabled
CONFIG_EVENTFD: enabled
CONFIG_EPOLL: enabled
CONFIG_UNIX_DIAG: missing
CONFIG_INET_DIAG: enabled
CONFIG_PACKET_DIAG: missing
CONFIG_NETLINK_DIAG: missing
File capabilities:

Note : Before booting a new kernel, you can check its configuration
usage : CONFIG=/path/to/config /usr/bin/lxc-checkconfig

When I try to find blkio, freezer, pids on this distro, it shows nothing:

root@vmware:~# find / -name "blkio" 2>/dev/null
root@vmware:~# find / -name "freezer" 2>/dev/null
root@vmware:~# find / -name "pids" 2>/dev/null
root@vmware:~#

So I suppose systemd uses cgroup when it is mounted at /sys/fs/cgroup to control stuffs, and now that /sys/fs/cgroup contains nothing, lxc-start can’t leverage systemd to start the container, is my understanding correct?

Should it be my next step to contact the kernel/distro developer and ask them about this weird phenomenon?

Extra information

earlier I was running lxc-start via bash, just:

root@vmware:~# lxc-start -P /var/log/image/lxc/ -n dgad -l DEBUG --logfile DGADLOG

because you mentioned

systemd must have had access to a cgroup tree at some poin

I tried to start the container using a service file:

root@vmware:~# cat /lib/systemd/system/dgad-container.service
[Unit]
Description=Start dgad container

[Service]
Type=simple
TimeoutStopSec=120s
ExecStart=/usr/bin/lxc-start -P /var/log/image/lxc/ -n dgad -lDEBUG --logfile DGADLOG
ExecStop=/usr/bin/lxc-stop -P /var/log/image/lxc/ -n dgad
Delegate=yes
StandardOutput=syslog
StandardError=syslog
KillMode=mixed

[Install]
WantedBy=multi-user.target

And the log is different:

root@vmware:~# cat /DGADLOG
lxc-start dgad 20230727010401.763 INFO     lxccontainer - lxccontainer.c:do_lxcapi_start:979 - Set process title to [lxc monitor] /var/log/image/lxc dgad
lxc-start dgad 20230727010401.764 DEBUG    lxccontainer - lxccontainer.c:wait_on_daemonized_start:840 - First child 676435 exited
lxc-start dgad 20230727010401.764 INFO     lsm - lsm.c:lsm_init:40 - Initialized LSM security driver nop
lxc-start dgad 20230727010401.765 DEBUG    terminal - terminal.c:lxc_terminal_peer_default:665 - No such device - The process does not have a controlling terminal
lxc-start dgad 20230727010401.766 INFO     start - start.c:lxc_init:837 - Container "dgad" is initialized
lxc-start dgad 20230727010401.789 WARN     cgfsng - cgfsng.c:mkdir_eexist_on_last:1152 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.monitor.dgad"
lxc-start dgad 20230727010401.807 INFO     cgfsng - cgfsng.c:cgfsng_monitor_create:1368 - The monitor process uses "lxc.monitor.dgad" as cgroup
lxc-start dgad 20230727010401.861 WARN     cgfsng - cgfsng.c:mkdir_eexist_on_last:1152 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.payload.dgad"
lxc-start dgad 20230727010401.883 INFO     cgfsng - cgfsng.c:cgfsng_payload_create:1471 - The container process uses "lxc.payload.dgad" as cgroup
lxc-start dgad 20230727010401.902 ERROR    utils - utils.c:lxc_can_use_pidfd:1853 - Kernel does not support pidfds
lxc-start dgad 20230727010401.953 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWNS
lxc-start dgad 20230727010401.953 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWPID
lxc-start dgad 20230727010401.953 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWUTS
lxc-start dgad 20230727010401.953 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWIPC
lxc-start dgad 20230727010401.953 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved mnt namespace via fd 29
lxc-start dgad 20230727010401.953 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved pid namespace via fd 30
lxc-start dgad 20230727010401.953 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved uts namespace via fd 31
lxc-start dgad 20230727010401.953 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved ipc namespace via fd 32
lxc-start dgad 20230727010401.954 INFO     cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2875 - Limits for the legacy cgroup hierarchies have been setup
lxc-start dgad 20230727010401.955 WARN     cgfsng - cgfsng.c:cgfsng_setup_limits:2936 - Invalid argument - Ignoring cgroup2 limits on legacy cgroup system
lxc-start dgad 20230727010401.955 DEBUG    start - start.c:lxc_spawn:1773 - Preserved net namespace via fd 7
lxc-start dgad 20230727010401.955 WARN     start - start.c:lxc_spawn:1778 - File exists - Failed to allocate new network namespace id
lxc-start dgad 20230727010401.955 INFO     start - start.c:do_start:1198 - Unshared CLONE_NEWCGROUP
lxc-start dgad 20230727010401.971 DEBUG    storage - storage.c:storage_query:233 - Detected rootfs type "dir"
lxc-start dgad 20230727010401.971 DEBUG    conf - conf.c:lxc_mount_rootfs:1261 - Mounted rootfs "/var/log/image/lxc/dgad/rootfs" onto "/usr/lib/lxc/rootfs" with options "(null)"
lxc-start dgad 20230727010401.971 INFO     conf - conf.c:setup_utsname:749 - Set hostname to "dgad"
lxc-start dgad 20230727010401.971 INFO     conf - conf.c:mount_autodev:1056 - Preparing "/dev"
lxc-start dgad 20230727010401.971 DEBUG    conf - conf.c:mount_autodev:1059 - Using mount options: size=500000,mode=755
lxc-start dgad 20230727010401.981 INFO     conf - conf.c:mount_autodev:1106 - Prepared "/dev"
lxc-start dgad 20230727010402.103 DEBUG    conf - conf.c:mount_entry:1945 - Remounting "/sys/fs/fuse/connections" on "/usr/lib/lxc/rootfs/sys/fs/fuse/connections" to respect bind or remount options
lxc-start dgad 20230727010402.103 DEBUG    conf - conf.c:mount_entry:1964 - Flags for "/sys/fs/fuse/connections" were 4110, required extra flags are 14
lxc-start dgad 20230727010402.103 DEBUG    conf - conf.c:mount_entry:2008 - Mounted "/sys/fs/fuse/connections" on "/usr/lib/lxc/rootfs/sys/fs/fuse/connections" with filesystem type "none"
lxc-start dgad 20230727010402.622 INFO     conf - conf.c:lxc_fill_autodev:1149 - Populating "/dev"
lxc-start dgad 20230727010402.622 DEBUG    conf - conf.c:lxc_fill_autodev:1159 - Created device node "full"
lxc-start dgad 20230727010402.622 DEBUG    conf - conf.c:lxc_fill_autodev:1159 - Created device node "null"
lxc-start dgad 20230727010402.622 DEBUG    conf - conf.c:lxc_fill_autodev:1159 - Created device node "random"
lxc-start dgad 20230727010402.623 DEBUG    conf - conf.c:lxc_fill_autodev:1159 - Created device node "tty"
lxc-start dgad 20230727010402.623 DEBUG    conf - conf.c:lxc_fill_autodev:1159 - Created device node "urandom"
lxc-start dgad 20230727010402.623 DEBUG    conf - conf.c:lxc_fill_autodev:1159 - Created device node "zero"
lxc-start dgad 20230727010402.623 INFO     conf - conf.c:lxc_fill_autodev:1221 - Populated "/dev"
lxc-start dgad 20230727010402.623 INFO     utils - utils.c:lxc_mount_proc_if_needed:1256 - I am 1, /proc/self points to "1"
lxc-start dgad 20230727010402.624 DEBUG    conf - conf.c:lxc_setup_ttydir_console:1711 - Created directory for console and tty devices at "/usr/lib/lxc/rootfs/dev/lxc"
lxc-start dgad 20230727010402.624 DEBUG    conf - conf.c:lxc_setup_ttydir_console:1758 - Mounted "/dev/pts/7" onto "/usr/lib/lxc/rootfs/dev/lxc/console"
lxc-start dgad 20230727010402.624 DEBUG    conf - conf.c:lxc_setup_ttydir_console:1765 - Mounted "/dev/pts/7" onto "/usr/lib/lxc/rootfs/dev/lxc/console"
lxc-start dgad 20230727010402.625 DEBUG    conf - conf.c:lxc_setup_ttydir_console:1767 - Console has been setup under "/usr/lib/lxc/rootfs/dev/lxc/console" and mounted to "/usr/lib/lxc/rootfs/dev/console"
lxc-start dgad 20230727010402.959 DEBUG    conf - conf.c:lxc_setup_devpts_child:1547 - Mount new devpts instance with options "gid=5,newinstance,ptmxmode=0666,mode=0620,max=1024"
lxc-start dgad 20230727010402.960 DEBUG    conf - conf.c:lxc_setup_devpts_child:1575 - Created dummy "/dev/ptmx" file as bind mount target
lxc-start dgad 20230727010402.960 DEBUG    conf - conf.c:lxc_setup_devpts_child:1580 - Bind mounted "/dev/pts/ptmx" to "/dev/ptmx"
lxc-start dgad 20230727010402.993 DEBUG    conf - conf.c:lxc_allocate_ttys:936 - Created tty "/dev/pts/0" with ptx fd 30 and pty fd 31
lxc-start dgad 20230727010402.995 DEBUG    conf - conf.c:lxc_allocate_ttys:936 - Created tty "/dev/pts/1" with ptx fd 32 and pty fd 33
lxc-start dgad 20230727010402.100 DEBUG    conf - conf.c:lxc_allocate_ttys:936 - Created tty "/dev/pts/2" with ptx fd 34 and pty fd 35
lxc-start dgad 20230727010402.100 DEBUG    conf - conf.c:lxc_allocate_ttys:936 - Created tty "/dev/pts/3" with ptx fd 36 and pty fd 37
lxc-start dgad 20230727010402.100 INFO     conf - conf.c:lxc_allocate_ttys:953 - Finished creating 4 tty devices
lxc-start dgad 20230727010402.100 DEBUG    conf - conf.c:lxc_setup_ttys:867 - Bind mounted "/dev/pts/0" onto "/dev/lxc/tty1"
lxc-start dgad 20230727010402.100 DEBUG    conf - conf.c:lxc_setup_ttys:867 - Bind mounted "/dev/pts/1" onto "/dev/lxc/tty2"
lxc-start dgad 20230727010402.100 DEBUG    conf - conf.c:lxc_setup_ttys:867 - Bind mounted "/dev/pts/2" onto "/dev/lxc/tty3"
lxc-start dgad 20230727010402.100 DEBUG    conf - conf.c:lxc_setup_ttys:867 - Bind mounted "/dev/pts/3" onto "/dev/lxc/tty4"
lxc-start dgad 20230727010402.100 INFO     conf - conf.c:lxc_setup_ttys:898 - Finished setting up 4 /dev/tty<N> device(s)
lxc-start dgad 20230727010402.100 INFO     conf - conf.c:setup_personality:1611 - Set personality to "0x0"
lxc-start dgad 20230727010402.100 DEBUG    conf - conf.c:setup_caps:2419 - Dropped mac_admin (33) capability
lxc-start dgad 20230727010402.100 DEBUG    conf - conf.c:setup_caps:2419 - Dropped mac_override (32) capability
lxc-start dgad 20230727010402.100 DEBUG    conf - conf.c:setup_caps:2419 - Dropped sys_time (25) capability
lxc-start dgad 20230727010402.100 DEBUG    conf - conf.c:setup_caps:2419 - Dropped sys_module (16) capability
lxc-start dgad 20230727010402.100 DEBUG    conf - conf.c:setup_caps:2419 - Dropped sys_rawio (17) capability
lxc-start dgad 20230727010402.109 DEBUG    conf - conf.c:setup_caps:2422 - Capabilities have been setup
lxc-start dgad 20230727010402.109 NOTICE   conf - conf.c:lxc_setup:3446 - The container "dgad" is set up
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.deny" set to "a"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c *:* m"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "b *:* m"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c 1:3 rwm"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c 1:5 rwm"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c 1:7 rwm"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c 5:0 rwm"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c 5:1 rwm"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c 5:2 rwm"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c 1:8 rwm"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c 1:9 rwm"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c 136:* rwm"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c 10:229 rwm"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "c *:* rwm"
lxc-start dgad 20230727010402.109 DEBUG    cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2870 - Set controller "devices.allow" set to "a"
lxc-start dgad 20230727010402.109 INFO     cgfsng - cgfsng.c:cgfsng_setup_limits_legacy:2875 - Limits for the legacy cgroup hierarchies have been setup
lxc-start dgad 20230727010402.109 DEBUG    start - start.c:lxc_spawn:1849 - Preserved cgroup namespace via fd 12
lxc-start dgad 20230727010402.118 NOTICE   utils - utils.c:lxc_setgroups:1420 - Dropped additional groups
lxc-start dgad 20230727010402.118 NOTICE   start - start.c:start:2087 - Exec'ing "/sbin/init"
lxc-start dgad 20230727010402.119 ERROR    start - start.c:start:2090 - Permission denied - Failed to exec "/sbin/init"
lxc-start dgad 20230727010402.119 ERROR    sync - sync.c:__sync_wait:36 - An error occurred in another process (expected sequence number 7)
lxc-start dgad 20230727010402.119 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:859 - Received container state "ABORTING" instead of "RUNNING"
lxc-start dgad 20230727010402.119 ERROR    lxc_start - lxc_start.c:main:308 - The container failed to start
lxc-start dgad 20230727010402.119 ERROR    lxc_start - lxc_start.c:main:311 - To get more details, run the container in foreground mode
lxc-start dgad 20230727010402.119 ERROR    lxc_start - lxc_start.c:main:313 - Additional information can be obtained by setting the --logfile and --logpriority options
lxc-start dgad 20230727010402.119 ERROR    start - start.c:__lxc_start:1999 - Failed to spawn container "dgad"
lxc-start dgad 20230727010402.119 WARN     start - start.c:lxc_abort:1018 - No such process - Failed to send SIGKILL to 676437

I saw:

ERROR    start - start.c:start:2090 - Permission denied - Failed to exec "/sbin/init"

So I checked the permission of /sbin/init:

root@vmware:~# ls -l /sbin/init
lrwxrwxrwx 1 root root 22 Jul 24 08:52 /sbin/init -> ../lib/systemd/systemd
root@vmware:~# ls -l /var/log/image/lxc/dgad/rootfs/sbin/init
-rwxr-xr-x 1 root root 1546296 Jul 27 01:03 /var/log/image/lxc/dgad/rootfs/sbin/init
root@vmware:~# file /var/log/image/lxc/dgad/rootfs/sbin/init
/var/log/image/lxc/dgad/rootfs/sbin/init: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-x86-64.so.2, BuildID[sha1]=ed2733d1736b5398b6605cb9f6c517401f8fddf8, for GNU/Linux 3.2.0, stripped

Is there something wrong with my /sbin/init?

It’s quite possible that you have /var/log mounted as noexec which would explain that part.

I have a feeling we are getting very close!

You were right:

root@vmware:~# findmnt -l | grep noexec
...
/var                       overlay             overlay   rw,nosuid,nodev,noexec,noatime,...
/var/log                   /dev/mapper/...     ext4      rw,nosuid,nodev,noexec,noatime
...

I’ve added another hard disk, and formatted the partition

root@vmware:~# fdisk /dev/sdb
root@vmware:~# mkfs.ext4 /dev/sdb1
root@vmware:~# mkdir -p /home/root/tmp/
root@vmware:~# mount -o rw,exec /dev/sdb1 /home/root/tmp/

and then created my lxc folder there

root@vmware:~/tmp# ls /home/root/tmp/lxc/dgad/
config  rootfs
root@vmware:~/tmp#

and ensured the permissions work:

chown 777 -R /home/

After starting the service, there is only one line of message:

lxc-start dgad 20230727041531.222 ERROR    lxc_start - lxc_start.c:main:192 - You lack access to /home/root/tmp/lxc

I guess this is the corresponding code, but I don’t understand why access failed:

/* lxc/src/lxc/tools/lxc_start.c */
        lxcpath = my_args.lxcpath[0];
        if (access(lxcpath, O_RDONLY) < 0) {
                ERROR("You lack access to %s", lxcpath);
                exit(err);
        }

I’ve also tried putting the folder under /dev folder, but it failed with:

lxc-start dgad 20230727045205.662 DEBUG    start - start.c:lxc_spawn:1849 - Preserved cgroup namespace via fd 12
lxc-start dgad 20230727045205.662 NOTICE   utils - utils.c:lxc_setgroups:1420 - Dropped additional groups
lxc-start dgad 20230727045205.663 NOTICE   start - start.c:start:2087 - Exec'ing "/sbin/init"
lxc-start dgad 20230727045205.664 NOTICE   start - start.c:post_start:2098 - Started "/sbin/init" with pid "9576"
lxc-start dgad 20230727045205.669 ERROR    commands - commands.c:lxc_cmd_get_init_pidfd_callback:457 - Failed to send init pidfd

Where should I go from here?

Hello Stéphane Graber,

I can’t express how thankful I am for your help. I have been on variouse forums, all tech-related, and this is the very first one where I actaully got a response (from the king of LXC himself)!

The problems I’m facing so far are resolved, in short:

  • my system doesn’t have cgroup (somehow)
  • the container was actually started, but somehow lxc-ls showed it was “STOPPED”
  • I found out about this because ps shows processes that only run in the container
  • (after a while, lxc-ls did show “RUNNING”)

So my project is finally working!

I’ll try to explain the errors I’ve seen so far, I hope this makes others’ lives easier.

  1. This error code doesn’t stop the container from starting, even though it says “ERROR”, it just means that something non-critical is missing:
ERROR    utils - utils.c:lxc_can_use_pidfd:1853 - Kernel does not support pidfds
  1. If you see this one, check /sys/fs/cgroup, if there’s nothing in it, try to find freezer, blkio in the system. If you really can’t find anything alike, your system might not even have cgroup.
start - start.c:lxc_spawn:1741 - Failed to setup cgroup limits for container "dgad"
  1. If you see “Permission denied” even after running chmod, try using findmnt -l and check if the partition where your container is installed has the “noexec” option set, and use "mount -o remount,exec " to remove the “noexec” flag:
start - start.c:start:2090 - Permission denied - Failed to exec "/sbin/init"
  1. I still can’t figure out why this one came out, but after starting the container under the default namespace, it’s gone:
commands - commands.c:lxc_cmd_get_init_pidfd_callback:457 - Failed to send init pidfd

And again, I’m grateful for your help, Stéphane.