Sandboxed systemd service fails due to cgroups issue

I’ve a problem with a sandboxed systemd service

[Unit]
After=network.target
Description=gitea

[Service]
Environment="GITEA_WORK_DIR=/var/lib/gitea"
Environment="HOME=/var/lib/gitea"
Environment="LOCALE_ARCHIVE=/nix/store/jbyaw0r48gxslxczwnjw5371rqj03gn8-glibc-locales-2.30/lib/locale/locale-archive"
Environment="PATH=/nix/store/404wfnlg9dvlzphd955zlqfclsaa31aj-gitea-1.11.8-bin/bin:/nix/store/xp5fj0915bkd0yidns2bkg8n7m9nfp8h-git-2.25.4/bin:/nix/store/x0jla3hpxrwz76hy9yckg1iyc9hns81k-coreutils-8.31/bin:/nix/store/97vambzyvpvrd9wgrrw7i7svi0s8vny5-findutils-4.7.0/bin:/nix/store/b0vjq4r4sp9z4l2gbkc5dyyw5qfgyi3r-gnugrep-3.4/bin:/nix/store/p34p7ysy84579lndk7rbrz6zsfr03y71-gnused-4.8/bin:/nix/store/vac1gmzh1xmk3s7w9pbjvirxqsg1npn0-systemd-243.7/bin:/nix/store/404wfnlg9dvlzphd955zlqfclsaa31aj-gitea-1.11.8-bin/sbin:/nix/store/xp5fj0915bkd0yidns2bkg8n7m9nfp8h-git-2.25.4/sbin:/nix/store/x0jla3hpxrwz76hy9yckg1iyc9hns81k-coreutils-8.31/sbin:/nix/store/97vambzyvpvrd9wgrrw7i7svi0s8vny5-findutils-4.7.0/sbin:/nix/store/b0vjq4r4sp9z4l2gbkc5dyyw5qfgyi3r-gnugrep-3.4/sbin:/nix/store/p34p7ysy84579lndk7rbrz6zsfr03y71-gnused-4.8/sbin:/nix/store/vac1gmzh1xmk3s7w9pbjvirxqsg1npn0-systemd-243.7/sbin"
Environment="TZDIR=/nix/store/8cz89zavyrm2bdrgkx4l66s5c7nx12dr-tzdata-2019c/share/zoneinfo"
Environment="USER=gitea"

CapabilityBoundingSet=
ExecStart=/nix/store/404wfnlg9dvlzphd955zlqfclsaa31aj-gitea-1.11.8-bin/bin/gitea web
ExecStartPre=/nix/store/4225kh8v54fdymwcmm6hzjy90m7q2kzf-unit-script-gitea-pre-start
Group=gitea
LockPersonality=true
MemoryDenyWriteExecute=true
NoNewPrivileges=true
PrivateDevices=true
PrivateMounts=true
PrivateUsers=true
ProtectControlGroups=true
ProtectHome=true
ProtectKernelModules=true
ProtectKernelTunables=true
ReadWritePaths=/var/lib/gitea
Restart=always
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6
RestrictRealtime=true
SystemCallArchitectures=native
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @memlock @module @mount @obsolete @raw-io @reboot @resources @setuid @swap
Type=simple
User=gitea
WorkingDirectory=/var/lib/gitea

Which is unable to start inside a lxd container

Aug 04 13:52:40 nixos systemd[14745]: gitea.service: Executing: /nix/store/404wfnlg9dvlzphd955zlqfclsaa31aj-gitea-1.11.8-bin/bin/gitea web
Aug 04 13:52:40 nixos systemd[1]: gitea.service: Failed to read oom_kill field of memory.events cgroup attribute: No such file or directory
Aug 04 13:52:40 nixos systemd[1]: gitea.service: Child 14745 belongs to gitea.service.
Aug 04 13:52:40 nixos systemd[1]: gitea.service: Main process exited, code=dumped, status=31/SYS
Aug 04 13:52:40 nixos systemd[1]: gitea.service: Failed with result 'core-dump'.

I’m a little bit unsure if this is a systemd problem or something is missing in my container.

asbachb@ubuntu-8gb-nbg1-1:~$ lxc config show nixos-gitea -e
architecture: x86_64
config:
  raw.lxc: |-
    lxc.init.cmd = /sbin/init
    lxc.mount.entry = proc mnt/proc proc create=dir 0 0
    lxc.apparmor.profile = unconfined
  volatile.base_image: d5e2a0b1ddb4c5bc36ced85dd3472dabf4e58e9b3a9aa03de22839e333d3cd34
  volatile.eth0.host_name: veth71c87c04
  volatile.eth0.hwaddr: 00:16:3e:73:df:0a
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
devices:
  credentials:
    path: /etc/nixos/credentials
    pool: storage1
    source: nixos-credentials
    type: disk
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  nixpkgs:
    path: /tmp/nixpkgs
    pool: storage1
    source: nixos-nixpkgs
    type: disk
  root:
    path: /
    pool: storage1
    type: disk
ephemeral: false
profiles:
- nixos
- with-credentials
- with-nixpkgs
stateful: false
description: ""

@stgraber any thoughts on this?

What host distro and kernel?

Host

asbachb@ubuntu-8gb-nbg1-1:~$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.1 LTS"

kernel

Linux ubuntu-8gb-nbg1-1 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

lxd

asbachb@ubuntu-8gb-nbg1-1:~$ lxd --version
4.4

Ok, I was wondering if that was a cgroup2 thing, but given this is on Ubuntu 20.04, it shouldn’t be.

@brauner any idea what’s going on here?

systemd in the container might be trying to mount cgroup2 only and gitea is looking for memory.events but the memory controller is not enabled in cgroup2 since it’s enabled in cgroup v1 on the host?

1 Like

@brauner any output I could provide to prove your assumption?

@stgraber @brauner I’m still facing this problem but I’m a little bit sure if this should be reported to systemd or be handled by lxd or on system side.

You can look at the systemd output in the container should be available in the console.log file in the lxd log’s directory.

I guess nothing useful in here:

cat /var/snap/lxd/common/lxd/logs/nixos-gitea/console.log
<<< NixOS Stage 2 >>>

    running activation script...
setting up /etc...
starting systemd...
systemd 243 running in system mode. (+PAM +AUDIT -SELINUX +IMA +APPARMOR +SMACK -SYSVINIT +UTMP -LIBCRYPTSETUP +GCRYPT -GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID -ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
Detected virtualization lxc.
Detected architecture x86-64.

Welcome to NixOS 20.03 (Markhor)!

Set hostname to <nixos>.
cgroup compatibility translation between legacy and unified hierarchy settings activated. See cgroup-compat debug messages for details.
[  OK  ] Created slice User and Session Slice.
[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
[  OK  ] Started Forward Password Requests to Wall Directory Watch.
[  OK  ] Reached target Login Prompts.
[  OK  ] Reached target Containers.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Remote File Systems.
[  OK  ] Reached target Slices.
[  OK  ] Reached target Swap.
[  OK  ] Listening on Process Core Dump Socket.
[  OK  ] Listening on initctl Compatibility Named Pipe.
[  OK  ] Listening on Journal Socket (/dev/log).
[  OK  ] Listening on Journal Socket.
[  OK  ] Listening on udev Control Socket.
[  OK  ] Listening on udev Kernel Socket.
         Starting Create list of static device nodes for the current kernel...
         Starting Set Up Additional Binary Formats...
Failed to set devices.allow on /system.slice/systemd-journald.service: Operation not permitted
         Starting Journal Service...
         Starting Firewall...
         Starting Apply Kernel Variables...
         Starting udev Coldplug all Devices...
[  OK  ] Started Create list of static device nodes for the current kernel.
         Starting Create Static Device Nodes in /dev...
[  OK  ] Started Set Up Additional Binary Formats.
[  OK  ] Started Create Static Device Nodes in /dev.
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Reached target Local File Systems.
         Starting udev Kernel Device Manager...
[  OK  ] Started Apply Kernel Variables.
[  OK  ] Started udev Coldplug all Devices.
         Starting udev Wait for Complete Device Initialization...
[  OK  ] Started Firewall.
[  OK  ] Started Journal Service.
         Starting Flush Journal to Persistent Storage...
[  OK  ] Started Flush Journal to Persistent Storage.
         Starting Create Volatile Files and Directories...
[  OK  ] Started Create Volatile Files and Directories.
         Starting Update UTMP about System Boot/Shutdown...
[  OK  ] Started Update UTMP about System Boot/Shutdown.
[  OK  ] Started udev Kernel Device Manager.
[  OK  ] Started udev Wait for Complete Device Initialization.
[  OK  ] Reached target System Initialization.
[  OK  ] Started nix-gc.timer.
[  OK  ] Started nixos-upgrade.timer.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Timers.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Listening on Nix Daemon Socket.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
         Starting DHCP Client...
         Starting Name Service Cache Daemon...
         Starting resolvconf update...
[  OK  ] Started D-Bus System Message Bus.
[  OK  ] Started Name Service Cache Daemon.
[  OK  ] Reached target Host and Network Name Lookups.
[  OK  ] Reached target User and Group Name Lookups.
         Starting Login Service...
[  OK  ] Started resolvconf update.
[  OK  ] Reached target Network (Pre).
[  OK  ] Reached target All Network Interfaces (deprecated).
         Starting Networking Setup...
[  OK  ] Started Networking Setup.
         Starting Extra networking commands....
[  OK  ] Started Extra networking commands..
[  OK  ] Reached target Network.
         Starting gitea...
         Starting Permit User Sessions...
[  OK  ] Started Permit User Sessions.
[  OK  ] Started gitea.
[  OK  ] Stopped gitea.
         Starting gitea...
[  OK  ] Started Login Service.
[  OK  ] Started gitea.
[  OK  ] Stopped gitea.
         Starting gitea...
[  OK  ] Started gitea.
[  OK  ] Stopped gitea.
         Starting gitea...
[  OK  ] Started gitea.
         Stopping Name Service Cache Daemon...
[  OK  ] Stopped Name Service Cache Daemon.
         Starting Name Service Cache Daemon...
[  OK  ] Started Name Service Cache Daemon.
[  OK  ] Stopped gitea.
         Starting gitea...
[  OK  ] Started gitea.
[  OK  ] Stopped gitea.
[FAILED] Failed to start gitea.
See 'systemctl status gitea.service' for details.
         Stopping Name Service Cache Daemon...
[  OK  ] Stopped Name Service Cache Daemon.
[  OK  ] Started DHCP Client.
[  OK  ] Reached target Network is Online.
[  OK  ] Reached target Multi-User System.
         Starting Name Service Cache Daemon...
[  OK  ] Started Name Service Cache Daemon.

Hm, it’s dumping core… Is this maybe a bug in gitea?

I don’t think so. When I remove all that systemd isolation stuff gitea is starting as expected.