HELP ! HELP ! HELP ! Cgroup2 related issue on ubuntu jammy with mullvad and PrivateInternetAccess VPN

Hi,
I am not able to run my container, had same issue on other host distro too
Host: Ubuntu 22.04 LTS

$ lxc start ubuntu

Error:

Error: Failed to run: /snap/lxd/current/bin/lxd forkstart ubuntu /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/ubuntu/lxc.conf: 

Logs:

$ lxc info --show-log ubuntu
Log:

lxc ubuntu 20220726184019.309 WARN     cgfsng - ../src/src/lxc/cgroups/cgfsng.c:cgfsng_setup_limits:3224 - Invalid argument - Ignoring cgroup2 limits on legacy cgroup system
lxc ubuntu 20220726184019.881 ERROR    conf - ../src/src/lxc/conf.c:turn_into_dependent_mounts:3919 - No such file or directory - Failed to recursively turn old root mount tree into dependent mount. Continuing...
lxc ubuntu 20220726184019.160 ERROR    cgfsng - ../src/src/lxc/cgroups/cgfsng.c:cgfsng_mount:2131 - No such file or directory - Failed to create cgroup at_mnt 24()
lxc ubuntu 20220726184019.160 ERROR    conf - ../src/src/lxc/conf.c:lxc_mount_auto_mounts:851 - No such file or directory - Failed to mount "/sys/fs/cgroup"
lxc ubuntu 20220726184019.160 ERROR    conf - ../src/src/lxc/conf.c:lxc_setup:4396 - Failed to setup remaining automatic mounts
lxc ubuntu 20220726184019.160 ERROR    start - ../src/src/lxc/start.c:do_start:1272 - Failed to setup container "ubuntu"
lxc ubuntu 20220726184019.160 ERROR    sync - ../src/src/lxc/sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 4)
lxc ubuntu 20220726184019.169 WARN     network - ../src/src/lxc/network.c:lxc_delete_network_priv:3631 - Failed to rename interface with index 0 from "eth0" to its initial name "veth58078e9e"
lxc ubuntu 20220726184019.169 ERROR    lxccontainer - ../src/src/lxc/lxccontainer.c:wait_on_daemonized_start:877 - Received container state "ABORTING" instead of "RUNNING"
lxc ubuntu 20220726184019.170 ERROR    start - ../src/src/lxc/start.c:__lxc_start:2107 - Failed to spawn container "ubuntu"
lxc ubuntu 20220726184019.170 WARN     start - ../src/src/lxc/start.c:lxc_abort:1036 - No such process - Failed to send SIGKILL via pidfd 20 for process 46858
lxc 20220726184024.335 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20220726184024.336 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_state"

lxc config:

$ lxc-checkconfig
LXC version 5.0.0~git2209-g5a7b9ce67
Kernel configuration not found at /proc/config.gz; searching...
Kernel configuration found at /boot/config-5.15.0-41-generic
--- Namespaces ---
Namespaces: enabled
Utsname namespace: enabled
Ipc namespace: enabled
Pid namespace: enabled
User namespace: enabled
Network namespace: enabled

--- Control groups ---
Cgroups: enabled
Cgroup namespace: enabled

Cgroup v1 mount points: 
/sys/fs/cgroup/net_cls

Cgroup v2 mount points: 
/sys/fs/cgroup

Cgroup v1 systemd controller: missing
Cgroup v1 freezer controller: missing
Cgroup v1 clone_children flag: enabled
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: enabled
Cgroup cpuset: enabled

--- Misc ---
Veth pair device: enabled, loaded
Macvlan: enabled, not loaded
Vlan: enabled, not loaded
Bridges: enabled, loaded
Advanced netfilter: enabled, loaded
CONFIG_IP_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_IP6_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_NETFILTER_XT_TARGET_CHECKSUM: enabled, not loaded
CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled, not loaded
FUSE (for use with lxcfs): enabled, not loaded

--- Checkpoint/Restore ---
checkpoint restore: enabled
CONFIG_FHANDLE: enabled
CONFIG_EVENTFD: enabled
CONFIG_EPOLL: enabled
CONFIG_UNIX_DIAG: enabled
CONFIG_INET_DIAG: enabled
CONFIG_PACKET_DIAG: enabled
CONFIG_NETLINK_DIAG: enabled
File capabilities: 

Note : Before booting a new kernel, you can check its configuration
usage : CONFIG=/path/to/config /usr/bin/lxc-checkconfig

Can you show grep cgroup /proc/mounts

Yes please have a look

cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime 0 0
net_cls /sys/fs/cgroup/net_cls cgroup rw,relatime,net_cls 0 0

Can you try sudo umount /sys/fs/cgroup/net_cls, see if that fixes it?

2 Likes

finally, you are a life saver !!!

Its working now
LOVE YOU
THANKS

Can you please also explain a bit what was happening and how cgroup was causing the problem ?

It’s the same issue as described in https://github.com/lxc/lxd/issues/10441

It’d be good to know what’s causing that net_cls mount on those affected systems as that’s a problem and really what should get fixed here…

2 Likes

I ll look up issues on github first next time because I just found a few other solutions their before posting here.

Does it mean that problem may occur again ?
and unmounting this resource will be required every time ?

I did it everytime i had to run a container, I resolved it by reinstalling the distro completely.

I had the same issue, and in my case the net_cls V1 cgroup is being mounted by pia-daemon, which is part of the PrivateInternetAccess VPN:

[linux_cgroup][daemon/src/linux/linux_cgroup.cpp:23][warning] The directory "/sys/fs/cgroup/net_cls" is not found, but is required by the split tunnel feature. Attempting to create.
[linux_cgroup][daemon/src/linux/linux_cgroup.cpp:30][info] Successfully created "/sys/fs/cgroup/net_cls"

I’ll disable the “Split tunnel” feature, that I wasn’t actively using anyway, and hopefully this will go away for good.

1 Like

I’ve done a complete reinstall also but net_cls cgroup still gets mounted over cgroup2

We’ve seen cases of mullvad VPN causing this see

@stgraber

In the case of net_cls cgroup mounting over cgroup2…

I use Mullvad VPN and I think I found a solution to my problem

In the case of net_cls cgroup mounting over cgroup2…

I use Mullvad VPN and I think I found a solution to my problem

net_cls interfering with lxd · Issue #3651 · mullvad/mullvadvpn-app · GitHub

Mullvad has updated there Github README.md file to include information about
supported Mullvad Environment Variable

**TALPID_NET_CLS_MOUNT_DIR** - On Linux, forces the daemon to mount the net_cls controller in the specified directory if it isn’t mounted already.

The Mullvad Bug report is:

GitHub - mullvad/mullvadvpn-app: The Mullvad VPN client app for desktop and mobile

The Mullvad Github README file describes the above Environment Variable and several others that Mullvad can utilize if configured.

Thanks, please can you explain the full solution/workaround here for other readers?

@tomp @stgraber
If you use LXD and have installed the Mullvad VPN you may find that you can no longer launch and start an LXD container.

Mullvad VPN’s installation mounts “net_cls” cgroup1 over cgroup2 which is the root of the problem.

To check if net_cls cgroup1 is mounted over cgroup2 run:

$ mount | grep net_cls

which if net_cls cgroup is shown as below:

$ cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec, relatime 0 0
$ net_cls /sys/fs/cgroup/net_cls cgroup rw,relatime,net_cls 0 0

and you are unable to create & start a new LXD container you will need to mount
net_cls somewhere else (literally anywhere else).

The Mullvad Bug ID is:

net_cls interfering with lxd · Issue #3651 · mullvad/mullvadvpn-app · GitHub

So as an example lets create a mount point /opt/net-cls-v1 (you can use any directory path & name you want):

$ sudo mkdir -p /opt/net-cls-v1

$ sudo mount -t cgroup -o net_cls net_cls /opt/net-cls-v1
$ sudo chown -R root:root /opt/net-cls-v1

After mounting net_cls on /opt/net-cls-v1

Run the command:

$ grep cgroup /proc/mounts

Now you should see “net_cls” mounted on /opt/net-cls-v1

cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime 0 0
net_cls /opt/net-cls-v1 cgroup rw,relatime,net_cls 0 0

Edit the Mullvad systemd unit file and make this new Mount point for “net_cls” permanent (re survive reboots):

NOTE: this requires adding "Environment=“TALPID_NET_CLS_MOUNT_DIR=/opt/net-cls-v1/”
as a new Unit “Service” (see below)…

$ cd /lib/systemd/system

$ sudo nano ./mullvad-daemon.service

Change the file Mullvad Unit file “mullvad-daemon.service” FROM:

# Systemd service unit file for the Mullvad VPN daemon
# testing if new changes are added

[Unit]
Description=Mullvad VPN daemon
Before=network-online.target
After=mullvad-early-boot-blocking.service NetworkManager.service systemd-resolved.service

StartLimitBurst=5
StartLimitIntervalSec=20
RequiresMountsFor=/opt/Mullvad\x20VPN/resources/

[Service]
Restart=always
RestartSec=1
ExecStart=/usr/bin/mullvad-daemon -v --disable-stdout-timestamps
Environment="MULLVAD_RESOURCE_DIR=/opt/Mullvad VPN/resources/"

[Install]
WantedBy=multi-user.target

To THIS:

# Systemd service unit file for the Mullvad VPN daemon
# testing if new changes are added

[Unit]
Description=Mullvad VPN daemon
Before=network-online.target
After=mullvad-early-boot-blocking.service NetworkManager.service systemd-resolved.service

StartLimitBurst=5
StartLimitIntervalSec=20
RequiresMountsFor=/opt/Mullvad\x20VPN/resources/

[Service]
Restart=always
RestartSec=1
ExecStart=/usr/bin/mullvad-daemon -v --disable-stdout-timestamps
Environment="MULLVAD_RESOURCE_DIR=/opt/Mullvad VPN/resources/"
Environment="TALPID_NET_CLS_MOUNT_DIR=/opt/net-cls-v1/"

[Install]
WantedBy=multi-user.target

Finally, restart Mullvad’s daemon so it mounts net_cls on - /opt/net-cls-v1

$ sudo systemctl restart mullvad-daemon

And verify that net_cls is now mounted on /opt/net-cls-v1

$ mount | grep net_cls
net_cls on /opt/net-cls-v1 type cgroup (rw,relatime,net_cls)

Now test by launching a new LXD container and you should see something like:

$ lxc launch ubuntu:22.04 cn1
Creating cn1
Starting cn1

$ lxc ls

should now should your new CN1 container created and started properly.

And these changes will survive a reboot!

2 Likes

Absolutely was the cause for me, thank you for this. Would have never guessed

1 Like

Note in the above How-To about fixing the NET_CLS problem there were a couple problems I found!

The command to mount “cgroup” was incorrect as it used to say:
$ sudo mount -t cgroup -o net_cls none /tmp/net-cls-v1

Note: /tmp is obviously not a good place if you want the change to survive reboots.

The mount command has been corrected in the above post to now say:
$ sudo mount -t cgroup -o net_cls net_cls /opt/net-cls-v1

The second problem was that the edit of the previous command

$ sudo systemctl edit mullvad-daemon.service

and saving saving it did not update the actual Mullvad Unit file [Service].
So the instructions were changed above to first:

$ cd /lib/systemd/system
then

edit the mullvad-daemon.service file and then add the following to the [Service] section:

Environment=“TALPID_NET_CLS_MOUNT_DIR=/opt/net-cls-v1/”

then restart the Mullvad daemon and “net_cls” should be remounted & it will survive reboots.

The original instructions have all been updated w/corrected information

1 Like