Kvm nested in LXC, problem accessing /dev/kvm inside lxc - Debian11

Sorry for the ugly format - I try to improve this now…

Hi,

running on 64bit Debian11 host a lxc container called android-dev. Android Studio requires /dev/kvm for the emulator, therefor this is nested kvm in lxc: Debian11>lxc>qemu/kvm

LXC runs on the debian11 host using a bridge and the config of the container

cat /var/lib/lxc/android-dev/config

# Uncomment the following line to support nesting containers:
# I tried with or without the next line
lxc.include = /usr/share/lxc/config/nesting.conf

# lxc.apparmor.profile = generated
lxc.apparmor.profile = unconfined
lxc.apparmor.allow_nesting = 1
lxc.net.0.type = veth
lxc.net.0.link = br0
lxc.net.0.flags = up
lxc.net.0.hwaddr = 00:16:3e:14:ba:34
lxc.rootfs.path = dir:/var/lib/lxc/android-dev/rootfs

# Common configuration
lxc.include = /usr/share/lxc/config/debian.common.conf

# Container specific configuration
lxc.tty.max = 4
lxc.uts.name = android-dev
lxc.arch = amd64
lxc.pty.max = 1024

# Permit access to /dev/loop*
#lxc.cgroup.devices.allow: b 7:* rwm
#lxc.cgroup.devices.allow: c 10:237 rwm

# Setup access to /dev/net/tun and /dev/kvm
lxc.mount.entry = /dev/net/tun dev/net/tun none bind,create=file 0 0
lxc.mount.entry = /dev/kvm dev/kvm none bind,create=file 0 0

cat /usr/share/lxc/config/nesting.conf

# Use a profile which allows nesting
lxc.apparmor.profile = lxc-container-default-with-nesting
# Add uncovered mounts of proc and sys, else unprivileged users
# cannot remount those
lxc.mount.entry = proc dev/.lxc/proc proc create=dir,optional 0 0
lxc.mount.entry = sys dev/.lxc/sys sysfs create=dir,optional 0 0

cat /usr/share/lxc/config/debian.common.conf

# This derives from the global common config
lxc.include = /usr/share/lxc/config/common.conf
# Doesn’t support consoles in /dev/lxc/
lxc.tty.dir =
# When using LXC with apparmor, the container will be confined by default.
# If you wish for it to instead run unconfined, copy the following line
# (uncommented) to the container’s configuration file.
#lxc.apparmor.profile = unconfined
# If you wish to allow mounting block filesystems, then use the following
# line instead, and make sure to grant access to the block device and/or loop
# devices below in lxc.cgroup.devices.allow.
#lxc.apparmor.profile = lxc-container-default-with-mounting
# Extra cgroup device access
## rtc
lxc.cgroup.devices.allow = c 254:0 rm
## tun
lxc.cgroup.devices.allow = c 10:200 rwm
## hpet
lxc.cgroup.devices.allow = c 10:228 rwm
## kvm
lxc.cgroup.devices.allow = c 10:232 rwm
## To use loop devices, copy the following line to the container’s
## configuration file (uncommented).
#lxc.cgroup.devices.allow = b 7:* rwm

cat /etc/lxc/default.conf

# lxc.apparmor.profile = generated
lxc.apparmor.profile = unconfined
lxc.apparmor.allow_nesting = 1
lxc.net.0.type = veth
lxc.net.0.link = lxcbr0
lxc.net.0.flags = up
lxc.net.0.hwaddr = 00:16:3e:xx:xx:xx

on the host
virt-host-validate

QEMU: Checking for hardware virtualization : PASS
QEMU: Checking if device /dev/kvm exists : PASS
QEMU: Checking if device /dev/kvm is accessible : PASS
QEMU: Checking if device /dev/vhost-net exists : PASS
QEMU: Checking if device /dev/net/tun exists : PASS
QEMU: Checking for cgroup ‘cpu’ controller support : PASS
QEMU: Checking for cgroup ‘cpuacct’ controller support : PASS
QEMU: Checking for cgroup ‘cpuset’ controller support : PASS
QEMU: Checking for cgroup ‘memory’ controller support : PASS
QEMU: Checking for cgroup ‘devices’ controller support : PASS
QEMU: Checking for cgroup ‘blkio’ controller support : PASS
QEMU: Checking for device assignment IOMMU support : PASS
QEMU: Checking if IOMMU is enabled by kernel : PASS
QEMU: Checking for secure guest support : WARN (Unknown if this platform has Secure Guest support)
LXC: Checking for Linux >= 2.6.26 : PASS
LXC: Checking for namespace ipc : PASS
LXC: Checking for namespace mnt : PASS
LXC: Checking for namespace pid : PASS
LXC: Checking for namespace uts : PASS
LXC: Checking for namespace net : PASS
LXC: Checking for namespace user : PASS
LXC: Checking for cgroup ‘cpu’ controller support : PASS
LXC: Checking for cgroup ‘cpuacct’ controller support : PASS
LXC: Checking for cgroup ‘cpuset’ controller support : PASS
LXC: Checking for cgroup ‘memory’ controller support : PASS
LXC: Checking for cgroup ‘devices’ controller support : PASS
LXC: Checking for cgroup ‘freezer’ controller support : FAIL (Enable ‘freezer’ in kernel Kconfig file or mount/enable cgroup controller in your system)
LXC: Checking for cgroup ‘blkio’ controller support : PASS
LXC: Checking if device /sys/fs/fuse/connections exists : PASS

inside android-dev lxc

virt-host-validate
QEMU: Checking for hardware virtualization : PASS
QEMU: Checking if device /dev/kvm exists : PASS
QEMU: Checking if device /dev/kvm is accessible : FAIL (Check /dev/kvm is world writable or you are in a group that is allowed to access it)
QEMU: Checking if device /dev/vhost-net exists : WARN (Load the ‘vhost_net’ module to improve performance of virtio networking)
QEMU: Checking if device /dev/net/tun exists : PASS
QEMU: Checking for cgroup ‘cpu’ controller support : PASS
QEMU: Checking for cgroup ‘cpuacct’ controller support : PASS
QEMU: Checking for cgroup ‘cpuset’ controller support : PASS
QEMU: Checking for cgroup ‘memory’ controller support : PASS
QEMU: Checking for cgroup ‘devices’ controller support : PASS
QEMU: Checking for cgroup ‘blkio’ controller support : PASS
QEMU: Checking for device assignment IOMMU support : PASS
QEMU: Checking if IOMMU is enabled by kernel : PASS
QEMU: Checking for secure guest support : WARN (Unknown if this platform has Secure Guest support)
LXC: Checking for Linux >= 2.6.26 : PASS
LXC: Checking for namespace ipc : PASS
LXC: Checking for namespace mnt : PASS
LXC: Checking for namespace pid : PASS
LXC: Checking for namespace uts : PASS
LXC: Checking for namespace net : PASS
LXC: Checking for namespace user : PASS
LXC: Checking for cgroup ‘cpu’ controller support : PASS
LXC: Checking for cgroup ‘cpuacct’ controller support : PASS
LXC: Checking for cgroup ‘cpuset’ controller support : PASS
LXC: Checking for cgroup ‘memory’ controller support : PASS
LXC: Checking for cgroup ‘devices’ controller support : PASS
LXC: Checking for cgroup ‘freezer’ controller support : FAIL (Enable ‘freezer’ in kernel Kconfig file or mount/enable cgroup controller in your system)
LXC: Checking for cgroup ‘blkio’ controller support : PASS
LXC: Checking if device /sys/fs/fuse/connections exists : PASS
root@android-dev:/#

Somethings really interesting is that /dev/kvm gets updated when the lxc container gets started:

root@host:/var/lib/lxc/android-dev# ls -la /dev/kvm

crw-rw----+ 1 root kvm 10, 232 Jan 22 20:05 /dev/kvm

root@host:/var/lib/lxc/android-dev# lxc-start android-dev
root@host:/var/lib/lxc/android-dev# ls -la /dev/kvm

crw-rw----+ 1 root Debian-exim 10, 232 Jan 22 20:05 /dev/kvm

root@host:/var/lib/lxc/android-dev# lxc-stop android-dev

root@host:/var/lib/lxc/android-dev# ls -la /dev/kvm

crw-rw----+ 1 root Debian-exim 10, 232 Jan 22 20:05 /dev/kvm

inside lxc
ls -la /dev/kvm

crw-rw----+ 1 root kvm 10, 232 Jan 22 09:05 /dev/kvm

I saw that the group id in lxc and the host don’t match, because I have most likely installed packages in a different order.

host:

root@host:/var/lib/lxc/android-dev# lsmod | grep kvm
kvm_intel 327680 0
kvm 921600 1 kvm_intel
irqbypass 16384 1 kvm

lxc:

root@android-dev:/# lsmod | grep kvm
kvm_intel 327680 0
kvm 921600 1 kvm_intel
irqbypass 16384 1 kvm

root@host:/var/lib/lxc/android-dev# cat /etc/modules

# /etc/modules: kernel modules to load at boot time.
coretemp
ipmi_devintf
ipmi_msghandler
ipmi_si
fuse
vhost_net
kvm_intel

root@android-dev:/# cat /etc/modules

# /etc/modules: kernel modules to load at boot time.
kvm_intel
vhost_net
kvm-intel

How can I get /dev/kvm to be useable within the lxc container?
Security is not an issue, I use the lxc container for pure software package separation purposes.

Help is appreciated.

add to my initial post:

I got from here:

to add

lxc.cgroup.devices.allow = c 10:232 rwm

I tried it, but it didn’t work

And I struggle to find out how these permissions work, meaning c10:232 rwm

I found here: https://github.com/lxc/lxd/issues/2718

That it should be possible to run:

lxc config device add CONTAINER kvm unix-char path=/dev/kvm

The problem is that under debian there is no such lxc “wrapper”, there is lxc-config, but the feature set of lxc-config is different.

@stgraber Stephane, do you have an input regarding this?

This means that you can remove your lxc.mount.entry for /dev/kvm and can instead create the device node directly inside of the container.

This should allow you to do mknod /dev/kvm c 10 232 and then chmod 660 /dev/kvm and finally chown root:kvm /dev/kvm.

Creating your own device node in this way will save you from any potential impact on the entry on the host.

Thank you for your input, I adapted the config and added:

lxc.cgroup.devices.allow = c 10:232 rwm

to the config file and then run

mknod /dev/kvm c 10 232
chmod 777 /dev/kvm
chown root:kvm /dev/kvm

running kvm-ok reports:

kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used

running virt-host-validate reports:

virt-host-validate
QEMU: Checking for hardware virtualization : PASS
QEMU: Checking if device /dev/kvm exists : PASS
QEMU: Checking if device /dev/kvm is accessible : FAIL (Check /dev/kvm is world writable or you are in a group that is allowed to access it)
QEMU: Checking if device /dev/vhost-net exists : WARN (Load the ‘vhost_net’ module to improve performance of virtio networking)
QEMU: Checking if device /dev/net/tun exists : FAIL (Load the ‘tun’ module to enable networking for QEMU guests)
QEMU: Checking for cgroup ‘cpu’ controller support : PASS
QEMU: Checking for cgroup ‘cpuacct’ controller support : PASS
QEMU: Checking for cgroup ‘cpuset’ controller support : PASS
QEMU: Checking for cgroup ‘memory’ controller support : PASS
QEMU: Checking for cgroup ‘devices’ controller support : PASS
QEMU: Checking for cgroup ‘blkio’ controller support : PASS
QEMU: Checking for device assignment IOMMU support : PASS
QEMU: Checking if IOMMU is enabled by kernel : PASS
QEMU: Checking for secure guest support : WARN (Unknown if this platform has Secure Guest support)
LXC: Checking for Linux >= 2.6.26 : PASS
LXC: Checking for namespace ipc : PASS
LXC: Checking for namespace mnt : PASS
LXC: Checking for namespace pid : PASS
LXC: Checking for namespace uts : PASS
LXC: Checking for namespace net : PASS
LXC: Checking for namespace user : PASS
LXC: Checking for cgroup ‘cpu’ controller support : PASS
LXC: Checking for cgroup ‘cpuacct’ controller support : PASS
LXC: Checking for cgroup ‘cpuset’ controller support : PASS
LXC: Checking for cgroup ‘memory’ controller support : PASS
LXC: Checking for cgroup ‘devices’ controller support : PASS
LXC: Checking for cgroup ‘freezer’ controller support : FAIL (Enable ‘freezer’ in kernel Kconfig file or mount/enable cgroup controller in your system)
LXC: Checking for cgroup ‘blkio’ controller support : PASS
LXC: Checking if device /sys/fs/fuse/connections exists : PASS

For testing I ran (under root) X and tried android studio, so both virt-host-validate and android studio are not happy yet with the required changes.

Got it working.

config file:

lxc.include = /usr/share/lxc/config/nesting.conf
lxc.apparmor.profile = unconfined
lxc.apparmor.allow_nesting = 1
lxc.net.0.type = veth
lxc.net.0.link = br0
lxc.net.0.flags = up
lxc.net.0.hwaddr = 00:16:3e:74:da:31
lxc.rootfs.path = dir:/var/lib/lxc/android-dev/rootfs
# lxc.include = /usr/share/lxc/config/debian.common.conf
lxc.tty.max = 4
lxc.uts.name = android-dev
lxc.arch = amd64
lxc.pty.max = 1024
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = c : m
lxc.cgroup.devices.allow = b : m
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 5:2 rwm
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 254:0 rm
lxc.cgroup.devices.allow = c 10:229 rwm
lxc.cgroup.devices.allow = c 10:200 rwm
lxc.cgroup.devices.allow = c 1:7 rwm
lxc.cgroup.devices.allow = c 10:228 rwm
lxc.cgroup.devices.allow = c 10:232 rwm

inside the host run this
mknod /dev/kvm c 10 232
chmod 777 /dev/kvm
chown root:kvm /dev/kvm
mkdir -p dev/net
mknod /dev/net/tun c 10 200
chmod 777 /dev/net/tun
chown root:root /dev/net/tun

I needed to run chmod 777 for android studio to work. chmod 660 might be enough

You must have installed lxc alone (using sudo apt install lxc). The command “lxc config ...” comes from lxd which is a snap package. Install it using sudo snap install lxd.

Hi Jithin,

the issue was resolved before.

No need for LXD, I just used LXC.

Kind regards

Hi, where is located this configuration? if it’s a specific configuration for a CT, where is located the file?

on debian10/11:
/var/lib/lxc/CONTAINER/config

As root on the container:
root@Android10 ~# cat init.sh

mknod /dev/kvm c 10 232
chmod 777 /dev/kvm
chown root:kvm /dev/kvm
mkdir -p /dev/net
mknod /dev/net/tun c 10 200
chmod 777 /dev/net/tun
chown root:root /dev/net/tun

usermod -aG docker root
usermod -aG docker android

usermod -aG kvm root
usermod -aG kvm android

docker run --privileged -d -p 6080:6080 -p 5554:5554 -p 5555:5555 -e DEVICE=“Samsung Galaxy S10” f48b3c678d6a

got:

root@Android10 ~# docker exec -it nifty_rhodes tail -f /var/log/supervisor/docker-android.stderr.log
The KVM line in /etc/group is: [kvm:x:104:]

If the current user has KVM permissions,
the KVM line in /etc/group should end with “:” followed by your username.

If we see LINE_NOT_FOUND, the kvm gr
More info on configuring VM acceleration on Linux:

General information on acceleration: Configure hardware acceleration for the Android Emulator  |  Android Developers.

kvm:x:102:root,android is present into /etc/group

LXC config for container
root@qatesting:/var/lib/lxc/105# cat config
lxc.cgroup.relative = 0
lxc.cgroup.dir.monitor = lxc.monitor/105
lxc.cgroup.dir.container = lxc/105
lxc.cgroup.dir.container.inner = ns
lxc.arch = amd64
lxc.include = /usr/share/lxc/config/debian.common.conf
lxc.apparmor.profile = generated
lxc.apparmor.allow_nesting = 1
lxc.apparmor.raw = mount fstype=fuse,
lxc.mount.entry = /dev/fuse dev/fuse none bind,create=file 0 0
lxc.monitor.unshare = 1
lxc.tty.max = 2
lxc.environment = TERM=linux
lxc.uts.name = Android10
lxc.cgroup2.memory.max = 17179869184
lxc.cgroup2.memory.swap.max = 8589934592
lxc.rootfs.path = /var/lib/lxc/105/rootfs
lxc.net.0.type = veth
lxc.net.0.veth.pair = veth105i0
lxc.net.0.hwaddr = B2:C9:5A:55:FC:6A
lxc.net.0.name = eth0
lxc.net.0.script.up = /usr/share/lxc/lxcnetaddbr
lxc.cgroup2.cpuset.cpus = 5,7-8,20-21,26,28,30

cat /usr/share/lxc/config/debian.common.conf

This derives from the global common config

lxc.include = /usr/share/lxc/config/common.conf

Doesn’t support consoles in /dev/lxc/

lxc.tty.dir =

When using LXC with apparmor, the container will be confined by default.

If you wish for it to instead run unconfined, copy the following line

(uncommented) to the container’s configuration file.

lxc.apparmor.profile = unconfined

If you wish to allow mounting block filesystems, then use the following

line instead, and make sure to grant access to the block device and/or loop

devices below in lxc.cgroup.devices.allow.

lxc.apparmor.profile = lxc-container-default-with-mounting

Extra cgroup device access

rtc

#lxc.cgroup.devices.allow = c 254:0 rm

tun

#lxc.cgroup.devices.allow = c 10:200 rwm

hpet

#lxc.cgroup.devices.allow = c 10:228 rwm

kvm

#lxc.cgroup.devices.allow = c 10:232 rwm

To use loop devices, copy the following line to the container’s

configuration file (uncommented).

#lxc.cgroup.devices.allow = b 7:* rwm

lxc.cgroup.devices.allow = a
lxc.cap.drop =
lxc.cgroup.devices.allow = c : m
lxc.cgroup.devices.allow = b : m
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 5:2 rwm
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 254:0 rm
lxc.cgroup.devices.allow = c 10:229 rwm
lxc.cgroup.devices.allow = c 10:200 rwm
lxc.cgroup.devices.allow = c 1:7 rwm
lxc.cgroup.devices.allow = c 10:228 rwm
lxc.cgroup.devices.allow = c 10:232 rwm

** uname -a **
Linux qatesting 5.13.19-2-pve #1 SMP PVE 5.13.19-4 (Mon, 29 Nov 2021 12:10:09 +0100) x86_64 GNU/Linux

I’m working on an unprivileged container with fuse=1 and nesting=1 via Proxmox. Could anyone help me?