Problem to start container - Failed to retrieve mode of device /dev/nvidiactl

Hi,
LXD - 3.0.3
lxc start guidisco - ubuntu 19.04 with mate
get this massage
Error: Common start logic: Failed to retrieve mode of device /dev/nvidiactl: open /dev/nvidiactl: no such device or address

Sounds like you somehow have a /dev/nvidiactl device on your host but without the nvidia driver properly loaded/functional.

Can you show lxc config show --expanded guidisco as well as nvidia-smi?

architecture: x86_64
config:
environment.DISPLAY: :0
image.architecture: amd64
image.description: Ubuntu disco amd64 (20181223_07:42)
image.name: ubuntu-disco-amd64-default-20181223_07:42
image.os: ubuntu
image.release: disco
image.serial: “20181223_07:42”
image.variant: default
raw.idmap: both 1000 1000
user.user-data: |
#cloud-config
runcmd:
- ‘sed -i “s/; enable-shm = yes/enable-shm = no/g” /etc/pulse/client.conf’
- ‘echo export PULSE_SERVER=unix:/tmp/.pulse-native | tee --append /home/ubuntu/.profile’
packages:
- x11-apps
- mesa-utils
- pulseaudio
volatile.base_image: 8d2efd5e43bed79f6fa6917c31873fb858e48b86cc4a83d452b30261a1ee517f
volatile.eth0.hwaddr: 00:16:3e:a4:1e:33
volatile.idmap.base: “0”
volatile.idmap.next: ‘[{“Isuid”:true,“Isgid”:false,“Hostid”:165536,“Nsid”:0,“Maprange”:1000},{“Isuid”:true,“Isgid”:true,“Hostid”:1000,“Nsid”:1000,“Maprange”:1},{“Isuid”:true,“Isgid”:false,“Hostid”:166537,“Nsid”:1001,“Maprange”:64535},{“Isuid”:false,“Isgid”:true,“Hostid”:165536,“Nsid”:0,“Maprange”:1000},{“Isuid”:true,“Isgid”:true,“Hostid”:1000,“Nsid”:1000,“Maprange”:1},{“Isuid”:false,“Isgid”:true,“Hostid”:166537,“Nsid”:1001,“Maprange”:64535}]’
volatile.last_state.idmap: ‘[{“Isuid”:true,“Isgid”:false,“Hostid”:165536,“Nsid”:0,“Maprange”:1000},{“Isuid”:true,“Isgid”:true,“Hostid”:1000,“Nsid”:1000,“Maprange”:1},{“Isuid”:true,“Isgid”:false,“Hostid”:166537,“Nsid”:1001,“Maprange”:64535},{“Isuid”:false,“Isgid”:true,“Hostid”:165536,“Nsid”:0,“Maprange”:1000},{“Isuid”:true,“Isgid”:true,“Hostid”:1000,“Nsid”:1000,“Maprange”:1},{“Isuid”:false,“Isgid”:true,“Hostid”:166537,“Nsid”:1001,“Maprange”:64535}]’
volatile.last_state.power: STOPPED
devices:
PASocket:
path: /tmp/.pulse-native
source: /run/user/1000/pulse/native
type: disk
X0:
path: /tmp/.X11-unix/X0
source: /tmp/.X11-unix/X0
type: disk
eth0:
name: eth0
nictype: bridged
parent: virbr0
type: nic
mygpu:
type: gpu
root:
path: /
pool: default
type: disk
ephemeral: false
profiles:

  • default
  • gui
    stateful: false
    description: “”

Is there some black list where you can insert some devices, so container ignore them ?

Containers don’t normally rely on host devices, though in your case, you do have a gpu device which then requires LXD be able to access all GPU and GPU related devices on the host to set it up, including /dev/nvidiactl.

It sounds like your NVIDIA setup is currently non-fonctional on the host, fixing that will fix the container. Alternatively, if you don’t use the NVIDIA GPU, you can tell LXD to only pass whatever other GPU you have, which should get you rid of that error.

ok, i hind your answer for familiar topic here


I check my intel card and pass command
lxc config device add gui1404 gpu0 gpu pci=0000:00:02.0
but I get still
Error: Common start logic: Failed to retrieve mode of device /dev/nvidiactl: open /dev/nvidiactl: no such device or address
What I doing wrong ?

@Mariusz_Peryt did you first remove your mygpu device with lxc config device remove?

I have the same question

input/output error suggests some weird kernel issue, anything in dmesg?