VirtualGL in a container

trumee · November 22, 2019, 5:24am

Hello,

I want to use VirtualGL inside a headless container (running on a remote server) and access it via VNC. I installed VirtualGL in a Centos container, configured using ‘/opt/VirtualGL/bin/vglserver_config’ and started a vnc session. Unfortunately, I get an error when trying to start glxgears

$ vglrun glxgears
[VGL] ERROR: Could not open display :0.

I can access the gpu inside the container,

$ nvidia-smi 
Fri Nov 22 05:21:54 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.31       Driver Version: 440.31       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P2000        Off  | 00000000:82:00.0 Off |                  N/A |
| 79%   46C    P0    19W /  75W |      0MiB /  5059MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

The configuration log is as follows

# /opt/VirtualGL/bin/vglserver_config

1) Configure server for use with VirtualGL
2) Unconfigure server for use with VirtualGL
X) Exit

Choose:
1

Restrict 3D X server access to vglusers group (recommended)?
[Y/n]


Restrict framebuffer device access to vglusers group (recommended)?
[Y/n]


Disable XTEST extension (recommended)?
[Y/n]

... Creating vglusers group ...
groupadd: group 'vglusers' already exists
Could not add vglusers group (probably because it already exists.)
... Creating /etc/opt/VirtualGL/ ...
... Granting read permission to /etc/opt/VirtualGL/ for vglusers group ...
... Modifying /etc/security/console.perms to disable automatic permissions
    for DRI devices ...
... Creating /etc/modprobe.d/virtualgl.conf to set requested permissions for
    /dev/nvidia* ...
... Attempting to remove nvidia module from memory so device permissions
    will be reloaded ...
rmmod: ERROR: Module nvidia is in use by: nvidia_uvm nvidia_modeset
... Granting write permission to /dev/nvidia-uvm /dev/nvidia-uvm-tools /dev/nvidia0 /dev/nvidiactl for vglusers group ...
chmod: changing permissions of '/dev/nvidia-uvm': Read-only file system
chmod: changing permissions of '/dev/nvidia-uvm-tools': Read-only file system
chmod: changing permissions of '/dev/nvidiactl': Read-only file system
chown: changing ownership of '/dev/nvidia-uvm': Read-only file system
chown: changing ownership of '/dev/nvidia-uvm-tools': Read-only file system
chown: changing ownership of '/dev/nvidiactl': Read-only file system
... Granting write permission to /dev/dri/card0 for vglusers group ...
... Modifying /etc/X11/xorg.conf.d/99-virtualgl-dri to enable DRI permissions
    for vglusers group ...
... Modifying /etc/X11/xorg.conf to enable DRI permissions
    for vglusers group ...
... Adding vglgenkey to /etc/gdm/Init/Default script ...
... Creating /usr/share/gdm/greeter/autostart/virtualgl.desktop ...
... Disabling XTEST extension in /etc/gdm/custom.conf ...
... Setting default run level to 5 (enabling graphical login prompt) ...
... Commenting out DisallowTCP line (if it exists) in /etc/gdm/custom.conf ...

Done. You must restart the display manager for the changes to take effect.

IMPORTANT NOTE: Your system uses modprobe.d to set device permissions. You
must execute rmmod nvidia with the display manager stopped in order for the
new device permission settings to become effective.

I dont have any display manager running inside the container. How can i get VirtualGL running in an lxd container?

Thanks

trumee · December 7, 2019, 7:16pm

I have VirtualGL running on the host which is running lightdm. I checked it using,

$vglconnect -s user@server
$ VGL_LOGO=1 vglrun +v glxinfo|head -10
[VGL] NOTICE: Added /usr/lib to LD_LIBRARY_PATH
[VGL] Shared memory segment ID for vglconfig: 10
[VGL] VirtualGL v2.6.2 64-bit (Build 20190603)
[VGL] Opening connection to 3D X server :0
[VGL] Using Pbuffers for rendering
name of display: localhost:10.0
display: localhost:10  screen: 0
direct rendering: Yes
server glx vendor string: VirtualGL
server glx version string: 1.4
server glx extensions:
    GLX_ARB_create_context, GLX_ARB_create_context_profile, 
    GLX_ARB_create_context_robustness, GLX_ARB_fbconfig_float, 
    GLX_ARB_get_proc_address, GLX_ARB_multisample, GLX_EXT_framebuffer_sRGB,

The container is defined as:

lxc config show plex
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Archlinux current amd64 (20190914_04:18)
  image.os: Archlinux
  image.release: current
  image.serial: "20190914_04:18"
  image.type: squashfs
  nvidia.runtime: "true"
  raw.idmap: |
    uid 816 816
    gid 816 816
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":816},{"Isuid":true,"Isgid":false,"Hostid":816,"Nsid":816,"Maprange":1},{"Isuid":true,"Isgid":false,"Hostid":1000817,"Nsid":817,"Maprange":999999183},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":816},{"Isuid":false,"Isgid":true,"Hostid":816,"Nsid":816,"Maprange":1},{"Isuid":false,"Isgid":true,"Hostid":1000817,"Nsid":817,"Maprange":999999183}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":816},{"Isuid":true,"Isgid":false,"Hostid":816,"Nsid":816,"Maprange":1},{"Isuid":true,"Isgid":false,"Hostid":1000817,"Nsid":817,"Maprange":999999183},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":816},{"Isuid":false,"Isgid":true,"Hostid":816,"Nsid":816,"Maprange":1},{"Isuid":false,"Isgid":true,"Hostid":1000817,"Nsid":817,"Maprange":999999183}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":816},{"Isuid":true,"Isgid":false,"Hostid":816,"Nsid":816,"Maprange":1},{"Isuid":true,"Isgid":false,"Hostid":1000817,"Nsid":817,"Maprange":999999183},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":816},{"Isuid":false,"Isgid":true,"Hostid":816,"Nsid":816,"Maprange":1},{"Isuid":false,"Isgid":true,"Hostid":1000817,"Nsid":817,"Maprange":999999183}]'
  volatile.last_state.power: STOPPED
devices:
  gpu:
    type: gpu
ephemeral: false
profiles:
- default
stateful: false
description: ""

Unfortunately, it does not start with nvidia.runtime: “true” anymore. It used to start before i had setup VirtualGL. On turning off the nvidia.runtime, the container does show the output of nvidia-smi.

trumee · December 8, 2019, 4:41pm

I started from scratch and setup a new container whose config looks like so,

$ lxc config show sim1
architecture: x86_64
config:
  environment.DISPLAY: :0
  image.architecture: amd64
  image.description: Archlinux current amd64 (20191208_04:18)
  image.os: Archlinux
  image.release: current
  image.serial: "20191208_04:18"
  image.type: squashfs
  raw.idmap: |
    uid 1001 1001
    gid 100 100
    gid 1002 1002
  security.privileged: "true"
  volatile.base_image: bc145aae1ed946126064bf95758a31a3ae9672327643b4bed026a25c6c82eeef
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.current: '[]'
  volatile.idmap.next: '[]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: RUNNING
devices:
  Xauthority:
    path: /home/user/.Xauthority
    source: /home/user/.Xauthority
    type: disk
  eth0:
    nictype: bridged
    parent: vlan300br
    type: nic
  mygpu:
    productid: 1d01
    type: gpu
    vendorid: 10de
  vglxauthkey:
    path: /etc/opt/VirtualGL
    source: /etc/opt/VirtualGL
    type: disk
  x11:
    path: /mnt/x11
    source: /tmp/.X11-unix
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

I have the /tmp/.X11-unix/X0 and /home/use/.Xauthority matching the host,

$ ls -la /tmp/.X11-unix/X0 
srwxrwxrwx 1 root root 0 Dec  8 21:21 /tmp/.X11-unix/X0
$ lxc exec sim1 -- ls -la /tmp/.X11-unix/X0 
srwxrwxrwx 1 root root 0 Dec  8 21:21 /tmp/.X11-unix/X0


$ lxc exec sim1 -- ls -la /home/user/.Xauthority 
-rw------- 0 user users 488 Dec  8 21:25 /home/user/.Xauthority
$ ls -la /home/user/.Xauthority 
-rw------- 1 user users 488 Dec  8 22:04 /home/user/.Xauthority

However if i try to connect via ssh or vglconnect my xauth is not accepted.

$ vglconnect -s user@172.16.3.131

VirtualGL Client 64-bit v2.5.2 (Build 20191122)
vglclient is already running on this X display and accepting SSL
   connections on port 4243.
vglclient is already running on this X display and accepting unencrypted
   connections on port 4242.

Making preliminary SSH connection to find a free port on the server ...
Making final SSH connection ...
/usr/bin/xauth:  unable to rename authority file /home/user/.Xauthority, use /home/user/.Xauthority-n

$ ssh -X -Y  user@172.16.3.131
Last login: Sun Dec  8 22:05:08 2019 from 172.16.1.28
/usr/bin/xauth:  unable to rename authority file /home/user/.Xauthority, use /home/user/.Xauthority-n

How can i make the container accept my xauth?

simos · December 9, 2019, 8:08am

Hi!

Can you setup the container according to this,

The “permission denied” is a very specific error. You can strace the command to see what exactly is denied.

trumee · December 9, 2019, 3:29pm

Hi,

I have setup the container following your article. There in one difference between your setup and mine. My host is a headless server running lightdm with the Xconfig generated using,

nvidia-xconfig -a --allow-empty-initial-configuration --use-display-device=None --virtual=1920x1200

I have disabled VirtualGL on the host for the time being to match your setup.

I sshed into the host and then did,

lxc exec mycontainer -- sudo --user ubuntu --login

ubuntu@mycontainer:~$ glxinfo -B
Error: unable to open display :0

$ pactl info
xcb_connection_has_error() returned true
Server String: /tmp/pulse-PKdhtXMmr18n/native
Library Protocol Version: 32
Server Protocol Version: 32
Is Local: yes
Client Index: 0
Tile Size: 65472
User Name: ubuntu
Host Name: mycontainer
Server Name: pulseaudio
Server Version: 11.1
Default Sample Specification: s16le 2ch 44100Hz
Default Channel Map: front-left,front-right
Default Sink: auto_null
Default Source: auto_null.monitor
Cookie: db60:445c

I dont see X being created in the container,

$ lxc exec mycontainer -- ls -la /home/ubuntu          
total 53
drwxr-xr-x 4 ubuntu ubuntu   10 Dec  9 20:17 .
drwxr-xr-x 3 root   root      3 Dec  9 20:14 ..
-rw------- 1 ubuntu ubuntu  147 Dec  9 20:24 .bash_history
-rw-r--r-- 1 ubuntu ubuntu  220 Apr  5  2018 .bash_logout
-rw-r--r-- 1 ubuntu ubuntu 3771 Apr  5  2018 .bashrc
drwx------ 3 ubuntu ubuntu    3 Dec  9 20:17 .config
-rw-r--r-- 1 ubuntu ubuntu  807 Apr  5  2018 .profile
drwx------ 2 ubuntu ubuntu    3 Dec  9 20:14 .ssh
-rw-r--r-- 1 ubuntu ubuntu    0 Dec  9 20:16 .sudo_as_admin_successful
srwxrwxrwx 1 ubuntu ubuntu    0 Dec  9 20:16 pulse-native

$ lxc exec mycontainer -- ls -la /tmp/.X11-unix/
total 49
drwxrwxrwt  2 root root  2 Dec  9 20:16 .
drwxrwxrwt 10 root root 10 Dec  9 20:24 ..

$ lxc exec mycontainer -- nvidia-smi -L         
GPU 0: GeForce GT 1030 (UUID: GPU-1ee28638-8821-c66d-f2a5-e92da7f7d91d)

The container config looks like this,

$ lxc config show mycontainer
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 18.04 LTS amd64 (release) (20191205)
  image.label: release
  image.os: ubuntu
  image.release: bionic
  image.serial: "20191205"
  image.type: squashfs
  image.version: "18.04"
  volatile.base_image: f75468c572cc50eca7f76391182e6fdaf58431f84c3d35a2c92e83814e701698
  volatile.eth0.host_name: veth40e56cc5
  volatile.eth0.hwaddr: 00:16:3e:41:43:b7
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
devices: {}
ephemeral: false
profiles:
- default
- x11
stateful: false
description: ""

$ lxc profile show x11       
config:
  environment.DISPLAY: :0
  environment.PULSE_SERVER: unix:/home/ubuntu/pulse-native
  nvidia.driver.capabilities: all
  nvidia.runtime: "true"
  user.user-data: |
    #cloud-config
    runcmd:
      - 'sed -i "s/; enable-shm = yes/enable-shm = no/g" /etc/pulse/client.conf'
    packages:
      - x11-apps
      - mesa-utils
      - pulseaudio
description: GUI LXD profile
devices:
  PASocket1:
    bind: container
    connect: unix:/run/user/1000/pulse/native
    gid: "1000"
    listen: unix:/home/ubuntu/pulse-native
    mode: "0777"
    security.gid: "100"
    security.uid: "1001"
    type: proxy
    uid: "1000"
  X0:
    bind: container
    connect: unix:@/tmp/.X11-unix/X0
    listen: unix:@/tmp/.X11-unix/X0
    security.gid: "100"
    security.uid: "1001"
    type: proxy
  mygpu:
    productid: 1d01
    type: gpu
    vendorid: 10de
name: x11
used_by:
- /1.0/containers/mycontainer

The host user i am trying to map has uid of 1001 and gid of 100. Thus i changed your x11 profile to match those. On the host i can see the X0 created,

$ ls -la /tmp/.X11-unix/
total 0
drwxrwxrwt  2 root root  60 Dec  9 20:11 .
drwxrwxrwt 10 root root 260 Dec  9 20:11 ..
srwxrwxrwx  1 root root   0 Dec  9 20:11 X0

Why i dont get X0 in the container?

simos · December 9, 2019, 5:16pm

Thanks for finding the typo. I just fixed it.

The short answer is that currently, the proxy devices do not transcend between LXD installations.
See recent post at Forward port to be accessible from remote container

That is, if you

lxc launch ubuntu:18.04 myremotelxd:mytest --profile default --profile x11
lxc exec myremotelxd:mytest -- sudo --user ubuntu --login
xclock

then the xclock application will (try to) appear on the system with myremotelxd, not your local computer. The reason is that the lxc client, when working with remote LXD servers, does some things between local and remote (such as lxc file push) and others between the remote host and the remote container (such as lxc config device).

If you look into the container logs on the remote server, you will likely see the proxy device failing to get created. This should be on myremotelxd, at /var/snap/lxd/common/lxd/logs/mycontainer/.

So, how do you deal with this issue?The communication between the X server and the X11 applications is uncompressed and requires lots of speed so that the applications not not appear to stutter when you use them. For this reason there are alternative protocols like VNC that compress this communication quite efficiently.

However, what can you do if you really want to CUDA stuff on a remote computer that has the GPU?

You can create a new LXD profile (x11network), one that does not have the X0 proxy device.
Use SSH and socat, for example, to pass your local /tmp/.X11-unix/X0 socket to the container.
Test first that xclock works.

Hint: Here is the command line that works. It is quite slow and you would need to figure out how to optimize.

socat exec:'ssh -p 2222 ubuntu@remotelxd.example.com socat unix-l\:/tmp/.X11-unix/X1 -' unix:/tmp/.X11-unix/X0

X1 is the socket of your local desktop. X0 is the new socket at the remote container.

trumee · December 9, 2019, 5:35pm

I am not sure what you mean by multiple LXD installations. I only have a single installation of LXD which is running on the host. I am not using the remote functionality of LXD. I simply do ‘ssh host’ to get access to the host and then run lxc commands.

To make things simpler, i started x11vnc on the host using,

[root ]# x11vnc -create -xkb -noxrecord -noxfixes -noxdamage -display :0 -auth /var/run/lightdm/root/\:0

And then I connected on the host using vncviewer from a remote system. This is now equivalent to sitting physically on the host. In vncviewer even $DISPLAY shows up as 0.0 on the host

$ echo $DISPLAY
:0.0

However the container doesnt show X0,

$ lxc exec mycontainer -- ls -la /tmp/.X11-unix/
total 49
drwxrwxrwt 2 root root 2 Dec  9 21:46 .
drwxrwxrwt 8 root root 8 Dec  9 21:47 ..

Here is a screenshot of the terminal running in the vnc session on the host.

There is nothing obvious in the logs,

# grep  X /var/snap/lxd/common/lxd/logs/mycontainer/*
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:Failed to read AF_UNIX datagram queue length, ignoring: No such file or directory
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:         Starting LXD - unix socket.
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:[  OK  ] Listening on LXD - unix socket.
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:         Starting LXD - container startup/shutdown...
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:[  OK  ] Started LXD - container startup/shutdown.
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:         Stopping LXD - container startup/shutdown...
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:[  OK  ] Stopped LXD - container startup/shutdown.
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:[  OK  ] Closed LXD - unix socket.
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:         Unmounting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.440.36...
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:         Unmounting /usr/lib/i386-linux-gnu/libGLX_nvidia.so.440.36...
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:[  OK  ] Unmounted /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.440.36.
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:[  OK  ] Unmounted /usr/lib/i386-linux-gnu/libGLX_nvidia.so.440.36.
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:Failed to read AF_UNIX datagram queue length, ignoring: No such file or directory
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:         Starting LXD - unix socket.
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:[  OK  ] Listening on LXD - unix socket.
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:         Starting LXD - container startup/shutdown...
/var/snap/lxd/common/lxd/logs/mycontainer/console.log:[  OK  ] Started LXD - container startup/shutdown.

simos · December 9, 2019, 9:02pm

I obviously misread your post.

I managed to get VirtualGL running in a LXD container according to the official documentation at https://cdn.rawgit.com/VirtualGL/virtualgl/2.6.3/doc/index.html
To verify that VirtualGL was really working,

ubuntu@virtualgl:~$ /opt/VirtualGL/bin/vglrun /usr/bin/glxgears 
### click to close the window
[VGL] ERROR: in readback--
[VGL]    259: Window has been deleted by window manager
ubuntu@virtualgl:~$

And /opt/VirtualGL/bin/glxinfo -c showed those PBuffers in the output.

I noticed that you are using Centos in the container, which may be an issue.
I am not able to help with Centos.

trumee · December 10, 2019, 1:03am

My host is running ArchLinux and the container is running Ubuntu:18.04. I have followed your blog post for my setup. The only difference between your blog post and my setup is that we are using different distributons on the host.

I am not sure why the proxy is not working correctly for me. Can you post the log entry for successful proxy setup in the container?

I just saw this log file on the host,

# cat /var/snap/lxd/common/lxd/logs/mycontainer/proxy.X0.log 
Status: Started
Warning: Error while reading data: read unix @->@/tmp/.X11-unix/X0: EOF

Is this a permissions problem? I can see that the file definitely exists on the host,

# ls -la /tmp/.X11-unix/X0 
srwxrwxrwx 1 root root 0 Dec  9 20:11 /tmp/.X11-unix/X0
#lsof -U|grep Xorg|head -1
Xorg      1036029            root    5u  unix 0x0000000033b80aa3      0t0  1155519 @/tmp/.X11-unix/X0 type=STREAM

UID/GID of the non root user is,

uid=1001(user) gid=100(users) groups=100(users),976(lxd),998(wheel),1002(vglusers)

And I have setup the container config accordingly as listed previously.

I started lxc monitor and issued the command lxc exec mycontainer -- glxinfo -B. Lxc monitor showed the following:

location: none
metadata:
  class: websocket
  created_at: "2019-12-10T21:36:18.375395438+05:30"
  description: Executing command
  err: ""
  id: 33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9
  location: none
  may_cancel: false
  metadata:
    command:
    - glxinfo
    - -B
    environment:
      DISPLAY: :0
      HOME: /root
      LANG: C.UTF-8
      PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
      TERM: xterm-256color
      TZ: Asia/Kolkata
      USER: root
    fds:
      "0": 10b9959587488d29868d530444a87a28f0432ea82b95ab75b008859e24724404
      control: f3f35e8a2e909d1b5519084dcc65ad928007f2d021cefef936aadd3ed5ba8365
    interactive: true
  resources:
    containers:
    - /1.0/containers/mycontainer
  status: Pending
  status_code: 105
  updated_at: "2019-12-10T21:36:18.375395438+05:30"
timestamp: "2019-12-10T21:36:18.376905423+05:30"
type: operation


location: none
metadata:
  class: websocket
  created_at: "2019-12-10T21:36:18.375395438+05:30"
  description: Executing command
  err: ""
  id: 33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9
  location: none
  may_cancel: false
  metadata:
    command:
    - glxinfo
    - -B
    environment:
      DISPLAY: :0
      HOME: /root
      LANG: C.UTF-8
      PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
      TERM: xterm-256color
      TZ: Asia/Kolkata
      USER: root
    fds:
      "0": 10b9959587488d29868d530444a87a28f0432ea82b95ab75b008859e24724404
      control: f3f35e8a2e909d1b5519084dcc65ad928007f2d021cefef936aadd3ed5ba8365
    interactive: true
  resources:
    containers:
    - /1.0/containers/mycontainer
  status: Running
  status_code: 103
  updated_at: "2019-12-10T21:36:18.375395438+05:30"
timestamp: "2019-12-10T21:36:18.377340734+05:30"
type: operation


location: none
metadata:
  context:
    ip: '@'
    method: GET
    url: /1.0/operations/33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9/websocket?secret=f3f35e8a2e909d1b5519084dcc65ad928007f2d021cefef936aadd3ed5ba8365
    user: ""
  level: dbug
  message: Handling
timestamp: "2019-12-10T21:36:18.378856925+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: 'Connected websocket Operation: 33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9'
timestamp: "2019-12-10T21:36:18.378889295+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: 'Handled websocket Operation: 33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9'
timestamp: "2019-12-10T21:36:18.379019786+05:30"
type: logging


location: none
metadata:
  context:
    ip: '@'
    method: GET
    url: /1.0/operations/33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9/websocket?secret=10b9959587488d29868d530444a87a28f0432ea82b95ab75b008859e24724404
    user: ""
  level: dbug
  message: Handling
timestamp: "2019-12-10T21:36:18.379675093+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: 'Connected websocket Operation: 33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9'
timestamp: "2019-12-10T21:36:18.379700305+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: 'Handled websocket Operation: 33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9'
timestamp: "2019-12-10T21:36:18.37979096+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: Starting to mirror websocket
timestamp: "2019-12-10T21:36:18.380350013+05:30"
type: logging


location: none
metadata:
  context:
    ip: '@'
    method: GET
    url: /1.0/operations/33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9
    user: ""
  level: dbug
  message: Handling
timestamp: "2019-12-10T21:36:18.383809564+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: 'Got error getting next reader read unix /var/snap/lxd/common/lxd/unix.socket->@:
    use of closed network connection'
timestamp: "2019-12-10T21:36:18.423418939+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: sending write barrier
timestamp: "2019-12-10T21:36:18.423516719+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: 'Detected poll(POLLHUP) event: exiting.'
timestamp: "2019-12-10T21:36:18.423425369+05:30"
type: logging


location: none
metadata:
  context: {}
  level: warn
  message: Detected poll(POLLNVAL) event.
timestamp: "2019-12-10T21:36:18.42364651+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: 'Got error getting next reader websocket: close 1006 (abnormal closure):
    unexpected EOF'
timestamp: "2019-12-10T21:36:18.423652255+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: Finished to mirror websocket
timestamp: "2019-12-10T21:36:18.423680333+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: 'Updated metadata for websocket Operation: 33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9'
timestamp: "2019-12-10T21:36:18.42372731+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: 'Success for websocket operation: 33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9'
timestamp: "2019-12-10T21:36:18.42413781+05:30"
type: logging


location: none
metadata:
  class: websocket
  created_at: "2019-12-10T21:36:18.375395438+05:30"
  description: Executing command
  err: ""
  id: 33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9
  location: none
  may_cancel: false
  metadata:
    return: 255
  resources:
    containers:
    - /1.0/containers/mycontainer
  status: Running
  status_code: 103
  updated_at: "2019-12-10T21:36:18.423716863+05:30"
timestamp: "2019-12-10T21:36:18.424116936+05:30"
type: operation


location: none
metadata:
  class: websocket
  created_at: "2019-12-10T21:36:18.375395438+05:30"
  description: Executing command
  err: ""
  id: 33bd165d-ebcc-429d-8a6b-a8b1cb98a6d9
  location: none
  may_cancel: false
  metadata:
    return: 255
  resources:
    containers:
    - /1.0/containers/mycontainer
  status: Success
  status_code: 200
  updated_at: "2019-12-10T21:36:18.423716863+05:30"
timestamp: "2019-12-10T21:36:18.424443167+05:30"
type: operation


location: none
metadata:
  context: {}
  level: dbug
  message: 'Event listener finished: 2dd6b5fc-b77b-44dc-b1db-ae5fe48ab621'
timestamp: "2019-12-10T21:36:18.42843076+05:30"
type: logging


location: none
metadata:
  context: {}
  level: dbug
  message: 'Disconnected event listener: 2dd6b5fc-b77b-44dc-b1db-ae5fe48ab621'
timestamp: "2019-12-10T21:36:18.428698241+05:30"
type: logging

Next step was to step check if proxy is working so i forwarded port 80 and installed apache2 in the container,

lxc config device add mycontainer port80 proxy listen=tcp:0.0.0.0:80 connect=tcp:127.0.0.1:80

Indeed i am able to reach the container via the host though the proxy device,

$ telnet localhost 80
Trying ::1...
Connected to localhost.
Escape character is '^]'.

It seems that something is weird with X0 device.

trumee · December 11, 2019, 3:13pm

I was able to get X going in the container.

However with VirtualGL the container is refusing to start with nvidia.runtime: “true”. Please can you post the container config which you used for VirtualGL test and permissions on the host for
/dev/nvidia*.

My device permissions are as follows:

$ ls -la /dev/nvidia*
crw-rw---- 1 root vglusers 195, 254 Dec 11 20:32 /dev/nvidia-modeset
crw-rw---- 1 root vglusers 237,   0 Dec 11 20:33 /dev/nvidia-uvm
crw-rw---- 1 root vglusers 237,   1 Dec 11 20:33 /dev/nvidia-uvm-tools
crw-rw---- 1 root vglusers 195,   0 Dec 11 20:32 /dev/nvidia0
crw-rw---- 1 root vglusers 195,   1 Dec 11 20:32 /dev/nvidia1
crw-rw---- 1 root vglusers 195, 255 Dec 11 20:32 /dev/nvidiactl

# nvidia-container-cli --user=1001:1002 --load-kmods info
NVRM version:   440.36
CUDA version:   10.2

Device Index:   0
Device Minor:   0
Model:          GeForce GT 1030
Brand:          GeForce
GPU UUID:       GPU-1ee28638-8821-c66d-f2a5-e92da7f7d91d
Bus Location:   00000000:82:00.0
Architecture:   6.1

Device Index:   1
Device Minor:   1
Model:          Quadro P2000
Brand:          Quadro
GPU UUID:       GPU-6eaf8700-58d6-0817-0637-d8a8a66e32fd
Bus Location:   00000000:83:00.0
Architecture:   6.1

I suspect container doesnt like the vglusers permission.

mtheimpaler · July 3, 2025, 10:26am

@simos Has there been any update on this ? I too want to utilize image rendering on a headless server inside my containers.

Something as simple as maybe say the following…
I have an nvidia gpu on my host and i create a container where i install xfce4 or some other desktop… I then usually use apache guacamole to create an RDP session to that container and use it from there… But i know the the openGL renderer is always llvmpipe and i cant for the life of me get it to work …

How do you fake a display on the host so that i can use its socket to render in the container?

Is this even possible via RDP ?

Id love to get something like a container that i can connect to remotely via apache guacamole (through RDP or VNC , RDP prefferably) and then run some stuff in unreal engine that is very graphic and physics intensive . Is this possible ? Can i get the desktop at least to be rendered by the gpu ?

I’ve tried so many methods to get this to work. I would greatly appreciate any help !

simos · July 3, 2025, 11:56am

Here is the documentation for VirtualGL that I tried in 2019, VirtualGL | Documentation / Documentation
I have not tried this again yet but I assume it should work.
Obviously the first step is to try the simplest of the cases with VirtualGL per documentation, get evidence that it works, then build on that for other things.
Can you give it a go?

mtheimpaler · July 3, 2025, 1:05pm

Ive got it running in my container but im positive my gpu isnt being utilized, im a bit worried to install in the host. Ill give it a go in my dev environment .

I also see that theres some nvidia egl stuff too from apt.

Im assuming that i use xfvb to create a dummy display.

I wonder if x11 is even worth the hassle ? Maybe i should look into wayland