Can we set VM pre-start hooks?

I have a laptop that has Intel and Nvidia GPUs and the Nvidia GPU is not used by the host (no drivers installed). I only use it for GPU passthrough to VMs.

There’s a module called bbswitch which can be used to shut down the Nvidia GPU when it is not in use, and this can greatly save battery life. Right now my solution to running VMs that get passed the gpu is to run the following commands before it starts up:

# turn on the dGPU
echo ON > /proc/acpi/bbswitch
# unload the bbswitch module
rmmod bbswitch

and this when the VM shuts down:

# load bbswitch module
modprobe bbswitch
# turn off the dGPU
echo OFF > /proc/acpi/bbswitch

What is the recommended solution to have LXD automate these tasks for me?

I managed to emulate the post-stop hook by writing a python-lxd script that listens for the “instance-shutdown” lifecycle event and turns off the GPU.

While there’s an event for when the instance starts up, it only happens when it is already started, and ideally we’d remove the bbswitch module before it starts up (I assume bbswitch binds to the GPU, which can cause problems if vfio-pci also tries to bind).

This would be the correct approach as LXD does not support hooks by design as they are difficult to reason about when making changes in the future (as its unknown what people are using them for).

See also

Does the VM not start if bbswitch module is loaded?

The VM does start, and it automatically turns ON the GPU if it was turned off before. In fact, I was relying on this behavior and had the python script run as a system service doing something like this:

  • Upon startup it checks if the VM is running. If not, it would turn off the GPU
  • Every time it receives the shutdown event, it would turn off the GPU 10 seconds after

However this solution is not perfect and I noticed some strange behaviors: If I turn off the GPU when the VM is running (in fact, I had to wait a while after the VM shuts down), it messes with the GPU audio device, it shows as missing in device manager. I can only fix this if I reboot the host. Also, randomly when I restart the VM (not shutdown, only restart), the GPU will simply disappear from the guest. None of these issues seem to happen when bbswitch is not loaded.

Another reason I was looking for an alternative, is that it would also be useful to have nvidia drivers running on the host, and have the ability to dynamically unload the drivers so the GPU can be passed through.

I wonder if it would be possible to implement a mechanism to delay instance startup via the REST API. For example, if the instance config has a delay flag set, it would send a “starting” event via websocket, then it would only start 5 seconds after or when a event consumer sent a message (Whichever comes first).

This would give a mechanism for external scripts to emulate the pre startup hook behavior.

Are you able to unload the module whilst the VM is running?

Yes it seems possible to unload the module after the VM is running, but it still seems to “break” something.

Here’s a test I just did:

  • load bbswitch and turn off the GPU
  • start the VM (this automatically turns the GPU on as shown by cat /proc/acpi/bbswitch)
  • unload bbswitch
  • reboot the VM

After reboot, the VM is missing the GPU. When the VM loses the GPU, I can fix it by shutting down and restarting the VM. Still, something seems strange after this happens, because when I try to read from /proc/acpi/bbswitch there’s a small delay happening:

$ time cat /proc/acpi/bbswitch 
0000:01:00.0 ON

real    0m0.340s
user    0m0.000s
sys     0m0.008s

This is the normal behavior:

$ time cat /proc/acpi/bbswitch 
0000:01:00.0 ON

real    0m0.001s
user    0m0.001s
sys     0m0.000s

To summarize, it is possible to make it work with these side effects, but the only clean way would be to unload the module before the VM starts.

I’ve sent a suggestion on how to implement the necessary infrastructure for pre-startup hooks: