Host kept on rebooting after launching VM (XCP-NG hypervisor)

Hi,

We are getting strange issue in LXD. After installing, we managed to spin up few containers without any issues. However, after trying out the virtual machine option, it just keeps on rebooting the machine.

Some details i can gather at the moment

Hypervisor used: XCP-NG
OS: Ubuntu 20.04 (don’t have kernel version at the moment but just upgraded to latest one)
LXD Version: lxd 5.6-794016a
Storage backend: Block based BTRFS

I can’t find any obvious issues on /var/log/syslog and lxd.log. However, a colleague have shared the attached log from console when booting in rescue mode.

Another question, since we already have a few container workloads i was wondering how to recover it first while investigating the inability to spin virtual machines.

Does anyone know:

1.) How to forcefully set the containers/VMs to not start (since in rescue mode) i assume lxc commands will not work.
2.) disable the lxd service first on startup and manually bring it up without starting anything. I suspect the lxc start --vm that i did might have caused this issue cause it started from there while lxc containers are perfectly fine.

Thanks in advance.

I’m not sure if Xen supports nested VMs out of the box.

Are you saying that LXD is trying to start the VM and crashing the host doing so?

Yes @tomp

I asked the admin to expose the CPU

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   46 bits physical, 48 bits virtual
CPU(s):                          16
On-line CPU(s) list:             0-15
Thread(s) per core:              1
Core(s) per socket:              1
Socket(s):                       16
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           85
Model name:                      Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
Stepping:                        4
CPU MHz:                         2095.207
BogoMIPS:                        4190.20
Virtualisation:                  VT-x
Hypervisor vendor:               Xen
Virtualisation type:             full
L1d cache:                       512 KiB
L1i cache:                       512 KiB
L2 cache:                        16 MiB
L3 cache:                        352 MiB
NUMA node0 CPU(s):               0-15
Vulnerability Itlb multihit:     KVM: Mitigation: Split huge pages
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cache flushes, SMT disabled
Vulnerability Mds:               Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Mmio stale data:   Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Mitigation; Clear CPU buffers; SMT Host state unknown
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush acpi mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_g
                                 ood nopl nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm
                                 abm 3dnowprefetch cpuid_fault invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bm
                                 i2 erms invpcid rtm mpx rdseed adx smap clflushopt clwb xsaveopt xsavec xgetbv1 xsaves pku ospke md_clear flush_l1d

If you prevent the kvm or vsock modules from loading then that will prevent LXD from trying to start the VM, allowing you to start LXD and then remove the VM:

See for the things that LXD checks to detect VM support:

This might help:

Thanks a lot. I will try this part but its weird that when i tried start VM the system just reboots. And after few tries it just don’t go up and we needed to restore. Is there any other thing I can check on lxd side what might be causing it probably a dump exactly what it does and persist on some log before rebooting the host OS.

LXD just runs QEMU for VMs, so its possible its an incompatibility of running QEMU KVM VMs inside XEN guests.