Do you get the same problem if you refresh to the 5.0/stable snap channel?
We can’t update lxd from 4.0.9-eb5e237 (23991 4.0/stable) to 5.0/stable because this is production server with lots of virtual machines and containers running - we can’t allow downtime for other vm/containers
Do you have any other suggestions how to fix this important issue without refreshing lxd to 5.0 or restarting server ?
What does snap info lxd
show?
I don’t think that LXD version has been updated for a long time, which then suggests something else has changed on that server.
But lets check.
Also just as a side note, the LXD 4.0 LTS series is only receiving security bug fixes now, so for continued general bug fix/environmental change support you need to be running the LXD 5.0 LTS series.
See Managing the LXD snap for more info about the different snap channels.
Do you know of anything that has changed on that server recently? Any updates applied?
Can new VMs be launched?
Looking at the snap change log it seems there were some cherry picks of dependency updates 10 days ago on the 22nd Nov, and the latest 4.0/stable package was built on 25th Nov so would include those changes.
I suspect the qemu change is the most likely candidate for the breakage.
I’ll have a look and see if I can recreate.
Please can you provide lxc config show <instance> --expanded
and lxc storage show <pool>
and lxc network show <network>
for the relevant instance, pool and network.
Thanks
I just recreated the same issue on the LXD 4.0/stable channel:
snap install lxd --channel=4.0/stable
lxd init --auto
lxc launch images:ubuntu/focal v1 --vm
Creating v1
Starting v1
Error: Failed to run: forklimits limit=memlock:unlimited:unlimited -- /snap/lxd/23991/bin/qemu-system-x86_64 -S -name v1 -uuid d44bab24-1a63-4f5a-b072-d5eef160a1aa -daemonize -cpu host,hv_passthrough -nographic -serial chardev:console -nodefaults -no-user-config -sandbox on,obsolete=deny,elevateprivileges=allow,spawn=deny,resourcecontrol=deny -readconfig /var/snap/lxd/common/lxd/logs/v1/qemu.conf -spice unix=on,disable-ticketing=on,addr=/var/snap/lxd/common/lxd/logs/v1/qemu.spice -pidfile /var/snap/lxd/common/lxd/logs/v1/qemu.pid -D /var/snap/lxd/common/lxd/logs/v1/qemu.log -smbios type=2,manufacturer=Canonical Ltd.,product=LXD -runas lxd: : Process exited with non-zero value -1
Try `lxc info --show-log local:v1` for more info
lxc info --show-log local:v1
Name: v1
Location: none
Remote: unix://
Architecture: x86_64
Created: 2022/12/02 11:17 UTC
Status: Stopped
Type: virtual-machine
Profiles: default
Log:
What does sudo snap changes
show for LXD? I wonder if we can get a past revision.
New VMs can’t be launched - the same error
We have latest 4.0.9 release from 4.0/stable channel (updated about a week ago):
snap list |grep lxd
lxd 4.0.9-eb5e237 23991 4.0/stable canonical** in-cohort
lxc config show w10-terminal-ssd --expanded
architecture: x86_64
config:
boot.autostart: "true"
boot.autostart.priority: "195"
limits.cpu: "8"
limits.memory: 24GB
security.secureboot: "false"
volatile.eth0.hwaddr: 00:16:3e:8d:2a:6e
volatile.last_state.power: STOPPED
volatile.uuid: ccfe6235-1186-446c-9e15-a28ee1b2a21a
volatile.vsock_id: "662"
devices:
eth0:
name: eth0
nictype: bridged
parent: br0
type: nic
root:
path: /
pool: ssd
size: 240GB
type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""
lxc storage show ssd
config: {}
description: ""
name: ssd
driver: btrfs
used_by:
- /1.0/instances/vartai2ssd
- /1.0/instances/w10-terminal-ssd
- /1.0/instances/w10-unsql-ssd
- /1.0/profiles/default
- /1.0/profiles/no_net
status: Created
locations:
- paralel-linux
- universe-linux
- cluster-linux
- blazar-linux
lxc network show br0
config: {}
description: ""
name: br0
type: bridge
used_by:
- /1.0/instances/gitlab
- /1.0/instances/nextcloud-dev
- /1.0/instances/w10-grybas
- /1.0/instances/w10-terminal-ssd
- /1.0/instances/wiki
- /1.0/instances/zabbix
- /1.0/profiles/ceph-hdd
- /1.0/profiles/ceph-ssd
- /1.0/profiles/default
- /1.0/profiles/default_hdd
managed: false
status: ""
locations: []
snap changes
no changes found
snap changes lxd
no changes found
What does this show?
snap list lxd --all
I’ve test server (not connected to LXD cluster) with latest lxd 4.0.9 from 4.0/stable channel and the VMs doesn’t start there too. Then I’ve refreshed lxd in test server to 5.0/stable channel and issue is fixed in latest LXD 5.0 !!! I’m pasting snap changes output:
mantas@neutron-star:/# snap changes
ID Status Spawn Ready Summary
109 Error 8 days ago, at 08:17 UTC today at 07:21 UTC Auto-refresh snap “lxd”
110 Done today at 07:21 UTC today at 09:16 UTC Auto-refresh snaps “lxd”, “snapd”
111 Done today at 09:17 UTC today at 09:18 UTC Remove “lxd” snap
113 Done today at 09:20 UTC today at 09:20 UTC Install “lxd” snap from “4.0/stable” channel
114 Done today at 11:33 UTC today at 11:33 UTC Refresh “lxd” snap from “5.0/stable” channel
Yes I expected 5.0 LTS would work, as its similar/same as this issue:
snap list lxd --all
Name Version Rev Tracking Publisher Notes
lxd 4.0.9-8e2046b 22753 4.0/stable canonical✓ disabled,in-cohort
lxd 4.0.9-eb5e237 23991 4.0/stable canonical✓ in-cohort
Ah OK so you could start by increasing the number of revision kept so you dont lose a working one:
sudo snap set system refresh.retain=n
Then trying:
sudo snap revert lxd --revision <revision number>
As you’re running the LTS series, reverting should be possible as we dont include DB/API schema changes that would prevent reverting.
Just asking if other virtual machines and containers running on the same server will be restarted when I revert lxd ? AFAIK I should run this command, right?:
snap revert lxd --revision 22753
No they shoudn’t be as this is the same as snap refresh that occurs automatically, only in the other direction.
However as you’re running a cluster you might need to do this on the other members. Although you’ll know if you need to because the snap refresh will pause waiting for the other members to arrive at same revision.
I think this shouldn’t be needed though as its just a minor snap revision and not a schema or API change.
@stgraber is looking into this now, but we suspect the issue is the more recent QEMU version in the 4.0 LTS snap is causing a seccomp violation.
We think this commit also need to be backported into the LTS 4.0 series:
Is there a fix for this? … trying follow this thread not sure where to go with it.
pfsense lxd VM from .iso per netgate doc.
.4.0-148-generic
VERSION=“20.04.1 LTS (Focal Fossa)”
lxd --version
5.13 --edge
Error: Failed to run: forklimits limit=memlock:unlimited:unlimited fd=3 – /snap/lxd/24814/bin/qemu-system-x86_64 -S -name pfsense -uuid cfe6a9cf-52fa-42ba-a67e-e0a903b6d5fa -daemonize -cpu host -nographic -serial chardev:console -nodefaults -no-user-config -sandbox on,obsolete=deny,elevateprivileges=allow,spawn=allow,resourcecontrol=deny -readconfig /var/snap/lxd/common/lxd/logs/pfsense/qemu.conf -spice unix=on,disable-ticketing=on,addr=/var/snap/lxd/common/lxd/logs/pfsense/qemu.spice -pidfile /var/snap/lxd/common/lxd/logs/pfsense/qemu.pid -D /var/snap/lxd/common/lxd/logs/pfsense/qemu.log -smbios type=2,manufacturer=Canonical Ltd.,product=LXD -runas lxd -boot menu=on -machine pc-q35-2.6 -device virtio-vga -vnc :2 -drive file=/home/ubuntu/pfSense-CE-2.6.0-RELEASE-amd64.iso,index=0,media=cdrom,if=ide: qemu-system-x86_64: -vnc :2: VNC support is disabled
: Process exited with non-zero value 1
Try lxc info --show-log pfsense
for more info N/A
thanks