I’m using LXD to create multiple virtual machines in parallel or very close time range. To do this, I launch them with lxc launch, stop them with lxc stop -f, and then delete them with lxc delete -f. I repeat this process to create new VMs.
This works most of the time, but occasionally I get an error message that says:
Now getting different error randomly during lxc launch ... (3 out of 20 machines)
Stderr: Error: Failed creating instance from image: Failed reading image info
"/var/snap/lxd/common/lxd/images/102c0fdafc87c8be84a604f1cf4fdc2414f90bb31b9301fae1bba4d8201095a8.rootfs":
Failed to run: prlimit --cpu=2 --as=1000000000 qemu-img info -f qcow2 --output=json /var/snap/lxd/common/lxd/images/102c0fdafc87c8be84a604f1cf4fdc2414f90bb31b9301fae1bba4d8201095a8.rootfs:
Process exited with non-zero value 1 (aa-exec: ERROR: profile 'lxd_qemu-img-var-snap-lxd-common-lxd-images-102c0fdafc87c8be84a604f1cf4fdc2414f90bb31b9301fae1bba4d8201095a8.rootfs' does not exist)
and
Error: Failed to begin transaction: context deadline exceeded
I think all these errors are due to fact that I’m launching them potentially in parallel (20 of them?), otherwise they seem to work
This error seems like maybe LXD is refreshing the base image do a different hash ID at the same time that the image is being used to create an image from it. There should be a lock that prevents this, but perhaps there’s an edge case here. I’ve assigned this to myself so will try and take a look ad reproducing it.
Problem seems to be have more VM than my CPU can handle
I used to run 20 VMs with 4vcpu when I have total 24 cores, and that VMs each run full CPU capacity.
with more realistic load like 6 VMs, everything seems to work