I am using debian 11 as the host and have installed lxd via snap using the btrfs storage type since that was default when I initially ran lxd init.
When I try to run an ubuntu virtual machine, I get the following error:
$ lxc start ubuntu
Error: open /var/snap/lxd/common/lxd/virtual-machines/ubuntu/config/server.crt: disk quota exceeded
Try `lxc info --show-log ubuntu` for more info
when I run more info I get:
Name: ubuntu
Status: STOPPED
Type: virtual-machine
Architecture: x86_64
Created: REDACTED
Last Used: REDACTED
Error: open /var/snap/lxd/common/lxd/logs/ubuntu/qemu.log: no such file or director
when I try to edit the disk size from 15GB to something larger, I get the following error:
$ lxc config edit ubuntu
Config parsing error: Failed to update device "root": Failed resizing disk image "/var/snap/lxd/common/lxd/storage-pools/default/virtual-machines/ubuntu/root.img" to size 20000006144: Failed to create sparse file /var/snap/lxd/common/lxd/storage-pools/default/virtual-machines/ubuntu/root.img: truncate /var/snap/lxd/common/lxd/storage-pools/default/virtual-machines/ubuntu/root.img: disk quota exceeded
Press enter to open the editor again or ctrl+c to abort change
Also note that just opening the config without making any changes gives me an error:
$ lxc config edit ubuntu
Config parsing error: Failed to write backup file: Failed to create file "/var/snap/lxd/common/lxd/virtual-machines/ubuntu/backup.yaml": open /var/snap/lxd/common/lxd/virtual-machines/ubuntu/backup.yaml: disk quota exceeded
Press enter to open the editor again or ctrl+c to abort change
I dont really understand how all this works, but after skimming the documentation again to try and find info on how to increase the disk size, I tried attaching a new storage volume with:
$ lxc storage create tst3 btrfs
$ lxc storage volume create tst3 tstblockvol1 --type=block
$ lxc storage volume attach tst3 tstblockvol1 ubuntu
Error: Failed to write backup file: Failed to create file "/var/snap/lxd/common/lxd/virtual-machines/ubuntu/backup.yaml": open /var/snap/lxd/common/lxd/virtual-machines/ubuntu/backup.yaml: disk quota exceeded
I’m very new to lxd, and don’t work in a tech field and am pretty out of my depth, I just wanted something more flexible / powerful then virtualbox and I could not get virt machine manager working the way I wanted it to. so any advice / suggestions would be appreciated.
I dont understand any of that. I dont work in a technology feild. I see that it says " please ensure that the instance root disk’s size.state property is set to 2x the size of the root disk’s size to allow all blocks in the disk image file to be rewritten without reaching the qgroup quota.", but I dont know what to do with that information. there is no “size.state” entry in the config of that particular virtual machine, and I am not sure what to set it to or how to change it.
Also check the storage pool size and check you’ve not just run out of disk space in the pool
I have not. or at least, when I run ‘lxc storage edit default’ and jack up the size of the disk it does nothing to change the error message.
or even better don’t use btrfs for VMS).
Apologies for using the default configuration. If its not recommended to use the default configuration, I would suggest changing the default configuration.
In LXD a VM is made up of 2 volumes; a small state volume (that contains config and metadata about the VM), and the data volume (that contains the VM’s root disk).
By default in LXD, BTRFS storage pools don’t have per filesystem volume size limits, so the state volume won’t be size limited. The VM root volume itself will be an image file sized to the pool’s volume.size setting or if not set will default to 10GiB.
If you grow the VM’s root disk size after creation using:
lxc config device set <instance> root size=20GiB
Then this will resize the root image file to the specified size, and enable BTRFS quotas with a quota set at the specified size + the state volume’s default size (which is 100MiB).
However because of the nature of BTRFS’ quota tracking (as explained in that 4th bullet point) it is possible for the BTRFS quota to be exceeded due to the way that BTRFS tracks the blocks that are changing inside the VM’s root image file. The absolute worst case scenario is that it could actually keep track of up to 2x the allowed quota when using a VM, so we encourage setting the state volume size to 2x the specified root volume size to be safe.
This can be done by doing:
lxc config device set <instance> root size.state=40GiB
This will cause the BTRFS quota to be set at 20+40GiB to allow quota to track block changes in the VM’s root disk.
This is all rather complicated, and this is why we actually suggest not using BTRFS for VMs.
On your point regarding LXD suggesting BTRFS for the default. The default pool type is actually ZFS, but its only suggested if your system has it available. The “next best” (although we’ve seen now that “best” is somewhat nuanced based on your workload and preferences) is suggested as BTRFS, because it is the next most flexible and efficient for containers (which LXD started out only supporting).
In both cases LXD will by default create BTRFS and ZFS pools on a fixed size loop file. This is to allow users to get up and running quickly, without having to provide a dedicated disk or partition for their instances. This works great for testing or development, however stacking filesystems ontop of the host’s filesystem is not as efficient, so for production workloads we recommend not using loop-backed pools, see:
Thank you for the explanation, but It still does not make a whole lot of sense but I dont think this is the time or place to try to educate me on what a volume is. I think i got the jist of it though.
Please can I see the output of the follow commands to better ascertain the state of your system:
Its difficult since i would need to redact anything that could identify me and there are to many options that I dont recognize.
Regardless, ran the command you suggested
lxc config device set ubuntu root size.state=40GiB
which did not work, so I just tried upping the size to 60, and now at least I can edit the config without getting errors. the size parts now read:
size: 20GB
size.state: 80GiB
but when I attempt to start the VM, it starts but wont allow me to connect with lxc exec ubuntu bash, just spits out Error: LXD VM agent isn't currently running. googling for that error spits out only the lxd github repo, and no troubleshooting help.
Attempting to use the spice window thing just shows the lxd logo forever. I also cant stop the VM unless I use the -f flag. Jacking up the size to 80 did not help either.
I’m a long time reader, first time poster. I got some valuable help from this article today, so I’ve decided to pay the deed forward.
I evidently ran into the same issue experienced by the OP today. Here are my findings.
Like the OP, I was attempting to use BTRFS in production (I also didn’t know ZFS was the default, so long as it was installed, or that BTRFS was a runner-up) and had also used this command to stop a VM a few times:
lxc stop -f <instancename>
What I didn’t realise was that running that command to effectively yank its virtual power cord, coupled with the fact that the file system in that VM had gone read-only (due to the aforementioned issue of using BTRFS), had caused the VM file system to become corrupt. In other words, it couldn’t start.
I had already applied this command:
lxc storage set default btrfs.mount_options=compress-force
Where ‘default’ is the name of my storage (hey, no judging).
I had also applied this command:
lxc config device set <instancename> root size.state=50GiB
Where 50GiB is twice the volume size of 25GiB
(as an aside, this command is the command needed to rescue the VM, you can do both, but this command will help on a per case basis without the performance impact of forceful compression across the entire BTRFS volume - we don’t all have M.2 drives or SSDs).
lxc config device set <instancename> root size.state=50GiB
(as a further aside, if you still want to push forward with BTRFS, I’d suggest starting fresh and applying the following command before deploying any VMs).
lxc storage set default btrfs.mount_options=compress-force
I didn’t stick around (I’m sorry) to determine if the command retrospectively fixes existing virtual machines. Perhaps someone else can chime in on this? Also, containers don’t seem to have these issues. Just VMs.
Back on track…
Still, when I started the VM, it ‘started’ but never became available.
Where ‘blankIPv4’ is the undetected IPv4 Address
Where ‘blankIPv6’ is the undetected IPv6 Address
lxc shell <instancename>
As per the OP, I got this message:
Error: LXD VM agent isn't currently running
The LXD VM agent isn’t running. It can’t run because the OS can’t start. The OS can’t start because the file system is corrupt. The file system is corrupt because it was shutdown uncleanly. It was shutdown uncleanly due to the disk being read only. The disk was read-only because the underlying disk quota had been exceeded by the meta data!
Without much more to lose, I tried to exit the prompt with:
(initramfs) exit
(I was expecting to reboot and return to the same prompt, I was hoping it would boot though!) I got the following output:
rootfs contains a file system with errors, check forced. rootfs: Inodes that were part of a corrupted orphan linked list found.
rootfs: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. (i.e., without -a or -p options) fsck exited with status code 4 The root filesystem on /dev/sda2 requires a manual fsck
BusyBox v1.30.1 (Ubuntu 1:1.30.1-7ubuntu3) built-in shell (ash) Enter 'help' for a list of built-in commands.
I know this one all too well. I followed the instructions:
(initramfs) fsck /dev/sda2
I pressed the ‘y’ key a few times:
fsck from util-linux 2.37.2 e2fsck 1.46.5 (30-Dec-2021) rootfs contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes Inode 18643 was part of the orphaned inode list. FIXED. Inode 18666 was part of the orphaned inode list. FIXED. Inode 18675 was part of the orphaned inode list. FIXED. Inode 18704 was part of the orphaned inode list. FIXED. Inode 266152 extent tree (at level 1) could be narrower. Optimize<y>? yes Pass 1E: Optimizing extent trees Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Free blocks count wrong (1705636, counted=1128276). Fix<y>? yes Inode bitmap differences: -18632 -18643 -18666 -18675 -18704 Fix<y>? yes Free inodes count wrong for group #1 (2036, counted=2041). Fix<y>? yes Free inodes count wrong (2749440, counted=2744590). Fix<y>? yes
rootfs: ***** FILE SYSTEM WAS MODIFIED ***** rootfs: 228434/2973024 files (0.5% non-contiguous), 4949379/6077655 blocks (initramfs)
From here I was able to take a full backup of my data (I’m using virtualmin), and start again. I wanted a fresh start on ZFS, so after backing up all the data in my VMs and exporting them onto the host, I removed lxd:
sudo snap remove lxd
2022-11-03T12:50:11+11:00 INFO Waiting for "snap.lxd.daemon.service" to stop. Save data of snap "lxd" in automatic snapshot set #3 lxd removed
This took a little while. Then I listed the saved snapshot (I didn’t want a rerun of what I had just experienced when I reinstalled LXD).
sudo snap saved
Set Snap Age Version Rev Size Notes 3 lxd 27.1m 5.7-c62733b 23889 13.9GB auto
I noted the number of the snapshot (3 in my case) and issued the following command:
sudo snap forget 3
Snapshot #3 forgotten.
I checked that it was really gone:
sudo snap saved
No snapshots found.
I’m using Pop!_OS (), so I needed to install ZFS and reboot.
sudo apt install zfsutils-linux zfs-dkms
The zfs-dkms package is especially important. Many articles on the web just specify zfsutils-linux when you Google: how to install zfs on Pop!_OS.
You will receive a notice with regards to the license model of the kernel vs ZFS. Be mindful and thoughtful and then continue.
Once the installation was complete, I rebooted and installed LXD again.
sudo snap install lxd --channel=latest/stable
lxd 5.7-c62733b from Canonical✓ installed
I rebooted once more and then ran the lxd init utility.
lxd init
ZFS was available! Yay!
Would you like to use LXD clustering? (yes/no) [default=no]: no Do you want to configure a new storage pool? (yes/no) [default=yes]: yes Name of the new storage pool [default=default]: default
–
Name of the storage backend to use (ceph, cephobject, dir, lvm, zfs, btrfs) [default=zfs]: zfs Create a new ZFS pool? (yes/no) [default=yes]: yes
–
Would you like to use an existing empty block device (e.g. a disk or partition)? (yes/no) [default=no]: no Size in GiB of the new loop device (1GiB minimum) [default=30GiB]: 450GiB Would you like to connect to a MAAS server? (yes/no) [default=no]: no Would you like to create a new local network bridge? (yes/no) [default=yes]: yes What should the new bridge be called? [default=lxdbr0]: lxdbr0 What IPv4 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: 1.2.3.4/24 Would you like LXD to NAT IPv4 traffic on your bridge? [default=yes]: yes What IPv6 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: none Would you like the LXD server to be available over the network? (yes/no) [default=no]: no Would you like stale cached images to be updated automatically? (yes/no) [default=yes]: yes Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: yes