we found out that our images are 16gb size, we are managing 17 containers, most of them have less than 4GB of disk usage. For now we only limited disc usage on few containers
but looks like all images are sizing 16gb at /var/lib/lxd/images. we are using a script to automate container backups and erase images from /var/lib/lxd/images once the backup is done and exported to a remote machine.
this is briefly what we do daily:
lxc snapshot “${vm}” “${vm}”_backup
lxc publish “${vm}”/"${vm}"_backup --alias “${vm}”_backup
…
lxc image delete “${vm}”_backup
lxc delete “${vm}”/"${vm}"_backup
but the point is that now we have like daily 16gb files at /var/lib/lxd/images. Could we erase all those files except the last one? how is the proper way to manage those huge files?
What size is the image when you run “lxc image list”? Does it compare to the real file size in /var/lib/lxd/images?
I believe the image files are tar.gz files. As a test to see why the files are so large, run this command: cat /var/lib/lxd/images/<image_file> | tar tvzf -
The image size with “lxc image list” gives all images weight less than 500mb which is the expected for lxd images, on the other hand at /var/lib/lxd/images we have daily 16gb files without extension not *tar.gz neither *zfs. we have no tar.gz files there because we erase them after backup is on the remote server. Here a part of our containers showing the files
What do you see if try to extract the first file in your list (0869ae…) using the command “cat /var/lib/lxd/images/0869ae… | tar tvzf -“
I use BTRFS and not ZFS,so I can’t do any tests on my system. However, I am curious - have you limited the disk space per container to 16GB via ZFS? Perhaps that is why you are getting the 16GB size. Maybe those are ZFS image files and not necessarily “tar.gz” files?
/var/lib/lxd/images/ has the images of containers. These are typically downloaded and cached from either the ubuntu: or images: repository. As far as I understand, you do not need to take a backup of these files.
These are cached image files for containers.
lxc image list
will show something like
$ lxc image list
+-------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+
| ALIAS | FINGERPRINT | PUBLIC | DESCRIPTION | ARCH | SIZE | UPLOAD DATE |
+-------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+
| | 03c2fa6716b5 | no | ubuntu 16.04 LTS amd64 (release) (20170919) | x86_64 | 156.13MB | Sep 29, 2017 at 1:41pm (UTC) |
+-------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+
| | 61d54418874f | no | ubuntu 16.04 LTS amd64 (release) (20171011) | x86_64 | 156.15MB | Oct 21, 2017 at 7:01pm (UTC) |
+-------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+
Here it shows that there are two versions of cached Ubuntu 16.04 images for amd64.
Let’s remove them to save some space (in the ZFS pool):
That’s it. The next time that you need to launch a new container, LXD will download and cache the container image. There is an option in LXD not to cache container images, which means that it will download it every time.
this is helping to find out what’s going on, althought the 16gb files are not *tar.gz but I can see them content with the command “cat /var/lib/lxd/images/8be9b051cf77f578…| tar tvzf - | less” comparing them looks like they have same content and modification date. I guess those files are generated by our backup script for the 17 containers because they are generated daily at the time we have the cronjob. The question now is why they start to be created this month and not before? we are using this infrastructure from months ago. Can we safely remove those 16gb files? there is way to avoid them to be generated?
If you extract the image into a temp directory (cd /usr/local/tmp && cat /var/lib/lxd/images/8be9b051cf77f578…| tar xvzf -) and then run a “du” against that directory, do you see 16GB of files in use? If so, it seems you may have a temp/large file consuming lots of disk space. That would definitely be the root cause.
Indeed . I got use to reading first the what I have done and then the what I got.
An official container image is about 160MB and if you apt update and install some services, the published image goes to around 360MB. These 16GB container images probably have your data (like database). It is important to verify what makes this 16GB size as @rkelleyrtp describes.
Personally, I would edit the cron script to add the date on the image alias, something like
As I read the original post again. It seems the OP’s problem is the nightly images are not getting deleted per the cron job. Your idea of adding the date/time is spot-on. I wonder if the cron job is just picking an arbitrary image to delete as opposed to the expected one.
We found out that those 16gb image correspond to only one of our containers generated from our backup script that was working good for months and erasing ok the images related to the backup. The point is that this image doesn’t appear with “lxc image list” so we cannot remove it trough “lxc image delete”. The point is that each .zfs have a related file with same fingerprint. This is not the case for any of the 16gb files that we have at /var/lib/lxd/images.Can we erase those files in some way?
After a lot of diagnosis we found that the container from wich is generated the image is so big, and we run out of space. Then the command “lxc publish “${vm}”/”${vm}"backup --alias "${vm}“backup” gets stacked and generate an orphaned image not listed by “lxc list” then we cannot erase it. Could you please enlighten us on how to delete those annoying images without compromising lxd stability? If they are orphaned could we use just rm?
I assume that if LXD does not show those images with lxc image list, then LXD does not know them and can safely be removed. But let’s get a second opinion as well.
You may want to check that they’re also not listed in “lxc storage volume list POOL”.
If they’re not listed there either, then yes, removing the files is fine, it’s likely just some confusion which happened when running out of disk space or something.