Multiple files at /var/lib/lxd/images big size

berzas · October 25, 2017, 12:13pm

Hi,

we found out that our images are 16gb size, we are managing 17 containers, most of them have less than 4GB of disk usage. For now we only limited disc usage on few containers

but looks like all images are sizing 16gb at /var/lib/lxd/images. we are using a script to automate container backups and erase images from /var/lib/lxd/images once the backup is done and exported to a remote machine.

this is briefly what we do daily:
lxc snapshot “${vm}” “${vm}”_backup
lxc publish “${vm}”/"${vm}"_backup --alias “${vm}”_backup
…
lxc image delete “${vm}”_backup
lxc delete “${vm}”/"${vm}"_backup

but the point is that now we have like daily 16gb files at /var/lib/lxd/images. Could we erase all those files except the last one? how is the proper way to manage those huge files?

thanks in advance for support

rkelleyrtp · October 25, 2017, 12:29pm

What filesystem are you using to store your images? ZFS, BTRFS, etc.

-Ron

berzas · October 25, 2017, 12:32pm

we are using ZFS filesystem, any of those huge files are not *.zfs

rkelleyrtp · October 25, 2017, 12:46pm

What size is the image when you run “lxc image list”? Does it compare to the real file size in /var/lib/lxd/images?

I believe the image files are tar.gz files. As a test to see why the files are so large, run this command: cat /var/lib/lxd/images/<image_file> | tar tvzf -

Maybe you can see what is taking all the space.

-Ron

berzas · October 25, 2017, 1:20pm

The image size with “lxc image list” gives all images weight less than 500mb which is the expected for lxd images, on the other hand at /var/lib/lxd/images we have daily 16gb files without extension not *tar.gz neither *zfs. we have no tar.gz files there because we erase them after backup is on the remote server. Here a part of our containers showing the files

On the other hand I see with “zfs list” shows all containers have 10.6gb except for the ones we limited the amount of disk usage.

rkelleyrtp · October 25, 2017, 1:31pm

What do you see if try to extract the first file in your list (0869ae…) using the command “cat /var/lib/lxd/images/0869ae… | tar tvzf -“

I use BTRFS and not ZFS,so I can’t do any tests on my system. However, I am curious - have you limited the disk space per container to 16GB via ZFS? Perhaps that is why you are getting the 16GB size. Maybe those are ZFS image files and not necessarily “tar.gz” files?

-Ron

simos · October 25, 2017, 1:31pm

/var/lib/lxd/images/ has the images of containers. These are typically downloaded and cached from either the ubuntu: or images: repository. As far as I understand, you do not need to take a backup of these files.
These are cached image files for containers.

lxc image list

will show something like

$ lxc image list
+-------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+
| ALIAS | FINGERPRINT  | PUBLIC |                 DESCRIPTION                 |  ARCH  |   SIZE   |         UPLOAD DATE          |
+-------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+
|       | 03c2fa6716b5 | no     | ubuntu 16.04 LTS amd64 (release) (20170919) | x86_64 | 156.13MB | Sep 29, 2017 at 1:41pm (UTC) |
+-------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+
|       | 61d54418874f | no     | ubuntu 16.04 LTS amd64 (release) (20171011) | x86_64 | 156.15MB | Oct 21, 2017 at 7:01pm (UTC) |
+-------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+

Here it shows that there are two versions of cached Ubuntu 16.04 images for amd64.
Let’s remove them to save some space (in the ZFS pool):

lxc image delete 61d54418874f
lxc image delete 03c2fa6716b5

That’s it. The next time that you need to launch a new container, LXD will download and cache the container image. There is an option in LXD not to cache container images, which means that it will download it every time.

simos · October 25, 2017, 1:36pm

In your screenshot, one of the images has a filename that starts with 0dfcf962.
Let’s try to figure out which image that is.

lxc image list ubuntu: | grep 0dfcf962
lxc image list images: | grep 0dfcf962

Strangely, they are not known official images for LXD. Are these custom images?

rkelleyrtp · October 25, 2017, 1:38pm

From the OP first email, those images are created locally - not from a remote image server…

berzas · October 25, 2017, 2:04pm

thanks for pointing that out, before starting this thread we removed 3 images that we are not using anymore

berzas · October 25, 2017, 2:12pm

this is helping to find out what’s going on, althought the 16gb files are not *tar.gz but I can see them content with the command “cat /var/lib/lxd/images/8be9b051cf77f578…| tar tvzf - | less” comparing them looks like they have same content and modification date. I guess those files are generated by our backup script for the 17 containers because they are generated daily at the time we have the cronjob. The question now is why they start to be created this month and not before? we are using this infrastructure from months ago. Can we safely remove those 16gb files? there is way to avoid them to be generated?

rkelleyrtp · October 25, 2017, 2:26pm

If you extract the image into a temp directory (cd /usr/local/tmp && cat /var/lib/lxd/images/8be9b051cf77f578…| tar xvzf -) and then run a “du” against that directory, do you see 16GB of files in use? If so, it seems you may have a temp/large file consuming lots of disk space. That would definitely be the root cause.

-Ron

simos · October 25, 2017, 3:13pm

Indeed . I got use to reading first the what I have done and then the what I got.

An official container image is about 160MB and if you apt update and install some services, the published image goes to around 360MB. These 16GB container images probably have your data (like database). It is important to verify what makes this 16GB size as @rkelleyrtp describes.

Personally, I would edit the cron script to add the date on the image alias, something like

lxc snapshot web --alias web-20171025
lxc publish web/web-20171025 --alias web-20171025
lxc image delete web-20171024
lxc delete web/web-20171024

Here is a demo script,

export VM=web
export CURRDATE=`date +%Y%m%d`
export PREVDATE=`date --date="-1 day" +%Y%m%d`

echo lxc snapshot ${VM} --alias "${VM}-${CURRDATE}"
echo lxc publish "${VM}/${VM}-${CURRDATE}" --alias "${VM}-${CURRDATE}"

echo lxc image delete "${VM}-${PREVDATE}"
echo lxc delete "${VM}/${VM}-${PREVDATE}"

rkelleyrtp · October 25, 2017, 3:22pm

As I read the original post again. It seems the OP’s problem is the nightly images are not getting deleted per the cron job. Your idea of adding the date/time is spot-on. I wonder if the cron job is just picking an arbitrary image to delete as opposed to the expected one.

stgraber · October 25, 2017, 5:13pm

You can delete old images, containers that were created from them will be unaffected.

Do not manually modify anything under /var/lib/lxd by hand though, you’re likely to break LXD if you do.

Instead, you should be using “lxc image list” and “lxc image delete” to remove those you don’t need anymore.

berzas · October 25, 2017, 6:35pm

We found out that those 16gb image correspond to only one of our containers generated from our backup script that was working good for months and erasing ok the images related to the backup. The point is that this image doesn’t appear with “lxc image list” so we cannot remove it trough “lxc image delete”. The point is that each .zfs have a related file with same fingerprint. This is not the case for any of the 16gb files that we have at /var/lib/lxd/images.Can we erase those files in some way?

berzas · October 25, 2017, 8:28pm

After a lot of diagnosis we found that the container from wich is generated the image is so big, and we run out of space. Then the command “lxc publish “${vm}”/”${vm}"backup --alias "${vm}“backup” gets stacked and generate an orphaned image not listed by “lxc list” then we cannot erase it. Could you please enlighten us on how to delete those annoying images without compromising lxd stability? If they are orphaned could we use just rm?

simos · October 25, 2017, 9:00pm

I assume that if LXD does not show those images with lxc image list, then LXD does not know them and can safely be removed. But let’s get a second opinion as well.

stgraber · October 28, 2017, 8:56am

You may want to check that they’re also not listed in “lxc storage volume list POOL”.

If they’re not listed there either, then yes, removing the files is fine, it’s likely just some confusion which happened when running out of disk space or something.

berzas · October 28, 2017, 11:30am

Thanks for support, we erased the files with rm finally.