Out of memory issues in /var


#1

Running debian containers (mainly) on snapd running on debian stretch (stable or also 9).

I ran the command
$ apt update
and I was told that there was insufficient memory to complete the task (there were a few lines before that).

With some assistance from a friend I’m been trying to find where the space hog is

root@debianserver:/# df -k
Filesystem 1K-blocks Used Available Use% Mounted on
udev 74254448 0 74254448 0% /dev
tmpfs 14853476 255724 14597752 2% /run
/dev/sda2 18011420 13298284 3775152 78% /
/dev/sda5 20027216 5944576 13042256 32% /usr
tmpfs 74267372 24696 74242676 1% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 74267372 0 74267372 0% /sys/fs/cgroup
/dev/sda7 3023760 5044 2865116 1% /tmp
/dev/sda6 15053152 41868 14226900 1% /usr/local
/dev/sda4 10013512 9353460 131668 99% /var
/dev/sda8 5759082496 17018792 5740155304 1% /home
/dev/sda9 510984 144 510840 1% /boot/efi
tmpfs 14853472 8 14853464 1% /run/user/1000
tmpfs 14853472 4 14853468 1% /run/user/114
/dev/loop3 209715200 4530288 203402800 3% /media/darald/lxd2
/dev/loop12 48256 48256 0 100% /snap/lxd/5785
/dev/loop14 48256 48256 0 100% /snap/lxd/5841
/dev/loop13 48256 48256 0 100% /snap/lxd/5866
/dev/loop19 83840 83840 0 100% /snap/core/4189
/dev/loop18 83840 83840 0 100% /snap/core/4194
/dev/loop16 83840 83840 0 100% /snap/core/4201

So 13 GiB and 78% full in / is more than I had expected but 9.4 GiB and 99% full in /var is quite unexpected.
/dev/sda5 is swap (1.0 GiB, with over 60 GB of ram I didn’t think I would need a ‘standard’ sized swap).

root@debianserver:/var# du -d 1 -BM /var
12M /var/tmp
1M /var/www
7M /var/mail
13M /var/backups
1M /var/opt
6205M /var/snap
1M /var/local
131M /var/log
1M /var/lost+found
656M /var/cache
1409M /var/lib
4M /var/spool
8434M /var

yet:
/var/snap properties
39 files
76.3 KiB (file size)
208.0 KiB (size on disk)

/var/snap/lxd
32 files
76.3 KiB
184.0 KiB
and
/var/snap/core
6 files
4 bytes
20.0 KiB

There’s some 6.20 GiB not accounted for!

Not such a large discrepancy but:

root@debianserver:/var# du -d 1 -BM /var/cache
1M /var/cache/cracklib
1M /var/cache/lightdm
308M /var/cache/apt
3M /var/cache/man
1M /var/cache/system-tools-backends
330M /var/cache/lxc
1M /var/cache/bind
8M /var/cache/debconf
1M /var/cache/dbconfig-common
1M /var/cache/snapd
1M /var/cache/libvirt
5M /var/cache/locate
1M /var/cache/dictionaries-common
1M /var/cache/cups
1M /var/cache/ldconfig
1M /var/cache/localepurge
1M /var/cache/PackageKit
1M /var/cache/tcpdf
1M /var/cache/awstats
1M /var/cache/samba
1M /var/cache/postgresql
2M /var/cache/fontconfig
1M /var/cache/lxd
1M /var/cache/apparmor
1M /var/cache/apache2
1M /var/cache/apt-cacher-ng
656M /var/cache

So it looks like snapd + lxd is somehow hogging a lot of space which I can’t really account for.

If this is really a ‘snapd’ issue I will be happy to file this over there.

Please advise.

TIA


(Stéphane Graber) #2

That’s certainly a weird behavior… It doesn’t really make sense that du thinks the directory is using 6GB but when listing its content it goes down to a few kBs…

Could it be du not accounting for some hidden directory or something in this case?

Another thought would be deleted files, but those shouldn’t show up in du's output at all…


#3

I’m really not sure what to do to loosen this knot!
Using the cli tools I can’t see what du says is there.

Its /var/snap where the bulk of things are hiding. I’m guessing that /var/lib at just under 1.5 GB isn’t unusual but snap taking 6.2 GB that doesn’t seem real.

All that’s on snap is 10 containers only 1 of which I have been trying to load software onto and ‘core’. My guess is that max there would be a need for maybe a few 100 MB if things were the way they were the last installation.

Any suggestions for rocks to turn over in the looking for a solution?


(Stéphane Graber) #4

If you don’t mind rebooting the system, I’d recommend doing that. If the issue is files that are open but deleted, then those would go away and reduce your disk usage post-reboot. I’m still confused as to why the directory usage would show those though…


#5

I was able to affect a fix to the issue but in the process have found a documentation lacuna. I didn’t try the rebooting but was able to affect a fix and from that have a request for at least documentation changes and/or package changes.
Solution first.

(At a mentor’s (an old time unix geek) suggestion!)
I moved /var/snap to a directory with lots of space and then set up a sym-link between /var/snap and the new location. Process was as follows (for anyone else needing the solution).
First make sure that ALL of your containers have been stopped.
Copy /var/snap to another directory that has lots (LOTS!!!) of space in it.
Then you need to $ rm -r the original (/var/snap).
Lastly you need to create a sym-link between /var/snap to /mynewdirectory/snap - - - I used $ ln -s /mynewdirectory/snap /var/snap .
I now did a reboot using shutdown -r now to clear any caches and to set things (as advised by my mentor).
(After the reboot completes.) You now need to start all your containers.

Should be working (at least in my case everything is working - - - grin!).

This issue is caused, at least in part, by the assumption that the system is installed with no partitions. There is no mention of where the ‘snaps’ will be stored and any space requirements that could be needed by those secondary, to snapd, (in this case lxd) tools. Therefore there should be something in the documentation that informs the user that this possible level of space requirement will be there or there should be direction that clearly states that lxd should only be used in a system install that doesn’t use partitions.
As lxd mounts above snapd and is as stated elsewhere the preferred installation solution by the dev team some changes would assist those that don’t necessarily do things the ‘ubuntu’ way.
Useful would be having the sparse files ‘claimed space’ (not necessarily the ‘used’ space (which is vastly smaller)) visible and also reflected in file manager tool responses. It may be necessary to have two responses, one for ‘used’ space and another for ‘claimed’ space - - this is likely far more problematic for implementation.

I wish to thank @stgraber for his patience with my many questions and occasional tribulations. I do find myself wishing for another something like snapd which lxd would also be able to use to affect an install. Yes that would increase the workload for the dev team but it would also likely force a more robust solution at snapd which would help I think more than just myself.