Non loop-backed btrfs storage

tetech · November 28, 2019, 5:20am

Hi all, I installed LXD using snap and created a default storage pool using:

lxc storage create default btrfs source=/dev/sda3
lxc profile device add default root disk path=/ pool=default

In this case, /dev/sda3 is an empty partition (no filesystem). When I look at df -k, I see

[root@c3 ~]# df -k | grep loop
/dev/loop0         91264   91264         0 100% /var/lib/snapd/snap/core/8039
/dev/loop1         56192   56192         0 100% /var/lib/snapd/snap/lxd/12317

and when I do iotop -d 10,

Total DISK READ :      45.57 M/s | Total DISK WRITE :     153.92 K/s
Actual DISK READ:      75.70 M/s | Actual DISK WRITE:     190.51 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
 2497 be/0 root        9.83 M/s    0.00 B/s  0.00 %  0.13 % [loop1]

I’m a bit confused whether this means the storage is loop-backed or not. Can anyone help me understand what I’m seeing? Thanks.

stgraber · November 28, 2019, 9:20pm

You won’t see it on the host as it’s only mounted inside the snap’s environment.

nsenter -t $(pgrep daemon.start) -m df should show it though.

gpatel-fr · November 28, 2019, 9:47pm

these are squashfs file systems for snap: there is no available space, these are read-only because they have only programs, no data file can be stored in the snaps). You get two because you have the core (base apps for snap management, since snap is managing itself and can update itself), and you have only one snap installed, lxd.

tetech · November 28, 2019, 10:30pm

Thanks for the replies, I think I understand now. The reason for asking is that I seem to be getting big read I/O spikes on /dev/loop1 (the lxd snap). During those times I also noticed that LXD memory runs at double the normal amount (RSS maybe 100 MB).

[root@c3 ~]# ps aux | grep "lxd --logfile"
root      1280  0.1 1.25 847020 46560 ?        Sl   14:03   0:38 lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd

I enabled debug but didn’t see anything abormal (and I think it since rolled off). The I/O spikes in particular seem a bit unusual if the loop device is read-only.

gpatel-fr · November 30, 2019, 9:54am

For the record I get much higher values with my config, also for Snap LXD, on Ubuntu 18.04: 130000 for Vss. LXD does some housekeeping behind the curtain (see /var/snap/lxd/common/lxd/logs), maybe it’s the reason for the spikes

tetech · November 30, 2019, 10:55am

Thanks for calibrating my expectations! Can this housekeeping be controlled (or disabled, if not really necessary)? I am running an in-memory program which uses most of the RAM, and gracefully reduces that usage before any cron.daily and so on. LXD’s housekeeping (assuming that’s what causes the spike) is making the OOM killer kick in and everything becomes uncontrolled.

gpatel-fr · December 1, 2019, 9:27am

Hrrmf, I’m afraid that you will not get a very good deal on this. Looking at the lxd/lxd/task subdirectory in the source tree, the lxd scheduler don’t seem to be a very sophisticated affair, I don’t see any API in this short but not very easy to understand code (not easy to me in a few minutes, that is). From the api_1.0.go source file, it seems that by setting the images.auto_update_interval and images.remote_cache_expiry keys to 0 it’s possible to stop these critters to run, but that seems all, the interval when other tasks run seem to be hardcoded, see for example in backup.go:

interval := time.Hour

and indeed from the logs this task runs hourly.

So there don’t seem to be any task management baked in LXD.

And from reading in daemon.go this comment

Support for proper cancellation is something that has been started but has not been fully completed.

from 2 years ago, there is not much activity on this.

So I think you will be better off setting aside memory for ‘unexpected’ memory requirements by LXD.
Anyway, there is also snap to consider. But at least you can stop snap by firewalling away the Ubuntu servers.

tetech · December 3, 2019, 8:27am

OK, thank you very much for your research and advice! This seems like a situation that I’ll have to live with as you describe.

gpatel-fr · December 5, 2019, 6:59am

looking at github issues, this may be something that will be worked on: but it will not give LXD users some control immediately even then.