Storage pool questons: ZFS is recommended over directory backend strictly because of snapshots and copies? And why is a dedicated block device superior to a directory backend?

pgoetz · October 9, 2021, 8:19pm

The documentation (Linux Containers - LXD - Has been moved to Canonical) says

The two best options for use with LXD are ZFS and btrfs. … the directory backend is to be considered as a last resort option. … [it] is terribly slow and inefficient as it can’t perform
instant copies or snapshots and so needs to copy the entirety of the instance’s storage every time.

I’ve been wondering for a while why the directory backend option is the option of last resort. In particular, how is it slow to use the filesystem you live in? Also, wouldn’t a loopback device be the option of last resort? It’s obvious that snapshots and copies are going to be slow on, say an ext4 based storage pool compared to ZFS or btrfs, where COW makes such things instantaneous, but is anything else going to be slower as well? If so, why?

Second, how exactly is dedicating a full disk or partition to your LXD storage pool faster or better than just using a directory backend? I’m not seeing this. I have several legacy systems with very large hardware RAID partitions, so using ZFS would be a challenge. I could create a separate volume on the RAID device for my ZFS storage pool, but I don’t understand what advantages this affords me.

In most of my use cases I’ll be creating snapshots only occasionally (i.e. most the volatile data will live outside the container, accessed through a bind mount or something) and will make copies even more rarely. If these are the only issues with using a directory backend, then this isn’t a concern for me.

stgraber · October 9, 2021, 9:12pm

If your use case is running a single container, never creating new ones and never making a snapshot and that you don’t need to be able to apply quotas, then dir is probably the best option.

For just about every other cases, you’d want ZFS or btrfs as that will drastically reduce the disk usage when running more than one container, cut down the container creation time from dozens of seconds/minutes to miliseconds and allow you to do things like snapshots, fast backups, migration, … which all would rely on slow rsync otherwise.

ZFS on pure filesystem performance even on a dedicated driver will be slower than directory (ext4/xfs), so again, if your use case is a single container, no quotas, no snapshots, no migration, no backups, … then ZFS isn’t for you.

But the most common use case for us are users creating more than one container, often using them as throwaway environments and for that, having to go through a full unpack and complete duplication of disk usage on ever instance gets quite frustrating.

pgoetz · October 9, 2021, 9:23pm

Thanks, Stéphane, for confirming my thoughts on this. Question: how does ZFS save on disk usage when running multiple containers? I’m not sure I see this unless you’re talking about running copies, in which case it’s obvious. If I have one Arch, one Ubuntu 14.04, one Ubuntu 20.04, and one CentOS 7 container, I don’t think ZFS will save me any space – is this correct?

I’m trying to use ZFS on all my new server projects, but for example, I’m currently building a container for a Samba AD-DC, where the users/machines don’t change that often. I will snapshot whenever there is a new user or machine; say every 1-2 months. I’m already making snapshots of other containers based on directory storage, and it’s just not that bad. Not instantaneous, but having to wait a couple of minutes once a month is less painful than reconfiguring the bare metal and starting over with ZFS (which does not like hardware RAID controllers one bit).

Of course if I were a service provider spinning up new containers several times an hour this would be an entirely different story. Maybe the language used in the documentation is a bit too Manichean though. <:)

stgraber · October 10, 2021, 8:14am

That’s correct, the space saving is for multiple containers from the same image or containers which are copies of each other. ZFS also comes with compression enabled in LXD so technically you’d save a bit of space from that too and you could turn on full block deduplication if you wanted maximal space savings at the cost of memory.

For your use case, what you’re describing sounds perfectly fine. Of course you’d never want to do that with a samba server that also acts as a file server as the duplication on snapshots for that would be quite problematic, but for an AD-DC, that’s just going to be a few hundreds MBs of OS and samba data so not a huge deal.