Using LXC over NFS, what is the ideal setup for performance?

Research has confirmed that running the LXC storage system on ZFS (not loop files, but instead, partitioned storage pools) yields the best performance.
This makes sense, but what is not clear is how should the container storage be setup for best performance if it uses network storage?

Scenario:

Multiple hosts running various LXC containers on them are connected to a central NAS which provides all storage access to them via NFS.
In this case the NAS server is using ZFS internally to manage all it’s storage, but the hosts that connect to it are using NFS to access images and container generated content.

So naturally if you setup a host and initialize LXD you need to setup the containers to use a storage system.
If you use ZFS from this perspective, it would mean that you are using a ZFS which is running over NFS which is then connected to a server running ZFS.

Surely this design is not ideal for performance due to all the overhead and nested filesystems, and using directory based storage on LXC is considered poor performance and not reccomended.

How do you manage shared storage over the network and still maintain good system performance overall and decrease overhead?

Here are some options that come to mind:

  • ZFS over ZFS is not so bad and maybe it’s ok to use it like this in this scenario
  • Don’t use ZFS on the storage server, but rather use a simpler filesystem like EXT4, and let the host LXC’s ZFS driver handle it
  • Using directory based storage in this case is OK?
  • Keep the images of the containers cached on the hosts, and have container generated content managed by mounting them using NFS

I’d probably stay away from putting a ZFS pool into a loop file over NFS, that seems like a potential massive issue if you get any network glitch and also potentially problematic due to how NFS locks work differently from normal filesystem locks.

Directory based storage is probably your best option given this environment.
Container creation will be pretty slow and you’ll definitely want to stay away from snapshots but other than that, this should be okay enough.

Again, one thing to keep in mind is that NFS may not be fully POSIX compliant and its locking mechanism is different from most other filesystems. So your containers may have some services fail or plain refuse to start because they’re running fully from NFS.

If you have enough space on the machines running LXD, I’d recommend using a local ZFS pool on them and then a separate dir storage pool that’s backed by NFS. You’d then allocate custom storage volumes from that NFS dir storage pool and attach that to specific locations in your containers.

This would get you the benefits of ZFS for most operations while still keeping the bulky data on the NFS server. You’d then want to have snapshots made of that data directly on the NFS server rather than through LXD.

Yes, the hosts running the containers have enough space to store images no problem, and by the sounds of it, you have confirmed that it is indeed a bad idea to keep the running images on NFS.

I definitely know the caveats that come with the NFS filesystem and they prohibit real time data access and forfeit many critical functionalities eg. file watching (and for good reason too, you can’t blame it given the fact that its over network).

So let’s confirm some points that may promote best performance:

  • Use local partitioned ZFS pools (preferably NVMe SSD’s) to run LXC container images from
  • Use NFS for snapshot storage, which can be transferred to the hosts filesystem to run there
  • Use NFS to store the LXC container app data by mounting in that container

The only part that I would like clarification on is the NFS allocation for container application usage.
One way is to just normally mount them via NFS within the container (some trouble with that lately from apparmor), but perhaps this is something that can be improved. Are just normal mounts ok using the traditional way, or is there a more optimal way to attach LXC optimized storage over the network?

If you’re starting from a clean state so don’t need to attach a bunch of pre-existing paths from NFS and you have your NFS mounted at /mnt/nfs on the host, I’d do something like:

  • lxc storage create nfs dir source=/mnt/nfs/lxd

This will show as a new storage pool that can then create volumes with:

  • lxc storage volume create nfs foo

At which point they can be attached to an instance with:

  • lxc config device add c1 foo disk pool=nfs source=foo path=/mnt/foo

That way you can create as many volumes as you want through LXD and attach them.

If you have pre-existing directories in NFS that you need to attach to instances, then using a disk device with source=/mnt/nfs/… is likely the way to go. Mounting NFS from every instance is tricky to do, not to mention would cause needless connections. Having the host maintain a single connection is probably the way to go.

You made a very interesting point in regards to the multiple NFS paths on the separate containers as opposed to the single connection on the main host. I completely overlooked the fact that having multiple paths on NFS would equal more connections. Is this really the case even if all of them are connecting to the same destination NFS server? I assumed the NFS kernel would realize this and handle/optimize this internally, but having shed some light on this, I will definitely take that into consideration.

So to be clear, you are suggesting to mount a single general NFS path to the main system, which will then act as a storage source for LXD, using the “dir” type, not “zfs” or any other sophisticated filesystems. You will then use LXC to create “volumes” on this storage source which will act as the isolated partitions for container use. This ultimately resolves the performance issues by having a single NFS connection and security/isolation issues for allowing containers to access their designated storage areas for use.

I did not want to come here and waste everyone’s time asking for performance advice of general system admin type setups, but rather learn to take advantage of any LXC offerings that have native/optimized solutions that can aide in this. Excellent advice Stéphane, Canonical made a great choice selecting you as their master maintainer, all the contributions you are providing are definitely noticed.