Distributed file system inside LXC/LXD containers

Hi all, a search didn’t show up much about this, so looking for your thoughts on distributed file systems.

We have geographically distributed Alpine hosts with btrfs housing LXC containers (but I guess LXC or LXD doesn’t make a significant difference here). We’d like to mount a distributed file system inside the container so that changes to files in one container are replicated to other containers. But I can’t find a good option, so looking for ideas.

So far we considered:

  • GlusterFS: Not available for Alpine.
  • Ceph: Memory requirements too high (there’s 1GB RAM or less on some hosts)
  • MooseFS: Stores chunks and has a single master for metadata, which makes it less suitable for a geographically distributed system
  • NFS/pNFS: Doesn’t store the file locally, so issues with read speeds in a geographically distributed setup
  • lsync: We currently use this and designate one container as the ‘master’ but we want to be able to perform the change in an arbitrary container

Some more information:

  • Must be usable in unprivileged containers
  • Not looking to put the entire container on a distributed file system; just certain directories inside the container
  • Whether everything happens inside the container or whether it happens on the host and is bind-mounted into LXC is not too important, although preference for the former
  • In this particular case the workload is heavily skewed towards reads

Any ideas are very welcome, thanks.

Hi tetech,
Actually you have considered the correct candidates but more or less, the best option looks like lsyncd.
It is configuration is simple and conservative also fine for alpine distribution. Plus you can consider the lsync with keepalive option for high availability.
Regards.

Thanks for your reply. I am currently testing a combination of lsyncd + csync2. As this runs entirely in user space, I guess this is no longer anything to do with LXC/LXD, except that the inotify limit needs to be adjusted inside the container.

@tetech

I have tried distributed file system in unprivileged container.

Lizardfs (similar to MooseFS):
It did not work well have many errors and in fan networking many issues with chunkserver and master communication.

NFS:
Couldn’t find an easy solution to run it reliably with security in unprivileged container.

Minio:
It claimed to support S3 compatibility, but in reality it’s just for marketing it is not really S3 compatible and some code which works on S3 will not work in minio.

SeaweedFS (https://github.com/chrislusf/seaweedfs):
Look very promising with better S3 compatibility. Also has a filer component which can mount it in unprivileged container as POSIX compliant file system. But didn’t try due to additional complexity.

At last setup one container with its user home as storage and then mount it in another containers using sshfs. The whole system is automated and each container use ed25519 keys for mounting the files in each container. It’s the most performant secure solution I found so far which is easy to use.

Hello @roka and thanks for your input.

LizardFS has the same problems as MooseFS for a geographically distributed system, and seems to be not as actively maintained, so we considered and eliminated this.

NFS is too slow. We need replication to the local node; remote access is always too slow for us. However, we do use NFS for tasks which are not latency sensitive (like backups). The way we do it is to create a wireguard tunnel on the host and do the NFS transport over that, then bind-mount the NFS share into an unprivileged container.

SeaweedFS we did look at. It seemed interesting and I forget why we did not continue with it. One thing we weren’t keen on was chunking the files (like MooseFS also does).

The sshfs option has the same problem as NFS for us - it is not quick enough over links with 300ms latency.

Currently we are still testing the lsyncd + csync2 combo. It is not perfect but it is working OK. And where it makes sense (e.g. database clusters) we do that at the application layer.