Sharing programs between containers

votsalo · March 26, 2019, 8:08pm

What are current best practices and future plans for sharing programs between containers? I mean the ability to install a program either on the host, or in a special resource (container, volume, etc.) and have it available in other containers. This is useful for performance reasons (a single copy of the program in memory) and also for easier maintenance (updating the program could be done once).

Existing program sharing methods:

Shared disk device. I put shared programs in a disk device that I share with most of my containers (through a profile, to make it easier). I have a host directory that is shared as /var/share/opt and is accessible read-only by all containers, except for one container which has read-write access (via id map). In that directory I have subdirectories like jdk1.8.0_191, which I then link to in each container via a symbolic link /opt/jdk8 -> /var/share/opt/jdk1.8.0_191. I can therefore have multiple versions, so each container can upgrade separately.

This method may require some work to separate the read-only parts of the shared program from its data parts, and this requires some working knowledge of the program I’m trying to share. Programs like the JDK, Ant and Go can be shared as-is. For example Go uses $GOPATH to find the variable .go files that it compiles. Application servers and PHP applications notoriously mix code and data (for example by having configuration files and plugins that are hard to separate from the core program, or by automatically updating their plugins), but some can be fixed by moving data directories outside the php code and using symbolic links.

This method doesn’t work with programs installed through the package manager (such as apt-get), nor through snap. For this reason, I often prefer to install the programs that I use manually (by unpacking an archive), rather than install them from the package manager. This also allows me to use the latest version more easily.

Clone a container. Cloning a container, using “lxc copy” (–container-only or from a snapshot) is a way to share all its programs, if a copy-on-write storage driver is used (zfs or btrfs). This is useful for containers that run the same programs, such as website frameworks. A template container would be used to install all necessary programs and then cloned to instantiate other containers that use the same programs. However, modifications to the original container don’t propagate to the cloned containers. One could carefully structure such a system to put most writeable data in a shared volume and then have scripts that would quickly clone the original template, and then move the data from the old instances to the updated instances. I do this in some cases.

Ideas for sharing programs between containers using LXD features that don’t exist yet (as far as I know):

Share snaps between containers. I would install a snap once in something like a storage volume, and “attach” it to multiple containers. Don’t snaps already have a read-only portion for the programs and a read-write portion for data? The read-only part should be shared, along with any updates to it. The write-only part should be cloned. Sharing snaps seems an ideal use-case for them.

Share OS packages. Perhaps the same thing could be done with the standard package manager, e.g. apt, apk, etc, though this would require making those package managers more like snap. But because of package dependencies, a better solution would be to share all packages, described next.

Share all OS packages from one container to another container. Let’s say I have a container T that I use as a template for other containers. I would install any OS packages that I need in container T and then I would instantiate other T-like containers through an LXD command, similar to copy. This command would share all read-only package files and create copy (or copy-on-write) of the variable (writeable) data. But most OS package managers don’t have a notion of read-only files (with the notable exception of snaps). Everything is writeable by the container root. This is very similar to the existing lxc copy command, except that it would also propagate updates, additions, and deletions of packages.

Package management should evolve to match container use-cases.

gpatel-fr · March 27, 2019, 1:35pm

How about automatic sharing through zfs ou btrfs automatic deduplication ? Seems much simpler and reliable, and would not imply meddling with updates.

votsalo · March 27, 2019, 2:45pm

Do you use deduplication? Could you point me to the commands I’d need to use? Doesn’t it have significant memory requirements?

gpatel-fr · March 27, 2019, 2:56pm

Not yet, but I have already thought about it and I am planning (hoping) to use it someday. From what I have seen, btrfs don’t have online dedup and it’s necessary to use some sort of cron job to launch an utility that dedup file by file (not block by block AFAIK, not sure). I don’t know about memory use but it does not sound as something that should be very memory hungry. I have not gone beyond because to be candid the way to do it was not obvious at first glance so I have given it a pass for now.
I know almost nil about zfs (don’t use it), but it’s famed for its online dedup; something that should be very memory intensive indeed.