Lxd-ops: Separating container OS from user data

votsalo · February 25, 2021, 8:30am

I have been separating user data (my data) from quest OS data in LXD containers, so that I can upgrade a container by removing the old OS and plugging in a new OS.

I put user data in disk devices that I attach to the container, at the following standard paths, and I don’t put user data anywhere else:
/opt, /etc/opt, /var/opt, /usr/local/bin, /home, /var/log
In some cases, I add extra devices. For mariadb containers, I add /var/lib/mysql and /var/lib/mysql-log (which I use for innodb_log_group_home_dir).

I have about a handful of template containers, built with specific packages and other configuration. The template containers are stopped and snapshoted after they are created.
When I need a working container $c, I make it as follows:

I copy a template container (which just makes a zfs clone)
I create a new zfs filesystem (or multiple ones), for the attached devices, and create a subdirectory for each disk device (opt, etc, var, bin, home, log)
I create a $c.devices profile and I attach the disk devices to it.
I either rsync or zfs clone each disk device from the template container (or from another “device template”) to the working container
If I don’t use a template for the devices, I chown them to 1000000:1000000, which maps to the root user in the container, so the container can write to them.
I attach this profile to the working container, along with any other profiles that I need (generally to attach other disk devices with application directories)
I start the container

When I create a template container, I follow similar steps, but there is typically a lot more configuration (installing packages, running scripts, …).
A template container can be created by cloning another template container, so I have a hierarchy of templates, out of which I clone my working containers.

When I need to upgrade the OS of a working container, I rebuild the working container by deleting it and creating a new container to take its place with the same user data:

I first upgrade the template container, or rebuild it from scratch
Then for each working container:
I stop and delete the working container (but not the disk devices in the profile $c.devices)
I clone the template container (copy a snapshot)
I attach the same profiles as before to the working container, including $c.devices
I start the container

There are some other details:

Because I replace /etc completely, I keep the changed files in /etc/opt/etc/ (for example, /etc/opt/etc/php7/php.ini)
When I recreate the container:
I rsync /etc/opt/etc/ /etc/
I replace the new sshd host keys with the old ones, so ssh to the replaced container doesn’t complain that it doesn’t recognize it.
I recreate any users that are not in the template container
I generally avoid making changes to /etc. For example, I keep nginx and apache2 conf files in /etc/opt/apache2/ and /etc/opt/nginx/, and configure the web servers to include the configuration files from there.

When I replace a template container, I first rename it, so I can tell which working containers still use the old template (using zfs list -t snapshot -o name,clones on the LXD containers filesystem). Once I see that the old template does not have any clones, then I delete it.

That way:

I know where my data is, how much space it takes, and how to port it to a new system.
I was able to migrate containers to another distribution, though I needed to change or remap file permissions because the corresponding application uid/gid were different between the two distributions.
I only need to backup my data, not multiple copies of guest OS instances.
I can snapshot my data independently of the OS.
I don’t have any hidden settings forgotten deep in the OS.
I keep my working containers “reproducible”. I can recreate them from a brand new container and my own data.
There is less duplication or divergence of OS files across containers, since all working containers are clones of a single template (or a few templates).

I have a program that automates these steps. It uses a yaml configuration file for each template container and working container, that lists what needs to be done to create or rebuild the container. These steps are:

copy a template container
create or clone zfs filesystems
copy files from a template
install packages (mostly for the template containers)
attach profiles
push files (lxc file push)
run scripts (lxc exec sh)
create users
create a snapshot (mostly for template containers)
stop the container (for template containers)

votsalo · March 31, 2021, 4:15pm

I have published the code that I use to do the above, here: https://github.com/melato/lxdops
I’d appreciate feedback. Am I duplicating something that already exists elsewhere?

simos · March 31, 2021, 10:49pm

Hi!

I am not aware of similar software that helps the process of separating the user data in LXD containers.
I think this is a type of what is called host-based persistence.

I would like to try lxdops. Here are my modified build instructions. I am starting off with a new LXD VM with Ubuntu 20.04, and install Go from the snap package (go1.16.2). Then,

ubuntu@lxdops:~$ export GO111MODULE=auto
ubuntu@lxdops:~$ go get melato.org/command
ubuntu@lxdops:~$ go get melato.org/lxdops
ubuntu@lxdops:~$ go get github.com/melato/lxdops
ubuntu@lxdops:~$ mkdir ~/go/bin
ubuntu@lxdops:~$ export GOBIN=~/go/bin
ubuntu@lxdops:~$ cd /home/ubuntu/go/src/github.com/melato/lxdops/main/
ubuntu@lxdops:/home/ubuntu/go/src/github.com/melato/lxdops/main$ date | sudo tee version
ubuntu@lxdops:/home/ubuntu/go/src/github.com/melato/lxdops/main$ go install lxdops.go
ubuntu@lxdops:/home/ubuntu/go/src/github.com/melato/lxdops/main$ cd ~/go/bin
ubuntu@lxdops:/home/ubuntu/go/bin$ ls -l
total 16016
-rwxr-xr-x 1 ubuntu ubuntu 16399768 Mar 31 22:30 lxdops
ubuntu@lxdops:/home/ubuntu/go/bin$

In your instructions you enable the -static flag but I get the following warnings. I suppose you prefer to compile the binary as static so that you can take and use on other systems as well?

# command-line-arguments
/usr/bin/ld: /tmp/go-link-409360308/000002.o: in function `mygetgrouplist':
/snap/go/current/src/os/user/getgrouplist_unix.go:16: warning: Using 'getgrouplist' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /tmp/go-link-409360308/000001.o: in function `mygetgrgid_r':
/snap/go/current/src/os/user/cgo_lookup_unix.go:38: warning: Using 'getgrgid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /tmp/go-link-409360308/000001.o: in function `mygetgrnam_r':
/snap/go/current/src/os/user/cgo_lookup_unix.go:43: warning: Using 'getgrnam_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /tmp/go-link-409360308/000001.o: in function `mygetpwnam_r':
/snap/go/current/src/os/user/cgo_lookup_unix.go:33: warning: Using 'getpwnam_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /tmp/go-link-409360308/000001.o: in function `mygetpwuid_r':
/snap/go/current/src/os/user/cgo_lookup_unix.go:28: warning: Using 'getpwuid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /tmp/go-link-409360308/000011.o: in function `_cgo_26061493d47f_C2func_getaddrinfo':
/tmp/go-build/cgo-gcc-prolog:58: warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking

If these instructions look right, feel free to use in README.md.

votsalo · April 1, 2021, 8:35am

You don’t have to use -static. I just tried without it on Ubuntu (with go 1.16.2 downloaded from golang.org) and it compiles: go install lxdops.go
I use -static because I often compile on alpine and run on debian. This wouldn’t work with dynamic libraries.
In the instructions I put the same flags that I use.
I’ll check the warnings that you are getting.

votsalo · April 1, 2021, 10:18am

@simos, I tried it on a brand new Ubuntu 20.04 minimal system with snap go, and updated the build instructions as you mentioned (almost).
I also had to: sudo apt install git gcc libc6-dev

I chose to use my own domain for my packages, instead of github’s, which complicates go get slightly, so for now you need to get my repositories explicitly.
If you do “go get github.com/melato/lxdops” and “go get melato.org/lxdops” you will have cloned the same repository in two different places.

votsalo · April 26, 2023, 7:54pm

I have updated lxdops v2, which uses standard cloud-config files for internal container configuration, instead of the previous custom mechanisms. I’ve updated the documentation and examples on github.

Also, I’ve improved the compilation procedure, by using go modules correctly (I hope). I still use my own domain for packages, but I now run my own go-get server at my domain, so the go compiler should know to get the code from github even though the import path does not have github in it.

See also How could immutable images for container creation work? - #5 by votsalo