Trying out `shiftfs`

What’s shiftfs

shiftfs is a kernel filesystem which was initially developed by James Bottomley, then stabilized and extended by @sforshee and @brauner.
That filesystem acts like an overlay on top of an existing mount on the host and then allows mounting inside the container, shifting uid/gid on all filesystem operations. This allows for instant creation and startup of unprivileged containers as no costly filesystem remapping is needed on creation or startup.

In tests, on fast systems backed by NVME storage, we’ve seen the typical lxc launch time for an Ubuntu 18.04 image go from around 2.5s down to just 500ms. For systems using hard disks, the difference should be far more noticeable, possibly saving tens of seconds.

shiftfs will also allow for multiple containers that use non-overlapping maps (security.idmap.isolated=true) to share custom storage volumes, which is currently impossible.

Where can it be found

At this time, shiftfs isn’t mainline. It’s included in the Ubuntu kernel starting with 5.0 and some other Linux distributions may pick it up. It should also be reasonably easy to package as a DKMS package for distributions that ship a suitably recent kernel.

Ubuntu users have it in their kernel out of the box, this can be confirmed with modinfo shiftfs.
As we had to fix a few early issues, you should make sure that your kernel is fully up to date and you are booted onto it.

Limitations

The only limitation we’re currently aware of is that shiftfs will prevent the use of overlayfs inside the container. This may break Docker users. If this applies to you, you should stay away from shiftfs for now.

How can it be used

The easiest is with the LXD snap on version 3.12 or higher. We currently have it disabled out of the box because of the above limitations but you can opt into it if those don’t apply to you.

To turn it on, do:

  • sudo snap set lxd shiftfs.enable=true
  • sudo systemctl reload snap.lxd.daemon

You can then run:

  • lxc info

And should see:

    shiftfs: "true"

Under the kernel_features header. If you see it, then you’ve got it enabled.

After that, any newly created container will use shiftfs.
To convert an existing container, the easiest is to temporarily convert it to privileged:

  • lxc config set NAME security.privileged true
  • lxc restart NAME
  • lxc config unset NAME security.privileged
  • lxc restart NAME

Feedback

Any feedback on shiftfs would be appreciated and we will be updating the limitations section above as we become aware of new issues.

Once all of them are addressed, we will slowly start rolling it out by default to our users as this is a significant performance improvement and completely avoids the rather complex and risky remapping logic that we have to use today.

6 Likes

I looked at it for my 18.04 LTS and all I found is a PPA. It’s not exactly ‘out of the box’ IMO. According to this there may be some hope for August 2019.

Linux 5.0 is currently available in Ubuntu 19.04 or newer. Indeed, it will reach the latest LTS at Ubuntu 18.04.3 (1st Aug).

2 Likes

Here are some benchmarks, on Ubuntu 19.10 (latest daily).

Before enabling shiftfs

myusername@myusername-desktop:~$ time lxc launch ubuntu:18.04 mycontainer
Creating mycontainer
Starting mycontainer

real	0m2,551s
user	0m0,074s
sys	0m0,033s
myusername@myusername-desktop:~$ time lxc launch ubuntu:18.04 mycontainer2
Creating mycontainer2
Starting mycontainer2

real	0m2,550s
user	0m0,071s
sys	0m0,034s
myusername@myusername-desktop:~$ time lxc launch ubuntu:18.04 mycontainer3
Creating mycontainer3
Starting mycontainer3

real	0m2,718s
user	0m0,069s
sys	0m0,036s
myusername@myusername-desktop:~$ 

After enabling shiftfs

myusername@myusername-desktop:~$ time lxc launch ubuntu:18.04 mycontainer
Creating mycontainer
Starting mycontainer

real	0m0,513s
user	0m0,070s
sys	0m0,041s
myusername@myusername-desktop:~$ time lxc launch ubuntu:18.04 mycontainer2
Creating mycontainer2
Starting mycontainer2

real	0m0,473s
user	0m0,066s
sys	0m0,031s
myusername@myusername-desktop:~$ time lxc launch ubuntu:18.04 mycontainer3
Creating mycontainer3
Starting mycontainer3

real	0m0,550s
user	0m0,072s
sys	0m0,027s
myusername@myusername-desktop:~$ 

This shows that the creation speed went from 2.5s down to 500s (SSD disk, using ZFS over a loop file).

Using lxd-benchmark but no shiftfs

$ lxd.benchmark launch --count 10 --parallel 10 ubuntu:18.04
Test environment:
  Server backend: lxd
  Server version: 3.14
  Kernel: Linux
  Kernel architecture: x86_64
  Kernel version: 5.0.0-17-generic
  Storage backend: zfs
  Storage version: 0.7.12-1ubuntu5
  Container backend: lxc
  Container version: 3.1.0

Test variables:
  Container count: 10
  Container mode: unprivileged
  Startup mode: normal startup
  Image: ubuntu:18.04
  Batches: 1
  Batch size: 10
  Remainder: 0

[Jul  2 12:41:37.107] Found image in local store: 6ae1c6e92017402f1aee655fa8d785ee9d2337a3369d76115cecad5e7a303e07
[Jul  2 12:41:37.107] Batch processing start
[Jul  2 12:41:46.664] Processed 10 containers in 9.557s (1.046/s)
[Jul  2 12:41:46.664] Batch processing completed in 9.557s

Using lxd-benchmark and shiftfs

$ lxd.benchmark launch --count 10 --parallel 10 ubuntu:18.04
Test environment:
  Server backend: lxd
  Server version: 3.14
  Kernel: Linux
  Kernel architecture: x86_64
  Kernel version: 5.0.0-17-generic
  Storage backend: zfs
  Storage version: 0.7.12-1ubuntu5
  Container backend: lxc
  Container version: 3.1.0

Test variables:
  Container count: 10
  Container mode: unprivileged
  Startup mode: normal startup
  Image: ubuntu:18.04
  Batches: 1
  Batch size: 10
  Remainder: 0

[Jul  2 12:38:16.553] Found image in local store: 6ae1c6e92017402f1aee655fa8d785ee9d2337a3369d76115cecad5e7a303e07
[Jul  2 12:38:16.553] Batch processing start
[Jul  2 12:38:18.353] Processed 10 containers in 1.800s (5.557/s)
[Jul  2 12:38:18.353] Batch processing completed in 1.800s

This was launching 10 Ubuntu containers in parallel. The time spent, when from 9.5s way down to 1.8s.

2 Likes

The 5.0 kernel is available for Ubuntu 18.04 users as linux-generic-hwe-18.04-edge.

That edge kernel will then get promoted to just linux-generic-hwe-18.04 with the next point release of Ubuntu 18.04.