Restarting a CentOS6 container makes host filesystems read-only

Related to:

OS: debian 10
Problem:

# lxc-create -t download -n test-1 -B loop --fssize 2G --fstype ext4 -- -d centos -r 6 -a amd64 ; lxc-start test-1 ; lxc-stop test-1 ; lxc-start test-1

lxc-start: test-1: lxccontainer.c: wait_on_daemonized_start: 842 Received container state "ABORTING" instead of "RUNNING"
lxc-start: test-1: tools/lxc_start.c: main: 330 The container failed to start
lxc-start: test-1: tools/lxc_start.c: main: 333 To get more details, run the container in foreground mode
lxc-start: test-1: tools/lxc_start.c: main: 336 Additional information can be obtained by setting the --logfile and --logpriority options

all because of devpts on host became read only. You can easily fix it with mount -t devpts -o remount,gid=5,mode=620 devpts /dev/pts after each stop of container, but it’s durty. I know guys in Proxmox project somehow “fix” it, there you can start/stop containers as many times as you want without remount something.
In proxmox they somehow use apparmor (as i think) to prevent remount of devpts. In logs i see

Aug  4 23:33:56 test kernel: [11149.862155] audit: type=1400 audit(1596540836.368:77): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-101_</var/lib/lxc>" name="/dev/" pid=21831 comm="mount" flags="ro, remount"
Aug  4 23:33:56 test kernel: [11149.865306] audit: type=1400 audit(1596540836.368:78): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-101_</var/lib/lxc>" name="/proc/sys/net/" pid=21833 comm="mount" flags="ro, remount"
Aug  4 23:33:56 test kernel: [11149.867154] audit: type=1400 audit(1596540836.372:79): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-101_</var/lib/lxc>" name="/proc/sys/" pid=21834 comm="mount" flags="ro, remount"
Aug  4 23:33:56 test kernel: [11149.868787] audit: type=1400 audit(1596540836.372:80): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-101_</var/lib/lxc>" name="/proc/sysrq-trigger" pid=21835 comm="mount" flags="ro, remount"
Aug  4 23:33:56 test kernel: [11149.874872] audit: type=1400 audit(1596540836.380:81): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-101_</var/lib/lxc>" name="/sys/devices/virtual/net/" pid=21838 comm="mount" flags="ro, remount"
Aug  4 23:33:56 test kernel: [11149.876520] audit: type=1400 audit(1596540836.380:82): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-101_</var/lib/lxc>" name="/sys/devices/virtual/net/" pid=21839 comm="mount" flags="ro, remount"
Aug  4 23:33:56 test kernel: [11149.878365] audit: type=1400 audit(1596540836.384:83): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-101_</var/lib/lxc>" name="/proc/cpuinfo" pid=21840 comm="mount" flags="ro, remount"
Aug  4 23:33:56 test kernel: [11149.879960] audit: type=1400 audit(1596540836.384:84): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-101_</var/lib/lxc>" name="/proc/diskstats" pid=21841 comm="mount" flags="ro, remount"
Aug  4 23:33:56 test kernel: [11149.881547] audit: type=1400 audit(1596540836.384:85): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-101_</var/lib/lxc>" name="/proc/loadavg" pid=21842 comm="mount" flags="ro, remount"
Aug  4 23:33:56 test kernel: [11149.883390] audit: type=1400 audit(1596540836.388:86): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-101_</var/lib/lxc>" name="/proc/meminfo" pid=21843 comm="mount" flags="ro, remount"

Unfortunately, i completely don’t know how to set apparmor. So, maybe someone know how to avoid this behavior, or maybe set apparmor?

To prevent this behavior we should forbid mount /proc and /sys with apparmor. Here is how it done on Proxmox

--- /etc/apparmor.d/abstractions/lxc/container-base	2019-04-14 16:46:47.000000000 +0300
+++ /etc/apparmor.d/abstractions/lxc/container-base	2020-08-13 14:49:17.025216151 +0300
@@ -82,7 +82,6 @@
   deny mount fstype=debugfs -> /var/lib/ureadahead/debugfs/,
   mount fstype=proc -> /proc/,
   mount fstype=sysfs -> /sys/,
-  mount options=(rw, nosuid, nodev, noexec, remount) -> /sys/,
   deny /sys/firmware/efi/efivars/** rwklx,
   deny /sys/kernel/security/** rwklx,
   mount options=(move) /sys/fs/cgroup/cgmanager/ -> /sys/fs/cgroup/cgmanager.lower/,
@@ -91,6 +90,11 @@
   # deny reads from debugfs
   deny /sys/kernel/debug/{,**} rwklx,
 
+  # prevent rw mounting of /sys, because that allows changing its global permissions
+  deny mount -> /proc/,
+  deny mount -> /sys/,
+#  mount options=(rw, nosuid, nodev, noexec, remount) -> /sys/,
+

then you must (!) clean all the caches with apparmor_parser --purge-cache or how i done rm -rf /var/cache/lxc/apparmor/* /var/cache/apparmor/*, cuz restart/reload of apparmor (even server restart) for some reason don’t do it, and as i think this is the reason i spent a week of my life :’( after that reload rules systemctl reload apparmor.service. And also i have added apparmor.raw into container configs:

lxc.apparmor.raw = deny mount -> /proc/,
lxc.apparmor.raw = deny mount -> /sys/,

but this is optional as i think…
and remove from anywhere lxc.apparmor.profile = generated OR you can use it, but then you must remove lxc.apparmor.allow_nesting = 1 - these two options with each other somehow makes apparmor useless…
So, after all, you can restart containers as many times as you want without remount something.