suddenly, after 7 days of uptime, I start getting “Error: mkdir /var/snap/lxd/common/lxd/shmounts/test: read-only file system” when staring a container.
~$ lxc launch c8 test --target lxd11
Creating test
Starting test
Error: mkdir /var/snap/lxd/common/lxd/shmounts/test: read-only file system
Try `lxc info --show-log lxd:test` for more info
container log is empty, which is kinda logical, since it didn’t get to set it up
~$ lxc info --show-log lxd:test
Name: test
Status: STOPPED
Type: container
Architecture: x86_64
Location: lxd11
Created: 2023/02/09 16:00 UTC
Log:
lxd log is empty (i.e. nothing in there that does not hapen regularly for months), I didn’t have lxcfs.debug or daemon.debug to true though, i do have them now.
So, I had a look around and indeed, something has remounted /var/snap/lxd/common/lxd/shmounts and lxcfs read only
[root@lxd11 ~]# nsenter -a -t $(pgrep daemon.start)
-bash-5.0# mount | fgrep shmounts
tmpfs on /var/snap/lxd/common/shmounts type tmpfs (rw,relatime,size=1024k,mode=711)
lxcfs on /var/snap/lxd/common/shmounts/lxcfs type fuse.lxcfs (ro,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
tmpfs on /var/snap/lxd/common/shmounts/instances type tmpfs (ro,relatime,size=100k,mode=711)
if I remount rw, all is good (as far as I can tell so far):
-bash-5.0# mount /var/snap/lxd/common/shmounts/lxcfs -oremount,rw
-bash-5.0# mount /var/snap/lxd/common/shmounts/instances -oremount,rw
this has happened twice already, with two different servers from the same cluster.
any ideas how to debug this further?
lxd is the latest 5.10-b392610