Error: Failed to create file, read-only file system

I’ve been googling for a while about this error, and there are indeed some topics that describe it but after reading all of it, I am even more confused :expressionless:

When I try to start any of three containers that I have, the following error comes.

root@brix:/# lxc start kubemaster
Error: Failed to create file "/var/snap/lxd/common/lxd/virtual-machines/kubemaster/backup.yaml": open /var/snap/lxd/common/lxd/virtual-machines/kubemaster/backup.yaml: read-only file system
Try `lxc info --show-log kubemaster` for more info

The lxc info --show-log kubemaster really gives me no clue what is happening, but as someone pointed out in some of the posts here, the problem might be the free space, and really, the loop devices are 100% full:

root@brix:/# df -H
Filesystem      Size  Used Avail Use% Mounted on
udev            4.1G     0  4.1G   0% /dev
tmpfs           815M  1.4M  814M   1% /run
/dev/sda4       112G   27G   79G  26% /
tmpfs           4.1G     0  4.1G   0% /dev/shm
tmpfs           5.3M     0  5.3M   0% /run/lock
tmpfs           4.1G     0  4.1G   0% /sys/fs/cgroup
/dev/sda1       536M  8.3M  528M   2% /boot/efi
/dev/sda2       5.3G   22M  5.0G   1% /home
/dev/loop1       71M   71M     0 100% /snap/lxd/20326
/dev/loop2       74M   74M     0 100% /snap/lxd/19647
/dev/loop0       59M   59M     0 100% /snap/core18/2066
/dev/loop3       59M   59M     0 100% /snap/core18/2074
/dev/loop4       34M   34M     0 100% /snap/snapd/12398
/dev/loop5       34M   34M     0 100% /snap/snapd/12159
tmpfs           1.1M     0  1.1M   0% /var/snap/lxd/common/ns
tmpfs           815M     0  815M   0% /run/user/1000

lsblk gives the following output:

root@brix:/# lsblk
NAME                                                 MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0                                                  7:0    0  55.4M  1 loop /snap/core18/2066
loop1                                                  7:1    0  67.6M  1 loop /snap/lxd/20326
loop2                                                  7:2    0  70.4M  1 loop /snap/lxd/19647
loop3                                                  7:3    0  55.5M  1 loop /snap/core18/2074
loop4                                                  7:4    0  32.3M  1 loop /snap/snapd/12398
loop5                                                  7:5    0  32.3M  1 loop /snap/snapd/12159
loop6                                                  7:6    0  18.6G  0 loop
├─lvm_default-LXDThinPool_tmeta                      253:0    0     1G  0 lvm
│ └─lvm_default-LXDThinPool-tpool                    253:2    0  16.6G  0 lvm
│   ├─lvm_default-LXDThinPool                        253:3    0  16.6G  1 lvm
│   ├─lvm_default-virtual--machines_kubemaster.block 253:4    0  18.6G  0 lvm
│   ├─lvm_default-virtual--machines_kubemaster       253:5    0    96M  0 lvm
│   ├─lvm_default-virtual--machines_node1.block      253:6    0  37.3G  0 lvm
│   ├─lvm_default-virtual--machines_node2.block      253:7    0  37.3G  0 lvm
│   ├─lvm_default-virtual--machines_node1            253:8    0    96M  0 lvm
│   └─lvm_default-virtual--machines_node2            253:9    0    96M  0 lvm
└─lvm_default-LXDThinPool_tdata                      253:1    0  16.6G  0 lvm
  └─lvm_default-LXDThinPool-tpool                    253:2    0  16.6G  0 lvm
    ├─lvm_default-LXDThinPool                        253:3    0  16.6G  1 lvm
    ├─lvm_default-virtual--machines_kubemaster.block 253:4    0  18.6G  0 lvm
    ├─lvm_default-virtual--machines_kubemaster       253:5    0    96M  0 lvm
    ├─lvm_default-virtual--machines_node1.block      253:6    0  37.3G  0 lvm
    ├─lvm_default-virtual--machines_node2.block      253:7    0  37.3G  0 lvm
    ├─lvm_default-virtual--machines_node1            253:8    0    96M  0 lvm
    └─lvm_default-virtual--machines_node2            253:9    0    96M  0 lvm
sda                                                    8:0    0 119.2G  0 disk
├─sda1                                                 8:1    0   512M  0 part /boot/efi
├─sda2                                                 8:2    0     5G  0 part /home
├─sda3                                                 8:3    0     8G  0 part [SWAP]
└─sda4                                                 8:4    0 105.8G  0 part /

Questions:

Why are the sizes of the loop devices 1-5 so small 32-70 megabytes?
Is space on the loop6 dynamic?
If space on the loop6 is dynamic, why then the loops 1-5 are not dynamic as well?
I have currently 2 snapshots for each container so 6 snapshots altogether, is this problem related to the number of snapshots?
What is the best way to aproach and solve this problem, and start the containers?

Can you show output of sudo lvs and sudo vgs please.

Sure:

root@brix:/# lvs
  LV                                            VG          Attr       LSize   Pool        Origin                                        Data%  Meta%  Move Log Cpy%Sync Convert
  LXDThinPool                                   lvm_default twi-aotz--  16.62g                                                           98.83  2.52                            
  virtual-machines_kubemaster                   lvm_default Vwi-aotz-k  96.00m LXDThinPool                                               9.11                                   
  virtual-machines_kubemaster-kubemaster1       lvm_default Vri---tz-k  96.00m LXDThinPool virtual-machines_kubemaster                                                          
  virtual-machines_kubemaster-kubemaster1.block lvm_default Vri---tz-k <18.63g LXDThinPool                                                                                      
  virtual-machines_kubemaster-kubemaster2       lvm_default Vri---tz-k  96.00m LXDThinPool virtual-machines_kubemaster                                                          
  virtual-machines_kubemaster-kubemaster2.block lvm_default Vri---tz-k <18.63g LXDThinPool virtual-machines_kubemaster.block                                                    
  virtual-machines_kubemaster.block             lvm_default Vwi-aotz-k <18.63g LXDThinPool virtual-machines_kubemaster-kubemaster1.block 25.14                                  
  virtual-machines_node1                        lvm_default Vwi-aotz-k  96.00m LXDThinPool                                               9.05                                   
  virtual-machines_node1-node1                  lvm_default Vri---tz-k  96.00m LXDThinPool virtual-machines_node1                                                               
  virtual-machines_node1-node1--2               lvm_default Vri---tz-k  96.00m LXDThinPool virtual-machines_node1                                                               
  virtual-machines_node1-node1--2.block         lvm_default Vri---tz-k  37.25g LXDThinPool virtual-machines_node1.block                                                         
  virtual-machines_node1-node1.block            lvm_default Vri---tz-k  37.25g LXDThinPool                                                                                      
  virtual-machines_node1.block                  lvm_default Vwi-aotz-k  37.25g LXDThinPool virtual-machines_node1-node1.block            12.02                                  
  virtual-machines_node2                        lvm_default Vwi-aotz-k  96.00m LXDThinPool                                               8.98                                   
  virtual-machines_node2-node2                  lvm_default Vri---tz-k  96.00m LXDThinPool virtual-machines_node2                                                               
  virtual-machines_node2-node2--2               lvm_default Vri---tz-k  96.00m LXDThinPool virtual-machines_node2                                                               
  virtual-machines_node2-node2--2.block         lvm_default Vri---tz-k  37.25g LXDThinPool virtual-machines_node2.block                                                         
  virtual-machines_node2-node2.block            lvm_default Vri---tz-k  37.25g LXDThinPool                                                                                      
  virtual-machines_node2.block                  lvm_default Vwi-aotz-k  37.25g LXDThinPool virtual-machines_node2-node2.block            12.70                                  
root@brix:/# vgs
  VG          #PV #LV #SN Attr   VSize  VFree
  lvm_default   1  19   0 wz--n- 18.62g    0 
root@brix:/#

The loop devices are the LXD snap itself, they are read-only volumes so it’s correct that they report 100% usage. It’d be a bug if they weren’t.

The unexpected read-only output could be coming from a kernel error, what’s the output of dmesg?

Confirmed disk failure.
Somehow I subconsciously decided to omit this fact, although I did check dmesg, because the disk is relatively new.

Hopefully I can (learn how to) export and import the VMs to the new disk.

dmesg errors:

Jul  1 13:19:16 brix kernel: [66671.001588] Buffer I/O error on device dm-9, logical block 10241
Jul  1 13:19:22 brix kernel: [66676.868411] JBD2: Detected IO errors while flushing file data on dm-5-8
Jul  1 16:14:33 brix kernel: [ 9762.421975] Buffer I/O error on dev dm-5, logical block 539, lost async page write
Jul  2 15:44:49 brix kernel: [94377.289883] Buffer I/O error on dev dm-8, logical block 539, lost async page write

These dm devices are indeed related to the LXD containers:

root@brix:/# dmsetup deps -o devname /dev/dm-*
/dev/dm-0: 1 dependencies       : (loop6)
/dev/dm-1: 1 dependencies       : (loop6)
/dev/dm-2: 2 dependencies       : (lvm_default-LXDThinPool_tdata) (lvm_default-LXDThinPool_tmeta)
/dev/dm-3: 1 dependencies       : (lvm_default-LXDThinPool-tpool)
/dev/dm-4: 1 dependencies       : (lvm_default-LXDThinPool-tpool)
/dev/dm-5: 1 dependencies       : (lvm_default-LXDThinPool-tpool)
/dev/dm-6: 1 dependencies       : (lvm_default-LXDThinPool-tpool)
/dev/dm-7: 1 dependencies       : (lvm_default-LXDThinPool-tpool)
/dev/dm-8: 1 dependencies       : (lvm_default-LXDThinPool-tpool)
/dev/dm-9: 1 dependencies       : (lvm_default-LXDThinPool-tpool)
root@brix:/#

After the disk check I got another error:

root@brix:/# lxc start kubemaster
Error: virtiofsd failed to bind socket within 10s
Try `lxc info --show-log kubemaster` for more info
root@brix:/# lxc info --show-log kubemaster
Name: kubemaster
Location: none
Remote: unix://
Architecture: x86_64
Created: 2021/06/22 10:03 UTC
Status: Stopped
Type: virtual-machine
Profiles: default
Pid: 2802
Resources:
  Processes: 0
  Disk usage:
    root: 4.29GB
Snapshots:
  kubemaster1 (taken at 2021/06/24 18:48 UTC) (stateless)
  kubemaster2 (taken at 2021/06/30 18:26 UTC) (stateless)
Error: open /var/snap/lxd/common/lxd/logs/kubemaster/qemu.log: no such file or directory
root@brix:/#

Here are some additional information from SMART

root@brix:/# smartctl -A /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-77-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 4
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0032   100   100   ---    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   ---    Old_age   Always       -       9412
 12 Power_Cycle_Count       0x0032   100   100   ---    Old_age   Always       -       2046
170 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       0
171 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       0
172 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       0
173 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       24
174 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       117
178 Used_Rsvd_Blk_Cnt_Chip  0x0032   100   100   ---    Old_age   Always       -       0
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   100   100   010    Pre-fail  Always       -       100
184 End-to-End_Error        0x0033   100   100   097    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   ---    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   063   063   ---    Old_age   Always       -       37 (Min/Max 21/63)
199 UDMA_CRC_Error_Count    0x0032   100   100   ---    Old_age   Always       -       0
233 Media_Wearout_Indicator 0x0033   094   100   001    Pre-fail  Always       -       15725736
234 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       8014
241 Total_LBAs_Written      0x0030   253   253   ---    Old_age   Offline      -       5376
242 Total_LBAs_Read         0x0030   253   253   ---    Old_age   Offline      -       3945
249 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       2904
root@brix:/#

Thank you for your support. Keep up the good work!