LVM loopback out of disk space

eim · March 24, 2020, 9:57am

Hello, I’m running some containers with LVM loopback as storage backend, I noticed that when I use all disk space inside the LXC containers the whole LVM loopback get’s corrupted in some way (bad message dm-setup issues on dmesg). I’m using LVM because of Docker (as ZFS storage does not seem to work) but considering to move over to btrfs. Any suggestions on this topic? Thanks

tomp · March 24, 2020, 10:12am

Hi,

Can you provide the LXD version you are running, and the output of lxc storage list and lvs please.

tomp · March 24, 2020, 10:14am

Can you also provide the error messages you refer to as well please.

eim · March 24, 2020, 10:29am

Inside the LXC container I get, for example:

rm: cannot remove './1424246857/p9040035.jpg': Bad message

on the LXD host:

lxd --version
3.0.3

lxc storage list and lvs can be executed anymore at this point, sorry

tomp · March 24, 2020, 10:39am

What happens when you run those commands, what error do you get?

eim · March 24, 2020, 11:08am

Error: Get http://unix.socket/1.0: EOF

the thing is, in case we reach disk size limit can this be an issue for LVM so eventually better move to btrfs which maybe handles such cases better? Just an idea, thanks

tomp · March 24, 2020, 11:15am

You mentioned dm mapper errors, I’d like to see them. Also lvs command isn’t related to LXD and so should run even if LXD won’t start, what error do you get when you run lvs?

eim · March 24, 2020, 11:17am

lxd.service - LXD - main daemon
   Loaded: loaded (/lib/systemd/system/lxd.service; indirect; vendor preset: enabled)
   Active: activating (start-post) (Result: exit-code) since Tue 2020-03-24 11:08:01 UTC; 3min 4s ago
     Docs: man:lxd(1)
  Process: 1943 ExecStart=/usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log (code=exited, status=1/FAILURE)
  Process: 1924 ExecStartPre=/usr/lib/x86_64-linux-gnu/lxc/lxc-apparmor-load (code=exited, status=0/SUCCESS)
 Main PID: 1943 (code=exited, status=1/FAILURE); Control PID: 1944 (lxd)
    Tasks: 6
   CGroup: /system.slice/lxd.service
           └─1944 /usr/lib/lxd/lxd waitready --timeout=600

Mar 24 11:08:01 lxd-01 systemd[1]: lxd.service: Service hold-off time over, scheduling restart.
Mar 24 11:08:01 lxd-01 systemd[1]: lxd.service: Scheduled restart job, restart counter is at 4.
Mar 24 11:08:01 lxd-01 systemd[1]: Stopped LXD - main daemon.
Mar 24 11:08:01 lxd-01 systemd[1]: Starting LXD - main daemon...
Mar 24 11:08:01 lxd-01 lxd[1943]: t=2020-03-24T11:08:01+0000 lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored."
Mar 24 11:08:11 lxd-01 lxd[1943]: t=2020-03-24T11:08:11+0000 lvl=eror msg="Failed to start the daemon: could not activate volume group \"default\":   Volume group \"default\" not found\n  Ca
Mar 24 11:08:11 lxd-01 lxd[1943]: Error: could not activate volume group "default":   Volume group "default" not found
Mar 24 11:08:11 lxd-01 lxd[1943]:   Cannot process volume group default
Mar 24 11:08:11 lxd-01 systemd[1]: lxd.service: Main process exited, code=exited, status=1/FAILURE

lxd init also does not work anymore, I purge lxd and try to start over from scratch with btrfs.

eim · March 24, 2020, 11:36am

when I do head -c 10GB /dev/urandom > test.txt inside the container to fill up the whole disk space, when using btrfs the rest of the system (including LXD host) looks good, just out of disk space notice inside the container, as it should be – but no dm-setup issues like with LVM, eventually I need to double check this test again with LVM also.

tomp · March 24, 2020, 11:39am

Yes if you can get a reproducer with steps then that would be great to diagnose and fix the issue. Thanks

eim · March 24, 2020, 11:50am

this is with LVM:

Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.310882] Buffer I/O error on device dm-6, log
ical block 1390596 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.312657] Buffer I/O error on device dm-6, log
ical block 1390597 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.314489] Buffer I/O error on device dm-6, log
ical block 1390598 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.316356] Buffer I/O error on device dm-6, log
ical block 1390599 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.318215] Buffer I/O error on device dm-6, log
ical block 1390600 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.320366] Buffer I/O error on device dm-6, log
ical block 1390601 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.322347] EXT4-fs warning (device dm-6): ext4_
end_bio:323: I/O error 3 writing to inode 403042 (offset 4823449600 size 8388608 starting bloc
k 1391600) 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.322677] EXT4-fs warning (device dm-6): ext4_
end_bio:323: I/O error 3 writing to inode 403042 (offset 4840226816 size 6676480 starting bloc
k 1395696) 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.322829] EXT4-fs warning (device dm-6): ext4_
end_bio:323: I/O error 3 writing to inode 403042 (offset 4840226816 size 6676480 starting bloc
k 1396208) 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.322968] EXT4-fs warning (device dm-6): ext4_
end_bio:323: I/O error 3 writing to inode 403042 (offset 4840226816 size 6676480 starting bloc
k 1396304) 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.323001] EXT4-fs warning (device dm-6): ext4_
end_bio:323: I/O error 3 writing to inode 403042 (offset 4823449600 size 8388608 starting bloc
k 1392112) 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.323183] EXT4-fs warning (device dm-6): ext4_
end_bio:323: I/O error 3 writing to inode 403042 (offset 4823449600 size 8388608 starting bloc
k 1392624) 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.323380] EXT4-fs warning (device dm-6): ext4_
end_bio:323: I/O error 3 writing to inode 403042 (offset 4831838208 size 8388608 starting bloc
k 1393136) 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.323565] EXT4-fs warning (device dm-6): ext4_
end_bio:323: I/O error 3 writing to inode 403042 (offset 4831838208 size 8388608 starting bloc
k 1393648) 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.323724] EXT4-fs warning (device dm-6): ext4_
end_bio:323: I/O error 3 writing to inode 403042 (offset 4831838208 size 8388608 starting bloc
k 1394160) 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.337864] JBD2: Detected IO errors while flush
ing file data on dm-6-8 
Mar 24 11:49:30 floads-prod-lxd-01 kernel: [  398.355874] JBD2: Detected IO errors while flush
ing file data on dm-6-8

eim · March 24, 2020, 12:33pm

after reboot containers don’t come up again also.

tomp · March 24, 2020, 1:04pm

So this is using a normal LVM storage pool on a loopback image, i.e lxc storage create lvm lvm?

Thanks
Tom

eim · March 24, 2020, 1:23pm

Exactly, btrfs does not seem to be affected of this issue.

tomp · March 31, 2020, 3:58pm

I’ve tried this on LXD 3.23 and cannot re-create it, are you able to try in on LXD 3.23?

Thanks