Hi,
I am looking for advice on how to get recover my system. btrfs-transacti uses 100% cpu and containers and lxc commands are unresponsive. I can access the host, although reboot and shutdown don’t work (they hang up with a “soft lockup” error). In the past, I have experienced this issue (the cause is a snapshot of a Container running home-assistant) … and by waiting it out … after a few hours, the process would finish, and I would be able to work normally again. This time, I have left the process running for 24 hrs with no success.
System is Ubuntu 17.04 with mixed disk structure (LVM raid array for OS, XFS raid 1 for some media storage, and 2 disks formatted with Btrfs. This was setup while learning (quite a while ago), so can’t confirm the storage setup is proper - the goal was to have the containers on a btrfs storage pool. My fstab mounts a UUID to a /media/btrfs … which is where I think I pointed the storage pool to.
I’ve hard reset the system. When it boots, the btrfs-transacti starts again on its own. LXC list command works and shows all the containers as stopped. If I try to start one, the command hangs, and I can no longer perform any lxc commands.
Any thoughts on how to get my containers working again? Thanks in advance!
Some log info:
dmesg:
13.856265] Btrfs loaded, crc32c=crc32c-generic
[ 27.140411] blk_update_request: I/O error, dev fd0, sector 0
[ 27.144318] floppy: error -5 while reading block 0
[ 27.355362] BTRFS: device fsid 416c708e-381b-45a9-85a3-f8461fb16e26 devid 2 transid 1329485 /dev/sdd
[ 27.360963] BTRFS: device fsid 416c708e-381b-45a9-85a3-f8461fb16e26 devid 1 transid 1329485 /dev/sdc
…
[ 40.790528] BTRFS info (device sdc): disk space caching is enabled
[ 40.790530] BTRFS info (device sdc): has skinny extents
…
[ 103.503035] BTRFS info (device sdc): The free space cache file (16135487488) is invalid. skip it
[ 504.322564] perf: interrupt took too long (2520 > 2500), lowering kernel.perf_event_max_sample_rate to 79250
[ 675.721644] perf: interrupt took too long (3151 > 3150), lowering kernel.perf_event_max_sample_rate to 63250
[ 972.851764] perf: interrupt took too long (3948 > 3938), lowering kernel.perf_event_max_sample_rate to 50500
[ 1593.358776] perf: interrupt took too long (4936 > 4935), lowering kernel.perf_event_max_sample_rate to 40500
ps fauxx
root 2194 0.0 0.0 0 0 ? S Apr10 0:01 _ [btrfs-cleaner]
root 2195 98.9 0.0 0 0 ? R Apr10 647:10 _ [btrfs-transacti]