A few weeks ago, I had lots of partially corrupted files on my pool. I explained everything in details here:
I never knew how it happened in the first place, but after migrating all my containers to a knew server, everything was fine for nearly 2 months.
Yesterday, I apply APT updates to all my containers and VMs, and then reboot the hosts (without powering off the containers beforehand).
Everything is fine, but then a data import to Elasticsearch fails because of an
Input/output error on PostgreSQL files…
And here we go again:
root@kokoro ~# zpool status -v pool: zpool-lxd state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://zfsonlinux.org/msg/ZFS-8000-8A scan: scrub repaired 0B in 0h22m with 0 errors on Sun Apr 14 00:46:21 2019 config: NAME STATE READ WRITE CKSUM zpool-lxd DEGRADED 0 0 173 sda3 DEGRADED 0 0 346 too many errors errors: Permanent errors have been detected in the following files: /var/snap/lxd/common/lxd/storage-pools/zpool-lxd/containers/elasticsearch/rootfs/usr/share/kibana/src/legacy/core_plugins/kibana/public/discover/components/field_chooser/lib/detail_views/string.html /var/snap/lxd/common/lxd/storage-pools/zpool-lxd/containers/elasticsearch/rootfs/usr/share/kibana/node_modules/@babel/core/node_modules/lodash/xorWith.js /var/snap/lxd/common/lxd/storage-pools/zpool-lxd/containers/elasticsearch/rootfs/usr/share/kibana/node_modules/graphql-extensions/node_modules/core-js/modules/_object-dp.js /var/snap/lxd/common/lxd/storage-pools/zpool-lxd/containers/postgresql/rootfs/var/lib/postgresql/11/main/base/49368/73649 /var/snap/lxd/common/lxd/storage-pools/zpool-lxd/containers/postgresql/rootfs/var/lib/postgresql/11/main/base/49368/49505
I assume this is related to the reboot, but I’m not sure.
This is extremely worrying and if it’s not my fault then there is a serious issues with LXD or ZFS… I don’t know what I’m going to do, I guess I’ll end up using the
I don’t expect to recover the files, they probably only have some blocks corrupted, but I really don’t want this to happen again, so I’ll take any help on this.