I have a pair of LXD servers set up with a number of containers on server A which refresh to server B every night using:
lxc copy --refresh container serverb:container --storage=default --refresh --mode=relay
They have been functioning well for a considerable time (the lxd team has helped in the past with previous problems - thanks!).
In the last few days or so (I’m guessing coinciding with the release of 5.7?) I’ve suddenly started getting what appear to be random send failures, there doesn’t seem to be any pattern to which containers get hit.
Here’s a few from last night…
Error: Error transferring instance data: Failed sending volume gogs/2022-10-27-daily:/: Btrfs send failed: [signal: killed write unix /var/snap/lxd/common/lxd/unix.socket->@: write: broken pipe] (At subvol /var/snap/lxd/common/lxd/storage-pools/srv/containers-snapshots/gogs/2022-10-27-daily
Error: Error transferring instance data: Failed sending volume librenms/2022-10-26-daily:/: Btrfs send failed: [signal: killed write unix /var/snap/lxd/common/lxd/unix.socket->@: write: broken pipe] (At subvol /var/snap/lxd/common/lxd/storage-pools/srv/containers-snapshots/librenms/2022-10-26-daily
Error: Error transferring instance data: Failed sending volume media/2022-10-27-daily:/: Btrfs send failed: [signal: killed write unix /var/snap/lxd/common/lxd/unix.socket->@: write: broken pipe] (At subvol /var/snap/lxd/common/lxd/storage-pools/srv/containers-snapshots/media/2022-10-27-daily
Error: Error transferring instance data: Failed sending volume unifi/2022-10-26-daily:/: Btrfs send failed: [signal: killed write unix /var/snap/lxd/common/lxd/unix.socket->@: write: broken pipe] (At subvol /var/snap/lxd/common/lxd/storage-pools/srv/containers-snapshots/unifi/2022-10-26-daily
Error: Error transferring instance data: Failed sending volume wireguard/2022-10-27-daily:/: Btrfs send failed: [signal: killed write unix /var/snap/lxd/common/lxd/unix.socket->@: write: broken pipe] (At subvol /var/snap/lxd/common/lxd/storage-pools/srv/containers-snapshots/wireguard/2022-10-27-daily
Those were mostly from the snapshots but I’m also getting the errors on the master containers.
Just to check it wasn’t some form of random corruption I removed one of the containers completely from the destination server…
lxc delete serverb:jitsi
Stopped it locally and attempted to copy it across fresh…
lxc stop jitsi
lxc copy jitsi serverb:jitsi --storage=default --mode=relay
It ran for a while then…
Error: Error transferring instance data: Failed sending volume jitsi/2022-09-01-monthly:/: Btrfs send failed: [signal: killed write unix /var/snap/lxd/common/lxd/unix.socket->@: write: broken pipe] (At subvol /var/snap/lxd/common/lxd/storage-pools/srv/containers-snapshots/jitsi/2022-09-01-monthly
)
Fetching the log on all the containers I’ve checked so far has the same errors…
lxc info jitsi --show-log
.
.
Log:
lxc jitsi 20221027163309.418 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc jitsi 20221027163309.418 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc jitsi 20221027163309.420 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc jitsi 20221027163309.420 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc jitsi 20221027163309.421 WARN cgfsng - ../src/src/lxc/cgroups/cgfsng.c:fchowmodat:1611 - No such file or directory - Failed to fchownat(40, memory.o
om.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
Fetching /var/snap/lxd/common/lxd/logs/lxd.log shows lots of instances of the following…
time="2022-10-27T16:05:52+01:00" level=error msg="Failed extracting TCP connection from remote connection" err="Connection is not a net.TCPConn"
and
time="2022-10-27T16:06:26+01:00" level=error msg="Migration failed on source" err="Failed sending volume jitsi/2022-09-01-monthly:/: Btrfs send failed: [sig
nal: killed write unix /var/snap/lxd/common/lxd/unix.socket->@: write: broken pipe] (At subvol /var/snap/lxd/common/lxd/storage-pools/srv/containers-snapsho
ts/jitsi/2022-09-01-monthly\n)" instance=jitsi project="{{map[features.images:true features.networks:true features.profiles:true features.storage.buckets:tr
ue features.storage.volumes:true] Default LXD project} default []}"
Both machines are Ubuntu 20.04.5 LTS running LXD 5.7 on top of BTRFS
Any help would be appreciated…