RBD issue when issuing LXC move

This is a 3 node LXD cluster via snap version 3.8. In this case, the prometh container has been stopped. I’m just trying to rename it. Any guidance to clear this up? I made a copy of prometh called prometh-tmp. Was able to move it to another node with no issues.

root@lxd2-a:/home/choyle# lxc move prometh prometh-bak
Error: Failed to run: rbd --id admin --cluster ceph --pool rbd unmap container_prometh: rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy
root@lxd2-a:/home/choyle#

Also tried

root@lxd2-a:/dev/rbd/rbd# rbd unmap container_prometh
rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy

This suggests the rbd device is still active, most likely because of a mount of some kind.

If you can figure out the rbd device number, you could then grep for it in /proc/*/mountinfo which would tell you what process is keeping the mount active, preventing it from getting unmapped.

I used “rbd showmapped” to get the rbd#, so I grepped

root@lxd2-a:/dev/rbd/rbd# grep rbd2 /proc/*/mountinfo
/proc/2256841/mountinfo:187 713 250:32 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/prometh rw,relatime - ext4 /dev/rbd2 rw,discard,stripe=1024,data=ordered
/proc/2257062/mountinfo:187 713 250:32 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/prometh rw,relatime - ext4 /dev/rbd2 rw,discard,stripe=1024,data=ordered
/proc/2257748/mountinfo:187 713 250:32 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/prometh rw,relatime - ext4 /dev/rbd2 rw,discard,stripe=1024,data=ordered
/proc/2798897/mountinfo:476 812 250:32 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/prometh rw,relatime - ext4 /dev/rbd2 rw,discard,stripe=1024,data=ordered
/proc/3874784/mountinfo:476 812 250:32 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/prometh rw,relatime - ext4 /dev/rbd2 rw,discard,stripe=1024,data=ordered

Any guidance on which process to kill without impacting something else?

Unlikely to be a process, more likely to just be a mount namespace reference, so doing something like:

  • nsenter -t 2256841 -m – umount /var/snap/lxd/common/lxd/storage-pools/remote/containers/prometheus

Should succeed and running your grep again should be empty, letting you unmap the rbd device.

I pushed a change to the LXD snap last week which should help preventing such issues in the future, at least for the cases that we fully understand now.

Sorry for the necro but I’m having the exact same issue.

#grep "rbd1 rw" /proc/*/mountinfo 2>/dev/null
/proc/3517630/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3517655/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3517692/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3537155/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3537175/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3554782/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3555055/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3555152/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3555169/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3556119/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3556346/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3556408/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3556722/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3556809/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3556919/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16
/proc/3556973/mountinfo:360 348 252:16 / /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us rw,relatime shared:76 - ext4 /dev/rbd1 rw,discard,stripe=16

Sadly for me the nsenter command does not appear to be doing anything.

# nsenter -t 3517630 -m - umount /var/snap/lxd/common/lxd/storage-pools/remote/containers/among-us
nsenter: failed to execute -: No such file or directory

Any other way to solve this?

Remove the dash between the PID and umount

Thanks that did the trick!

Is there anything you know of that could trigger this since I seem to get this consistently, some things I should perhaps not be doing?