Ceph rbd exclusive-lock

joostn · November 23, 2021, 9:45am

Hi,

First of all thanks for creating lxd, it’s awesome!!

I have set up a 3 node lxd cluster with ceph storage backend. I’ve been trying to get automatic failover by running a simple script that checks ‘lxc cluster ls’. Any containers on a stopped host will be in Error state and my script then just does lxc move && lxc start.

This works mostly, but I’ve ended up with a corrupted filesystem in several containers. I’ve not been able to reproduce the problem, but I can imagine this happens when the lxd cluster communication fails while the individual nodes can still access Ceph. Without proper fencing two nodes may write to the same rbd.

I’m new to ceph, but I noticed it has an exclusive-locks feature:
https://docs.ceph.com/en/latest/rbd/rbd-exclusive-locks/

but lxd doesn’t seem to enable it on the filesystem:

root@miles:~# rbd info lxdpool/container_fileserver
    rbd image 'container_fileserver':
            size 19 GiB in 4769 objects
            order 22 (4 MiB objects)
            snapshot_count: 21
            id: 15452f1b586be
            block_name_prefix: rbd_data.15452f1b586be
            format: 2
            features: layering
            op_features:
            flags:
            create_timestamp: Tue Nov  2 11:46:36 2021
            access_timestamp: Tue Nov  2 11:46:36 2021
            modify_timestamp: Tue Nov  2 11:46:36 2021
            parent: lxdpool/zombie_image_7e68080daefdc36d8d7448a29f37bacd9f933f5c99b8556138796ecd38e7f91c_ext4@readonly
            overlap: 9.3 GiB

I can enable it manually:
rbd feature enable lxdpool/container_fileserver exclusive-lock
and this doesn’t seem to affect lxd.

Would this prevent potential problems? Shouldn’t lxd enable the exclusive-lock feature on new containers?

Thanks
Joost

joostn · November 23, 2021, 9:58am

BTW your outgoing mail server doesn’t have PTR records, which may cause forum notifications to bounce:

Nov 22 13:38:15 incomingmail postfix/smtpd[718970]: NOQUEUE: reject: RCPT from unknown[2602:fc62:a:1003:216:3eff:fea3:3fe]: 450 4.7.25 Client host rejected: cannot find your hostname, [2602:fc62:a:1003:216:3eff:fea3:3fe]; from=noreply@discuss.linuxcontainers.org to=joost@xxxxx proto=ESMTP helo=<postfix03.core.dcmtl.stgraber.net>

I’ve whitelisted it, but you might want to know!

tomp · November 23, 2021, 12:28pm

@stgraber any thoughts on both the ceph and PTR issue?

stgraber · November 23, 2021, 3:00pm

Ah yeah, I was rolling out the new LXD DNS feature to handle those records but hit a snag. Just put manual records back in place for now, should be live in the next hour.