LXD/CEPH integration fails to provide a valid IP address

I have a LXD cluster setup UP and running with containers that get a proper IP address from the DHCP. Containers are test containers created with. Nothing inside of them

lxc launch ubuntu:j c2204-14
root@node14:~# lxc cluster list
+--------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
|  NAME  |          URL           |      ROLES       | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATE  |      MESSAGE      |
+--------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| node14 | https://10.3.4.14:9443 |                  | x86_64       | default        |             | ONLINE | Fully operational |
+--------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| node15 | https://10.3.4.15:9443 | database-leader  | x86_64       | default        |             | ONLINE | Fully operational |
|        |                        | database         |              |                |             |        |                   |
+--------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| node16 | https://10.3.4.16:9443 | database-standby | x86_64       | default        |             | ONLINE | Fully operational |
+--------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| node17 | https://10.3.4.17:9443 | database         | x86_64       | default        |             | ONLINE | Fully operational |
+--------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| node18 | https://10.3.4.18:9443 | database         | x86_64       | default        |             | ONLINE | Fully operational |
+--------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+
| node19 | https://10.3.4.19:9443 | database-standby | x86_64       | default        |             | ONLINE | Fully operational |
+--------+------------------------+------------------+--------------+----------------+-------------+--------+-------------------+

root@node14:~# lxc list
+----------+---------+-------------------+------+-----------+-----------+----------+
|   NAME   |  STATE  |       IPV4        | IPV6 |   TYPE    | SNAPSHOTS | LOCATION |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c1804-14 | RUNNING | 10.3.4.192 (eth0) |      | CONTAINER | 0         | node14   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-14 | RUNNING | 10.3.4.176 (eth0) |      | CONTAINER | 0         | node14   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-15 | RUNNING | 10.3.4.179 (eth0) |      | CONTAINER | 0         | node15   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-16 | RUNNING | 10.3.4.191 (eth0) |      | CONTAINER | 0         | node16   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-17 | RUNNING | 10.3.4.181 (eth0) |      | CONTAINER | 0         | node17   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-18 | RUNNING | 10.3.4.182 (eth0) |      | CONTAINER | 0         | node18   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-19 | RUNNING | 10.3.4.185 (eth0) |      | CONTAINER | 0         | node19   |
+----------+---------+-------------------+------+-----------+-----------+----------+

I can reboot normally, and everything OK… containers keep getting the IP address from the DHCP

then I install some ceph packages for LXD/CEPH deployment

pdsh -R ssh -w 10.3.4.[14-19] apt install -y podman python3 cephadm ceph-common ceph-volume ceph-osd ceph-iscsi

after reboot, the containers don’t get any IP address.

root@node14:~# lxc list
+----------+---------+------+------+-----------+-----------+----------+
|   NAME   |  STATE  | IPV4 | IPV6 |   TYPE    | SNAPSHOTS | LOCATION |
+----------+---------+------+------+-----------+-----------+----------+
| c1804-14 | RUNNING |      |      | CONTAINER | 0         | node14   |
+----------+---------+------+------+-----------+-----------+----------+
| c2204-14 | RUNNING |      |      | CONTAINER | 0         | node14   |
+----------+---------+------+------+-----------+-----------+----------+
| c2204-15 | RUNNING |      |      | CONTAINER | 0         | node15   |
+----------+---------+------+------+-----------+-----------+----------+
| c2204-16 | RUNNING |      |      | CONTAINER | 0         | node16   |
+----------+---------+------+------+-----------+-----------+----------+
| c2204-17 | RUNNING |      |      | CONTAINER | 0         | node17   |
+----------+---------+------+------+-----------+-----------+----------+
| c2204-18 | RUNNING |      |      | CONTAINER | 0         | node18   |
+----------+---------+------+------+-----------+-----------+----------+
| c2204-19 | RUNNING |      |      | CONTAINER | 0         | node19   |
+----------+---------+------+------+-----------+-----------+----------+

If Rollback to a previous snapshot and reboot

root@node14:~# pdsh -R ssh -w 10.3.4.[14-19] zfs rollback rpool/ROOT/zfsroot@20230119-154205-lxd \&\& reboot

I get IP addresses again!!

root@node14:~# lxc list
+----------+---------+-------------------+------+-----------+-----------+----------+
|   NAME   |  STATE  |       IPV4        | IPV6 |   TYPE    | SNAPSHOTS | LOCATION |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c1804-14 | RUNNING | 10.3.4.192 (eth0) |      | CONTAINER | 0         | node14   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-14 | RUNNING | 10.3.4.176 (eth0) |      | CONTAINER | 0         | node14   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-15 | RUNNING | 10.3.4.179 (eth0) |      | CONTAINER | 0         | node15   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-16 | RUNNING | 10.3.4.191 (eth0) |      | CONTAINER | 0         | node16   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-17 | RUNNING | 10.3.4.181 (eth0) |      | CONTAINER | 0         | node17   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-18 | RUNNING | 10.3.4.182 (eth0) |      | CONTAINER | 0         | node18   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-19 | RUNNING | 10.3.4.185 (eth0) |      | CONTAINER | 0         | node19   |
+----------+---------+-------------------+------+-----------+-----------+----------+
root@node14:~#

Then I installed the ceph software in only one host, and voila, only that host doesn’t get any IP addresses

root@node14:~# lxc list
+----------+---------+-------------------+------+-----------+-----------+----------+
|   NAME   |  STATE  |       IPV4        | IPV6 |   TYPE    | SNAPSHOTS | LOCATION |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c1804-14 | RUNNING |                   |      | CONTAINER | 0         | node14   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-14 | RUNNING |                   |      | CONTAINER | 0         | node14   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-15 | RUNNING | 10.3.4.179 (eth0) |      | CONTAINER | 0         | node15   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-16 | RUNNING | 10.3.4.191 (eth0) |      | CONTAINER | 0         | node16   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-17 | RUNNING | 10.3.4.181 (eth0) |      | CONTAINER | 0         | node17   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-18 | RUNNING | 10.3.4.182 (eth0) |      | CONTAINER | 0         | node18   |
+----------+---------+-------------------+------+-----------+-----------+----------+
| c2204-19 | RUNNING | 10.3.4.185 (eth0) |      | CONTAINER | 0         | node19   |
+----------+---------+-------------------+------+-----------+-----------+----------+
root@node14:~# 

Any clue on how to look into this?

These are bare metal deployed with MAAS, Ubuntu 22.04 latest version and apt-updated/upgraded today before installing the CEPH replaced packages.

LXD snap updated to the latest version of today 5.10

I’ve tried both ways. First installing the CEPH packages, then installing LXD and first LXD then the CEPH packages and both ways it fails to provide an IP address to the LXD containers

root@node14:~# lsb_release -a

No LSB modules are available.

Distributor ID: Ubuntu

Description: Ubuntu 22.04.1 LTS

Release: 22.04

Codename: jammy

root@node14:~#

root@node14:~# uname -a

Linux node14 5.15.0-58-generic #64-Ubuntu SMP Thu Jan 5 11:43:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

root@node14:~#

root@node14:~# apt update && apt upgrade -y

Hit:1 http://archive.ubuntu.com/ubuntu jammy InRelease

Hit:2 http://archive.ubuntu.com/ubuntu jammy-updates InRelease

Hit:3 http://archive.ubuntu.com/ubuntu jammy-security InRelease

Hit:4 http://archive.ubuntu.com/ubuntu jammy-backports InRelease

Reading package lists... Done

Building dependency tree... Done

Reading state information... Done

1 package can be upgraded. Run 'apt list --upgradable' to see it.

Reading package lists... Done

Building dependency tree... Done

Reading state information... Done

Calculating upgrade... Done

#

# News about significant security updates, features and services will

# appear here to raise awareness and perhaps tease /r/Linux ;)

# Use 'pro config set apt_news=false' to hide this and future APT news.

#

The following packages have been kept back:

update-notifier-common

0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.

root@node14:~#

I suspect its because Ceph now installs Docker and that modifies the system firewall to block DHCP from LXD containers.

See https://linuxcontainers.org/lxd/docs/master/howto/network_bridge_firewalld/#prevent-issues-with-lxd-and-docker

Tried with both podman and docker… same behavior. Thanks… I’ll have a look at that

As I said, I tried with both podman and docker. but as @tomp says

pdsh -R ssh -w 10.3.4.[14-19] apt install -y podman python3 cephadm ceph-common ceph-volume ceph-osd ceph-iscsi

ALSO installs docker by default… so both podman and docker were installed

pdsh -R ssh -w 10.3.4.[14-19] apt remove -y docker.io

solved the issue. Staying with podman for cephadm

in another environment I have I had installed podman beforehand… and probably docker was not installed

thanks

1 Like