LXD containers not getting IP address

Hi everyone,

I already opened a question on askubuntu, but since I am getting no reply, I though that this might be a better place to ask the experts!

Basically I am on a bare-metal (desktop) Ubuntu 16.04 and followed this tutorial to get started with LXD. My question on askubuntu contains some more detail, which I wont repeat here to keep this post short.

Since I posted on askubuntu, I could make some more experiments. My suspect is that it has something to do with my firewall. However, I tried every possible combination, could get the IP in one case, and could never reproduce this lucky situation.

This is how my firewall looks at the moment: it basically contains only the rules generated by lxd-bridge:

# Generated by iptables-save v1.6.0 on Tue May  1 14:49:13 2018
*mangle
:PREROUTING ACCEPT [904:1109612]
:INPUT ACCEPT [904:1109612]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [624:48362]
:POSTROUTING ACCEPT [633:50692]
-A POSTROUTING -o lxdbr0 -p udp -m udp --dport 68 -m comment --comment "managed by lxd-bridge" -j CHECKSUM --checksum-fill
COMMIT
# Completed on Tue May  1 14:49:13 2018
# Generated by iptables-save v1.6.0 on Tue May  1 14:49:13 2018
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [41:3699]
:POSTROUTING ACCEPT [39:3586]
-A POSTROUTING -s 10.12.8.0/24 ! -d 10.12.8.0/24 -m comment --comment "managed by lxd-bridge" -j MASQUERADE
COMMIT
# Completed on Tue May  1 14:49:13 2018
# Generated by iptables-save v1.6.0 on Tue May  1 14:49:13 2018
*filter
:INPUT ACCEPT [904:1109612]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [624:48362]
-A INPUT -i lxdbr0 -p tcp -m tcp --dport 53 -m comment --comment "managed by lxd-bridge" -j ACCEPT
-A INPUT -i lxdbr0 -p udp -m udp --dport 53 -m comment --comment "managed by lxd-bridge" -j ACCEPT
-A INPUT -i lxdbr0 -p udp -m udp --dport 67 -m comment --comment "managed by lxd-bridge" -j ACCEPT
-A FORWARD -o lxdbr0 -m comment --comment "managed by lxd-bridge" -j ACCEPT
-A FORWARD -i lxdbr0 -m comment --comment "managed by lxd-bridge" -j ACCEPT
COMMIT
# Completed on Tue May  1 14:49:13 2018

After clearing the firewall a few times, and restarting all lxd services, I could get this:

+-------+---------+--------------------+------+------------+-----------+
| NAME  |  STATE  |        IPV4        | IPV6 |    TYPE    | SNAPSHOTS |
+-------+---------+--------------------+------+------------+-----------+
| test  | RUNNING | 10.12.8.107 (eth0) |      | PERSISTENT | 0         |
+-------+---------+--------------------+------+------------+-----------+
| test2 | RUNNING |                    |      | PERSISTENT | 0         |
+-------+---------+--------------------+------+------------+-----------+

Of course both containers have the same configuration and profile, and it is funny to me that one is “privileged” by getting an IP and the other is not.

It would be great if someone could give some low-level hint on how to debug this.

1 Like

Hi!

I notice that (from the more info at the askubuntu question) the nictype in the profile is ‘macvlan’. Such a setting would make the container to try to get some public IP address from the network, from the interface ‘eth0’ of the host.
The defaults should be nictype: bridged, parent: lxdbr0.
See, for example, at
https://blog.simos.info/how-to-make-your-lxd-container-get-ip-addresses-from-your-lan/
for the defaults.

Note that the common issue in other reports is that a container may not get configured to get a private IP address automatically from LXD’s internal DHCP server.

Simos, thank you for your quick reply.

I have since experimented with different settings in the profile and finally settled on these, which are the only ones I can get an IP with (sometimes):

config:
  environment.http_proxy: ""
  user.network_mode: link-local
description: Default LXD profile
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
name: default
used_by: []

If I am not mistaken, these are pretty much the default settings by LXD.

Note that the common issue in other reports is that a container may not get configured to get a private IP address automatically from LXD’s internal DHCP server.

Can you suggest what is the “common solution”? :slight_smile:
The test with the macvlan nic was exactly to address this, but I am probably missing some other config.

Full disclosure: I am not trying to make the containers available from LAN, nor from the Internet, which I have seen many questions for. I am perfectly happy with a “local” private network as long as

  • the container and the host can talk to each other;
  • the container can initiate downloads from the Internet (eg. apt upgrade);
  • bonus if the containers can talk to each other (which should be the case, as far as I understand).

The default configuration should work all the time, and if there are any situations where it randomly does not work, then it’s a bug that must be resolved.

There are cases where it does not work, and these go as entries to a Troubleshooting section.
For example, if you create a container before you run lxd init, then lxdbr0 does not exist, and that container will not be able to get proper network configuration.

In the local private network (nic: bridged, parent: lxdbr0), the containers and the host can all talk to each other. The containers can talk between themselves using hostnames like container1.lxd, container2.lxd and so on. The host cannot use those hostnames (i.e. cannot ping container1.lxd from the host) unless you do some configuration on the host (DNS service). Personally, I would avoid setting up services on the host, and just get any Internet connection directed to a container. For example, have a reverse proxy in a container, and then that container will be directing the connections to the appropriate website.

In the local private network (nic: bridged, parent: lxdbr0), the containers have access to the Internet and can make downloads.

Can you show:

  • lxc config show --expanded test
  • lxc config show --expanded test2
  • lxc exec test – ps fauxww
  • lxc exec test2 – ps fauxww
  • dmesg
  • uname -a

Thank you both for your messages: I continued my experiments and I have some more info to share.

The default configuration should work all the time, and if there are any situations where it randomly does not work, then it’s a bug that must be resolved.

I am starting to think that this is the case: I now have two “real” (bare-metal) machines configured identically, these are my workstation and my laptop. Both show the same behavior by which containers do not get IPs.

As a test, on the workstation I installed LXD 3.0 from snap by following these instructions (thank you, stgraber) and
 everything works out of the box. Containers do get an IP, and they can access the internet without problems.

For example, if you create a container before you run lxd init, then lxdbr0 does not exist, and that container will not be able to get proper network configuration.

I tend to exclude cases like this: I followed the instructions in the tutorial very carefully, and never used containers before. I even followed the instructions here to restart the procedure a few times.

Can you show:

Since the snap install works flawlessly, that is enough for my “practical purposes”. However, I think that my system is “pretty standard” (Ubuntu 16.04 with default LXD installation): so I will keep my laptop at the “old” LXD in order to be able to share data with you, and maybe trace down some bug. Here is the info you requested (from laptop):

The problem:

matteo@matteo-laptop:~$ lxc list
+-------+---------+------+------+------------+-----------+
| NAME  |  STATE  | IPV4 | IPV6 |    TYPE    | SNAPSHOTS |
+-------+---------+------+------+------------+-----------+
| test  | RUNNING |      |      | PERSISTENT | 0         |
+-------+---------+------+------+------------+-----------+
| test2 | RUNNING |      |      | PERSISTENT | 0         |
+-------+---------+------+------+------------+-----------+

Container config (note that I reverted the experiments with macvlan and using bridged now, but that does not help):

matteo@matteo-laptop:~$ lxc config show --expanded test
architecture: x86_64
config:
  environment.http_proxy: ""
  user.network_mode: link-local
  volatile.base_image: 353b1a2c367ec983fd9d1532171618cd967e96d77a06f6b6e024c39ec010e8d7
  volatile.eth0.hwaddr: 00:16:3e:1a:04:ba
  volatile.idmap.base: "0"
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":165536,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":165536,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":165536,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":165536,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.power: RUNNING
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

matteo@matteo-laptop:~$ lxc config show --expanded test2
architecture: x86_64
config:
  environment.http_proxy: ""
  user.network_mode: link-local
  volatile.base_image: 353b1a2c367ec983fd9d1532171618cd967e96d77a06f6b6e024c39ec010e8d7
  volatile.eth0.hwaddr: 00:16:3e:0c:6d:d3
  volatile.idmap.base: "0"
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":165536,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":165536,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":165536,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":165536,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.power: RUNNING
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

Processes:

matteo@matteo-laptop:~$ lxc exec test -- ps fauxww
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root       367  0.0  0.0  37760  2076 ?        Rs+  08:10   0:00 ps fauxww
root         1  0.1  0.0  37516  4148 ?        Ss   07:59   0:00 /sbin/init
root        52  0.0  0.0  41724  1816 ?        Ss   07:59   0:00 /lib/systemd/systemd-udevd
root        59  0.0  0.0  35272  4612 ?        Ss   07:59   0:00 /lib/systemd/systemd-journald
root       285  0.0  0.0  20096  1544 ?        Ss   07:59   0:00 /lib/systemd/systemd-logind
root       286  0.0  0.0  27728  1752 ?        Ss   07:59   0:00 /usr/sbin/cron -f
daemon     287  0.0  0.0  26044  1460 ?        Ss   07:59   0:00 /usr/sbin/atd -f
root       288  0.0  0.0  65508  3680 ?        Ss   07:59   0:00 /usr/sbin/sshd -D
syslog     290  0.0  0.0 186896  2320 ?        Ssl  07:59   0:00 /usr/sbin/rsyslogd -n
root       291  0.0  0.0 274488  3980 ?        Ssl  07:59   0:00 /usr/lib/accountsservice/accounts-daemon
message+   293  0.0  0.0  42888  2420 ?        Ss   07:59   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root       304  0.0  0.1 289364 20028 ?        Ssl  07:59   0:00 /usr/lib/snapd/snapd
root       311  0.0  0.0 277176  4184 ?        Ssl  07:59   0:00 /usr/lib/policykit-1/polkitd --no-debug
root       347  0.0  0.0  14472  1388 console  Ss+  07:59   0:00 /sbin/agetty --noclear --keep-baud console 115200 38400 9600 linux

matteo@matteo-laptop:~$ lxc exec test2 -- ps fauxww
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root       410  0.0  0.0  37760  2108 ?        R    08:14   0:00 ps fauxww
root         1  0.0  0.0  37528  4168 ?        Ss   08:07   0:00 /sbin/init
root        53  0.0  0.0  35272  4560 ?        Ss   08:07   0:00 /lib/systemd/systemd-journald
root        59  0.0  0.0  41724  1872 ?        Ss   08:07   0:00 /lib/systemd/systemd-udevd
daemon     286  0.0  0.0  26044  1440 ?        Ss   08:07   0:00 /usr/sbin/atd -f
root       289  0.0  0.0  20096  1540 ?        Ss   08:07   0:00 /lib/systemd/systemd-logind
message+   292  0.0  0.0  42888  2320 ?        Ss   08:07   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root       297  0.0  0.1 222164 20508 ?        Ssl  08:07   0:00 /usr/lib/snapd/snapd
root       298  0.0  0.0  65508  3484 ?        Ss   08:07   0:00 /usr/sbin/sshd -D
root       299  0.0  0.0 272868  3936 ?        Ssl  08:07   0:00 /usr/lib/accountsservice/accounts-daemon
root       300  0.0  0.0  26068  1656 ?        Ss   08:07   0:00 /usr/sbin/cron -f
syslog     301  0.0  0.0 186896  2368 ?        Ssl  08:07   0:00 /usr/sbin/rsyslogd -n
root       319  0.0  0.0 277176  4192 ?        Ssl  08:07   0:00 /usr/lib/policykit-1/polkitd --no-debug
root       338  0.0  0.0  12840  1276 console  Ss+  08:07   0:00 /sbin/agetty --noclear --keep-baud console 115200 38400 9600 linux

Kernel ring (only the last 60 lines, to keep it short):

matteo@matteo-laptop:~$ dmesg | tail -n60
[   23.086051] audit: type=1400 audit(1525334382.221:30): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/bin/lxc-start" pid=2897 comm="apparmor_parser"
[   23.109252] audit: type=1400 audit(1525334382.245:31): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-container-default" pid=2904 comm="apparmor_parser"
[   23.109281] audit: type=1400 audit(1525334382.245:32): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-container-default-cgns" pid=2904 comm="apparmor_parser"
[   23.109302] audit: type=1400 audit(1525334382.245:33): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-container-default-with-mounting" pid=2904 comm="apparmor_parser"
[   23.109321] audit: type=1400 audit(1525334382.245:34): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="lxc-container-default-with-nesting" pid=2904 comm="apparmor_parser"
[   23.328789] audit: type=1400 audit(1525334382.465:35): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-test_</var/lib/lxd>" pid=2989 comm="apparmor_parser"
[   23.439892] eth0: renamed from mcBC61SU
[   23.466854] device lxdbr0 entered promiscuous mode
[   24.491475] audit: type=1400 audit(1525334383.629:36): apparmor="STATUS" operation="profile_load" label="lxd-test_</var/lib/lxd>//&:lxd-test_<var-lib-lxd>://unconfined" name="/usr/bin/lxc-start" pid=3215 comm="apparmor_parser"
[   24.493367] audit: type=1400 audit(1525334383.629:37): apparmor="STATUS" operation="profile_load" label="lxd-test_</var/lib/lxd>//&:lxd-test_<var-lib-lxd>://unconfined" name="/usr/lib/snapd/snap-confine" pid=3217 comm="apparmor_parser"
[   24.493391] audit: type=1400 audit(1525334383.629:38): apparmor="STATUS" operation="profile_load" label="lxd-test_</var/lib/lxd>//&:lxd-test_<var-lib-lxd>://unconfined" name="/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=3217 comm="apparmor_parser"
[   24.495235] audit: type=1400 audit(1525334383.633:39): apparmor="STATUS" operation="profile_load" label="lxd-test_</var/lib/lxd>//&:lxd-test_<var-lib-lxd>://unconfined" name="/usr/lib/lxd/lxd-bridge-proxy" pid=3216 comm="apparmor_parser"
[   27.163652] Loading iSCSI transport class v2.0-870.
[   27.442281] usb 1-1: new low-speed USB device number 5 using xhci_hcd
[   27.574961] usb 1-1: New USB device found, idVendor=046d, idProduct=c069
[   27.574970] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[   27.574977] usb 1-1: Product: USB Laser Mouse
[   27.574982] usb 1-1: Manufacturer: Logitech
[   27.575328] usb 1-1: ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes
[   27.604064] usbcore: registered new interface driver usbhid
[   27.604073] usbhid: USB HID core driver
[   27.612999] input: Logitech USB Laser Mouse as /devices/pci0000:00/0000:00:14.0/usb1/1-1/1-1:1.0/0003:046D:C069.0002/input/input16
[   27.667238] hid-generic 0003:046D:C069.0002: input,hidraw1: USB HID v1.10 Mouse [Logitech USB Laser Mouse] on usb-0000:00:14.0-1/input0
[   30.356687] acer_wmi: Unknown function number - 6 - 1
[  251.159833] usb 1-6: new high-speed USB device number 6 using xhci_hcd
[  251.365367] usb 1-6: New USB device found, idVendor=1058, idProduct=0820
[  251.365378] usb 1-6: New USB device strings: Mfr=1, Product=2, SerialNumber=5
[  251.365384] usb 1-6: Product: My Passport 0820
[  251.365390] usb 1-6: Manufacturer: Western Digital
[  251.365395] usb 1-6: SerialNumber: 575841314539344637435741
[  251.386600] usb-storage 1-6:1.0: USB Mass Storage device detected
[  251.386668] scsi host4: usb-storage 1-6:1.0
[  251.386805] usbcore: registered new interface driver usb-storage
[  251.388607] usbcore: registered new interface driver uas
[  252.384693] scsi 4:0:0:0: Direct-Access     WD       My Passport 0820 1012 PQ: 0 ANSI: 6
[  252.385185] scsi 4:0:0:1: Enclosure         WD       SES Device       1012 PQ: 0 ANSI: 6
[  252.386910] sd 4:0:0:0: Attached scsi generic sg3 type 0
[  252.387386] scsi 4:0:0:1: Attached scsi generic sg4 type 13
[  252.387680] sd 4:0:0:0: [sdc] Spinning up disk...
[  253.391826] .ready
[  258.810806] sd 4:0:0:0: [sdc] 3906963456 512-byte logical blocks: (2.00 TB/1.82 TiB)
[  258.811430] sd 4:0:0:0: [sdc] Write Protect is off
[  258.811439] sd 4:0:0:0: [sdc] Mode Sense: 47 00 10 08
[  258.812012] sd 4:0:0:0: [sdc] No Caching mode page found
[  258.812023] sd 4:0:0:0: [sdc] Assuming drive cache: write through
[  258.818118]  sdc: sdc1 sdc2 sdc3 sdc4
[  258.818906] ses 4:0:0:1: Attached Enclosure device
[  258.820393] sd 4:0:0:0: [sdc] Attached SCSI disk
[  505.599694] audit_printk_skb: 27 callbacks suppressed
[  505.599696] audit: type=1400 audit(1525334864.733:49): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxd-test2_</var/lib/lxd>" pid=4559 comm="apparmor_parser"
[  505.702781] eth0: renamed from mc7CY9QP
[  506.175754] audit: type=1400 audit(1525334865.309:50): apparmor="STATUS" operation="profile_load" label="lxd-test2_</var/lib/lxd>//&:lxd-test2_<var-lib-lxd>://unconfined" name="/usr/lib/lxd/lxd-bridge-proxy" pid=4760 comm="apparmor_parser"
[  506.207618] audit: type=1400 audit(1525334865.341:51): apparmor="STATUS" operation="profile_load" label="lxd-test2_</var/lib/lxd>//&:lxd-test2_<var-lib-lxd>://unconfined" name="/usr/sbin/tcpdump" pid=4761 comm="apparmor_parser"
[  506.260901] audit: type=1400 audit(1525334865.393:52): apparmor="STATUS" operation="profile_load" label="lxd-test2_</var/lib/lxd>//&:lxd-test2_<var-lib-lxd>://unconfined" name="/sbin/dhclient" pid=4759 comm="apparmor_parser"
[  506.261158] audit: type=1400 audit(1525334865.397:53): apparmor="STATUS" operation="profile_load" label="lxd-test2_</var/lib/lxd>//&:lxd-test2_<var-lib-lxd>://unconfined" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=4759 comm="apparmor_parser"
[  506.261394] audit: type=1400 audit(1525334865.397:54): apparmor="STATUS" operation="profile_load" label="lxd-test2_</var/lib/lxd>//&:lxd-test2_<var-lib-lxd>://unconfined" name="/usr/lib/NetworkManager/nm-dhcp-helper" pid=4759 comm="apparmor_parser"
[  506.261621] audit: type=1400 audit(1525334865.397:55): apparmor="STATUS" operation="profile_load" label="lxd-test2_</var/lib/lxd>//&:lxd-test2_<var-lib-lxd>://unconfined" name="/usr/lib/connman/scripts/dhclient-script" pid=4759 comm="apparmor_parser"
[  506.362421] audit: type=1400 audit(1525334865.497:56): apparmor="STATUS" operation="profile_load" label="lxd-test2_</var/lib/lxd>//&:lxd-test2_<var-lib-lxd>://unconfined" name="lxc-container-default" pid=4758 comm="apparmor_parser"
[  506.362757] audit: type=1400 audit(1525334865.497:57): apparmor="STATUS" operation="profile_load" label="lxd-test2_</var/lib/lxd>//&:lxd-test2_<var-lib-lxd>://unconfined" name="lxc-container-default-cgns" pid=4758 comm="apparmor_parser"
[  506.363055] audit: type=1400 audit(1525334865.497:58): apparmor="STATUS" operation="profile_load" label="lxd-test2_</var/lib/lxd>//&:lxd-test2_<var-lib-lxd>://unconfined" name="lxc-container-default-with-mounting" pid=4758 comm="apparmor_parser"

Kernel version:

matteo@matteo-laptop:~$ uname -a
Linux matteo-laptop 4.4.0-121-generic #145-Ubuntu SMP Fri Apr 13 13:47:23 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Totally willing to share more if you are interested! Meanwhile, thanks for the help!

Ok, so LXD thinks that you never configured networking on this system.
Did you run lxd init and configure networking with it?

What’s the output of ifconfig lxdbr0?

The issue you’re having is because of this config key:

  user.network_mode: link-local

Which is set by default until you have properly configured your network. This config key instructs any new container to not attempt to DHCP and to simply boot with no network configured.

You should properly configure your bridge using dpkg-reconfigure -p medium lxd, then try to launch new containers and see if those get connectivity (you should delete you existing ones as that tends to be easier than manually rewriting /etc/network/interfaces inside them).

Did you run lxd init and configure networking with it?

Yes, I am 100% sure about this


What’s the output of ifconfig lxdbr0?

matteo@matteo-laptop:~$ ifconfig lxdbr0
lxdbr0    Link encap:Ethernet  HWaddr 3a:84:94:b2:91:20  
          inet addr:10.91.84.1  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::3884:94ff:feb2:9120/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:44 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:4547 (4.5 KB)

You should properly configure your bridge using dpkg-reconfigure -p medium lxd, then try to launch new containers and see if those get connectivity

I did as you suggested, and I reconfigured the bridge using the whiptail interface (which I recall doing a few times already). To go extra-safe, I rebooted the laptop after this, and then did the following:

List existing containers, out of curiosity. This hangs:

matteo@matteo-laptop:~$ lxc list
^C

Deleted existing containers, and made a new one:

matteo@matteo-laptop:~$ lxc delete test --force
matteo@matteo-laptop:~$ lxc delete test2 --force
matteo@matteo-laptop:~$ lxc launch ubuntu:x test3
Creating test3
Starting test3
matteo@matteo-laptop:~$ lxc list
+-------+---------+------+------+------------+-----------+
| NAME  |  STATE  | IPV4 | IPV6 |    TYPE    | SNAPSHOTS |
+-------+---------+------+------+------------+-----------+
| test3 | RUNNING |      |      | PERSISTENT | 0         |
+-------+---------+------+------+------------+-----------+

Then checked the default profile. link-local is still there, so I replaced it with an empty value "". Then retried a new container:

matteo@matteo-laptop:~$ lxc profile edit default
matteo@matteo-laptop:~$ lxc delete test3 --force
matteo@matteo-laptop:~$ lxc launch ubuntu:x test3
Creating test3
Starting test3
matteo@matteo-laptop:~$ lxc list
+-------+---------+------+------+------------+-----------+
| NAME  |  STATE  | IPV4 | IPV6 |    TYPE    | SNAPSHOTS |
+-------+---------+------+------+------------+-----------+
| test3 | RUNNING |      |      | PERSISTENT | 0         |
+-------+---------+------+------+------------+-----------+

At this point my guess is that dpkg-reconfigure did not reconfigure correctly, but might be wrong. I checked the profile on the “desktop” machine (the one that now works with LXD 3.0), and it indeed looks different:

matteo@matteo-desktop:~$ lxc profile show default
config: {}
description: Default LXD profile
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: lxd4
    type: disk
name: default
used_by:
- /1.0/containers/centos6
- /1.0/containers/saturne

And the config of one container, in case it might be useful:

matteo@matteo-desktop:~$ lxc config show --expanded saturne
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 16.04 LTS amd64 (release) (20180427)
  image.label: release
  image.os: ubuntu
  image.release: xenial
  image.serial: "20180427"
  image.version: "16.04"
  raw.idmap: both 1000 1000
  volatile.base_image: 353b1a2c367ec983fd9d1532171618cd967e96d77a06f6b6e024c39ec010e8d7
  volatile.eth0.hwaddr: 00:16:3e:15:c6:86
  volatile.idmap.base: "0"
  volatile.idmap.next: '[{"Isuid":true,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000},{"Isuid":true,"Isgid":true,"Hostid":1000,"Nsid":1000,"Maprange":1},{"Isuid":true,"Isgid":true,"Hostid":1001001,"Nsid":1001,"Maprange":999998999}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000},{"Isuid":true,"Isgid":true,"Hostid":1000,"Nsid":1000,"Maprange":1},{"Isuid":true,"Isgid":true,"Hostid":1001001,"Nsid":1001,"Maprange":999998999}]'
  volatile.last_state.power: RUNNING
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: lxd4
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

I am not an expert here so this may be all out to lunch. :slight_smile: !

With a working lxd system when I first copy in a new container (I generally copy one container into a new container (name change) because its an easy way to get things quickly to where I want) no ip address shows in $ lxc list . If I think stop the container $ lxc stop test3 (in your instance right now) and then do a $ lxc start test3 I now have a listed ip address. (This may not work for you if there are other issues but it might not hurt to try this - - - imo - - - .)

Can you post ps fauxw | grep dnsmasq and netstat -lnp | grep 53?

Your symptoms are the same as if you had a conflict DHCP/DNS server running on your system.

Here: on the (non-working) laptop first:

matteo@matteo-laptop:~$ ps fauxw | grep dnsmasq
nobody    4220  0.0  0.0  52868  4180 ?        S    22:06   0:00  \_ /usr/sbin/dnsmasq --no-resolv --keep-in-foreground --no-hosts --bind-interfaces --pid-file=/var/run/NetworkManager/dnsmasq.pid --listen-address=127.0.1.1 --cache-size=0 --conf-file=/dev/null --proxy-dnssec --enable-dbus=org.freedesktop.NetworkManager.dnsmasq --conf-dir=/etc/NetworkManager/dnsmasq.d
lxd       2572  0.0  0.0  52868   408 ?        S    22:05   0:00 dnsmasq -s lxd -S /lxd/ -u lxd --strict-order --bind-interfaces --pid-file=/run/lxd-bridge//dnsmasq.pid --dhcp-no-override --except-interface=lo --interface=lxdbr0 --dhcp-leasefile=/var/lib/lxd-bridge//dnsmasq.lxdbr0.leases --dhcp-authoritative --listen-address 10.91.84.1 --dhcp-range 10.91.84.2,10.91.84.254 --dhcp-lease-max=252
matteo@matteo-laptop:~$ sudo netstat -lnp | grep 53
[sudo] password for matteo:
tcp        0      0 127.0.1.1:53            0.0.0.0:*               LISTEN      4220/dnsmasq
tcp        0      0 10.91.84.1:53           0.0.0.0:*               LISTEN      2572/dnsmasq
tcp6       0      0 fe80::b4f8:feff:fe86:53 :::*                    LISTEN      2572/dnsmasq
udp        0      0 0.0.0.0:5353            0.0.0.0:*                           2273/avahi-daemon:
udp        0      0 127.0.1.1:53            0.0.0.0:*                           4220/dnsmasq
udp        0      0 10.91.84.1:53           0.0.0.0:*                           2572/dnsmasq
udp6       0      0 :::5353                 :::*                                2273/avahi-daemon:
udp6       0      0 fe80::b4f8:feff:fe86:53 :::*                                2572/dnsmasq
unix  2      [ ACC ]     SEQPACKET  LISTENING     13353    1/init              /run/udev/control

And on the (working) desktop:

matteo@matteo-desktop:~$ ps fauxw | grep dnsmasq
nobody    5052  0.0  0.0  52872  2928 ?        S    12:34   0:01  \_ /usr/sbin/dnsmasq --no-resolv --keep-in-foreground --no-hosts --bind-interfaces --pid-file=/var/run/NetworkManager/dnsmasq.pid --listen-address=127.0.1.1 --cache-size=0 --conf-file=/dev/null --proxy-dnssec --enable-dbus=org.freedesktop.NetworkManager.dnsmasq --conf-dir=/etc/NetworkManager/dnsmasq.d
lxd       6371  0.0  0.0  49984  2440 ?        S    12:35   0:01 dnsmasq --strict-order --bind-interfaces --pid-file=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.pid --except-interface=lo --interface=lxdbr0 --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.30.164.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-range 10.30.164.2,10.30.164.254,1h -s lxd -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.raw -u lxd
matteo@matteo-desktop:~$ sudo netstat -lnp | grep 53
[sudo] password for matteo:
tcp        0      0 10.30.164.1:53          0.0.0.0:*               LISTEN      6371/dnsmasq
tcp        0      0 127.0.1.1:53            0.0.0.0:*               LISTEN      5052/dnsmasq
tcp6       0      0 fe80::c836:86ff:fe4b:53 :::*                    LISTEN      6371/dnsmasq
udp        0      0 0.0.0.0:5353            0.0.0.0:*                           4545/avahi-daemon:
udp        0      0 10.30.164.1:53          0.0.0.0:*                           6371/dnsmasq
udp        0      0 127.0.1.1:53            0.0.0.0:*                           5052/dnsmasq
udp6       0      0 :::5353                 :::*                                4545/avahi-daemon:
udp6       0      0 fe80::c836:86ff:fe4b:53 :::*                                6371/dnsmasq
unix  2      [ ACC ]     STREAM     LISTENING     49976    8053/dbus-daemon    @/tmp/dbus-3AbJiqvg4d

I do see two DNS processes here, but one seems to belong to the NetworkManager, so it is probably not an issue
 furthermore, the same situation appears on both machines.

Sorry for the long delay, I’ve been traveling back home and then was off yesterday


That all looks good to me. Did you confirm that you have a DHCP client running inside one of the affected containers?

So far things that have been checked are:

  • firewalling
  • dnsmasq running
  • container/profile config

Do not worry at all! Here, it seems like the DHCP client is running, but still no IP


matteo@matteo-laptop:~$ lxc exec test3 -- ps fauxww
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root       233  0.0  0.0  37760  2032 ?        Rs+  14:27   0:00 ps fauxww
root         1  0.6  0.0  37392  3884 ?        Ss   14:24   0:00 /sbin/init
root        57  0.0  0.0  41724  1840 ?        Ss   14:24   0:00 /lib/systemd/systemd-udevd
root        58  0.0  0.0  35272  4368 ?        Ss   14:24   0:00 /lib/systemd/systemd-journald
root       150  0.0  0.0   4412  1216 ?        Ss   14:24   0:00 /sbin/ifup -a --read-environment
root       212  0.0  0.0   4504   664 ?        S    14:24   0:00  \_ /bin/sh -c /sbin/dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases -I -df /var/lib/dhcp/dhclient6.eth0.leases eth0 ?
root       213  0.0  0.0  15996  2516 ?        S    14:24   0:00      \_ /sbin/dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases -I -df /var/lib/dhcp/dhclient6.eth0.leases eth0

So far things that have been checked are:

Yes, I think we went through all of those, and even tried multiple combinations
 I have no clue what the problem can be.

Can you run tcpdump -ni lxdbr0 on the host and see if you get the DHCP requests there?

Looks like the requests are there: two MAC addresses, since I now have two test containers (with the same problem)

matteo@matteo-laptop:~$ sudo tcpdump -ni lxdbr0
[sudo] password for matteo: 
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lxdbr0, link-type EN10MB (Ethernet), capture size 262144 bytes
19:18:15.807692 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:c6:e1:65, length 300
19:18:24.427846 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:c6:e1:65, length 300
19:18:25.927684 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:6f:3b:f6, length 300
19:18:37.803704 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:c6:e1:65, length 300
19:18:40.309507 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:6f:3b:f6, length 300
19:18:40.344268 IP6 fe80::c846:67ff:fe11:7f4.5353 > ff02::fb.5353: 0 [2q] PTR (QM)? _ipp._tcp.local. PTR (QM)? _ipps._tcp.local. (45)
19:18:52.737047 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:6f:3b:f6, length 300
19:18:56.607963 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:c6:e1:65, length 300
19:19:06.198260 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:c6:e1:65, length 300
19:19:06.412253 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:6f:3b:f6, length 300
^C
10 packets captured
10 packets received by filter
0 packets dropped by kernel

Ok, so just somehow never making it to the dnsmasq process listening on that interface


What does iptables -L -n -v show? Mostly wondering if we see anything hitting any of the rules

Another potential for packets going missing would be ebtables, anything in ebtables -L?

My turn to apologize for the delay, got busy at work.

Usually, I have a fairly complicated iptables setup, so that was my first thought.
I did test both with and without it. Here is the current configuration:

matteo@matteo-laptop:~$ sudo iptables -L -n -v
Chain INPUT (policy ACCEPT 585 packets, 178K bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 ACCEPT     tcp  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:53 /* generated for LXD network lxdbr0 */
  667 41414 ACCEPT     udp  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            udp dpt:53 /* generated for LXD network lxdbr0 */
   93 30504 ACCEPT     udp  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            udp dpt:67 /* generated for LXD network lxdbr0 */

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
42909   63M ACCEPT     all  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdbr0 */
27531 1636K ACCEPT     all  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdbr0 */

Chain OUTPUT (policy ACCEPT 1701 packets, 505K bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 ACCEPT     tcp  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            tcp spt:53 /* generated for LXD network lxdbr0 */
  666 66271 ACCEPT     udp  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            udp spt:53 /* generated for LXD network lxdbr0 */
   89 29792 ACCEPT     udp  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            udp spt:67 /* generated for LXD network lxdbr0 */

And:

matteo@matteo-laptop:~$ sudo iptables-save
# Generated by iptables-save v1.6.0 on Sat May 12 22:54:55 2018
*filter
:INPUT ACCEPT [585:177535]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [1701:504634]
-A INPUT -i lxdbr0 -p tcp -m tcp --dport 53 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A INPUT -i lxdbr0 -p udp -m udp --dport 53 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A INPUT -i lxdbr0 -p udp -m udp --dport 67 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A FORWARD -o lxdbr0 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A FORWARD -i lxdbr0 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A OUTPUT -o lxdbr0 -p tcp -m tcp --sport 53 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A OUTPUT -o lxdbr0 -p udp -m udp --sport 53 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
-A OUTPUT -o lxdbr0 -p udp -m udp --sport 67 -m comment --comment "generated for LXD network lxdbr0" -j ACCEPT
COMMIT
# Completed on Sat May 12 22:54:55 2018
# Generated by iptables-save v1.6.0 on Sat May 12 22:54:55 2018
*nat
:PREROUTING ACCEPT [1272:243229]
:INPUT ACCEPT [599:56703]
:OUTPUT ACCEPT [16135:1434723]
:POSTROUTING ACCEPT [14400:924802]
-A POSTROUTING -s 10.91.84.0/24 ! -d 10.91.84.0/24 -m comment --comment "generated for LXD network lxdbr0" -j MASQUERADE
COMMIT
# Completed on Sat May 12 22:54:55 2018
# Generated by iptables-save v1.6.0 on Sat May 12 22:54:55 2018
*mangle
:PREROUTING ACCEPT [293879:213268939]
:INPUT ACCEPT [223439:148312522]
:FORWARD ACCEPT [70440:64956417]
:OUTPUT ACCEPT [232436:40810365]
:POSTROUTING ACCEPT [301140:105256821]
-A POSTROUTING -o lxdbr0 -p udp -m udp --dport 68 -m comment --comment "generated for LXD network lxdbr0" -j CHECKSUM --checksum-fill
COMMIT
# Completed on Sat May 12 22:54:55 2018

Uhmm
 ebtables? Never heard of it


matteo@matteo-laptop:~$ sudo ebtables -L
sudo: ebtables: command not found

SOLVED: See at the end.

Sorry for bringing up this old thread, but I seem to have the exact same problem


The server was running fine for 5 months, this morning I launched a fresh new ubuntu:18.04 container, it took like 10 minutes to create, then after 10 or more minutes after printing the creation msg on screen it printed starting container xxxxx but it never finished.

LXD list showed the container as existing and running but it had no IP.
I decided to reboot the server and now all the Running containers have no IP.

When I run lxc start xxxxx, I receive no immediate response, I let it run for 3-4 mins. and the lxc list command shows it as running, but no IP.

I’m inclined to think something broke on my network.

SOLUTION: My ZFS pool ran out of space, removing a few unused containers and restarting brought everything back to normal.
Then I created a new larger pool, moved the containers to the new pool, deleted the old pool.

NOTE to the LXD developers: (@stgraber)
Consider adding a warning when ZFS is about to run out of storage.
And/or
Consider adding a warning when a new container creation could endanger the entire server if ZFS is about to run out of storage.

2 Likes