SOLVED: Static IPv4 address just disappeared from container?

pgoetz · July 21, 2020, 8:01pm

I set up a container over a year ago with a static IP address:

$ lxc network attach lxdbr0 atom eth0
$ lxc config device set atom eth0 ipv4.address 10.248.83.4

[pgoetz@erap-atx pkg]$ ip addr
...
4: lxdbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 2e:29:bb:97:24:52 brd ff:ff:ff:ff:ff:ff
    inet 10.248.83.1/24 scope global lxdbr0
        valid_lft forever preferred_lft forever

This was working at the time I set it up, as I ran updates from the container, but how the static IP address is just gone:

[pgoetz@erap-atx pkg]$ lxc exec atom -- bash
root@atom:~# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
7: eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:03:f6:fd brd ff:ff:ff:ff:ff:ff link-netnsid 0

I’m running this on Arch linux, and between the time this was known to work, LXD migrated from the user maintained AUR to the official repo, so there were a couple of automatic updates:

[pgoetz@erap-atx pkg]$ pwd
/var/cache/pacman/pkg
[pgoetz@erap-atx pkg]$ ls lxd*
lxd-3.21-1-x86_64.pkg.tar.zst  lxd-4.0.0-1-x86_64.pkg.tar.zst
lxd-3.21-2-x86_64.pkg.tar.zst  lxd-4.2-1-x86_64.pkg.tar.zst

I’m guessing the automatic upgrade to version 4.x somehow broke the networking? Anyone have any insights on what happened or best way to fix?

tomp · July 22, 2020, 7:59am

Please can you show output of lxc config show atom --expanded and also confirm the following items:

That your container’s internal network config includes DHCP on the eth0 interface.
That your host isn’t running a firewall (or a piece of software, such as docker, that configures a firewall that blocks DHCP).
That dnsmasq is running and listening on lxdbr0.

pgoetz · July 22, 2020, 12:57pm

Hi -

Thanks for the response.

[pgoetz@erap-atx ~]$ lxc config show atom --expanded
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 18.04 LTS amd64 (release) (20190604)
  image.label: release
  image.os: ubuntu
  image.release: bionic
  image.serial: "20190604"
  image.version: "18.04"
  security.privileged: "true"
  volatile.base_image: c234ecee3baaee25db84af8e3565347e948bfceb3bf7c820bb1ce95adcffeaa8
  volatile.eth0.host_name: veth98d85ed2
  volatile.eth0.hwaddr: 00:16:3e:03:f6:fd
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.current: '[]'
  volatile.idmap.next: '[]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: RUNNING
devices:
  atomport8084:
    connect: tcp:127.0.0.1:80
    listen: tcp:0.0.0.0:8084
    type: proxy
  eth0:
    ipv4.address: 10.248.83.4
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

The container IP address is statically assigned:

$ lxc config device set atom eth0 ipv4.address 10.248.83.4

So I’m not sure DHCP is relevant? All the firewall rules are generated by LXD:

[root@erap-atx ~]# iptables -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:53 /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:53 /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:67 /* generated for LXD network lxdbr0 */

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdbr0 */
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdbr0 */

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp spt:53 /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp spt:53 /* generated for LXD network lxdbr0 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp spt:67 /* generated for LXD network lxdbr0 */

dnsmasq is not running on the host system, as this system uses systemd-resolved

tomp · July 22, 2020, 3:16pm

The static IP allocation is just telling LXD to create a static DHCP reservation in its DHCP server (dnsmasq, that it starts on LXD daemon start). The instance itself will still need to perform DHCP request to get the static IP allocated. Additionally if you don’t see dnsmasq running on your LXD host, then something is likely preventing our dnsmasq from starting, and that is why your instance is not getting a response to its DHCP request.

Check for other services listening on the DNS or DHCP ports on your host, such as systemd-resolved, and if needed, configure them to not listen on the lxdbr0 interface, which should then allow the LXD dnsmasq service to start.

pgoetz · July 22, 2020, 10:23pm

Hi Tom -

Thanks so much for your insightful help with this problem. I ran updates on this system (updating to LXD 4.3) and rebooted and now networking is working again. Indeed dnsmasq was not runnning previously but is running now, so maybe the service just died or there was some misconfiguration in a previous package version which caused it to die. (It’s an Arch linux system, and about 100 packages got updated, since I don’t do this as frequently as one probably should.) Not worth tracking down, but it was helpful to learn how container networking is implemented. Marked this topic as SOLVED.

tomp · July 23, 2020, 8:03am

It may be that if there is another service starting that is preventing dnsmasq from starting that it is intermittently starting because the two services are racing each other to grab the port on lxdbr0.

pgoetz · July 23, 2020, 1:58pm

Well, this container has been serving up a web app software chain (that only runs on Ubuntu 14.04) for a couple of years without previous issues. I’m guessing some post install script from an intermediate update just caused the service to fail, but I will monitor the situation. Systemd should make it easy to determine if you’re right – thanks.