Unknown new routes added after startup, breaks dns

ipsilondev · June 6, 2018, 7:01pm

Hi everyone!

first post, but coming here after hours of struggling with this, and i’m really clueless what is the root of the problem.

Setup:

Host: ubuntu 18.04
Containers: Ubuntu artful
LXC version: 3.0.0
Kernel: 4.15
systemd-resolved: v234
Network config:
USE_LXC_BRIDGE="true"
LXC_BRIDGE="lxcbr0"
LXC_ADDR="10.0.3.1"
LXC_NETMASK="255.255.255.0"
LXC_NETWORK="10.0.3.0/24"
LXC_DHCP_RANGE="10.0.3.2,10.0.3.254"
LXC_DHCP_MAX="253"
LXC_DHCP_CONFILE=/etc/lxc/dhcp.conf

On every container (different ip per container of course):

lxc.net.0.type = veth
lxc.net.0.link = lxcbr0
lxc.net.0.flags = up
lxc.net.0.ipv4.address = 10.0.3.101/24

Problem:
The problem is that, after let’s say, 30 minutes, my containers lose the ability to resolve DNS domains. i have been looking at the logs, and everything (including systemd-resolved logs) but to make it short, i realized that the problem is that 2 more routing tables are added and that’s is screwing up the connection to the DNS at host.

At first startup, all working routing table:

root@video:/# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.0.3.1        0.0.0.0         UG    100    0        0 eth0
0.0.0.0         10.0.3.1        0.0.0.0         UG    100    0        0 eth0
10.0.3.0        0.0.0.0         255.255.255.0   U     0      0        0 eth0

after 30 mins or so:

root@video:/# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.0.3.1        0.0.0.0         UG    100    0        0 eth0
0.0.0.0         10.0.3.1        0.0.0.0         UG    100    0        0 eth0
10.0.3.0        0.0.0.0         255.255.255.0   U     0      0        0 eth0
10.0.3.1        0.0.0.0         255.255.255.255 UH    100    0        0 eth0
10.0.3.1        0.0.0.0         255.255.255.255 UH    100    0        0 eth0

Connectivity is up tough, i can ping or connect to any ip, local, in the host or internet. I could reproduce this in a clean installed container too. and another weird thing, is that i could even get one of the containers, for this to not happen for a whole day, just by chance, doing nothing. erasing those 2 last routes restore DNS resolution.

So i have no idea where else i should look for. i have no idea what is triggering the set up of those routing tables. host has nothing except lxc and a bind9 (that is only bind to the public ip), and like a said, this also happens on a fresh container with nothing on it.

Any idea or where i could look for more info, is highly appreciated !

Thanks.

ipsilondev · June 7, 2018, 12:41pm

So, an update, i ended up commenting:

LXC_DHCP_CONFILE=/etc/lxc/dhcp.conf

and leaving the fixed ip configured on every container and works (lxc-info show’s me 2 ip’s assigned, but in the end, the fixed one is working, so i don’t care).

is this a bug? why docs say that the LXC_DHCP_CONFILE is needed to define fixed ips on the dhcp. plus is actually screwing up the routing (even with no fixed ip defined, still screw up the routing with LXC_DHCP_CONFILE).

If anyone can clarify that, i will submit an issue.

stgraber · June 7, 2018, 11:18pm

What do you have in your /etc/lxc/dhcp.conf?

That file is loaded as configuration by dnsmasq so it can do a lot of different things

ipsilondev · June 8, 2018, 12:14am

just the IP definitions:

dhcp-host=mysql,10.0.3.100
dhcp-host=ipsilondev,10.0.3.101
dhcp-host=text2v,10.0.3.102
dhcp-host=video,10.0.3.103
dhcp-host=email,10.0.3.104