Domain resolution in Debian 10 container with routed NIC

Luken · January 23, 2021, 11:24pm

This is a follow up to this issue: "routed" LXC containers with public IP in Vagrant . I realized that I was using “Ubuntu” image accidentally. My intention was to use Debian 10 though, and after changing image to “Debian 10 (cloud)” one, my setup broke.

Specifically: there is something different in Debian when it comes to routing and domain resolution.

Debian 10 container doesn’t connect to the internet by default, but we can make it work by setting up NAT on the host with:

iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

Do anyone know why for Ubuntu container it wasn’t required? I could learn something new.

What still doesn’t work is domain resolution. As you can see below, I have 8.8.8.8 defined as a nameserver in my cloud-init configuration but it doesn’t work:

root@mail-server:~# ping google.com
ping: google.com: Temporary failure in name resolution

There is nothing suspicious in the /var/log/cloud-init-output.log .

In /etc/network/interfaces.d/50-cloud-init I have:

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
    address 192.168.7.201/32
    dns-nameservers 8.8.8.8
    post-up route add default gw 169.254.0.1 || true
    pre-down route del default gw 169.254.0.1 || true

/etc/resolv.conf contains only comments.

Host runs on Ubuntu 20.04 . Container is Debian 10.
LXC version: 4.0.4

My configuration:

config:
  core.https_address: '[::]:8443'
  core.trust_password: true
networks: []
storage_pools:
- config:
    source: /home/luken/lxd-storage-pools
  description: ""
  name: default
  driver: dir
profiles:
- config:
    user.user-data: |
      #cloud-config
      users:
        - name: luken
          gecos: ''
          primary_group: luken
          groups: "sudo"
          shell: /bin/bash
          sudo: ALL=(ALL) NOPASSWD:ALL
          ssh_authorized_keys:
           - <REDACTED>
  description: Default LXD profile
  devices:
    root:
      path: /
      pool: default
      type: disk
  name: default
- config:
    user.network-config: |
      #cloud-config
      version: 2
      ethernets:
        eth0:
          addresses:
          - 192.168.7.201/32
          nameservers:
            addresses:
            - 8.8.8.8
          routes:
          - to: 0.0.0.0/0
            via: 169.254.0.1
            on-link: true
  description: Mail Server LXD profile
  devices:
    eth0:
      ipv4.address: 192.168.7.201
      nictype: routed
      parent: eth1
      type: nic
  name: mail-server

I mean, I KNOW how to make it work, I could add nameserver 8.8.8.8 to /etc/resolv.conf or something, but there is something wrong that cloud-init cannot handle it, so instead of hacking some workaround, I (and probably other lxc users) would rather like to know how to make cloud-init work correctly, or understand why it fails at that.

Any help would be appreciated

Added:

I also noticed a weird thing on the guest:

root@mail-server:/etc/resolvconf/resolv.conf.d# systemctl --failed
  UNIT                          LOAD   ACTIVE SUB    DESCRIPTION           
    
● networking.service            loaded 
failed failed Raise network interfaces  
● systemd-journald-audit.socket loaded failed fail
ed Journal Audit Socket      

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

2 loaded units listed. Pass --all to see loaded but inactive units, t
oo.
To show all installed unit files use 'systemctl list-unit-files'.

root@mail-server:/etc/resolvconf/resolv.conf.d# systemctl status networking
● networking.service - Raise network interfaces
   Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sun 2021-01-24 12:07:03 UTC; 7min ago
     Docs: man:interfaces(5)
  Process: 89 ExecStart=/sbin/ifup -a --read-environment (code=exited, status=1/FAILURE)
 Main PID: 89 (code=exited, status=1/FAILURE)

Jan 24 12:07:03 mail-server systemd[1]: Starting Raise network interfaces...
Jan 24 12:07:03 mail-server ifup[89]: RTNETLINK answers: File exists
Jan 24 12:07:03 mail-server ifup[89]: ifup: failed to bring up eth0
Jan 24 12:07:03 mail-server systemd[1]: networking.service: Main process exited, code=exite
d, status=1/FAILURE
Jan 24 12:07:03 mail-server systemd[1]: networking.service: Failed with result 'exit-code'.

Jan 24 12:07:03 mail-server systemd[1]: Failed to start Raise network interfaces.

This is on barebones testing configuration, so I think it’s worth figuring out what’s going on here, as it may blocking a lot of people from enjoing lxc, because it looks like even Debian 10 doesn’t work without issues out of the box here.

tomp · January 25, 2021, 10:00am

Yes it seems that debian’s version of cloud-init renderer is objecting to the fact that the IP address is already configured on the interface.

I’m not sure if there is something in cloud-init to instruct the render to remove IPs or ignore them if they are already present.

Does it work if you remove the IP inside the container after boot manually and then try starting the network service?

tomp · January 25, 2021, 10:22am

As for the NAT rule, the routed NIC doesn’t affect the host’s NAT rules, so if you need NAT then you need to add that (although that does beg the question the need for routed NIC type, as the primary purpose of that is to expose the container’s IP onto the network without NAT).

Luken · January 25, 2021, 10:32pm

Does it work if you remove the IP inside the container after boot manually and then try starting the network service?

If I remove the ip, restart networking and restore routes manually, everything works like before, but networking is running correctly.

As for the NAT rule, the routed NIC doesn’t affect the host’s NAT rules, so if you need NAT then you need to add that (although that does beg the question the need for routed NIC type, as the primary purpose of that is to expose the container’s IP onto the network without NAT).

I’m pretty sure that when I used “Ubuntu” image in the container, I didn’t have to set up NAT on the host to be able to ping the internet from the container - that’s what I don’t understand really - why I need to do that on Debian 10? Shouldn’t routing set up by LXD take care of that? Do you have any idea what’s going on here?

We may be actually pretty close to the working Debian 10 configuration, which is great, but also a bit sad that there is no instruction, how to set up one of the most popular server distros as container, pretty much anywhere on the internet yet.

Luken · January 26, 2021, 12:56am

I realized one more important thing: Despite not being able to ping 8.8.8.8 from my guest, I CAN ping other devices in my local network. So the packets actually are routed outside of my host. I’m now convinced that I’m missing something quite simple, but my networking knowledge is pretty limited yet.

tomp · January 26, 2021, 9:56am

Ah interesting, but lets back up for a moment and review what you are trying to achieve.

The default networking mode in LXD is to setup a virtual bridge (or switch) called lxdbr0 that provides a private DHCP and DNS server (via dnsmasq). Containers then use the bridged NIC device that is in the default profile and get allocated either a dynamic or static IP via DHCP. The lxdbr0 network also sets up NAT rules to allow the containers to get outbound external access by ‘hiding’ behind the host’s IP and MAC address.

This is usually enough to get up and running. But sometimes you might want the containers to actually ‘join’ the external network as if they were real machines on the network. In this case using NAT is not desirable as it masks all of the containers behind the hosts IP.

In this case there are a few options available to get the containers to join the external network without NAT.

Convert your host’s external interface into a switch/bridge (e.g. br0) and move the host’s IP addressing to that interface and add the host’s external interface to the br0 bridge. Then have LXD containers also connect to that bridge, effectively joining them to the external network at layer 2. You can do this using lxc config device add <instance> eth0 nic nictype=bridged parent=br0. Your containers would then rely on DHCP and DNS services from the external network as if they were real machines, and each would have their own MAC address on the network.
Converting your host’s external interface and addressing to a bridge can be complex, especially if being done on a remote machine, as it can involve losing network access. Instead another option is to use the macvlan NIC type with a parent of the external interface. This achieves a similar solution to the external bridge above, without setting up another bridge. However it comes with a restriction that the containers are not allowed to communicate with the host. This may or may not be a problem depending on your use case.
Some external networks do not allow a single host to use more than 1 MAC address per port. In this case, using an external bridge or macvlan is not possible because each instance would have its own MAC address. In these cases we can use either routed or ipvlan NIC types. The latter having a similar restriction to macvlan that prevents communication with the host. But both of these types allow specific designated IPs to be advertised as being owned by the host’s MAC address onto the external network and then routed into instances without needing to use NAT.

So with that background, you can perhaps see why I am a little confused why you are enabling NAT on your LXD host, as the primary reason to use routed NICs is when you don’t want to use NAT.

The way that routed and ipvlan advertise their IPs onto the network is using proxy ARP/NDP (also called Neighbour Proxy), it effectively asks the host to claim ownership of the IPs onto the external network, which is why the instances using these NIC types need to have static IPs assigned, and why, as a convenience, LXD preconfigures the IPs on the interfaces (as DHCP won’t work).

The vast majority of users will use DHCP with bridged or macvlan networking.
Different distros use different networking configuration systems with their own subtle behaviors, and using cloud-init further introduces additional restrictions.

In this case it appears that Debian’s network setup doesn’t like it if IPs are preconfigured, and cloud-init (to my knowledge anyway) doesn’t offer a way to clear the IPs before running the network setup. However perhaps there is a pre-script which can be run to achieve this.

Another option would be to extend LXD with an option to not preconfigure the IPs on the interface when passing into the instance, and let it up the network config inside the instance.

One thing to point out though is that, as we’ve seen, unlike Ubuntu with netplan, Debian does not clear IPs on the interfaces before setting it up. This means that in actual fact we do not need cloud-init to configure networking, its already set up. All we need to do is set the DNS servers. So if cloud-init can be configured to just setup DNS servers then that should suffice.

tomp · January 26, 2021, 10:10am

So assuming that NAT isn’t actually desired, and is disabled, then if you can still ping other devices on the network and they can ping the container’s IP as well, then that means that both the proxy ARP and static routes are working.

If at that point you cannot ping 8.8.8.8, then the next step is to try pinging the network’s default gateway and make sure (using tcpdump) that the packets from the container are leaving the host’s external port.

tomp · January 26, 2021, 10:53am

Assuming you can resolve the other issue on your network (pinging 8.8.8.8), then a way to automate this via cloud-init working around Debian’s network setup restrictions is as follows:

Don’t add the address info in the cloud-init network config, just disable DHCP, and instead manually apply the desired nameservers to /etc/resolv.conf using the user-date:

lxc profile show routed
config:
  user.network-config: |
    #cloud-config
    version: 2
    ethernets:
        eth0:
          dhcp4: false
          dhcp6: false
          routes:
          - to: 0.0.0.0/0
            via: 169.254.0.1
            on-link: true
  user.user-data: |
    #cloud-config
    bootcmd:
      - rm -f /etc/resolv.conf
      - echo "nameserver 8.8.8.8" > /etc/resolv.conf
description: Default LXD profile
devices:
  eth0:
    ipv4.address: 192.168.1.201
    name: eth0
    nictype: routed
    parent: enp3s0
    type: nic
name: routed

tomp · January 26, 2021, 11:08am

@simos I’m not sure if you’re interested in updating https://blog.simos.info/how-to-get-lxd-containers-get-ip-from-the-lan-with-routed-network/ but just letting you know how to get this working for Debian.

Luken · January 26, 2021, 11:48am

Thank you for this very detailed explanations, I really appreciate them. I don’t think that disabling dhcp is required for Debian, because it’s just not enabled by default. We can probably completely get rid of the “user.network-config” part from the profile configuration.

So with that background, you can perhaps see why I am a little confused why you are enabling NAT on your LXD host, as the primary reason to use routed NICs is when you don’t want to use NAT.

Oh, I didn’t intend to enable NAT, it was just an observation that enabled NAT made outgoing connections available. I didn’t know if this information would be useful or not, as I’m still wrapping my head around how this setup works exactly.

It seems the only part that is missing is why my outgoing packets are not routed into my network’s gateway. What is really funny is that I can ping my network’s gateway (192.168.7.1) from the container, but yet I cannot ping 8.8.8.8 for some reason. It MIGHT be some misconfiguration on the part of my network, but I’m not sure of this yet. Any ideas/suggestions what it could be would be welcome, I’ll be diagnosing this, and if I will finally find the working setup, I’ll describe it with the working configuration here.

tomp · January 26, 2021, 11:53am

I’ve found that the LXD debian images have DHCP enabled by default, and while it cannot succeed, and will leave the preconfigured IPs and routes intact, it does delay the boot and prevent lxc shell <instance> working for a minute or so after starting the instance.

I’d look at using tcpdump when pinging 8.8.8.8 to check which interface the packets are going out of, check the source address hasn’t been mangled by NAT, and that return packets are arriving.

simos · January 26, 2021, 12:18pm

Thanks, I updated the post with the LXD profile that is suitable for Debian.

Luken · January 26, 2021, 12:24pm

@tomp
When I add one more route to the host: ip route add default via 192.168.7.1 pointing into my gateway, then access to the internet is restored in the container.

These are routes on the host that are set up by default, after starting container:

default via 10.0.2.2 dev eth0 proto dhcp src 10.0.2.15 metric 100 
10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15 
10.0.2.2 dev eth0 proto dhcp scope link src 10.0.2.15 metric 100 
192.168.7.0/24 dev eth1 proto kernel scope link src 192.168.7.200 
192.168.7.201 dev vethc5cebe03 scope link

Could it be that there is something wrong in how lxd is setting up routes on the host?

And these are routes in the container:

default via 169.254.0.1 dev eth0
169.254.0.1 dev eth0 scope link

@simos There is one error in this configuration: /etc/resolv.conf shouldn’t be modified like that, because it will be overwritten after restarting resolvconf service. Better way would be to use:

  bootcmd:
    - echo 'nameserver 8.8.8.8' > /etc/resolvconf/resolv.conf.d/tail
    - systemctl restart resolvconf

tomp · January 26, 2021, 12:29pm

The routes in the container are correct. The routed NIC type sets up a link-local default route just to get the traffic from the container to the host.

After that it isn’t LXD’s responsibility anymore and its up to the host’s routing table to route traffic as needed (as it assumes that the host is setup to have external connectivity like lxdbr0 does).

I’m a bit confused why you seem to have a default route on the host to 10.0.2.2 via dev eth0, if that isn’t your gateway, then that is likely causing the issue, as your LXD containers are being published on eth1.

tomp · January 26, 2021, 12:35pm

Although that would explain why enabling NAT helps, as if traffic is going out of eth0, then its likely the gateway at 10.0.2.2 won’t know how to route return traffic from 192.168.7.0/24 (and thus hiding it behind to host’s IP on eth0 helps).

Luken · January 26, 2021, 12:36pm

Ooooh I think everything jumped into place now. As I mentioned in referenced post at the beginning, I’m running this in Vagrant, so it probably has its own network set up, and 10.0.2.2 is its default gateway. Interesting.

tomp · January 26, 2021, 12:38pm

You should decide how you want your network routed, and then if you need to enable NAT, only do it on the external interface and not the internal one.

Luken · January 26, 2021, 12:41pm

Yeah, I will look into how networking in Vagrant/VirtualBox works, and try to write up the full instruction how to make it to work correctly in Vagrant, because it’s often a good place to test such things like lxc/lxd. Thank you for your help, I hope I will be able to contribute back .

tomp · January 26, 2021, 12:43pm

Thanks!

Luken · January 26, 2021, 2:13pm

@tomp Just FYI I described the final working setup here: https://serverfault.com/questions/1047365/no-network-connectivity-in-the-lxc-container-set-up-in-the-routed-mode/1051267#1051267 I hope it will be helpful for someone.