3.19 and Routed networking mode configuration example needed

Darwin · April 9, 2020, 4:40pm

This are the results

root@ copark:~# tcpdump -l -nn -i bond-wan host  200.119.xxx.xxx
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond-wan, link-type EN10MB (Ethernet), capture size 262144 bytes


root@copark:~# sysctl net.ipv4.conf.bond-wan.forwarding
net.ipv4.conf.bond-wan.forwarding = 1

tomp · April 9, 2020, 4:50pm

So looks like you may haven upstream firewall in place too.

I would expect to see an inbound ARP request or the ICMP packets themselves arriving at the bond-wan interface.

E.g. see my local test: 192.168.1.201 is my container’s IP, and the host’s external interface is enp3s0.

I then ping from a different PC on the network to 192.168.1.201, and can see the ARP who-has request arriving, and then proxy ARP replies and the ICMP packets start flowing.

sudo tcpdump -l -nn -i enp3s0 host 192.168.1.201
17:48:01.220214 ARP, Request who-has 192.168.1.201 tell 192.168.1.2, length 46
17:48:01.220250 ARP, Reply 192.168.1.201 is-at 44:8a:5b:25:54:d8, length 28
17:48:08.665486 ARP, Request who-has 192.168.1.201 tell 192.168.1.2, length 46
17:48:08.725433 ARP, Reply 192.168.1.201 is-at 44:8a:5b:25:54:d8, length 28
17:48:08.725632 IP 192.168.1.2 > 192.168.1.201: ICMP echo request, id 19542, seq 0, length 64
17:48:08.725713 IP 192.168.1.201 > 192.168.1.2: ICMP echo reply, id 19542, seq 0, length 64

If you are not seeing anything at all, then it suggests something upstream is filtering out requests.

If you were just seeing the ARP who-has requests coming in and no response, we could start to think that proxy ARP isn’t working, but in this case its not the problem.

Darwin · April 9, 2020, 6:04pm

Hi Thomas

I’m trying to find what happen with my server but in the mean time. Do you know what can I do in my firewall when this is activated to reach both machines (Host and Container)? I can’t leave my server without firewall for a long time.

On the other hand I want to tell you that I made a ping from my laptop to the container Public IP, at the same time I activate tcpdump in the host side (that was my mistake in the previous test) and this were the results

darwin@Darwins-MBP ~ % ping -c8 200.119.xxx.xxx
PING 200.119.xxx.xxx (200.119.xxx.xxx): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3
Request timeout for icmp_seq 4
Request timeout for icmp_seq 5
Request timeout for icmp_seq 6

root@test:~# tcpdump -l -nn -i bond-wan host 200.119.xxx.xxx
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond-wan, link-type EN10MB (Ethernet), capture size 262144 bytes
13:35:04.415980 IP 162.204.xx.xx > 200.119.xxx.xxx: ICMP echo request, id 16393, seq 0, length 64
13:35:05.419637 IP 162.204.xx.xx > 200.119.xxx.xxx: ICMP echo request, id 16393, seq 1, length 64
13:35:06.422769 IP 162.204.xx.xx > 200.119.xxx.xxx: ICMP echo request, id 16393, seq 2, length 64
13:35:07.426262 IP 162.204.xx.xx > 200.119.xxx.xxx: ICMP echo request, id 16393, seq 3, length 64
13:35:08.430814 IP 162.204.xx.xx > 200.119.xxx.xxx: ICMP echo request, id 16393, seq 4, length 64
13:35:09.433627 IP 162.204.xx.xx > 200.119.xxx.xxx: ICMP echo request, id 16393, seq 5, length 64
13:35:10.437840 IP 162.204.xx.xx > 200.119.xxx.xxx: ICMP echo request, id 16393, seq 6, length 64
13:35:11.439606 IP 162.204.xx.xx > 200.119.xxx.xxx: ICMP echo request, id 16393, seq 7, length 64

Darwin · April 11, 2020, 9:06pm

Hi Thomas,

I kept working on this issue and I was able to figured it out so I want to share the solution with you. Let me know if this is the best way.

I did the following on the host side

Step 1. Edit sysctl.conf

root@copark:~# nano /etc/sysctl.conf 
net.ipv4.ip_forward=1
net.ipv4.conf.bond-wan.forwarding=1
net.ipv4.conf.all.proxy_arp=1

Step 2. Activate it

root@copark:~# sysctl -p
net.ipv4.ip_forward = 1
net.ipv4.conf.bond-wan.forwarding = 1
net.ipv4.conf.all.proxy_arp = 1

Now I keep focus to find a way to set up my firewall to permit the access when it’s activated. Any advice will be welcome.

tomp · April 12, 2020, 5:57am

Some of those Sysctls should be checked at container start and container will refuse to start otherwise.

It would be interesting to see which one of those sysctls actually fixed it. The proxy ARP one should not be needed as LXD creates manual proxy neighbor entries you can see with:

ip neigh show proxy

Darwin · April 12, 2020, 7:00am

The sysctls that I used to make the container reachable from Internet was net.ipv4.conf.all.proxy_arp = 1.

I was able to configure the firewall to make possible that the container get internet and be reachable, however I’m facing another situation with the firewall since it’s interrupting the communication from the host to the container and vice versa.

Can you suggest me a way to overcome this issue?

Darwin · April 14, 2020, 5:59am

Finally I got the solution. I have had to implement a workaround on my firewall policies and design an script to rewrite the virtual ethernet vethxxxxxxx since it change every time the container reboots.

Do you know if there is a way to configure volatile.eth0.host_name: vethxxxxxxx to be permanent?

tomp · April 14, 2020, 7:57am

Yes you can make that persistent by setting the host_name property on the NIC.

See https://linuxcontainers.org/lxd/docs/master/instances#nictype-routed

E.g.

lxc config device set <container> eth0 host_name=myct

Darwin · April 15, 2020, 3:54am

Hi Thomas,

Everything works like expected. Thank you again for your excellent support and input.

Darwin · April 17, 2020, 8:43pm

Hi Everyone,

This time I want to ask you about how to configure the network interface on Ubuntu 16.04 correctly

I tried to configure the container as following.

root@copark:~# lxc config device override Meet eth0 ipv4.address=PUBLIC-IP

root@Ubuntu16:~# nano /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
        address [Public-IP]
        netmask [xxx.xxx.xxx.xxx]
        gateway 169.254.0.1
        dns-nameservers 8.8.8.8 8.8.4.4

source /etc/network/interfaces.d/*.cfg

At first it appears to be fine but when I check the container I notice that I have the Public IP duplicate it

root@copark:~# lxc list
+----------+---------+------------------------------+------+-----------+-----------+
| Meet     | RUNNING | PUBLIC-IP (eth0)             |      | CONTAINER | 1         |
|          |         | PUBLIC-IP (eth0)             |      |           |           |
|          |         | 172.18.0.1 (br-a058d481dd5f) |      |           |           |
|          |         | 172.17.0.1 (docker0)         |      |           |           |
+----------+---------+------------------------------+------+-----------+-----------+

Can you suggest me a way to overcome this misconfiguration?

tomp · April 20, 2020, 5:25pm

Can you show output of ‘ip a’ inside container?

Darwin · April 20, 2020, 7:48pm

This is the complete output.

root@Meet:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0@if57: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 36:20:d5:ff:4d:74 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet PUBLIC-IP/32 brd 255.255.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet PUBLIC-IP/16 brd xxx.xxx.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::3420:d5ff:feff:4d74/64 scope link 
       valid_lft forever preferred_lft forever
3: br-a058d481dd5f: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:8a:82:bd:0d brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 brd 172.18.255.255 scope global br-a058d481dd5f
       valid_lft forever preferred_lft forever
    inet6 fe80::42:8aff:fe82:bd0d/64 scope link 
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:66:a6:5d:32 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
6: veth2016691@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-a058d481dd5f state UP group default 
    link/ether 96:97:a2:f0:16:88 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::9497:a2ff:fef0:1688/64 scope link 
       valid_lft forever preferred_lft forever
8: veth0b506da@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-a058d481dd5f state UP group default 
    link/ether 8e:42:7c:1e:30:ca brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::8c42:7cff:fe1e:30ca/64 scope link 
       valid_lft forever preferred_lft forever

tomp · April 21, 2020, 8:59am

I suspect you have specified the incorrect subnet mask in your /etc/network/interfaces file, try changing it to 255.255.255.255 (i.e a /32 address) rather than a /16 address. But you have hidden it so I cant be sure.

Darwin · April 22, 2020, 6:14am

After I changed the subnet mask to 255.255.255.255 the duplicate Public IP was fixed but I can not access to internet. When I check /etc/resolv.conf I don’t see dns server.

root@Meet:~# ping google.com
ping: unknown host google.com


root@Meet:~# ip r
default via 169.254.0.1 dev eth0 
169.254.0.1 dev eth0  scope link 
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1 linkdown 
172.24.0.0/16 dev br-97de16545f97  proto kernel  scope link  src 172.24.0.1 


root@Meet:~# cat /etc/resolv.conf 
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN

If I check the network services I have the following errors

root@Meet:~# systemctl status networking.service 
● networking.service - Raise network interfaces
   Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor preset: enabled)
  Drop-In: /run/systemd/generator/networking.service.d
           └─50-insserv.conf-$network.conf
   Active: failed (Result: exit-code) since Wed 2020-04-22 05:57:05 UTC; 1min 30s ago
     Docs: man:interfaces(5)
  Process: 164 ExecStart=/sbin/ifup -a --read-environment (code=exited, status=1/FAILURE)
  Process: 132 ExecStartPre=/bin/sh -c [ "$CONFIGURE_INTERFACES" != "no" ] && [ -n "$(ifquery --read-environment --list --exclude=lo)" ] && udevadm settle (cod
 Main PID: 164 (code=exited, status=1/FAILURE)

Apr 22 05:57:05 Meet systemd[1]: Starting Raise network interfaces...
Apr 22 05:57:05 Meet ifup[164]: RTNETLINK answers: File exists
Apr 22 05:57:05 Meet ifup[164]: Failed to bring up eth0.
Apr 22 05:57:05 Meet systemd[1]: networking.service: Main process exited, code=exited, status=1/FAILURE
Apr 22 05:57:05 Meet systemd[1]: Failed to start Raise network interfaces.
Apr 22 05:57:05 Meet systemd[1]: networking.service: Unit entered failed state.
Apr 22 05:57:05 Meet systemd[1]: networking.service: Failed with result 'exit-code'.

The only way that I found was flush dev eth0

root@Meet:~# sudo ip addr flush dev eth0
root@Meet:~# systemctl restart networking.service

root@Meet:~# ip r
default via 169.254.0.1 dev eth0 onlink 
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1 linkdown 
172.24.0.0/16 dev br-97de16545f97  proto kernel  scope link  src 172.24.0.1 

root@Meet:~# cat /etc/resolv.conf                
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 8.8.8.8
nameserver 8.8.4.4

root@Meet:~# ping -c 4 google.com
PING google.com (172.217.172.14) 56(84) bytes of data.
64 bytes from 02s09-in-f14.1e100.net (172.217.172.14): icmp_seq=1 ttl=55 time=1.48 ms
64 bytes from 02s09-in-f14.1e100.net (172.217.172.14): icmp_seq=2 ttl=55 time=1.53 ms
64 bytes from 02s09-in-f14.1e100.net (172.217.172.14): icmp_seq=3 ttl=55 time=1.88 ms
64 bytes from 02s09-in-f14.1e100.net (172.217.172.14): icmp_seq=4 ttl=55 time=1.59 ms

--- google.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 1.481/1.624/1.889/0.165 ms

Do you know what can I do to come out with a better solution?

tomp · April 22, 2020, 9:23am

I suspect it is failing because LXD adds the IPs to the interface and then your /etc/network/interfaces tries to add the same IP.

Is it possible to just not use /etc/network/interfaces to set IPs, and just the DNS servers?

Also, if possible, you could use netplan, as that removes any existing IPs first.

Darwin · April 23, 2020, 12:57am

I found the solution using the DNS server.

Therefore I would like to share what I did on Ubuntu 16.04. First of all, you have to delete the IP configuration on /etc/network/interfaces

root@Meet:~# nano /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback

source /etc/network/interfaces.d/*.cfg

Then assign the static DNS server adding the following lines in it

root@Meet:~# nano /etc/resolvconf/resolv.conf.d/head
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 8.8.8.8
nameserver 8.8.4.4

After that save the changes and restart the resolvconf.service or reboot the system.

root@Meet:~# systemctl restart resolvconf.service

root@Meet:~# reboot

Finally when you check the /etc/resolv.conf file in the container the name server entries should be stored there permanently.

root@Meet:~# cat /etc/resolv.conf 
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 8.8.8.8
nameserver 8.8.4.4

Thank you Thomas for your support!

CoolCold · June 25, 2021, 7:41pm

sharing my PoC with Hetzner private networking

Hetzner Cloud VM private (internal) networking case. Task - assign private IP for container which is accessible from other VMs or servers via vSwitch.

Assumes you already have setup for vSwitch and internal networking, your VM private IP is 192.168.2.2 . Cloud subnet - 192.168.2.0/23, global private subnet with dedicated servers: 192.168.0.0/20

Add another internal ip via control panel - it’s buried in Networking → subnets, list of VMs → triple . (…) - add alias ip, something like that.

In my case I’ve chosen internal IP to be 192.168.2.101

ensure that you have forwarding on - net.ipv4.ip_forward = 1 (lxd with default bridge/nat will do it for you)

configuring container of interest:
run on host:

lxc launch ubuntu:focal c1
lxc stop c1
lxc config edit c1

change devices section to have network part (by default devices section is empty - devices {} )

devices:
  eth0:
    ipv4.address: 192.168.2.101
    name: eth0
    nictype: routed
    type: nic

on host:

iptables -t nat -A POSTROUTING -s 192.168.2.0/23 ! -d 192.168.0.0/20 -m comment --comment "NAT for internal network" -j MASQUERADE

let’s make changes in container - lxc shell c1 to get inside and configure you network, netplan /etc/netplan/50-cloud-init.yaml sample:

root@c1:~# cat /etc/netplan/50-cloud-init.yaml
# This file is generated from information provided by the datasource.  Changes
# to it will not persist across an instance reboot.  To disable cloud-init's
# network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
  version: 2
  ethernets:
    eth0:
      dhcp4: no
      dhcp6: no
      addresses:
        - 192.168.2.101/32
      routes:
        - on-link: true
          to: 0.0.0.0/0
          via: 169.254.0.1

      nameservers:
        addresses:
          - 213.133.100.100
          - 213.133.99.99
          - 213.133.98.98

save and apply -netplan apply.

Now you should be able to

ping 192.168.2.101 (container itself)
ping 192.168.2.2 (you host private IP)
ping any other existing host in your private netrange, like 192.168.0.12
ping 8.8.8.8 or any other public ip

If any of the above not working, means you have some issues with setup/nat/other parts of the system. If all good, congrats - use networking as usual for LXD containers.

CoolCold · June 26, 2021, 8:00am

@tomp i’ve tried 3.19 and Routed networking mode configuration example needed - #7 by tomp but without much success - network inside container was not configured after start, had no IP at all. May be it’s related to cloud-init something, I didn’t debug further, with hand-crafted netplan configuration things worked fine (so i’m using “intnet” profile in my own docs now, not reflecting it here to keep things simplified)

tomp · June 26, 2021, 12:13pm

That post will only work if the container doesn’t have its own network configuration trying to do dhcp which will first wipe the static config from the interface passed in from LXD.

There is a guide on how to get this to work with cloud init here How to get LXD containers get IP from the LAN with routed network

Or just doing it statically as you do works fine too.

CoolCold · June 26, 2021, 12:44pm

Thanks for explicitly pointing on that, it was my guess, but had no time for more tests, needed to move database from another server

Will check later and update to make all steps clear in one place.