Can't get IPv6 Netmask 64 to work (no NAT, should be end to end)

So here my output again.

root@c1:~# ip -6 r
1111:aaaa:3004:9978::2 dev eth1 proto kernel metric 256 pref medium
fe80::/64 dev eth1 proto kernel metric 256 pref medium
default via fe80::1 dev eth1 metric 1024 pref medium

Pinging Cloudflares DNS works. Also pinging the address with an online tool also works.

root@c1:~# ping 2606:4700:4700::1111
PING 2606:4700:4700::1111(2606:4700:4700::1111) 56 data bytes
64 bytes from 2606:4700:4700::1111: icmp_seq=1 ttl=61 time=7.44 ms
64 bytes from 2606:4700:4700::1111: icmp_seq=2 ttl=61 time=7.15 ms
64 bytes from 2606:4700:4700::1111: icmp_seq=3 ttl=61 time=7.20 ms
64 bytes from 2606:4700:4700::1111: icmp_seq=4 ttl=61 time=7.18 ms
64 bytes from 2606:4700:4700::1111: icmp_seq=5 ttl=61 time=7.16 ms
64 bytes from 2606:4700:4700::1111: icmp_seq=6 ttl=61 time=7.14 ms

--- 2606:4700:4700::1111 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5007ms
rtt min/avg/max/mdev = 7.144/7.216/7.440/0.141 ms

But dig does not work:

root@c1:~# dig @2606:4700:4700::1111 www.google.com

; <<>> DiG 9.11.3-1ubuntu1.11-Ubuntu <<>> @2606:4700:4700::1111 www.google.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

Good, so now you should use tcpdump or similar to check the outgoing traffic from your host and see where it is going and if you are getting dns response packets or not. It maybe a firewall somewhere.

1 Like

Been at it quite some time, can’t quite figure out how to use tcpdump correctly.
I went in the container and put this:

root@c1:/etc# dnsmasq

dnsmasq: failed to create listening socket for port 53: Address already in use

And this:

root@c1:/etc# netstat -anlp | grep -w LISTEN
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      161/systemd-resolve 
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      205/sshd            
tcp6       0      0 :::22                   :::*                    LISTEN      205/sshd

Is that to be expected?

Here’s how I would do it:

First, a refresh of my test config:

lxc network show lxdbr1 (I’m using lxdbr1 for IPv4 only):

lxc network show lxdbr1
config:
  ipv4.address: 10.0.171.1/24
  ipv4.nat: "true"
  ipv6.address: none
  ipv6.dhcp: "false"
  ipv6.nat: "true"
description: ""
name: lxdbr1
type: bridge

I’ve created a test container called crouted and added 2 NICs, one (eth0) bridged to lxdbr1 (for IPv4), and the other (eth1) routed to parent enp3s0 (my external interface on the host).

architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 18.04 LTS amd64 (release) (20200317)
  image.label: release
  image.os: ubuntu
  image.release: bionic
  image.serial: "20200317"
  image.type: squashfs
  image.version: "18.04"
  volatile.base_image: 98e43d99d83ef1e4d0b28a31fc98e01dd98a2dbace3870e51c5cb03ce908144b
  volatile.eth0.hwaddr: 00:16:3e:ec:e2:b5
  volatile.eth1.hwaddr: 00:16:3e:83:d2:60
  volatile.eth1.name: eth1
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000}]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: STOPPED
devices:
  eth0:
    name: eth0
    network: lxdbr1
    type: nic
  eth1:
    ipv6.address: 2a02:nnn:76f4:1::1234
    nictype: routed
    parent: enp3s0
    type: nic
ephemeral: false
profiles:
- default
stateful: false
description: ""

I start the container, and check the IPs and routes have taken effect:

lxc start crouted
Wait a couple of seconds.
lxc ls crouted
+---------+---------+--------------------+------------------------------+-----------+-----------+
|  NAME   |  STATE  |        IPV4        |             IPV6             |   TYPE    | SNAPSHOTS |
+---------+---------+--------------------+------------------------------+-----------+-----------+
| crouted | RUNNING | 10.0.171.52 (eth0) | 2a02:nnn:76f4:1::1234 (eth1) | CONTAINER | 0         |
+---------+---------+--------------------+------------------------------+-----------+-----------+

Check routes inside container:

lxc exec crouted -- ip r
default via 10.0.171.1 dev eth0 proto dhcp src 10.0.171.52 metric 100 
10.0.171.0/24 dev eth0 proto kernel scope link src 10.0.171.52 
10.0.171.1 dev eth0 proto dhcp scope link src 10.0.171.52 metric 100 
lxc exec crouted -- ip -6 r
2a02:nnn:76f4:1::1234 dev eth1 proto kernel metric 256 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
fe80::/64 dev eth1 proto kernel metric 256 pref medium
default via fe80::1 dev eth1 metric 1024 pref medium

Check ping to external addresses:

 lxc exec crouted -- ping 8.8.8.8 -c 5
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=57 time=23.9 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=57 time=23.8 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=57 time=23.8 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=57 time=24.0 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=57 time=23.7 ms

--- 8.8.8.8 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4007ms
rtt min/avg/max/mdev = 23.725/23.885/24.045/0.141 ms
lxc exec crouted -- ping 2606:4700:4700::1111 -c 5
PING 2606:4700:4700::1111(2606:4700:4700::1111) 56 data bytes
64 bytes from 2606:4700:4700::1111: icmp_seq=1 ttl=59 time=23.4 ms
64 bytes from 2606:4700:4700::1111: icmp_seq=2 ttl=59 time=23.4 ms
64 bytes from 2606:4700:4700::1111: icmp_seq=3 ttl=59 time=23.3 ms
64 bytes from 2606:4700:4700::1111: icmp_seq=4 ttl=59 time=23.3 ms
64 bytes from 2606:4700:4700::1111: icmp_seq=5 ttl=59 time=23.3 ms

--- 2606:4700:4700::1111 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4005ms
rtt min/avg/max/mdev = 23.325/23.378/23.465/0.200 ms

Now, check DNS resolution manually using dig tool rather than relying on the systemd resolver (that you have shown is listening on 127.0.0.1:53 of your container):

Test using external IPv6 resolver:

lxc exec crouted -- dig @2606:4700:4700::1111 www.linuxcontainers.org

; <<>> DiG 9.11.3-1ubuntu1.11-Ubuntu <<>> @2606:4700:4700::1111 www.linuxcontainers.org
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6433
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1452
;; QUESTION SECTION:
;www.linuxcontainers.org.	IN	A

;; ANSWER SECTION:
www.linuxcontainers.org. 884	IN	CNAME	rproxy.stgraber.org.
rproxy.stgraber.org.	884	IN	A	149.56.148.5

;; Query time: 23 msec
;; SERVER: 2606:4700:4700::1111#53(2606:4700:4700::1111)
;; WHEN: Thu Mar 19 08:54:12 UTC 2020
;; MSG SIZE  rcvd: 98

Test using external IPv4 resolver:

lxc exec crouted -- dig @8.8.8.8 www.linuxcontainers.org

; <<>> DiG 9.11.3-1ubuntu1.11-Ubuntu <<>> @8.8.8.8 www.linuxcontainers.org
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;www.linuxcontainers.org.	IN	A

;; ANSWER SECTION:
www.linuxcontainers.org. 899	IN	CNAME	rproxy.stgraber.org.
rproxy.stgraber.org.	899	IN	A	149.56.148.5

;; Query time: 430 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Thu Mar 19 08:54:58 UTC 2020
;; MSG SIZE  rcvd: 98

If these didn’t work, I then need to check where the packets are getting lost using tcpdump.

First, lets review how the packets will traverse from the container, to the host and the response coming back:

  1. User runs dig to 2606:4700:4700::1111 inside container, this means UDP packets with a destination port of 53 will leave the container via eth1 destined for the default gateway address fe80::1.
  2. Both bridged and routed NIC types make use of a Linux network concept called “veth pairs” where a pair of virtual Ethernet devices are created by the OS and any packets that go in one end come out of the other. We then leave one end of the pair on the host and move the other end of the pair into the container. In this way eth0 and eth1 in the containers are connected to their respective pair ends on the host. You can see which are the respective host-side veth interfaces by running lxc info crouted

e.g.

 lxc info crouted
Name: crouted
Location: none
Remote: unix://
Architecture: x86_64
Created: 2020/03/19 08:43 UTC
Status: Running
Type: container
Profiles: default
Pid: 14160
Ips:
  lo:	inet	127.0.0.1
  lo:	inet6	::1
  eth0:	inet	10.0.171.52	veth575cd614
  eth0:	inet6	fe80::216:3eff:feec:e2b5	veth575cd614
  eth1:	inet6	2a02:nnn:76f4:1::1234	veth9142f1c9
  eth1:	inet6	fe80::f813:d4ff:fe89:c57f	veth9142f1c9

You can see that eth0 has a host-side end interface called veth575cd614 and eth1 has a host-side end of veth9142f1c9

  1. For bridged NIC types, LXD ‘connects’ the host-side veth interface to the parent LXD bridge (in the case of my container’s eth0 this is lxdbr1). For routed NIC types, LXD does not connect the host-side veth interface to anything, and just leaves it ‘connected’ to the host like any other interface.

We can see this by running on the host:

ip a show dev veth9142f1c9
30: veth9142f1c9@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether fe:d4:d1:5e:1c:82 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::1/128 scope link 
       valid_lft forever preferred_lft forever
    inet6 fe80::fcd4:d1ff:fe5e:1c82/64 scope link 
       valid_lft forever preferred_lft forever

LXD has added an IPv6 address of fe80::1/128 to the host-side interface of the veth pair.

  1. So, for packets leaving eth1 in the container destined for fe80::1 they will arrive at the host end of the veth pair. After that it is up to the host to “route” the packets where they need to go.

So an expected packet flow for these UDP packets would be:

  • Leave container’s eth1 for fe80::1
  • Arrive at host’s veth interface.
  • Host routes packets out to the Internet via your host’s external interface, in my case enp3s0 (check ip r on host to see your default gateway).
  • Response packets arrive from Internet back at your host’s external interface destined for the container’s IPv6 address.
  • Your host sees the static route LXD adds for the container’s IPv6 address, and sends the response packets down the host-side veth interface.
  • Response packets appear in the container at eth1.
  1. So we can see there are several places we can ‘attach’ a tcpdump session to track the flow of these packets in and out. I would suggest:
  • enp3s0 on the host (your external interface).
  • veth9142f1c9 on the host (the host-side end of the veth pair).
  • eth1 inside the container.

Lets setup the tcpdump on enp3s0 and then in a separate window run dig inside the container again.

On the host:

tcpdump -l -nn -i enp3s0 host 2a02:nnn:76f4:1::1234 and port 53

Now run the dig command:

lxc exec crouted -- dig @2606:4700:4700::1111 www.linuxcontainers.org

The tcpdump results should show:

sudo tcpdump -l -nn -i enp3s0 host 2a02:nnn:76f4:1::1234 and port 53
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp3s0, link-type EN10MB (Ethernet), capture size 262144 bytes
09:13:06.114853 IP6 2a02:nnn:76f4:1::1234.54264 > 2606:4700:4700::1111.53: 6205+ [1au] A? www.linuxcontainers.org. (64)
09:13:06.138527 IP6 2606:4700:4700::1111.53 > 2a02:nnn:76f4:1::1234.54264: 6205$ 2/0/1 CNAME rproxy.stgraber.org., A 149.56.148.5 (98)

This shows outbound DNS request packets leaving with a source address of 2a02:nnn:76f4:1::1234 going to 2606:4700:4700::1111 and the query for A? www.linuxcontainers.org..

Then the response packet coming back with an answer of CNAME rproxy.stgraber.org., A 149.56.148.5.

But this only shows us that the response packets made it back to the host’s external interface.

Lets re-run the test, but now with tcpdump running on veth9142f1c9:

sudo tcpdump -l -nn -i veth9142f1c9 host 2a02:nnn:76f4:1::1234 and port 53
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth9142f1c9, link-type EN10MB (Ethernet), capture size 262144 bytes
09:15:57.342239 IP6 2a02:nnn:76f4:1::1234.45415 > 2606:4700:4700::1111.53: 17864+ [1au] A? www.linuxcontainers.org. (64)
09:15:57.366240 IP6 2606:4700:4700::1111.53 > 2a02:nnn:76f4:1::1234.45415: 17864$ 2/0/1 CNAME rproxy.stgraber.org., A 149.56.148.5 (98)

Great, so we can see that the host is correctly routing the response packets that are coming in on enp3s0 down veth9142f1c9.

Finally, lets check that the packets are arriving at the container’s eth1 interface:

sudo lxc exec crouted -- tcpdump -l -nn -i eth1 host 2a02:nnn:76f4:1::1234 and port 53
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
09:18:39.547749 IP6 2a02:nnn:76f4:1::1234.33122 > 2606:4700:4700::1111.53: 4365+ [1au] A? www.linuxcontainers.org. (64)
09:18:39.571926 IP6 2606:4700:4700::1111.53 > 2a02:nnn:76f4:1::1234.33122: 4365$ 2/0/1 CNAME rproxy.stgraber.org., A 149.56.148.5 (98)

Success! The packets have now been confirmed arriving at the container.

So this is how you can break down the problem to see where the issue lies. Hope that helps.

3 Likes

Thanks to your steps and instructions I was able to get it working.

Turns out enabling port 53 on ufw was not enough as that still blocked the packets.Completely disbaling ufw did the trick. Then later I figured out that forwarding packets also needs to be enabled in ufw: https://help.ubuntu.com/lts/serverguide/firewall.html

sudo nano /etc/default/ufw

Make this change:

DEFAULT_FORWARD_POLICY="ACCEPT"

Thank you very much for all the help!
Can I at least buy you a coffee or something?

Excellent glad you got it working :slight_smile:

@stgraber has posted some info on donating to Ubuntu’s community fund if you would like to make a donation.

1 Like

Hey @tomp, thanks again for everything!!

If I wanted to limit ingress and egress of a container I since learned that you need to use a bridge for that? So how about a managed bridge, is it impossible to have the ipv6/64 on a managed bridge? Or how about one bridge for ipv4 (leaving the ipv4/32 address on eth0 of the host) and another for ipv6 that gives each container it’s ipv6 automatically?

Should I make a new thread for this for better search engine visibility?

Yes you could create a new bridge manually (using netplan for instance), e.g. call it br0 that is connected to your host’s external interface. You’d need to ensure that the host’s current static IPs are moved from the host’s interface to the new bridge interface, otherwise they will stop working.

Then you could add a new NIC device to your containers (in addition to the private IPv4 one) that connects directly to the new bridge, e.g. lxc config device ad <container> eth1 nic nictype=bridged parent=br0

This would allow you to also use the limits settings on those devices, and they would be directly connected to the host’s external network.

However IIRC your ISP does not run an IPv6 router advertisement service and so your container’s would not be able to auto configure their IPv6 addresses, and you’d need to configure them internally using netplan inside the container.

Also worth noting that your ISP may enforce that a single MAC address can only be present on each network port, if they do this, then using bridging will not work.

If this is the case, then you’d need to use the original approach you linked to and use a private managed bridge, and then use the proxy ndp daemon to advertise your container’s IPv6 addresses onto the external interface.

1 Like

Trying to set the right netplan config on the host with this:

network:
  version: 2
  renderer: networkd
  ethernets:
    eth0:
      match:
        macaddress: 12:49:56:3f:4e:37
      dhcp4: no
      dhcp6: no
  bridges:
    br0:
      interfaces: [eth0]
      dhcp4: no
      dhcp6: no
      addresses:
        - 111.12.100.70/32
        - 1111:aaaa:3004:9978::1/64
      routes:
        - on-link: true
          to: 0.0.0.0/0
          via: 111.12.100.1
      gateway6: fe80::1
      nameservers:
        addresses:
          - 111.12.100.11
          - 111.12.100.10
          - 1111:aaaa::2:53
          - 1111:aaaa::1:53

But I get locked out after rebooting or netplan applying.
Any tips?

Or getting back to routed: Can I use a bridge to route to instead of the container and then connect the containers to the bridge?

I since learned that my host does indeed use mac filtering, so the unmanaged bridge br0 is out I guess.

What is the easiest way to have traffic go through a bridge from the containers so I can limit the network and still work on a “restricted” host like mine?

So you could try the original approach you were trying with ndpproxyd and see if you can get that working.

But also, I don’t see any reason why we couldn’t add our limit support that we have for bridged NICs to routed NICs, so I’ll add that to our ideas board for the future.

1 Like

Yeah, but the guy states that there is a bug in netplan about on-link (whatever that is :smiley:) so it won’t work for ipv6 with netplan, and my attempts to get rid of netplan and go back to ifupdown were unsuccessful. I will give it a try anyways, won’t be the first suicide mission I go for. :smiley:

Would p2p work with mac address filtering on the host? It is also veth based so it could work?

And I am happy even I could give you an idea, thanks for considering that!

Good thing you send me on that suicide mission as it worked out after all @tomp!

So here the in my opinion simplest and most feature rich approach, as I can limit egress and ingress of the network.

Setting up Netplan

$macaddress, $ipv6address, $ipv4address and $ipv4gateway have to be set/changed to your addresses. And eth0 my default physical interface may have a different name for you.

cat > /etc/netplan/01-netcfg.yaml <<EOF
network:
  version: 2
  renderer: networkd
  ethernets:
    eth0:
      match:
        macaddress: $macaddress
      addresses:
        - $ipv4address/32
        - $ipv6address/128
      routes:
        - to: ::/0
          via: fe80::1
      routes:
        - to: 0.0.0.0/0
          via: $ipv4gateway
          on-link: true
      nameservers:
        search: [ invalid ]
        addresses:
          - 1.1.1.1 # These four entries are Cloudflare's DNS
          - 1.0.0.1
          - 2606:4700:4700::1111
          - 2606:4700:4700::1001
EOF

Setting up the Kernel NDP proxying and forwarding

cat >>/etc/sysctl.conf <<EOF
net.ipv6.conf.all.forwarding=1
net.ipv6.conf.eth0.forwarding=1
net.ipv6.conf.all.proxy_ndp=1
net.ipv6.conf.eth0.proxy_ndp=1
EOF

Also make sure IPv6 is not disabled in this file.

UFW Change - If UFW is used
nano /etc/default/ufw

Make this change: DEFAULT_FORWARD_POLICY="ACCEPT"

Then do a reboot.

Install and setup LXD
When initiating LXD after the install put the IPv6/64 range as the lxdbr0 IPv6 address. If already installed you can run:

lxc network set lxdbr0 ipv6.address $ipv6address/64

This way the containers are going to get an ipv6 address from lxdbr0.

Also the following options should be set:

lxc network set lxdbr0 ipv6.dhcp false
lxc network set lxdbr0 ipv6.nat false
lxc network set lxdbr0 ipv6.routing true

The ipv4 stuff can be left alone and stay with NAT.

Run a Linux Container and enjoy
lxc launch ubuntu:18.04 c1
Enjoy a container with and universally routable IPv6.
To get the address you can run lxc list

Special Thanks
This would not have been possible without the help and tutorials of Thomas Parrott @tomp and Ryan Young @yoryan. Thank you both very very much!

Glad to hear you got it working how you wanted it. Was that still running ndppd by the way (it wasn’t in your setup steps)?

1 Like

No, just using the Kernel NDP proxying features that you told me before and Ryan also mentioned in his tutorial, that’s how I was able to put 2 and 2 together. :slight_smile:
Thanks again you guys! Especially you Thomas!

Do you think I should put this small tutorial up on askubuntu?

Ah thats interesting. If that is working without NDP proxy daemon, even with the NDP proxy sysctls activate, the kernel still needs static routes (as generated by the routed NIC type) to work. If its working without that or ndppd then it suggests your ISP are routing your /64 subnet directly to your host rather than expecting NDP resolution to take place. In this way your host is just doing the router part of the job and not needing to proxy NDP as well.

We have a Tutorials section in this forum (although ofcourse we would be happy for you to put a tutorial up on askubuntu too). If you post a tutorial and then we could link to it from our Tutorials section as well.

Ok, so the prerequisites for our tutorial are:

  • Having an /64 IPv6 subnet
  • The ISP routes the /64 subnet directly to the host (If not NDP proxy deamon ndppd has to be used, see here)
  • Running Ubuntu 18.04 and LXD 4.0

Anything I forgot?

Sounds good.

1 Like

Don’t have access to making Tutorial topics.

If you post it as a normal new post and I’ll move it into Tutorials for you.

1 Like