IPv6 without NAT inside LXC container

tomp · September 27, 2021, 2:29pm

I have another idea which is pretty “weird” but in some senses simpler.

Rather than trying to spread the /64 addressing across two interfaces, I’ve tested this approach that works.

Setup lxcbr0 with the 2a00:1234:1:5678::1/64 address and remove the 2a00:1234:1:5678::1/128 address from the host’s eth0 interface (so that its just left with link-local address and no global addresses).
Ensure eth0 on the host doesn’t accept router advertisements.
You should still be able to setup the static and default IPv6 routes to 2a00:1234:1::1 as before.
Add an IP neigh proxy for your lxcbr0 IP6 address on the host’s eth0.
Check you can ping your default gateway and an external IPv6 address (the host should use the IPv6 address on the lxcbr0 interface as the source address of packets leaving eth0).

Now for the containers:

They should be able to ping lxcbr0’s IPv6 address normally.
You will need to add static IP neigh proxy entries for each container’s IP (or automate it) on the LXD host’s eth0 interface.

This would allow full communication (it works in my test lab).

LXD host:

ip a
2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:16:3e:36:04:d4 brd ff:ff:ff:ff:ff:ff
    inet 10.128.213.2/24 brd 10.128.213.255 scope global enp5s0
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe36:4d4/64 scope link 
       valid_lft forever preferred_lft forever
3: lxdbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:12:c7:8f brd ff:ff:ff:ff:ff:ff
    inet 10.237.24.1/24 scope global lxdbr0
       valid_lft forever preferred_lft forever
    inet6 fd42:b545:2e58:ec06::2/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe12:c78f/64 scope link 
       valid_lft forever preferred_lft forever

ip -6 r
::1 dev lo proto kernel metric 256 pref medium
fd42:b545:2e58:ec06::1 dev enp5s0 metric 1024 pref medium
fd42:b545:2e58:ec06::/64 dev lxdbr0 proto kernel metric 256 pref medium
fe80::/64 dev enp5s0 proto kernel metric 256 pref medium
fe80::/64 dev lxdbr0 proto kernel metric 256 pref medium
default via fd42:b545:2e58:ec06::1 dev enp5s0 metric 1024 pref medium

ip neigh show proxy
fd42:b545:2e58:ec06::2 dev enp5s0  proxy
fd42:b545:2e58:ec06:216:3eff:fe94:4d2f dev enp5s0  proxy

lxc network show lxdbr0
config:
  ipv4.address: 10.237.24.1/24
  ipv4.nat: "true"
  ipv6.address: fd42:b545:2e58:ec06::2/64
  ipv6.nat: "false"
description: ""
name: lxdbr0
type: bridge

lxc ls
+------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| NAME |  STATE  |        IPV4         |                     IPV6                      |   TYPE    | SNAPSHOTS |
+------+---------+---------------------+-----------------------------------------------+-----------+-----------+
| c1   | RUNNING | 10.237.24.80 (eth0) | fd42:b545:2e58:ec06:216:3eff:fe94:4d2f (eth0) | CONTAINER | 0         |
+------+---------+---------------------+-----------------------------------------------+-----------+-----------+

Container:

lxc exec c1 -- ip a
4: eth0@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:94:4d:2f brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.237.24.80/24 brd 10.237.24.255 scope global dynamic eth0
       valid_lft 2890sec preferred_lft 2890sec
    inet6 fd42:b545:2e58:ec06:216:3eff:fe94:4d2f/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 3146sec preferred_lft 3146sec
    inet6 fe80::216:3eff:fe94:4d2f/64 scope link 
       valid_lft forever preferred_lft forever

lxc exec c1 -- ip -6 r
fd42:b545:2e58:ec06::/64 dev eth0 proto ra metric 100 expires 3130sec pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
default via fe80::216:3eff:fe12:c78f dev eth0 proto ra metric 100 expires 1330sec mtu 1500 pref medium

tetech · September 27, 2021, 3:38pm

Hi @tomp, thanks for your follow-up. To answer your first post, here’s the view inside the container:

# ip -6 r
2a00:1234:1:5678::/64 dev eth0 proto kernel metric 256 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
default via fe80::216:3eff:fea7:1758 dev eth0 metric 1024 pref medium
# ip -6 n
fe80::216:3eff:fea7:1758 dev eth0 lladdr 00:16:3e:a7:17:58 router REACHABLE

I tried setting the container’s default gateway to the global address of lxcbr0 but it doesn’t make a difference one way or the other.

Here’s some additional information which clearly illustrates the issue.

When I ping from the host, NS messages to the gateway come from the global IP:

# tcpdump ip6
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:06:33.926756 IP6 2a00:1234:1:5678::1 > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2a00:1234:1::1, length 32
10:06:33.929886 IP6 2a00:1234:1::1 > 2a00:1234:1:5678::1: ICMP6, neighbor advertisement, tgt is 2a00:1234:1::1, length 32
10:06:33.929917 IP6 2a00:1234:1:5678::1 > ams17s12-in-x0e.1e100.net: ICMP6, echo request, id 52543, seq 0, length 64
10:06:33.962581 IP6 ams17s12-in-x0e.1e100.net > 2a00:1234:1:5678::1: ICMP6, echo reply, id 52543, seq 0, length 64
10:06:34.926947 IP6 2a00:1234:1:5678::1 > ams17s12-in-x0e.1e100.net: ICMP6, echo request, id 52543, seq 1, length 64
10:06:34.959682 IP6 ams17s12-in-x0e.1e100.net > 2a00:1234:1:5678::1: ICMP6, echo reply, id 52543, seq 1, length 64
10:06:35.927166 IP6 2a00:1234:1:5678::1 > ams17s12-in-x0e.1e100.net: ICMP6, echo request, id 52543, seq 2, length 64
10:06:35.959820 IP6 ams17s12-in-x0e.1e100.net > 2a00:1234:1:5678::1: ICMP6, echo reply, id 52543, seq 2, length 64
# ip -6 n show dev eth0
2a00:1234:1::1 dev eth0 lladdr 3c:61:04:a4:1f:7c router REACHABLE
...

If I ping from inside the LXC container, they come from the link address. This is the tcpdump (taken on the host):

# tcpdump ip6
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
08:41:55.229077 IP6 fe80::216:3cff:fea8:db1b > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2a00:1234:1::1, length 32
08:41:56.305550 IP6 fe80::216:3cff:fea8:db1b > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2a00:1234:1::1, length 32
08:41:57.345548 IP6 fe80::216:3cff:fea8:db1b > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2a00:1234:1::1, length 32
# ip -6 n show dev eth0
2a00:1234:1::1  router FAILED
...

tetech · September 27, 2021, 4:15pm

Interesting idea! I’m trying it out. The gateway is not responding to NS messages from lxcbr0. Possible that the provider does some type of MAC filtering? Hmm. I’ll work on it.

tomp · September 27, 2021, 4:33pm

tetech:

If I ping from inside the LXC container, they come from the link address. This is the tcpdump (taken on the host):

# tcpdump ip6
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
08:41:55.229077 IP6 fe80::216:3cff:fea8:db1b > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2a00:1234:1::1, length 32
08:41:56.305550 IP6 fe80::216:3cff:fea8:db1b > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2a00:1234:1::1, length 32
08:41:57.345548 IP6 fe80::216:3cff:fea8:db1b > ff02::1:ff00:1: ICMP6, neighbor solicitation, who has 2a00:1234:1::1, length 32
# ip -6 n show dev eth0
2a00:1234:1::1  router FAILED
...

So this looks like the host is losing the neighbour address of the default gateway.

If you’re provider’s router is filtering messages with a link-local address, that may explain why its being ignored when you remove the global address from eth0 interface.

Your provider shouldn’t really be filtering link-local packets, they are critical for IPv6 operation.

If you use ndppd and can get it statically respond to all IPs in the /64 with a global address, perhaps it will work.

tetech · September 27, 2021, 5:16pm

I’m thinking they don’t filter link local but might be filtering based on MAC. Another idea might be to make an ifup script which pulls the MAC of the gateway and then manually adds it to the static neighbor cache, i.e. bypass NDP.

Edited to add:
Yeah, if I add it as a permanent neighbor, i.e.

# ip -6 neigh del 2a00:1234:1::1 dev eth0
# ip -6 neigh add 2a00:1234:1::1 lladdr 3c:61:04:a4:1f:7c dev eth0
# ip -6 n show dev eth0
2a00:1234:1::1 dev eth0 lladdr 3c:61:04:a4:1f:7c PERMANENT

then (at least in the very initial test) it seems to work. But wow, what an ugly hack.

tomp · September 27, 2021, 5:57pm

What about with ndppd Dev version that responds with global address?

tomp · September 27, 2021, 5:58pm

Oh but that’s your host soliciting for the router.

I would be opening a support ticket with my provider around now

tomp · September 27, 2021, 6:00pm

Perhaps they can route you a separate /64 rather ygsn the onlink one, as that is the proper way of doing it anyway.

tetech · September 28, 2021, 8:18am

Yeah, I put in a support ticket and they said to set the neighbor manually, as I wrote in the previous post.

I think the source address selection is outside of LXC, and I can’t find a good reliable way to override it, so that might be the end of the line.

Perhaps what I will consider is creating a second interface in the LXC containers, keeping the veth for IPv4 (and communication with the host) and a second macvlan one for any IPv6.

Both a bit ugly, but should work.

tomp · September 28, 2021, 8:24am

Yes that would work too.

Its a shame providers don’t deploy IPv6 properly.

tetech · September 28, 2021, 10:32am

Yes. As it has been explained to me, it seems to be a SolusVM/Virtualizor limitation rather than their own ignorance. But thank you for your ideas and suggestions @tomp.