Floating IP between two containers

I am replacing OpenVZ with LXD on many nodes. I have a setup that has been discussed here before:

lxdbr0 -> 192.168.122.1

Each container gets its own static IP, outbound route is 192.168.122.1, then on the host node, I use ipv4.address and ipv4.route to add the IP to the container. If I move the container, the route goes with the container. All good so far!

New wrinkle: I have some containers that use Keepalived to pass a floating IP between two containers on different hosts. In OpenVZ, I have a script that connects to the host (192.168.122.1 in this case) and adds the IP address as a venet to the container. It is automatically picked up by the host and announced as a kernel route to the network.

In LXD, I can add the IP address inside the container, my problem is routing. If I connect to the host and add a static route, the route stays up even if the IP moves to another container (on another host) and/or the original container goes down. Same if I use ipv4.routes to the container’s config.

I realize I could put frr or zebra inside the container and announce it that way, but I was hoping to avoid that and run it on the host.

Any ideas?

Hi @seanfulton

Could you clarify why you are using static routes on the host rather than just bridging your containers onto the main physical network (which would then allow your containers to assign their own IPs), are you routing to a different subnet inside the container?

Can you provide a bit more info about your network setup please?

Thanks

Hi Tom, thanks for the reply.

Each server has 2 NICs on two different LAN segments. So bridging to a NIC would mean having to bridge to both NICs and set up routing inside the container. Each LAN has its own subnet, and the containers are all in a third subnet.

The way it is set up now, lxdbr0 is a bridge, but it doesn’t really do anything but provide a connection point for the containers to the host and network. All containers use 192.168.122.1 for their default route out, so they can be migrated from host to host without having to be reconfigured.

I am not saying this is the best way, but this is what I figured out at the time we started down this road.

sean

@seanfulton I see that makes sense.

So OpenVZ uses static routes and proxy ARP to advertise the IPs out onto the wider network at the layer 2.

You could add a script on the host that does a similar thing to adding an IP in OpenVZ venet, i.e:

  • Add proxy ARP entry using ip neigh proxy add ...
  • Adds static route to the lxdbr0 for the IP
  • Adds IP inside the container using lxc exec ip add...

These rules would say in place as long as the lxdbr0 interface is up though, which is perhaps not ideal. You would need something to remove them when the IP is moved elsewhere.

Yes, that is exactly what i was going to do and that is exactly the problem.

With Openvz, the routes go away when the IP goes away, either through ifdown or a container reboot.

Anyone have any ideas? Maybe there is a different way to do this?

So there is an open issue to add LXC’s existing routed network support to LXD (https://github.com/lxc/lxd/issues/6175), this uses OpenVZ-like proxy ARP and static routes on the host to allow a container to “appear” to on one of the host’s specific interfaces without using a bridge (but unlike IPVLAN actually uses the host’s routing table to reach destinations like other containers or other networks).

Once this done you could do away with your lxdbr0 and the static routes entirely and just use static IPs and routed network mode (which will then setup what you need). It will naturally remove the proxy ARP and static routes on container stop.

The only problem is that in its initial implementation it will be leveraging’s LXC’s underlying router network mode and so will not support hot-plugging additional IPs without a reboot of the container.

However if there was a request to add hot-plug routed IPs then this might be something we could add in a future cycle.

If you want to do it the proper way, use FRR on the host and container and announce the same /32 host routes (loopbacks) from the containers to the bridge, then you get BGP anycast ECMP, should balance between containers.
KeepaliveD sounds like more of a layer2 hack. Why mess around use floating IP’s when its easy to do this stuff at Layer3 these days. You use floating ip’s / vrrp etc when the devices not able to run BGP, which everything is nowadays, in linux land anyway.

Cheers!
Jon.

Thank’s for the feedback but that will not work for our use case. We’re not trying to load balance between two containers.

These are software load balancers and Keekalived has built-in health checks to determine if the load balancers are functioning correctly. If there is an error with the service, Keepalive can shift the IP to the other load balancer which will take over.

BGP can’t do that.

OK I probably miss-understood the original setup then, but interesting still to me. I normally see the software loadbalancers (haproxy or nginx?) would be doing the health checks themselves to some backend, if a haproxy dies it stops advertising its address and you use the other one.
Also forgot to say, active/active balancing to your load balancers makes more sense than active passive. The loadbalancers should normally be stateless in that the request can go to either so you can scale out sideways.

Cheers!
Jon.