The listen config currently creates hairpin rules nicely. But there is one typical scenario where the rules generated do not quite suffice. Consider the following scenario:
One server with one public IP
Container 1: a reverse proxy with a listen config for ports 80 and 443 on the public IP
Containers 2 and 3: application servers that are normally accessed through the reverse proxy
When the application servers want to address each other in a location-agnostic manner, they should use the FQDN, which translates to the public IP. The current NAT rules translates the packets from the application container to the reverse proxy to the proxy local IP, but the response is not SNAT’ed or MASQUERADEd back, resulting in a timeout.
Affected examples include federated cloud applications such as Nextcloud as well as Nextcloud’s integration with Collabora Office. Basically this applies to any application that wants to talk over a REST HTTPS API with another one when it happens to be on the same server.
There are obviously a ton of workarounds possible (IPv6, split horizon DNS, application intelligence, manually added NAT rules), so this question is really about having something elegant that integrates well with LXD. That way, it would not require a lot of orchestration as the container landscape evolves.
Listen rules can use ports, but since SNAT rules cannot rely on ports, such rules cannot be created 1:1 with the listen rules. The only theoretical approach I’ve come up with so far is SNAT based on connection marking. Have there been any other approaches already explored in LXD? Are there any other solutions for LXD but implemented outside of LXD out there?
What are you referring to when you say “listen rules”, this isn’t a concept I am familiar with in LXD?
Can you show the the sudo iptables-save (or sudo nft list ruleset if you’re using nftables) along with a reproducer command that isn’t working (e.g. curl) and lxc config show <instance> --expanded for the relevant instances?
Sorry, I mean the port forwarding function in LXD for the proxy device. For some reason, I remembered it as listen which is the first key for it. Probably because as we use it, it is not really a proxy.
The relevant bits from iptables-save -t nat on one host:
From outside, it works as expected. But from any host on lxdbr0’s subnet, it times out since the reply packet does not return from 91.190.196.250. As such is a hairpin NAT case.
Yes loading br_netfilter will unlock that functionality and restarting the instance.
The reason we never load the br_netfilter module is because it will then potentially apply the system’s existing firewall rules to intra-bridge traffic, which may cause unexpected disruption, depending on the rules in place.
When adding the br_netfilter module parameter to a running container, it breaks its existing hairpin NAT setup (at least within some days). However, the new rule using br_netfilter does not take effect until after a container reboot.
I noticed this when certain quite specific PHP routines started failing with a timeout. They tried to fetch an HTTPS resource that happened to be hosted on the same container.
Got a similar issue, mocking br_netfilter, but it is related to forward listen_ip:
level=warning msg="IPv4 bridge netfilter not enabled. Instances using the bridge will not be able to connect to the forward listen IPs" driver=bridge err="br_netfilter kernel module not loaded" network=lxdbr0 project=default