Reasoning behind in/out/fwd netfilter rules

To me the fwd, in and out chains look kinda useless since they use an accept policy.
That means that accepting certain packets explicitly doesn’t have any effect.
So is there a reason why they exist?

That’s what they currently look like on arch linux with lxd 5.2:

table inet lxd {
        chain pstrt.lxdbr0 {
                type nat hook postrouting priority srcnat; policy accept;
                ip saddr 10.149.19.0/24 ip daddr != 10.149.19.0/24 masquerade
                ip6 saddr fd42:8ec3:2d17:407a::/64 ip6 daddr != fd42:8ec3:2d17:407a::/64 masquerade
        }

        chain fwd.lxdbr0 {
                type filter hook forward priority filter; policy accept;
                ip version 4 oifname "lxdbr0" accept
                ip version 4 iifname "lxdbr0" accept
                ip6 version 6 oifname "lxdbr0" accept
                ip6 version 6 iifname "lxdbr0" accept
        }

        chain in.lxdbr0 {
                type filter hook input priority filter; policy accept;
                iifname "lxdbr0" tcp dport 53 accept
                iifname "lxdbr0" udp dport 53 accept
                iifname "lxdbr0" icmp type { destination-unreachable, time-exceeded, parameter-problem } accept
                iifname "lxdbr0" udp dport 67 accept
                iifname "lxdbr0" icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
                iifname "lxdbr0" udp dport 547 accept
        }

        chain out.lxdbr0 {
                type filter hook output priority filter; policy accept;
                oifname "lxdbr0" tcp sport 53 accept
                oifname "lxdbr0" udp sport 53 accept
                oifname "lxdbr0" icmp type { destination-unreachable, time-exceeded, parameter-problem } accept
                oifname "lxdbr0" udp sport 67 accept
                oifname "lxdbr0" icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, echo-request, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
                oifname "lxdbr0" udp sport 547 accept
        }
}

If we dropped all unmatching traffic it would also drop all allowed traffic in other tables. This is how nftables works unfortunately, a packet has to be allowed in all hook tables for it to pass, whereas it only has to be dropped in a single table to be blocked.

See discussion here for more detail

To actually drop traffic you can use our network acl feature.

correct me if I’m wrong, but the way I understand it, with the policy of lxd’s chains being accept, all the other rules don’t have any effect and we could just remove them without any change in behavior. That doesn’t mean that we should switch to drop, just that we can basically remove those rules because they don’t do anything.

That being said, IMO using LXD on it’s own without an additional firewall is pretty bad because it both enables IP forwarding and sets the forward policy to accept effectively turning your device into a router.

And if you decide to use a firewall, the only rules that have an effect are the NAT rules. If the firewall has a drop-chain it doesn’t matter if it has a lower or a higher priority than LXDs chain - LXDs rules need to be copied to the firewall implementation to have any effect meaning there’s effectively no difference between enabling and disabling LXDs internal firewall.

For that reason I’d suggest to let LXDs firewall use a default-drop chain to make it secure by default. If the user has their own firewall they can disable LXD’s and copy it’s rules into the firewall configuration.

On top of all that LXD could try to be a better citizen than e.g. docker(which is REALLY bad when it comes to that) and provide integrations with firewalls so you don’t have to hardcode IP addresses hoping that they don’t change. Ideally the interface would be generic enough so we don’t have to support every firewall on earth explicitly, e.g. by calling an external script where the user can react to changes in LXDs network setup.

The alternative would be to do it the other way around and let users subscribe to firewall changes through the LXD server API. That should be made as easy as possible though because most people don’t want to compile code to do scripting on their server.

Again, correct me if I’m wrong.

You are not wrong. Those rules, apart from the SNAT rule, are not really doing anything at this time, by themselves. Really they are a skeleton ruleset used to allow managed bridge services if the user enables the ACL feature (How to configure network ACLs - LXD documentation).

There doesn’t appear to be consensus yet on how different applications should interact when creating nftables rules. The table namespace feature is great to simplify rule management for an application, but the behaviour of nftables to apply a drop rule in any table even if the traffic is allowed in another table is awkward as it undoes some of the benefits of table namespace that could have been realised if that was not the behaviour. This is different than when LXD uses xtables (iptables/ip6tables/ebtables) as there are standard chains that we can add rules to to have our traffic policy interact with other application’s rules.

This can be disabled by setting ipv4.routing=false on the managed bridge (see Bridge network - LXD documentation).

That would be very disruptive for existing users, and even if we only applied it on newly created networks, it would complicate getting started quickly with LXD (as a novice user would immediately have to think about their outbound internet policy). We wouldn’t be able to add a default drop policy to the entire machine as it would potentially disconnect any remote users or interfere with other applications, so at best we could add a default drop rule to traffic to/from the lxdbr0 interface, which wouldn’t improve security of enabling router mode for the other interfaces on the system anyway.

We don’t go in for external hooks with LXD, as our experiences with LXC indicated that adding hook support meant it was hard to get an understanding of how everyone uses the application and makes changing things risky as it would bound to break someone’s integration work flow. Also in a cluster environment its not always clear where the hook should be run. We do have a REST API you can subscribe to to get events, although im not sure what specific events you would be interested in.
Although one thing comes to mind is something @stgraber wrote as a proof of concept for our BGP integration, which uses the event stream to monitor for instances starting and then integrating the config to get their IP in order to advertise them via BGP.

See https://github.com/stgraber/lxd-bgp that may provide some inspiration.

Hopefully in the future we will see some consensus around how applications should coexist when using nftables, or perhaps its behavior will be changed to allow an allow rule to apply irrespective of a drop rule in another table.

1 Like

We wouldn’t be able to add a default drop policy to the entire machine as it would potentially disconnect any remote users or interfere with other applications, so at best we could add a default drop rule to traffic to/from the lxdbr0 interface, which wouldn’t improve security of enabling router mode for the other interfaces on the system anyway.

Right, but couldn’t we just drop forwards instead? That’d still allow all traffic like without a firewall but basically undoes enabling forwarding(except for lxdbr0).

We do have a REST API you can subscribe to to get events, although im not sure what specific events you would be interested in.

A list of LXD-managed interfaces with all their addresses. As you can see in my OP there’s basically just two important rules in the postrouting table which masquerades all traffic that want’s to leave lxdbr0. That requires to know the addresses which were generated when creating the interface though.
Sure, usually you’d just setup all your bridges once, copy the addresses to your firewall and never touch it again, but it’d obviously be easier if you could just modify networks using the LXD cli however you want and have your firewall adjust automatically.

Hopefully in the future we will see some consensus around how applications should coexist when using nftables, or perhaps its behavior will be changed to allow an allow rule to apply irrespective of a drop rule in another table.

Is that an issue on LXDs side or on nftables side? Sounds more like an nftables limitation to me.

Its an nftables behaviour. Its not clear yet what the correct way of approaching multiple applications managing the firewall yes.

Thats what Docker does (see LXD and Docker Firewall Redux - How to deal with FORWARD policy set to drop) and it causes no end of confusion and problem reports on these forums. Certainly not keen to add system wide default drop rules for any sort of traffic due to the potential for unexpected blocking of other application’s traffic.

Do you mean the equivalent of lxc network ls or lxc ls?

yes but there’d have to be an event for network configuration changes so you can update it accordingly.
While writing this answer I started looking into how you’d do that and unfortunately the REST API documentation seems to be pretty bad when it comes to events.
There’s no mention about which event types there are so you have to check the code. And even then it’s not clear what those mean. My current assumption after reading the code for 10min is that a network change would trigger an operation-event with some kind of otherwise unspecified payload - but it’s probably in a similar format to what you’d request to start that operation.

You can use lxc monitor to subscribe to events (and get an idea of the type of API requests being made).

E.g.

lxc monitor --loglevel=info --pretty --type=lifecycle
INFO   [2022-07-18T10:11:27+01:00] Action: network-updated, Source: /1.0/networks/lxdbr0, Requestor: unix/user (@) 

Shows when lxdbr0 network was updated, at which point that can trigger you to pull its latest config/info.

The events documentation is here Events - LXD documentation with a list of the event types.

And recently they got their own constants in the api package too api package - github.com/lxc/lxd/shared/api - Go Packages

Oh nice so the documentation is there so we just have to make it easier to find.
Through Google I came to this page
Neither the linked go nor python documentation list any event types. The python impl at least has some constants in it’s code.
But also, the python doc links to GitHub which also doesn’t have any description of the event types.

Also the doc is 404ing right now :grin:

Either way, I think with some additional cross-links the documentation will be way easier to find(I can make a PR for that) .
Also while we’ve now come to a conclusion how such a solution could be easily implemented using the REST API, it also looks like most people(including me) might never need it because they probably rarely change the bridge config.

Thanks, I’ve asked @ru-fu to take a look at fixing/improving the cross links.
We’re in the process of re-doing our documentations so some of the structure has been changing.

1 Like