Reasoning behind in/out/fwd netfilter rules

Oh nice so the documentation is there so we just have to make it easier to find.
Through Google I came to this page
Neither the linked go nor python documentation list any event types. The python impl at least has some constants in it’s code.
But also, the python doc links to GitHub which also doesn’t have any description of the event types.

Also the doc is 404ing right now :grin:

Either way, I think with some additional cross-links the documentation will be way easier to find(I can make a PR for that) .
Also while we’ve now come to a conclusion how such a solution could be easily implemented using the REST API, it also looks like most people(including me) might never need it because they probably rarely change the bridge config.

Thanks, I’ve asked @ru-fu to take a look at fixing/improving the cross links.
We’re in the process of re-doing our documentations so some of the structure has been changing.

1 Like

I was just about to create a new topic with the same questions as @m1cha but gladly I found this thread and I don’t need to explain the whole issue with lxd’s nft tables.

But it bugs me slightly to see @tomp 's statement.

Hopefully in the future we will see some consensus around how applications should coexist when using nftables, or perhaps its behavior will be changed to allow an allow rule to apply irrespective of a drop rule in another table.

Although it’s an understandable issue on nftables, this comment seems to indicate there will be no solution unless someone else solves the problem. This will leave lxd in an awkward situation as nftables usage increases.

Expected behavior

It should be possible to obtain a hardened instance firewall without crippling LXD. It should be possible to route forward only packets to LXD managed networks while denying everything else. It should also be possible to harden input firewall rules to accept only a few rules and to be able to add a few extra LXD generated rules.

How it works currently

To allow for instance internet access, you need to allow all forward packets unless you configure firewall rules by hand. Similar issues with input and output rules.

The issue

It seems the main issue is that lxd is using a base chain instead of a regular chain. Indeed the base chain is useless if there is another base chain using the same hook which rejects packets, as was the case in this thread, for example.

A secondary base chain would only make sense to add drop veredict statements. As is, a secondary base chain with only accept rules is pointless.

A possible solution

Be able to change the behaviour of LXD firewall generated rules so that LXD rules are contained in a regular chain and are inserted as a jump from another base chain set of rules.

Just as there are network configuration options which control a few aspects of how these rules are added, how about adding a bit more so the user can configure them correctly?

Make the defaults to add rules as they are so there is no change for anyone who doesn’t run into these issues. As @m1cha was mentioning, add a ipv4.nft.forward.policy configuration option so users can change the default policy of these rules.

It would also be nice to be able to add the chains as regular and not base chains and be able to add something like a vmap to jump from another base chain into the correct lxd regular chain.

Something like ip4.nft.input.type=jump, ip4.nft.input.table=filter, ip4.nft.input.chain=INPUT, ip4.nft.input.vmap_line=5 would produce a regular chain with the input rules and it would add a jump statement to the INPUT chain in the filter table on line 5. Here I’m guessing ip4.nft.input.type could be either base, jump or goto in which case both jump and goto would make a custom input chain and the vmap would either jump or goto the correct table in the specified line. Default value would be base.

This is basically how the LXD xtables driver works. It injects rules into the main base chains (or uses its own chains with jump rules from the main chains).

This comes with its own set of problems as now multiple applications will be managing the same ruleset and potentially affecting each other’s rules due to ordering or default policy (e.g LXD and Docker Firewall Redux - How to deal with FORWARD policy set to drop). There are numerous examples of problems like this in the forums.

It would be good if nftables provided a way to state that an accept in a base chain was final, so that it couldn’t then potentially be dropped by other base chains (which is how the drop policy is for nftables chains). Then we would just need to control the priority LXD uses for the netfilter hooks in the base chains (which could be a setting) to ensure its ordered how the user wants.

Otherwise one of the main benefits of nftables (the use of separate table namespaces) which allows for isolated rules for each application is lost and we’re back to trying to order the rules correctly by controlling the start up order of each application (which we’ve seen can then break if you reload them in a different order later).

The rules LXD adds are only to allow instances access to the managed bridge (lxdbr0) services (such as DNS, DHCP, ping) and for SNAT to the external interfaces. It doesn’t add drop/reject rules (with the exception of the ACL feature).

So with the nftables driver, these default accept rules are really only effective to provide instances with access to the managed services when the LXD ACL is enabled (as that adds a default drop rule for lxdbr0 traffic).

My suggestion right now would be to ensure that the manual rules you add affect only traffic on non-LXD managed interfaces (as LXD will only add rules for its own interfaces), without adding a default drop/reject for all interfaces (i.e add a default drop/reject rule for all interfaces except lxdbr0).

Or you can turn off the LXD firewall rules (ipv{n}.firewall=false) entirely and manage them centrally via which ever firewall configuration software you are using (this is what I do as I prefer to have the firewall policy for a system managed centrally).

Its a tricky one for sure. Which ever way we do it, LXD rules will potentially (likely in my experience) be affected by rules added by other applications/systems. I’m not against using non-base chains and then adding jump rules into the main base chains. Although there isn’t, as far as I know, a standard set of base chains, except those added by the nftables iptables shim commands.

Indeed it would not be bad, but I’m not sure this problem should be ignored and considered as just a case of wishful thinking. I think nftables has this behavior not as a bug, but by design. Since it seems LXD is developed to ease the usage of containers, it should adapt to how nftables actually works to help end users run their containers/vms. And I don’t think the solution it currently provides is really helpful.

To explain what I mean, I want to consider a situation which I believe the LXD team wants to be able to solve. I also do not wish to consider the case in which the netfilter team will change their mind on how their policy

LXD use case

Set up a hardened server which runs LXD containers/vms which can access the internet. By a hardened server, I mean one with a set of firewall accept rules and a drop policy for input and forward hooks, at least. I want to consider the control obtained by ipv{n}.firewall=false as was mentioned.

Solution 1: ipv{n}.firewall=true

In this case, if input and forward hooks have a drop policy in another chain, I’ll basically not have any connectivity in my containers, so it doesn’t make sense to use them in the first place.

The other solution would be to change the drop policy in my other base chain, which ranges from undesirable to unacceptable, depending on how important your data on that server is.

I guess this solution could be discarded, as both options are unacceptable. This leaves our only real solution to be the following one.

Solution 2: ipv{n}.firewall=false

Ideally, I would copy the current tables and use those. But what if I want to add another managed LXD network? I would have to manually add rules for every other network and research how to do it properly. In this case, in what way is LXD making my life easier? Although I can finally get control over the firewall rules, I thought the purpose of LXD was to be better than using LXC and having to configure everything yourself.

Conclusion

Solution 1 is unacceptable while solution 2 is having to do everything yourself (which is contrary to the automation LXD should provide).

I can agree adding these rules automatically has its downsides as well, but in my initial post, I proposed a configurable solution. Give the user the control to choose to add a regular chain. It would not be default, but the user can choose to change this behaviour.

The only concern which would remain is what you mentioned that “there is no standard set of base chains”. Similar to what I mentioned above, it would be nice if the user could then choose this behaviour. It would not be automatic. There would be 3 configuration options for the user to choose for each base hook.

  1. Should the firewall rules be placed in a base chain or in a regular chain?

  2. If the user chooses a regular chain, then he must also provide which base chain should be used for the jump. This should not be auto detected as that comes with natural problems.

  3. In which line should the jump rule be added? Maybe make this default to first or last rule so the user only needs to change this if necessary.

I’ve seen how LXD may even add one base chain for every managed network. As I mentioned before, all that would be needed would be a vmap jump line to choose the appropriate table and there would even be a smaller number of tests for verification.

Current input chains for two networks

table inet lxd {
	chain in.lxdbr0 {
		type filter hook input priority filter; policy accept;
		iifname "lxdbr0" tcp dport 53 accept
		iifname "lxdbr0" udp dport 53 accept
		iifname "lxdbr0" icmp type { destination-unreachable, time-exceeded, parameter-problem } accept
		iifname "lxdbr0" udp dport 67 accept
		iifname "lxdbr0" icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
		iifname "lxdbr0" udp dport 547 accept
	}

	chain in.lxdcustombr0 {
		type filter hook input priority filter; policy accept;
		iifname "lxdcustombr0" tcp dport 53 accept
		iifname "lxdcustombr0" udp dport 53 accept
		iifname "lxdcustombr0" icmp type { destination-unreachable, time-exceeded, parameter-problem } accept
		iifname "lxdcustombr0" udp dport 67 accept
		iifname "lxdcustombr0" icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
		iifname "lxdcustombr0" udp dport 547 accept
	}
}

How it would look like with this alternate configuration

Set the following configuration variables:

  • ip4.nft.input.type=jump
  • ip4.nft.input.table=filter
  • ip4.nft.input.chain=INPUT
  • ip4.nft.input.insert_position=first

Which would produce

table ip filter {
	chain INPUT {
		type filter hook input priority filter; policy drop;
               iifname vmap { "lxdbr0" : jump inet lxd in.lxdbr0, "lxdcustombr0" : jump inet lxd in.lxdcustombr0}
	}
}

table inet lxd {
chain in.lxdbr0 {
		tcp dport 53 accept
		udp dport 53 accept
		icmp type { destination-unreachable, time-exceeded, parameter-problem } accept
		udp dport 67 accept
		icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
		udp dport 547 accept
	}
chain in.lxdcustombr0 {
		tcp dport 53 accept
		udp dport 53 accept
		icmp type { destination-unreachable, time-exceeded, parameter-problem } accept
		udp dport 67 accept
		icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
		udp dport 547 accept
	}

Can’t guarantee those rules would compile perfectly since I edited them by hand, but I think they might be understandable. In this case we would get the correct behavior of accepting those rules and there would even be less checks since the check for iifname was already made in the vmap entry. If those rules did not apply, then the chain would just return to the next position in the base chain and continue from there.

I think regular chains can have policies too. It would also be able to harden the lxd firewall rules by adding a drop policy without affecting rules to networks not managed by LXD since those will not be sent to the regular chains on the vmap entry.

My main point was that doing it like you propose is how we do it for xtables and that has also has a different set of problems, but the solution is the same, that is to modify the system firewall rules manually.

Arguably making xtables and nftables drivers the same could be an approach we take, but we would still encounter issues with ordering conflicts with other firewalls and applications (e.g docker).

Your point about having a setting which places the rules in a certain place in the ruleset doesn’t address the ordering issues when applications that modify the firewall are restarted in an order different to the boot time order and then apply their rules in a new order.

I agree having different behavior is confusing. But as someone who regularly has to support problems with the xtables approach, it doesnt feel like that is much better to be honest.

I wonder if we shouldn’t just remove the service accept rules (leaving the snat rule) for nftables which would over time as people switched to nftables mean lxd wouldn’t be adding those rules at all. Then documenting what services and rules lxd requires.

They can have a default rule at the bottom. But we already have support for this via lxds acl feature. Which adds a configurable default action.

1 Like

I edited my previous post and added information on how the configuration options could be and there would be an option to choose an insert position. The user can control that to bypass issues between different applications, but it would ultimately be configurable.

The issue is that it would probably still be better than both solutions 1 and 2 above.

If the user could configure to use regular tables, those accept rules would be used and important.

An accept would be as you mentioned previously: final for whateven chain it was inserted in.

I saw your updated example, thanks, it makes sense.

Yep we have something similar for xtables ipv4.nat.order but it only affects the Nat rule.

Adding ordering generally would only fix issues between applications if they start in a predictable order and are not reloaded after boot time.

Here is @stgraber post about this

This is why we are hesitant to change the approach at the moment until we can see how other firewall systems approach coexistence when using native nftables (rather than just calling the iptables shim).

I had proposed something similar to your suggestion previously

I can definitely see there might still be issues between different applications, but wouldn’t that problem still be smaller than not being able to have a hardened server or having to add all firewall rules by hand?

I have no clue on how to solve the ordering issue between different applications, but I still think my proposed solution solves some very problematic issues and doesn’t assume the behavior of other base chains. The correct base chain to jump from would be given by the user and the user can set that up and administer that as he wishes.

On my proposed solution, other than the ordering between rules from different applications, is there any other issue?

I chatted with @stgraber about this and we don’t want to make changes until we understand which direction the wider community are going in with regards to coexisting firewall rule sets. I am thinking of posting to the nftables mailing list asking about the thinking of the nftables namespaces given that a drop/reject in any of them prevents the other accept rules from taking effect. And whether they have any recommendations.

1 Like

if you do, please post a link to the archive of that thread so we can follow along.

1 Like

Thanks for the feedback!