[SOLVED] Setting up router container to manage all networking

I think the summary is great - I didn’t know about claude, I’ll have to try that too as I’ve been using others. I cannot answer on it being an acceptable practice. As a software engineer, or just engineer in general, I prefer to have a single source of truth, how do you organize or structure your data or information accordingly? As another post, it will get lost in the mix and become stale.

I’m running this on Gentoo. I got stuck as I’m trying not to disrupt my live network :(. That said, I would like to resolve this because it should be close.

Yeah, those are my motivations, maintenance is the biggest though. Prior to this setup, I had a dedicated router and for upgrades, having to physically walk over, turn on the monitor … is slow and requires more equipment. I do all of my upgrades from a single location with a huge monitor rather than a tiny one. I’m all for exercise, but that “process” was not effective or efficient.

I think it is conceptually possible to test this behind my existing network without impacting it. I have 2 identical machines, 1 that I’m replacing is presently running FreeBSD with this setup using jails, and the replacement machine running Gentoo Linux. I think all I should have to do is use 2 different subnets, one for my live network, and one for my test network. Then when the test one works, I can move it to primary and swap the subnet.

With that, yeah, I’m up for testing, but as I indicated earlier, I generally want to do it when nobody else is going to complain if I break something.

You can use containers to simulant what the AI said.
Create an incus project for isolation. Switch to the project for further test.
All you need is 1 bridge and 2 container, 1 for release ip, 1 for acquire ip.


DHCP server can use incusbr0 as wan(simulanting ISP router). AI said create bridge in container, but I don’t know how to expose virtual port to other containers, so I suggest create another bridge in host, point gateway to DHCP server. Don’t use veth, I told you how to do it. Ensure dnsmasq is properly configured and running.
Bridge doesn’t need to bind any physic NIC. Let’s name it br1.
Create a profile called bridge for easy management.

config: {}
description: Let instances use br1
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: br1
    type: nic

Create a test container called ct, make sure it’s working fine.
Now let’s start test. Add profile to ct:
incus profile add ct bridge
And restart ct’s network.
If ct got an ip from DHCP server, then you succeed.
If not, do what the AI said for debugging.

A few thoughts.

  1. Personally I’m very wary of using AI to answer technical questions.
  2. The idea of using Linux as a router has been around for a loooong time, plenty of worked examples out there.
  3. A bare bones Linux “router” routes traffic between two different subnets using ip forwarding and masquerading.
  4. A real internet facing “router” typically has one interface for the WAN side and at least one other for the LAN side which is connected to a switch for other LAN hosts to use.
  5. You’d expect a real internet facing “router” to acts as “firewall” (maybe iptables based) and provide both DHCP and DNS services to a LAN.
  6. The virtual equivalent of a real “router” can be an instance configured with two nic devices combined with a incus host “bridge” which acts as a switch that both the LAN side of the “router” instance and other LAN side instances can connect to. In its simplest form, there is NO internal bridge in the “router” instance.

Parts of the AI derived scheme shown above appear to be incorrect. You only need to pass one physical nic to the “router” instance - which will be the WAN interface (although this could be via a bridge created on the incus host).

For example, part of a “router” container config:

devices:
  eth0:
    nictype: physical
    parent: enp11s0
    type: nic
  eth1:
    nictype: bridged
    parent: br2
    type: nic

The public network is on eth0 and the private network on eth1.

In my example, the incus host interface enp11s0 is actually connected to an upstream router running DHCP on 192.168.20.x & the incus host bridge “br2” is non-persitent and was created using ip link commands. The bridge “br2” has no ip and no real members.

On the incus host:

root@debincus-vm:~# ip link show br2
13: br2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e1:da:81:1d:2b brd ff:ff:ff:ff:ff:ff
root@debincus-vm:~#

The network within my example “router” container is (note: there is no bridge here):

root@debincus-vm:~# incus shell d12-c5
root@d12-c5:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:1e:09:41 brd ff:ff:ff:ff:ff:ff
    inet 192.168.20.49/24 metric 1024 brd 192.168.20.255 scope global dynamic eth0
       valid_lft 5854sec preferred_lft 5854sec
    inet6 fe80::5054:ff:fe1e:941/64 scope link
       valid_lft forever preferred_lft forever
18: eth1@if19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 10:66:6a:a5:47:9a brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.100.1/24 brd 192.168.100.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::1266:6aff:fea5:479a/64 scope link
       valid_lft forever preferred_lft forever
root@d12-c5:~#
root@d12-c5:~# cat /var/lib/misc/dnsmasq.leases
1747955743 10:66:6a:7f:8e:21 192.168.100.50 d12-1 01:10:66:6a:7f:8e:21
1747969771 00:16:3e:8b:cf:af 192.168.100.59 testvm 01:00:16:3e:8b:cf:af
root@d12-c5:~# 

Forwarding, masquerading and basic dnsmasq within the “router” container is as per the AI scheme above.

Part of a LAN side container config:

devices:
  eth0:
    nictype: bridged
    parent: br2
    type: nic

Network within a LAN side container:

root@d12-1:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
25: eth0@if26: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 10:66:6a:7f:8e:21 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.100.50/24 metric 1024 brd 192.168.100.255 scope global dynamic eth0
       valid_lft 41885sec preferred_lft 41885sec
    inet6 fe80::1266:6aff:fe7f:8e21/64 scope link
       valid_lft forever preferred_lft forever
root@d12-1:~#
root@d12-1:~# resolvectl status --no-pager
Global
       Protocols: +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub

Link 25 (eth0)
    Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.20.1
       DNS Servers: 192.168.20.1
root@d12-1:~#

root@d12-1:~# traceroute 192.168.0.55
traceroute to 192.168.0.55 (192.168.0.55), 30 hops max, 60 byte packets
 1  d12-c5 (192.168.100.1)  0.085 ms  0.021 ms  0.029 ms
 2  192.168.20.1 (192.168.20.1)  1.217 ms  1.103 ms  1.008 ms
 3  192.168.0.55 (192.168.0.55)  1.034 ms  1.011 ms  1.275 ms
root@d12-1:~#

In my case the incus host address is 192.168.0.55.

So, now every instance whose device eth0 is bridged to incus host bridge “br2” will be on the private network while the “router” container is running.

1 Like

@stgraber, may we use the incus lab to vet, test and document this scenario?

Thanks everyone for taking the time and effort to discuss this topic!

I wanted to clean up my posts. It was confusing me.

I provisioned a new container with these commands. It is also worth mentioning that the new container is not using a template and thus not inheriting the default network which I think is causing issues (for me at least).

incus launch images:gentoo/openrc test
incus config device add test wan nic nictype=physical parent=enp2s0 name=wan
incus config device add test lan nic nictype=physical parent=eno1 name=lan
incus config device add test p2p nic nictype=p2p host_name=veth name=veth

# reminder that this is on the host
ifconfig veth 10.130.0.100/24 up

incus shell test

# in test instance
emerge --sync
emerge net-misc/bridge-utils -q
emerge net-analyzer/tcpdump -q

dhcpcd wan

brctl addbr bridge
brctl addif bridge lan
brctl addif bridge veth

ifconfig lan up 10.130.0.1/24 up
ifconfig bridge 10.130.0.101/24 up

# my actual install from earlier was using nftables, but just to test a barebones install
iptables -t nat -A POSTROUTING -s 10.130.0.0/24 -o wan -j MASQUERADE

ping -c 1 10.130.0.100 # returns successfully

exit

# from the host
ping -c 1 10.130.0.1 # returns successful
ping -c 1 10.130.0.101 # returns successful

The networking config looks a bit different than what I had earlier and that is why it was so confusing. It was pulling in configuration from a template that I somehow could not remove.

I’m not certain anymore this is the solution as this is the same incarnation I had tried earlier, worked, and then quit working. The only difference appears to be it is not inheriting from a template. I will reprovision my instance entirely with all the tools normally to see if that sorts it out.