How can I use BGP with incus cluster?

I have just started learning about Incus and BGP. I have watched the LXD and BGP videos on the YouTube channel, but now I have a situation where I have 3 Incus clusters, and all 3 clusters have the exact same IP address ranges. When instances access the network, BGP is unable to identify which cluster they belong to unless I manually specify the IP address, but I would like to use Incus’ own network functions for automatic allocation.
In addition to this, there is also a separate Incus cluster with multiple hosts within it. When using BGP, it is similarly unable to confirm which host an instance is coming from without manually assigning an IP address.
In total, there are the above two issues.
I hope to get some help, thank you very much.

When using bridge network for bgp session, if there are 3 incus server in one cluster,
the ipv4 address of network on each server is the same by default.
You can find this through the incus network list.
For example ,10.10.10.0/24,this address will be a prefix-list in bgp router.

So what confuses me is how to make FRR (bgp router) forward traffic correctly to the correct incus server ? Of course, this issue does not exist in a single incus node.Do I have to make the address of bridge network explicitly distinguish between each host ?

In fact, I think it’s good to have the same IP address on every host for bridge network.
Because I can keep my IP address unchanged when migrating instances.

Responding to this older topic as it came up on Github.

So basically if running a cluster with local bridges on all of them and you want the instances to be reachable from the outside with traffic going to the correct server through BGP, you won’t be able to do it for the instance’s traditional IP acquired through DHCP.

Incus doesn’t really know what those IP addresses are, they’re handled separately by dnsmasq and Incus just queries them when prompted.

For such a scenario what you should try (I haven’t had a need for it myself), is to keep the bridges themselves on distinct NAT-ed RFC1918 type subnets, but then give a ipv4.routes.external entry on the NIC for the IP address that should be advertised to your router over BGP.

You can then put that address on the instance’s network interface and now moving the instance around will just have whatever server is running the instance do the advertising of that address.

It “may” actually work to set both ipv4.address and ipv4.routes.external to the same address on the instance, making sure that the instance address itself is the same regardless of the host but then get you a /32 advertisement for the address.

Thanks @stgraber

if i understand correctly, that architecture impossible with clustered incus, right? Only if all incus servers dont know anything about each other?

Why wouldn’t that work in a cluster?

You seem to already have each server advertising the same /24, so using ipv4.routes.external on the instance NIC to advertise a more specific /32 should be fine.

cause cluster subnet is same for every cluster member. Isn’t it?

Yes.

Use incusbr0 with say 10.123.123.1/24 as its ipv4.address.
All servers will advertise the /24 over BGP.

Then pick an instance and set its ipv4.address to say 10.123.123.10 and its ipv4.routes.external also to 10.123.123.10. Now the server that runs that instance will also announce a more specific 10.123.123.10/32 route.

1 Like

All cluster members start announce 10.123.123.10/32. So same problem like announce /24

Ah, except that in this situation, that would be considered a bug and so something we can fix :slight_smile:

Just to triple check, can you show:

  • incus network show NAME
  • incus config show --expanded NAME
  • incus query /internal/debug/bgp

Oh yeah, sure.

Network settings:

config:
  bgp.peers.debug.address: 192.168.20.51
  bgp.peers.debug.asn: "64512"
  bgp.peers.mikrotik.address: 192.168.20.1
  bgp.peers.mikrotik.asn: "65540"
  ipv4.address: 10.35.28.1/24
  ipv4.nat: "false"
  ipv4.routes: 10.35.28.218/32
  ipv6.address: none
description: ""
name: incusbr0
type: bridge
used_by:
- /1.0/instances/demo
- /1.0/profiles/default
- /1.0/profiles/packer
managed: true
status: Created
locations:
- mini2
- mini5
- mini4
- lb
- mini1
- mini3
- worker1
project: default

Demo vm

architecture: x86_64
config:
  cluster.evacuate: auto
  image.architecture: amd64
  image.description: Ubuntu noble amd64 (cloud) (20251126_07:42)
  image.name: ubuntu-noble-amd64-cloud-20251126_07:42
  image.os: ubuntu
  image.release: noble
  image.serial: "20251126_07:42"
  image.variant: cloud
  limits.cpu: "2"
  limits.memory: 2GiB
  migration.stateful: "true"
  volatile.base_image: 1b73acb7de5d67e5635eb0f212617008619deca8e1796bc8bd794c505fe0f1a0
  volatile.cloud-init.instance-id: 62c68d32-e1f9-45f1-8988-b213ccfb51e3
  volatile.eth0.host_name: tap4a4427d4
  volatile.eth0.hwaddr: 10:66:6a:ca:06:b2
  volatile.last_state.power: RUNNING
  volatile.last_state.ready: "false"
  volatile.uuid: 25c88423-4857-4529-a11c-fe2de0316f4a
  volatile.uuid.generation: 25c88423-4857-4529-a11c-fe2de0316f4a
  volatile.vm.definition: pc-q35-10.1
  volatile.vm.rtc_adjustment: "-1"
  volatile.vm.rtc_offset: "1"
  volatile.vsock_id: "1502800757"
devices:
  eth0:
    network: incusbr0
    type: nic
  root:
    path: /
    pool: local
    size: 20GiB
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

All of my 6 servers in homelab announce same subnets but extra route is missing right now and i dont know why.

sudo incus query /internal/debug/bgp
{
	"peers": [
		{
			"address": "192.168.20.1",
			"asn": 65540,
			"count": 1,
			"holdtime": 0,
			"password": ""
		},
		{
			"address": "192.168.20.51",
			"asn": 64512,
			"count": 1,
			"holdtime": 0,
			"password": ""
		}
	],
	"prefixes": [
		{
			"nexthop": "0.0.0.0",
			"owner": "network_2",
			"prefix": "10.35.28.0/24"
		}
	],
	"server": {
		"address": "0.0.0.0:179",
		"asn": 65536,
		"router_id": "192.168.20.21",
		"running": true
	}
}

The main problem is in logic of changing settings of “cluster-bridge“ when we want announce single /32 for single cluster member.

Every 2.0s: ./gobgp global rib                               worker1.lan: Thu Nov 27 18:20:39 2025

   Network              Next Hop             AS_PATH              Age        Attrs
*  10.35.28.0/24        192.168.20.21        65536                00:06:02   [{Origin: i}]
Every 2.0s: ./gobgp neighbor                                 worker1.lan: Thu Nov 27 18:20:51 2025

Peer              AS  Up/Down State       |#Received  Accepted
192.168.20.1   65112    never Active      |        0         0
192.168.20.21  65536 00:06:14 Establ      |        1         1
192.168.20.51  65536    never Active      |        0         0
192.168.20.122 62343    never Active      |        0         0

I mean

Every 2.0s: ./gobgp global rib                               worker1.lan: Thu Nov 27 18:24:33 2025

   Network              Next Hop             AS_PATH              Age        Attrs
*  10.35.28.0/24        192.168.20.22        65536                00:00:17   [{Origin: i}]
*  10.35.28.0/24        192.168.20.23        65536                00:00:17   [{Origin: i}]
*  10.35.28.0/24        192.168.20.21        65536                00:00:13   [{Origin: i}]
192.168.20.122 62343    never Active      |        0         0
Every 2.0s: ./gobgp neighbor                                 worker1.lan: Thu Nov 27 18:24:39 2025

Peer              AS  Up/Down State       |#Received  Accepted
192.168.20.21  65536 00:00:19 Establ      |        1         1
192.168.20.22  65536 00:00:23 Establ      |        1         1
192.168.20.23  65536 00:00:23 Establ      |        1         1

what the difference if we announce single /32 from all peers? Something like this

Every 2.0s: ./gobgp global rib                               worker1.lan: Thu Nov 27 18:24:33 2025

   Network              Next Hop             AS_PATH              Age        Attrs
*  10.35.28.11/32        192.168.20.22        65536                00:00:17   [{Origin: i}]
*  10.35.28.11/32        192.168.20.23        65536                00:00:17   [{Origin: i}]
*  10.35.28.11/32        192.168.20.21        65536                00:00:13   [{Origin: i}]

with same problem described above.

Sorry, english is not my first language ^___^”

Yeah, that’s wrong.

Try:

  • incus network unset NAME ipv4.routes
  • incus config device set NAME eth0 ipv4.routes.external=10.35.28.218/32
Error: Device from profile(s) cannot be modified for individual instance. Override device or modify profile instead

okay, wait a second i recreate vm without profile :slight_smile:

That’s fine, you can just do:

incus config device override NAME eth0 ipv4.routes.external=10.35.28.218/32
1 Like
incus config device override debug eth0 ipv4.routes.external=10.35.28.218/32
Device eth0 overridden for debug

output query

{
	"peers": [
		{
			"address": "192.168.20.1",
			"asn": 65540,
			"count": 1,
			"holdtime": 0,
			"password": ""
		},
		{
			"address": "192.168.20.51",
			"asn": 64512,
			"count": 1,
			"holdtime": 0,
			"password": ""
		}
	],
	"prefixes": [
		{
			"nexthop": "0.0.0.0",
			"owner": "network_2",
			"prefix": "10.35.28.0/24"
		}
	],
	"server": {
		"address": "0.0.0.0:179",
		"asn": 65536,
		"router_id": "192.168.20.21",
		"running": true
	}
}

incus config show --expanded debug

architecture: x86_64
config:
  cluster.evacuate: auto
  image.architecture: amd64
  image.description: Ubuntu noble amd64 (cloud) (20251126_07:42)
  image.name: ubuntu-noble-amd64-cloud-20251126_07:42
  image.os: ubuntu
  image.release: noble
  image.serial: "20251126_07:42"
  image.variant: cloud
  limits.cpu: "2"
  limits.memory: 2GiB
  migration.stateful: "true"
  volatile.base_image: 8d3e6ff9770f2d2da831982c06967600e963ca0eb58353b2bf1f6c3b8b736b24
  volatile.cloud-init.instance-id: 517f6db5-b2df-4f1b-a90b-5f5a092822c2
  volatile.eth0.host_name: tap1367dc15
  volatile.eth0.hwaddr: 10:66:6a:25:38:bb
  volatile.last_state.power: RUNNING
  volatile.last_state.ready: "false"
  volatile.uuid: eefc17b4-cd07-4feb-942b-79c23c664df2
  volatile.uuid.generation: eefc17b4-cd07-4feb-942b-79c23c664df2
  volatile.vm.definition: pc-q35-10.1
  volatile.vm.rtc_adjustment: "-1"
  volatile.vm.rtc_offset: "0"
  volatile.vsock_id: "1410789558"
devices:
  eth0:
    ipv4.routes.external: 10.35.28.218/32
    network: incusbr0
    type: nic
  root:
    path: /
    pool: local
    size: 20GiB
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

tehnically if we can do something with announcing /32 per cluster member - incus will be perfect solution for small / medium sized clusters or homelab.

Maybe after vm / lxc is created incus can listen events queue and announce / remove primary ip for bgp?

Okay, so that looks like a bug, this should normally cause whatever server this is running on to advertise the /32 so long as the instance is running.

Hmm, it seems to work fine for me here.

bash-5.2# incus list
+------+---------+-----------------------+------------------------------------------------+-----------+-----------+----------+
| NAME |  STATE  |         IPV4          |                      IPV6                      |   TYPE    | SNAPSHOTS | LOCATION |
+------+---------+-----------------------+------------------------------------------------+-----------+-----------+----------+
| c1   | RUNNING | 10.167.140.119 (eth0) | fd42:c9f0:6f69:913f:1266:6aff:fe7c:3c50 (eth0) | CONTAINER | 0         | incus02  |
+------+---------+-----------------------+------------------------------------------------+-----------+-----------+----------+
bash-5.2# incus config show c1
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Debian trixie amd64 (20251127_05:24)
  image.name: debian-trixie-amd64-default-20251127_05:24
  image.os: debian
  image.release: trixie
  image.serial: "20251127_05:24"
  image.variant: default
  volatile.base_image: e7c966a015d58b5d3dc880e76bdb63583dbd316d50290e6c22caafa5a731414f
  volatile.cloud-init.instance-id: 5c2a0f4e-3251-404e-b2e5-bcc29a212159
  volatile.eth0.host_name: veth8d2d381e
  volatile.eth0.hwaddr: 10:66:6a:7c:3c:50
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: RUNNING
  volatile.uuid: 3cc3cf4a-3736-473e-b95c-73a976a87310
  volatile.uuid.generation: 3cc3cf4a-3736-473e-b95c-73a976a87310
devices:
  eth0:
    ipv4.routes.external: 10.167.140.119/32
    network: incusbr0
    type: nic
ephemeral: false
profiles:
- default
stateful: false
description: ""
bash-5.2# incus query /internal/debug/bgp
{
	"peers": [],
	"prefixes": [
		{
			"nexthop": "0.0.0.0",
			"owner": "instance_6_eth0",
			"prefix": "10.167.140.119/32"
		}
	],
	"server": {
		"address": ":149",
		"asn": 65001,
		"router_id": "10.0.0.102",
		"running": true
	}
}