Hi, I’ve been having very intermittent behaviour in my environment where connections drop abruptly. While troubleshooting I’ve realised I’m not confident on my network setup - I understand the netplan configuration, so I can configure the host and I can configure the Incus instances, but I am not sure about the interaction between the two so I wanted some advice. Apologies therefore upfront for what are basic questions.
with the idea being I may need to attach different instances to different VLANs.
If I now attach an instance, either an LXC or a VM to a bridge, will the instance need to be configured to tag its traffic to that VLAN? Ideally I would want to just give the instances simple netplans where they just use their interface (type nic and bridged) with the VLAN tag being applied appropriately by the underlying host according to the bridge, but looking at the configuration now I don’t see any reason that would actually happen so I assume I’m just dropping untagged frames from the instance onto the bridge and they’re being pushed out onto the wire in the default VLAN.
What is the correct practice here? I don’t think I should be creating all the VLAN interfaces on the host first (enp1s0.10 etc) and creating bridges on top of those - this doesn’t match any examples in the netplan docs - but I think I’m now doubting everything. Can anyone assist?
Or, more generically - what is best practice for configuring Incus on top of a host with multiple VLANs?
I had a similar problem with configuring VLANs in incus recently (essentially, I wanted to use the same nic for both ceph and incus traffic, which should be isolated). So, as I figured out there are 2 options:
You manage your bridges yourself (as you did in your netplan config).
You let bridges be managed by incus (by creating a “network”)
In both cases you should then simply connect bridges to the network devices of containers, e.g., using profiles:
devices:
eth0:
name: eth0
nictype: bridged
parent: br1
type: nic
Then you configure the network device within the container as usual, e.g., using netplan.
If you go with Option 2, you should not define the bridges in the netplan config of the host and only define VLANs. Then you should set the key bridge.external_interfaces to the corresponding VLAN when creating the network, e.g.,:
Then the bridge br1 will be created and managed by incus.
Note that in this case the VLAN should be unconfigured, that is, they should not have any IP addresses assined (including ipv6 addresses). Otherwise incus will refuse creating the bridge. In our case, I had to add accept-ra: false to the netplan config for VLANs to prevent the router assigning ipv6 addresses to them.
Unfortunately, there appears to be a bug in LXD according to which, after restarting lxd, the managed interfaces are created in the wrong order, which results in bridge br1 not properly configured because of the missing (at that time) VLAn vlan10. (I just verified that incus is affected by the same bug.)
Oh, it appears that “physical networks” do not seem to work the way I originally thought.
Creating a physical network in incus does not actually create a network interface.
$ incus network list
+-----------------+----------+---------+---------------+------+-------------+---------+---------+
| NAME | TYPE | MANAGED | IPV4 | IPV6 | DESCRIPTION | USED BY | STATE |
+-----------------+----------+---------+---------------+------+-------------+---------+---------+
...
+-----------------+----------+---------+---------------+------+-------------+---------+---------+
| vlan10 | physical | YES | | | | 0 | CREATED |
+-----------------+----------+---------+---------------+------+-------------+---------+---------+
$ ip a | grep vlan10
[shows nothing]
Consequently, attaching vlan10 as an external interface of the bridge (silently) fails:
So, interfaces managed by incus are probably not supposed to be used as external interfaces of bridges.
However, for some reason, if the name of the physical network is given like <VLAN-link>.<VLAN-id>, e.g., enp1s0.10 for the example above, the interface on the host is actually created and can be subsequently used for the bridge (but there still a problem with the order in which they appear after restart).
Hmmm. I’ve been doing your option 1 for a while but I’m still not sure that the VLAN tagging is working and I still have intermittent drops. Which may be unrelated but I do want to eliminate as many possibilities as I can.
In desperation I did configure source routing for the VLANs on the underlying hosts, but so far the situation has not improved.