Placing containers in vlans with gvrp

I’m fairly new to lxd and so it’s possible that I am missing something obvious here. If so, my apologies and I would very much appreciate any points.

The Setup

I have a number of Raspberry Pi 3s and 4s running Ubuntu 20.10 (mix of 64 and 32 bit) attached to a VLAN-capable hardware switch.
Each Pi is member of a single lxd cluster, using a dedicated VLAN interface for clustering (i.e. each Pi has an eth0 in the default VLAN with a DHCP-provided IP and a vlan2 interface in VLAN 2 with a hard-coded IP from 10.0.0.0/24).

The Plan

I would like to be able to launch lxc containers on any host through lxd clustering with sets of these containers being placed in dedicated VLANs so as to create a number of isolated networking environments for them.

For example, I might have containers foo, bar and baz running on pi1 and pi2 sharing VLAN ID 5 and at the same time also have containers bam and bat, running on pi2 and pi3 sharing a separate VLAN 6:

container lxd host vlan id
foo pi1 5
bar pi1 5
baz pi2 5
bam pi2 6
bat pi3 6

The whole setup should be automated and reproducible.

What I’ve done so far

I’ve tried to avoid using an overlay network because of the added complexity and because I figure VLANs should be capable of providing what I need.

I have managed to achieve the desired setup by creating a dedicated network interface on the host and then attaching it to each container:

ubuntu@pi2$ sudo ip link add link eth0 lxdvlan5 type lan id 5 gvrp on # gvrp so that pi advertises the new vlan id to the switch

ubuntu@pi2$ sudo link set dev lxdvlan5 up

ubuntu@pi2$  lxc init --target pi2 ubuntu:20.04 vlantestubuntu
ubuntu@pi2$ lxc config device add vlantestubuntu lxdvlan5 nic nictype=physical parent=lxdvlan5 name=eth0
ubuntu@pi2$ lxc start vlantestubuntu

This works:

  • Each container created in this way will have an eth0 that is in the correct VLAN
  • The host does not see these interfaces while the containers are running
  • GVRP advertisements are made to the switch, thus containers on different Pis can talk to each other on VLAN 5

The problem

The above has the slight wrinkle that it requires dedicated ip commands to be run on the host after the container has been created. It thus can’t be automated using lxc on a single cluster node alone.

I have looked at the networking documentation for lxd at https://lxd.readthedocs.io/en/latest/networks/ but have found them to be quite sparse.

Is it possible to achieve what I’m after using lxd alone and if so, can anyone give me some pointers on how to do it?

n.b: I am hoping to avoid using an overlay network if possible because of the added complexity and the fact that VLANs provide enough isolation by themselves.

Hi,

I will say from the outset that I am not familiar with gvrp so I will not be able to advise on that, and from what I know LXD has no mechanism currently for advertising VLANs via GVRP (not that it couldn’t potentially be added).

Also, RE documentation for Instance NICs you may find the NIC device documentation more useful rather than the managed network page https://linuxcontainers.org/lxd/docs/master/instances#type-nic

If you’re able to setup the VLANs you need in advance on your switch and then make sure that they are all trunked to the each PI host, then they will be available for use without needing GVRP.

If that is acceptable, then you may find the macvlan NIC with a vlan property set would be sufficient.

E.g.

lxc config device add <instance> <device name> nic nictype=macvlan name=<interface name> parent=<host interface> vlan=<vLAN ID>

This will create a VLAN interface ontop of the host interface (if not already present) and then a macvlan interface ontop of that which will then be passed into the instance.

This will also allow multiple instances per host VLAN interface.

See https://linuxcontainers.org/lxd/docs/master/instances#nic-macvlan

GVRP is used to dynamically advertise to the switch what VLAN IDs are used by devices behind each port.
Without it, each VLAN ID and port membership need to be configured in advance (and by hand) on the switch itself (in my case via a crappy web interface :frowning: ).

Since the Linux kernel has support for GVRP, I was hoping there might be a way to hand off that responsibility to the kernel? But as you say, macvlan doesn’t seem to have that capability…?

Yes it probably is possible but not something we automate at this point.

I was suggesting that you just setup all of the VLANs to go to all of your Pis as trunked ports, then they can be used on-demand by the containers on each host.

So in your table above, you would setup vlan 5 and 6 to go to pi 1, 2 and 3. And then they can be used ad-hoc by containers as needed. I’m not sure of the scale at which your proposing to use vlans. If it was a handful then it would be doable, but if there are lots of different groups then it would be time consuming to setup.

When specifying the vlan property on the macvlan NIC we effectively run this command:

ip link add link <parent> name <parent>.<vlan> type vlan id <vlan>

See https://github.com/lxc/lxd/blob/master/lxd/device/nic_macvlan.go#L128
And https://github.com/lxc/lxd/blob/3e3f3fe032fc536876bb265db19dcb0fbfddb5bf/lxd/network/network_utils.go#L1026-L1052

So if we also added a gvrp config property that was a boolean, then if it was true we could then run:

ip link add link <parent> name <parent>.<vlan> type vlan id <vlan> gvrp on

Should be a relatively small change, please could you open a feature issue here https://github.com/lxc/lxd/issues so we have somewhere to record the request. Thanks

1 Like

I come from a Networking Engineering background and GVRP is something I thought was from the past (I had to dust off a few cobwebs to remember it). I haven’t touched anything like that or dynamic vlans (or dreaded Cisco VTP) in years. I would look into using overlay networks or just script this. It seems a lot of effort for a corner case / old dying / dead protocol.
I’m trying to figure out what dynamically setting up the vlans between the switch and the pi’s actually gives you. Is this for some kind of mobility?, e.g. you wish to move the pi’s to different switches in different office locations and they spin up on the right vlans without having to login and reprogram the switch each time?
My last use of dynamic vlans was for workers to be able to move to different floors in the building and end up on the correct vlan. Usually involved some kind of radius server look ups based on the device mac address and the username etc.

Cheers! :sunglasses:
Jon.

1 Like

I don’t know about dead, but I certainly appreciate that the use of GVRP is somewhat of a corner case.

My use case is this:
I’m building a home lab of sorts, using some Pis on a common switch.
The switch I’m using had to meet a number of requirements, among them the ability to

  • provide PoE
  • turn PoE to individual ports on and off remotely
  • VLAN support
  • be affordable XD

After some research I settled on a Netgear GS510, but I’ve found that many switches in my price range that support the above handle VLANs somewhat primitively.
With the GS510 individual ports can’t be easily made to receive all VLANs and instead have to be manually configured to be a member of any given VLAN - unless GVRP is used in which case the membership is managed dynamically.

Most of the switches I looked at seem to behave in similar ways, but also all support GVRP, so this change appears worthwhile to me as long as it is small and otherwise benign.

I have now created https://github.com/lxc/lxd/issues/8318 for this feature request.

1 Like

We have a branch that adds GVRP support, are you able to help us test it if I provide a binary and show you how to side load it into your snap (keeping in mind that this will be equivalent to latest/edge snap channel and may prevent you downgrading back to the previous branch due to DB changes)?

I’d be delighted to take it for a spin.

My lxd cluster is comprised of Pis and currently entirely expendable, so no worries about downgrading.
Is there a particular order in which the nodes need to be upgraded in order for any DB changes to be consistent? Currently the DB is distributed across three members.

I will also need some hints around dealing with snap. I’m fairly new to it compared to apt.

Thank you very much for this, though. I’m looking forward to seeing how it flies. :slight_smile:

Great, here are the steps I’ve used on a fresh machine running Ubuntu Focal with a physical interface enp1s0f0 connected to the switch. I setup LXD without using a cluster for simplicity.

Install LXD and sideload GVRP version of lxd binary:

sudo apt install snapd -y
sudo snap install lxd --edge # or sudo snap refresh lxd --channel=latest/edge
sudo lxd init --auto
sudo mv lxd.gvrp /var/snap/lxd/common/lxd.debug
sudo systemctl reload snap.lxd.daemon

Create container and add physical NIC using a VLAN enabled parent:

lxc init images:ubuntu/focal c1
lxc config device add c1 eth0 nic nictype=physical parent=enp1s0f0 vlan=1234
lxc start c1

# Confirm physical VLAN 1234 parent has been passed in (existing functionality):
lxc exec c1 -- ip -d l show eth0
26: eth0@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6c:b3:11:1c:09:52 brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 minmtu 0 maxmtu 65535 
    vlan protocol 802.1Q id **1234** <REORDER_HDR> addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 

Now enable GVRP on NIC and check it has been applied:

lxc stop c1
lxc config device set c1 eth0 gvrp=true
lxc start c1
lxc exec c1 -- ip -d l show eth0
27: eth0@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6c:b3:11:1c:09:52 brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 minmtu 0 maxmtu 65535 
    vlan protocol 802.1Q id 1234 <REORDER_HDR,**GVRP**> addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 

The GVRP version of LXD is here: https://tomp.uk/files/lxd.gvrp

@stgraber just pointed out to me that u will need an arm64 build of the binary as that earlier link is an x8664 build. Do u have an x86 64 machine to test it with? If not will see about getting an arm build.

I’m afraid I currently do not have an x86 setup to test, no.
Since this test will rely to some extent on the hardware being attached to a GVRP-capable switch, I’d appreciate something that can run on my Pis:

ubuntu@friednoodle01:~$ uname -a
Linux friednoodle01 5.8.0-1010-raspi #13-Ubuntu SMP PREEMPT Wed Dec 9 17:19:55 UTC 2020 armv7l armv7l armv7l GNU/Linux

I’m going to try it on two of @stgraber lab machines that are now connected to a GARP enabled switch.

We confirmed it works and registers on the switch using GARP, so just running it through the automated tests and it will be ready to merge.

This is now merged and will be available in the edge snap shortly.