I have a machine that is directly connected to a scientific instrument that generates data with 2x25GE interfaces. I’d like to setup containers on the machine so the different people who use the instrument can have their own environment and I’m using macvlan to share the interfaces. I have one container setup but it doesn’t see any data. On the host I can see the data with tcpdump and the destination mac address and IP on the packets is correct. So I’m guessing that macvlan isn’t picking the packets up and routing them to the container? (Ubuntu 18.04 with lxd 3.20 from snap)
The issue that you are probably running into is from the unRAID macvlan driver that is in use. Br0 and host containers will never be able to talk to each other.
Please can you show your network config by pasting the output of:
lxc config show <instance name> --expanded
How are you configuring your networking inside the instance, does it use DHCP to get its IP? Is that working OK? Is there any other node on the network (apart from the host (which you cant reach with macvlan) or the equipment in question)? If so can you ping that?
Also you mention there are 2 interfaces, how are these configured, are they separately named or teamed somehow? Does the host that runs the container also have 2 NICs?
There are 3 IFs on the host. 1x 1GE and 2x25GE. The 1 Gig is used for control and that is a normal DHCP net and works well on the host and on the containers.
The 2x25G provides the 2 unidirectional UDP data streams from the device. The devices are directly connected, no switches or other devices on that network. We have static IPs set up.
It seems that tcpdump doesn’t work in the container at all. It doesn’t find any interfaces and doesn’t see any packets, it doesn’t seem to run normally at all. After ditching tcpdump I’ve found that I am getting data on one channel in the container. I now need to know what’s wrong with the other.
Are there any debug tools or info for macvlan? I’d really like to know if it’s just dropping packets somewhere.
Thanks. That is quite a curious setup, using the same subnet on each link, I can see that causing problems, potentially meaning that packets for one link will flow down the other, which would likely causing problems with MACVLAN which is only going to accept packets at the correct interface.
Unless you have a good reason for doing so I recommend 2 smaller subnets for each link. If they are truly point to point then a /30 will suffice, which will mean there is just 2 IPs in the subnet.
E.g
10.50.5.0/30 would give you 10.50.5.1 on one end and 10.50.5.2 on the other.
10.50.6.0/30 would give you 10.50.6.1 on one end and 10.50.6.2 on the other.
Does the host actually need IPs on the links that are connected to the equipment (not that this would be an issue, but you’d just need to make the subnets larger)?
This way you’d end up with 2 non-overlapping routes in the device and the container:
10.50.5.0/30 dev data0 proto kernel scope link src 10.50.5.2
10.50.6.0/30 dev data1 proto kernel scope link src 10.50.6.2
Then packets would go down the correct link to reach 10.50.5.1 or 10.50.6.1 respectively.
Yeah it’s a pretty broken setup, but I don’t know what to do about it. I need to use the host interfaces right now because tcpdump doesn’t work in the containers so I can’t see the network traffic any other way. The network traffic is a UDP stream, but on both the host and the container I can only receive one of the streams. I can see both streams of data in tcpdump, but one is getting dropped by the host. The behaviour is at least the same now for the container, so it’s not a macvlan issue.
I choose 10.50.5.1|129/16 for the container so that both interfaces are on the same subnet as the device. I don’t know which subnet the device thinks it’s on.
Is there anyway to find out why the packets sent to the other interface are dropped? There seems to be very few tools for debugging these problems.
10.50.5.0/30 would give you 10.50.5.1 on one end and 10.50.5.2 on the other.
10.50.6.0/30 would give you 10.50.6.1 on one end and 10.50.6.2 on the other.
I can’t do this as the device uses the same IP address for both interfaces
Would
10.50.5.0/29 with 10.50.5.1 and 10.50.5.2 at the host and 10.50.5.3 at the device.
I think from previous postings that tcpdump can be coerced to work with apparmor magicking but if you want quick result you can use tshark that has escaped apparmor attentions up to this point (I hope)
Yes I’ve used tcpdump inside containers before. You can also use it’s line buffered mode which has worked around apparmor issues in the past tcpdump -l