Howto debug macvlan network IFs

I have a machine that is directly connected to a scientific instrument that generates data with 2x25GE interfaces. I’d like to setup containers on the machine so the different people who use the instrument can have their own environment and I’m using macvlan to share the interfaces. I have one container setup but it doesn’t see any data. On the host I can see the data with tcpdump and the destination mac address and IP on the packets is correct. So I’m guessing that macvlan isn’t picking the packets up and routing them to the container? (Ubuntu 18.04 with lxd 3.20 from snap)

Any advice on howto debug this?

The issue that you are probably running into is from the unRAID macvlan driver that is in use. Br0 and host containers will never be able to talk to each other.

This isn’t a host -> container network problem, but remote -> container

Please can you show your network config by pasting the output of:

lxc config show <instance name> --expanded

How are you configuring your networking inside the instance, does it use DHCP to get its IP? Is that working OK? Is there any other node on the network (apart from the host (which you cant reach with macvlan) or the equipment in question)? If so can you ping that?

Thanks
Tom

Also you mention there are 2 interfaces, how are these configured, are they separately named or teamed somehow? Does the host that runs the container also have 2 NICs?

architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 18.04 LTS amd64 (release) (20191021)
  image.label: release
  image.os: ubuntu
  image.release: bionic
  image.serial: "20191021"
  image.type: squashfs
  image.version: "18.04"
  raw.idmap: |-
    uid 1002 1001
    gid 1002 1001
  security.privileged: "true"
  volatile.base_image: d6f281a2e523674bcd9822f3f61be337c51828fb0dc94c8a200ab216d12a0fff
  volatile.data0.host_name: mac189991b6
  volatile.data0.hwaddr: 00:16:3e:68:c3:81
  volatile.data0.last_state.created: "false"
  volatile.data1.host_name: macc323f2f3
  volatile.data1.hwaddr: 00:16:3e:93:f7:14
  volatile.data1.last_state.created: "false"
  volatile.eth0.host_name: mace954f242
  volatile.eth0.hwaddr: 00:16:3e:a7:13:09
  volatile.eth0.last_state.created: "false"
  volatile.idmap.base: "0"
  volatile.idmap.current: '[]'
  volatile.idmap.next: '[]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: RUNNING
devices:
  data0:
    name: data0
    nictype: macvlan
    parent: ens6f0
    type: nic
  data1:
    name: data1
    nictype: macvlan
    parent: ens6f1
    type: nic
  eth0:
    name: eth0
    nictype: macvlan
    parent: enp3s0f2
    type: nic
  fastdisk:
    path: /fastdisk
    source: /fastdisk
    type: disk
  pulsedata:
    path: /e3d
    source: /bigdisk/e3d
    type: disk
  root:
    path: /
    pool: bigdisk
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

There are 3 IFs on the host. 1x 1GE and 2x25GE. The 1 Gig is used for control and that is a normal DHCP net and works well on the host and on the containers.

The 2x25G provides the 2 unidirectional UDP data streams from the device. The devices are directly connected, no switches or other devices on that network. We have static IPs set up.

It seems that tcpdump doesn’t work in the container at all. It doesn’t find any interfaces and doesn’t see any packets, it doesn’t seem to run normally at all. After ditching tcpdump I’ve found that I am getting data on one channel in the container. I now need to know what’s wrong with the other.

Are there any debug tools or info for macvlan? I’d really like to know if it’s just dropping packets somewhere.

I’ve edited your post using the three backticks format to help with readability.

The issue I can see straight away is that you don’t have have any NICs in your container.

You can see this in the devices section of your container, there is no device with a type of nic.

How did you add the macvlan NIC?

Did you run the the lxc config command with the --expanded flag btw?

Ah, my mistake I hadn’t run with the expanded option. I’ve edited my original post and hopefully kept the new formatting.

OK cool, can you show the output of lxc exec <container> -- ip a please.

Also lxc exec <container> -- ip r

Can I also see the output of ip a and ip r on the host too please.

Hi Tom,

Container ip a

62: data0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9600 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:68:c3:81 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.50.5.1/16 brd 10.50.255.255 scope global data0
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe68:c381/64 scope link 
       valid_lft forever preferred_lft forever
63: data1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9600 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:93:f7:14 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.50.5.129/16 brd 10.50.255.255 scope global data1
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe93:f714/64 scope link

Container ip r

10.50.0.0/16 dev data0 proto kernel scope link src 10.50.5.1 
10.50.0.0/16 dev data1 proto kernel scope link src 10.50.5.129

Host ip a

2: ens6f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9600 qdisc mq state UP group default qlen 1000
    link/ether 98:03:9b:1b:33:a0 brd ff:ff:ff:ff:ff:ff
    inet 10.50.1.129/25 brd 10.50.1.255 scope global ens6f0
       valid_lft forever preferred_lft forever
    inet6 fe80::9a03:9bff:fe1b:33a0/64 scope link 
       valid_lft forever preferred_lft forever
3: ens6f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9600 qdisc mq state UP group default qlen 1000
    link/ether 98:03:9b:1b:33:a1 brd ff:ff:ff:ff:ff:ff
    inet 10.50.1.1/25 brd 10.50.1.127 scope global ens6f1
       valid_lft forever preferred_lft forever
    inet6 fe80::9a03:9bff:fe1b:33a1/64 scope link 
       valid_lft forever preferred_lft forever

Host ip r

10.50.1.0/25 dev ens6f1 proto kernel scope link src 10.50.1.1
10.50.1.128/25 dev ens6f0 proto kernel scope link src 10.50.1.129

The target device at the end of the links has ip 10.50.1.10 for both interfaces

Thanks. That is quite a curious setup, using the same subnet on each link, I can see that causing problems, potentially meaning that packets for one link will flow down the other, which would likely causing problems with MACVLAN which is only going to accept packets at the correct interface.

Unless you have a good reason for doing so I recommend 2 smaller subnets for each link. If they are truly point to point then a /30 will suffice, which will mean there is just 2 IPs in the subnet.

E.g

10.50.5.0/30 would give you 10.50.5.1 on one end and 10.50.5.2 on the other.
10.50.6.0/30 would give you 10.50.6.1 on one end and 10.50.6.2 on the other.

Does the host actually need IPs on the links that are connected to the equipment (not that this would be an issue, but you’d just need to make the subnets larger)?

This way you’d end up with 2 non-overlapping routes in the device and the container:

10.50.5.0/30 dev data0 proto kernel scope link src 10.50.5.2
10.50.6.0/30 dev data1 proto kernel scope link src 10.50.6.2

Then packets would go down the correct link to reach 10.50.5.1 or 10.50.6.1 respectively.

Yeah it’s a pretty broken setup, but I don’t know what to do about it. I need to use the host interfaces right now because tcpdump doesn’t work in the containers so I can’t see the network traffic any other way. The network traffic is a UDP stream, but on both the host and the container I can only receive one of the streams. I can see both streams of data in tcpdump, but one is getting dropped by the host. The behaviour is at least the same now for the container, so it’s not a macvlan issue.

I choose 10.50.5.1|129/16 for the container so that both interfaces are on the same subnet as the device. I don’t know which subnet the device thinks it’s on.

Is there anyway to find out why the packets sent to the other interface are dropped? There seems to be very few tools for debugging these problems.

10.50.5.0/30 would give you 10.50.5.1 on one end and 10.50.5.2 on the other.
10.50.6.0/30 would give you 10.50.6.1 on one end and 10.50.6.2 on the other.

I can’t do this as the device uses the same IP address for both interfaces

Would

10.50.5.0/29 with 10.50.5.1 and 10.50.5.2 at the host and 10.50.5.3 at the device.

be any better?

Can you elaborate on what you mean by seeing both streams on the host? Are they coming to/from the right IP on the right interface?

I think from previous postings that tcpdump can be coerced to work with apparmor magicking but if you want quick result you can use tshark that has escaped apparmor attentions up to this point (I hope)

1 Like

Yes I’ve used tcpdump inside containers before. You can also use it’s line buffered mode which has worked around apparmor issues in the past tcpdump -l

https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1641236/comments/7

On the host if I run

tcpdump -i ens6f(0|1)

I can see both streams of data

tcpdump: listening on ens6f0, link-type EN10MB (Ethernet), capture size 262144 bytes
16:17:59.010974 IP (tos 0xb8, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 9586)
    10.50.1.10.4369 > 10.50.5.1.5000: UDP, length 9558
tcpdump -i ens6f1 -nn -s0 -v -e
tcpdump: listening on ens6f1, link-type EN10MB (Ethernet), capture size 262144 bytes
16:19:27.316820 0e:50:c2:5d:21:2f > 00:16:3e:93:f7:14, ethertype IPv4 (0x0800), length 9600: (tos 0xb8, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 9586)
    10.50.1.10.4369 > 10.50.5.129.5000: UDP, length 9558