Hi I have been testing a scenario in which an LXD container is setup to simply route/forward IP packets from one IP interface out of another. We would like the LXD container to logically act as a virtual router with goal of highest throughput possible. Packet per second throughput on the LXD container has been pretty bad (~40k pps) due to packets drops showing on the vtap/veth interface connecting an OVS bridge up to the LXD container. It seems as if network packet processing for the LXD container is getting restricted somehow. I have dug through cgroups config and tried applying the ‘limits.ingress: 10Gbit’ to my LXD profile, but does not seem to help. softirq processing by LXD host seems to remain around equivalent of 1 core (aggregate - spread across mulitiple cores) utilization and due to drops under ‘/proc/net/softnet_stat’ seems the processing queue is not getting serviced quickly enough. CPU utilization inside LXD container is basically null, LXD host is moderate on the cores doing the softirq processing. Is there some other inherent network I/O restriction placed on LXD container? or possibly by cgroups?
Topology:
iperf3 |------------| (ovs-br1)----(LXD-container)----(ovs-br2) |-------------| iperf3
traffic1 LXD server UUT traffic2
1.1.1.2 1.1.1.1 2.2.2.1 2.2.2.2
Secondary question: where would I see ‘limits.egress’ LXC config show up within the cgroups hierarchy?
We are ultimately using Openstack “VLAN provider” model to set this up which basically bridges traffic into the LXD host and then wires the LXD container up via vtap interfaces. I see same behavior on both Ubuntu 16.04 and 18.04 as LXD host and both ‘Version: 2.0.11-0ubuntu1~16.04.4’ and ‘Version: 3.0.1-0ubuntu1~18.04.1’ under more simplified scenario as shown above. Below is the basic profile being used.
root@compute3:/sys/fs/cgroup/cpu/lxc/c3# lxc profile show dual
config: {}
description: Default LXD profile
devices:
eth0:
limits.egress: 10Gbit
limits.ingress: 10Gbit
nictype: bridged
parent: ovs-br1
type: nic
eth1:
limits.egress: 10Gbit
limits.ingress: 10Gbit
nictype: bridged
parent: ovs-br2
type: nic
root:
path: /
pool: default
type: disk
name: dual
used_by:
- /1.0/containers/c1
Any help or suggestions would be greatly appreciated!