LXD 3.13 has been released

Great news!

I had a look at the network_ipvlan LXC feature. In LXD 3.13 (channel: candidate),

$ lxc info 
...
  kernel_version: 4.15.0-48-generic
  lxc_features:
    mount_injection_file: "true"
    network_gateway_device_route: "false"
    network_ipvlan: "false"
    network_l2proxy: "false"
    seccomp_notify: "false"
...
$ 

It is a feature that it is not enabled in my case. Is it user-configurable?

https://github.com/lxc/lxd/blob/master/lxd/daemon.go#L580

It appears it is not user-configurable. As mentioned in the announcement, it relates to the version of liblxc that is bundled in the snap package.

The master branch of liblxc knows about network_ipvlan:

But what version of liblxc does the 3.13 LXD snap package have?

$ snap run --shell lxd
bash-4.3$ ls -l /snap/lxd/current/lib/liblxc*
lrwxrwxrwx 1 0 0      15 May  9 17:12 /snap/lxd/current/lib/liblxc.so.1 -> liblxc.so.1.5.0
-rwxr-xr-x 1 0 0 1068656 May  9 17:29 /snap/lxd/current/lib/liblxc.so.1.5.0
-rwxr-xr-x 1 0 0   80688 May  9 17:29 /snap/lxd/current/lib/liblxcfs.so
bash-4.3$ 

It mentions it is a liblxc 1.5.0 version. There is no string network_ipvlan in liblxc.so.1.5.0.
How does this library relate to the repository GitHub - lxc/lxc: LXC - Linux Containers ? I could not find a relevant 1.5.0 tag in the source code repository.

I have used the snap package from the candidate channel, which bundles the following lxclxc version, tag: lxc-3.1.0. It is not recent enough to have the network_ipvlan goodness.

But what snap package channel is recent enough to have the git version of liblxc? Is it edge?

It is edge. Does edge have a recent enough LXD that includes IPVLAN support?

$ snap info lxd
...
channels:
  stable:        3.12        2019-04-16 (10601) 56MB -
  candidate:     3.13        2019-05-09 (10732) 56MB -
  beta:          ↑                                   
  edge:          git-566ee20 2019-05-09 (10738) 56MB -
...

There is a commit, 566ee20. Is that recent enough to have IPVLAN support in LXD? Here are the commits, Commits · lxc/incus · GitHub and discourse does not show a good snapshot of the page.

So, we could switch to the edge snap channel of LXD and experience the full goodness of the new features. But that would be utterly inappropriate for the stability of the system. Because, would things break in LXD if you switch forward and back between the stable and edge channels? The LXD version is almost the same, but liblxc differs quite a bit; almost six months of changes to the code.

So, what do we do? You know what we do, but let’s capture first the error message when you try IPVLAN on a liblxc that it is not recent enough. They say it is good for SEO.

$ lxc launch ubuntu:18.04 mycontainer --profile default --profile ipvlan
Creating mycontainer
Error: Failed container creation: Create container: Create LXC container: Initialize LXC: LXC is missing one or more API extensions: network_ipvlan, network_l2proxy, network_gateway_device_route

Let’s switch the snap package of LXD to the edge channel.

$ snap switch --channel edge lxd
"lxd" switched to the "edge" channel
$ snap refresh
lxd (edge) git-566ee20 from Canonical✓ refreshed

Will it work now?

$ lxc launch ubuntu:18.04 mycontainer --profile default --profile ipvlan
Creating mycontainer
Starting mycontainer
$ lxc list mycontainer
+-------------+---------+------+------+------------+-----------+
|    NAME     |  STATE  | IPV4 | IPV6 |    TYPE    | SNAPSHOTS |
+-------------+---------+------+------+------------+-----------+
| mycontainer | RUNNING |      |      | PERSISTENT | 0         |
+-------------+---------+------+------+------------+-----------+

No IP address from the LAN. What went wrong? Isn’t IPVLAN supposed to let the container get the IP address automatically from the LAN? Probably not, considering that it is Layer 3 (not Layer 2 that macvlan is). Scratch that then, we start over again.

To cut this short, you need to tell LXD (ipv4.address=...) the IP address for the container. Then, LXD will be able to set up what is needed. And you need to instruct the container of the DNS server settings because without DNS, cloud-init takes time to complete the bootup sequence (and create the ubuntu account).

In a nutshell,

  1. You need to get LXD to setup the IP address for the container, because that’s the way IPVLAN works.
  2. You do not get a DNS server autoconfigured, so you need to configure it in some way, such as with cloud-init from a LXD profile.
  3. You do not need to (cannot?) add a default route. LXD/ipvlan does that for you. See below how the default route looks like.
ubuntu@mycontainer:~$ route 
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         0.0.0.0         0.0.0.0         U     0      0        0 eth0
ubuntu@mycontainer:~$ ping -c 1 www.google.com
PING www.google.com (216.58.198.4) 56(84) bytes of data.
64 bytes from mil04s03-in-f4.1e100.net (216.58.198.4): icmp_seq=1 ttl=54 time=76.2 ms

--- www.google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 76.275/76.275/76.275/0.000 ms
ubuntu@mycontainer:~$ 

@simos correct, the LXC version in the stable snap is 3.1 (as lxc info should show as driver_version), you need current master to get IPVLAN which the LXD edge snap has.

And indeed, IPVLAN doesn’t have a default gateway in the normal sense of the term so that’s configured for you, DNS is up to you to configure though, you should be able to use network-config (netplan) to configure that, possibly by adding your DNS config to the loopback device so that netplan doesn’t attempt to mess the ipvlan device?

1 Like

I tried with network-config on eth0 to set the nameserver and it worked well.

As is, I use a LXD profile per ipvlan container, because each profile needs to specify a unique LAN IP addresses.

I noticed that the ipvlan container cannot communicate with the host, which follows the case with macvlan.

Good to hear simos.

You can also use a shared profile and add individual ipvlan NICs to a container.

Yes ipvlan like macvlan stops containers and the host communicating. You can add an ipvlan interface to host though to overcome this I believe.

Have the changes to the network limits affected how limits are applied to individual containers? It appears our limits have suddenly stopped working, so we’re trying to determine the root cause. Currently we apply them by overwriting the devices config for the container.

Can you give an example of your config, and a bit more info on what you think isn’t working or has changed?

The network limits shouldn’t have changed, there were some tests added as part of 3.13 to ensure behaviour remained consistent:

Sure thing!

We currently update the devices value of the LXD container. The Ruby code looks something like this (we set the network and disk limits here):

# config = get the LXD container's config using the LXD API
config.devices = {
  eth0: {
    nictype: :bridged,
    parent: :lxdbr0,
    type: :nic,
    'limits.ingress' => '25Mbit',
    'limits.egress' => '5Mbit'
  },
  root: {
    path: '/',
    pool: 'default',
    size: '2048MB',
    type: :disk,
    'limits.read' => '10MB',
    'limits.write' => '10MB'
  }
}
# save the LXD container's config using the LXD API

This allows us to set dynamic network and disk limits based on the instance size we’ve selected. Previously both were working, but now only the disk one is.

Now, we’re only able to set it on the profile level, and only if the profile has that device, which is just our default profile. We do it by running this on the host machine:

lxc profile device set default eth0 limits.ingress 50000000

Note that the 50000000 could also be 50Mbit, but for some reason, this also didn’t appear to be working. We haven’t gone back and tried to reproduce that yet, however.

The instance launches with the eth0 interface already attached thanks to the default profile, so ideally we can continue to alter the limits for that device once it’s already been attached.

You’re correct that you can only set limits on the device at the profile level when the device is being added to the container as part of the profile. If you add a standalone device to a container then you can specify limits on a per-container basis.

I’m not understanding what the issue is that you’re having though, do you get an error compared to pre-3.13? Or is it that the limits do not apply?

Thanks
Tom

Ah, thanks for clarifying.

It’s that the limits do not apply. I know for sure they used to because we tested it thoroughly when we added them originally, but realized just this week they were no longer being used. I saw some changes to network limits mentioned in the 3.13 changelog which is why I brought it up here.

It sounds like our best bet would be to either keep applying more general limits on the default profile or remove the eth0 device from the default profile and add it on a per container basis.

I wonder if perhaps previously the default profile didn’t have the eth0 device added to it. If that’s the case, the code above would have added it on a per container basis, where it sounds like it would have included the limit.

Thanks for your help! I’ll do a bit more experimentation now that I know limits can only be applied when the device is originally attached.

From what I understand, @saulcosta has the eth0 device in his default profile with some initial limits, those limits then get overriden on a per-container basis by adding a eth0 device local to the container.

In this case I would certainly expect the limit defined on the eth0 device on the container to be the effective traffic limit.

There shouldn’t be any need for @saulcosta to alter his default profile here, adding a eth0 device directly to the container should take precedence and immediately change the limit to the running container.

Am I missing something?

Yes that should be fine, there is a test for that scenario here:

@stgraber just checked doing an non-hotplug variant too and it appears to work fine. tc shows the limits applied.

@stgraber that was my expectation as well and how it previously seemed to be performing. Here’re some more details on the configuration we currently have.

The default profile (applicable part):

devices:                                                   
  eth0:                                                    
    limits.egress: 20Mbit                                  
    limits.ingress: 50Mbit                                 
    nictype: bridged                                       
    parent: lxdbr0                                         
    type: nic                                              
  root:                                                    
    path: /                                                
    pool: default                                          
    type: disk

And here are the starting devices for the container, which does have the default profile:

root:
  limits.read: 10MB
  limits.write: 10MB
  path: /
  pool: default
  size: 354MB
  type: disk

When running a speedtest in that container, I get:

workspace $ speedtest
Retrieving speedtest.net configuration...
Testing from Google Cloud (34.66.108.68)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by Kansas Research and Education Network (Wichita, KS) [43.14 km]: 34.618 ms
Testing download speed................................................................................
Download: 45.26 Mbit/s
Testing upload speed................................................................................................
Upload: 19.99 Mbit/s

This makes sense, given the limits on the default profile.

I then can apply the network limits using the approach in the Ruby code above, which does apply this config to the container:

eth0:
  limits.egress: 15Mbit
  limits.ingress: 25Mbit
  nictype: bridged
  parent: lxdbr0
  type: nic
root:
  limits.read: 10MB
  limits.write: 10MB
  path: /
  pool: default
  size: 361MB
  type: disk

However, the speedtest still uses the limits from the default profile:

workspace $ speedtest
Retrieving speedtest.net configuration...
Testing from Google Cloud (34.66.108.68)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by Cox - Wichita (Wichita, KS) [43.14 km]: 26.761 ms
Testing download speed................................................................................
Download: 43.89 Mbit/s
Testing upload speed................................................................................................
Upload: 20.06 Mbit/s

speedtest is a utility that can be installed with sudo pip install speedtest-cli.

Can you apply a container level limit and then check the settings are applied:

 sudo tc class show dev $(lxc config get test1 volatile.eth0.host_name)

I couldn’t get that command to work with my container name, but here’s the only output from the config for that container that includes volatile.eth0:

volatile.eth0.hwaddr: 00:16:3e:0c:92:f9
volatile.eth0.name: eth0

Can you try running ip link on the host and look for the parent interfaces starting “veth”.

If you run sudo tc class show dev X where X is each parent veth device name before the “@” sign, then you should be able to see which interfaces have what limits applied.

We need to ascertain whether the issue is with limits settings not being applied to the OS or whether the OS isn’t restricting them.

For testing purposes, if you set a host_name property on one of your containers then you’ll be more able to identify it’s peer on the host.

lxd config device set ct1 eth0 host_name myct1

sudo tc class show dev myct

Looks like the limit is indeed being applied. sudo tc class show dev myct1 returns:

class htb 1:10 root prio rate 25Mbit ceil 25Mbit burst 1600b cburst 1600b

OK good, I’m not going mad then. :slight_smile:

I just tested on my Ubuntu 18.04 machine with a bridged network and a 2Mbit ingress limit and speedtest-cli shows:

sudo tc class show dev $(lxc config get test1 volatile.eth0.host_name)
class htb 1:10 root prio rate 2Mbit ceil 2Mbit burst 1600b cburst 1600b 
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by CloudConnX (Eastbourne) [72.01 km]: 15.249 ms
Testing download speed................................................................................
Download: 1.58 Mbit/s
Testing upload speed................................................................................................
Upload: 18.45 Mbit/s

So it does work as far as I can tell.

What OS & kernel are you using btw?