LXD 3.13 has been released

Introduction

The LXD team is very excited to announce the release of LXD 3.13!

This is another very exciting LXD release, packed with useful features and a lot of bugfixes and performance improvements!

The latest addition to the LXD team, @tomp has been busy improving the LXD networking experience with quite a few new features and bugfixes already making it into this release.

We’ve also gotten all the plumbing needed for system call interception done and in place in this release, currently handling mknod on supported systems.

Cluster users will enjoy this release too, thanks to scaling improvements, reducing the load on the leader a bit and improving container copies and migration, especially on CEPH clusters.

Enterprise users will like the addition of Role Based Access Control through the external Canonical RBAC service, making it possible to control permissions to individual projects on your LXD servers and assign roles to your users and groups.

And we’ve even managed to get quotas working for the dir storage backend at last, thanks to the addition of filesystem project quotas in recent kernels.

Enjoy!

New features

Cluster: Improved heartbeat interval

In a LXD cluster, the current leader periodically sends a hearbeat to all other cluster members. The main purpose of this is to detect offline cluster members, marking them as offline in the database so that queries no longer block on them. A secondary use for those hearbeats is to refresh the list of database nodes.

Previously, this was done every 4s with all cluster members being contacted at the same time, resulting in spikes in CPU and network traffic, especially on the current cluster leader.

LXD 3.13 changes that by bumping the interval to 10s and by adding randomization to the timing of the hearbeats so that not all cluster members are contacted at the same time. Extra logic was also added to detect cluster members that get added during a hearbeat run.

Cluster Internal container copy

LXD 3.13 now properly implements one step container copies, similar to how you would normally copy a container on a standalone LXD instance. Prior to this, the client had to know whether to perform a copy (if staying on the same cluster member) or a migration (if going to another cluster member), this is now all done internally.

A side benefit of this fix is that all CEPH copies are now near instantaneous on clusters as those do not require any migration at all.

Initial syscall interception support

LXD 3.13 when combined with a 5.0 or higher kernel, as well as the very latest libseccomp and liblxc can now intercept and mediate system calls in userspace.

For this first pass, we have focused on mknod, implementing a basic allow list of devices which can now be created by unprivileged containers.

It will take a little while before this feature can be commonly used as we will need an upstream release of both libseccomp and liblxc and are waiting for further improvements to the feature in the kernel too.

We will be building upon this capability to allow specific filesystems to be mounted inside unprivileged containers in the future as well as allow things like kernel module loading and more (all will require opt-in from the administrator).

Role Based Access Control (RBAC)

Users of the Canonical RBAC service can now integrate LXD with it.

LXD will register all its projects with RBAC, allowing administrators to assign roles to users/groups for specific projects or for the entire LXD instance.

Currently this includes the following permissions:

  • Full administrative access to LXD
  • Management of containers (creation, deletion, re-configuration, …)
  • Operation of containers (start/stop/restart, exec, console, …)
  • Management of images (creation, deletion, aliases, …)
  • Management of profiles (creation, deletion, re-configuration, …)
  • Management of the project itself (re-configuration)
  • Read-only access (view everything tied to a project)

This gets us one step closer to being able to run a shared LXD cluster with unprivileged users being able to run containers on it without concerns of them escalating their privileges.

IPVLAN support

LXD can now make use of the recent implementation of ipvlan in LXC.
When running a suitably recent version of LXC, IPVLAN can now be configured in LXD through a nic device:

  • Setting the nictype property to ipvlan
  • Setting the parent property to the expected outgoing device
  • For IPv4, setting ipv4.address to the desired address
  • For IPv6, setting ipv6.address to the desired address

Here is an example of it in action:

stgraber@castiana:~$ lxc init ubuntu:18.04 ipvlan
Creating ipvlan
stgraber@castiana:~$ lxc config device add ipvlan eth0 nic nictype=ipvlan parent=wlan0 ipv4.address=172.17.0.100 ipv6.address=2001:470:b0f8:1000:1::100
Device eth0 added to ipvlan
stgraber@castiana:~$ lxc start ipvlan
stgraber@castiana:~$ lxc exec ipvlan bash
root@ipvlan:~# ifconfig 
eth0: flags=4291<UP,BROADCAST,RUNNING,NOARP,MULTICAST>  mtu 1500
        inet 172.17.0.100  netmask 255.255.255.255  broadcast 255.255.255.255
        inet6 2001:470:b0f8:1000:1::100  prefixlen 128  scopeid 0x0<global>
        inet6 fe80::28:f800:12b:bdf8  prefixlen 64  scopeid 0x20<link>
        ether 00:28:f8:2b:bd:f8  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 5 overruns 0  carrier 0  collisions 0

    lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@ipvlan:~# ip -4 route show
default dev eth0 

root@ipvlan:~# ip -6 route show
2001:470:b0f8:1000:1::100 dev eth0 proto kernel metric 256 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
default dev eth0 metric 1024 pref medium

root@ipvlan:~# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=57 time=14.4 ms
--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 14.476/14.476/14.476/0.000 ms

root@ipvlan:~# ping6 -n 2607:f8b0:400b:800::2004
PING 2607:f8b0:400b:800::2004(2607:f8b0:400b:800::2004) 56 data bytes
64 bytes from 2607:f8b0:400b:800::2004: icmp_seq=1 ttl=57 time=21.2 ms
--- 2607:f8b0:400b:800::2004 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 21.245/21.245/21.245/0.000 ms
root@ipvlan:~# 

Quota support on dir storage backend

Support for the project quota feature of recent Linux kernels has been added.

When the backing filesystem for a dir type storage pool is suitably configured, container quotas can now be set as with other storage backends and disk usage is also properly reported.

stgraber@castiana:~$ sudo truncate -s 10G /tmp/ext4.img
stgraber@castiana:~$ sudo mkfs.ext4 /tmp/ext4.img 
mke2fs 1.44.6 (5-Mar-2019)
Discarding device blocks: done                            
Creating filesystem with 2621440 4k blocks and 655360 inodes
Filesystem UUID: d8ab56d9-1e84-40ee-921a-c68c06ad6625
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done     
stgraber@castiana:~$ sudo tune2fs -O project -Q prjquota /tmp/ext4.img 
tune2fs 1.44.6 (5-Mar-2019)

stgraber@castiana:~$ sudo mount -o prjquota /tmp/ext4.img /mnt/
stgraber@castiana:~$ sudo rmdir /mnt/lost+found/
stgraber@castiana:~$ lxc storage create mnt dir source=/mnt
Storage pool mnt created

stgraber@castiana:~$ lxc launch ubuntu:18.04 c1 -s mnt
Creating c1
Starting c1
stgraber@castiana:~$ lxc exec c1 -- df -h
Filesystem                                           Size  Used Avail Use% Mounted on
/var/lib/lxd/storage-pools/mnt/containers/c1/rootfs  9.8G  742M  8.6G   8% /
none                                                 492K     0  492K   0% /dev
udev                                                 7.7G     0  7.7G   0% /dev/tty
tmpfs                                                100K     0  100K   0% /dev/lxd
tmpfs                                                100K     0  100K   0% /dev/.lxd-mounts
tmpfs                                                7.8G     0  7.8G   0% /dev/shm
tmpfs                                                7.8G  152K  7.8G   1% /run
tmpfs                                                5.0M     0  5.0M   0% /run/lock
tmpfs                                                7.8G     0  7.8G   0% /sys/fs/cgroup

stgraber@castiana:~$ lxc config device set c1 root size 1GB
stgraber@castiana:~$ lxc exec c1 -- df -h
Filesystem                                           Size  Used Avail Use% Mounted on
/var/lib/lxd/storage-pools/mnt/containers/c1/rootfs  954M  706M  249M  74% /
none                                                 492K     0  492K   0% /dev
udev                                                 7.7G     0  7.7G   0% /dev/tty
tmpfs                                                100K     0  100K   0% /dev/lxd
tmpfs                                                100K     0  100K   0% /dev/.lxd-mounts
tmpfs                                                7.8G     0  7.8G   0% /dev/shm
tmpfs                                                7.8G  152K  7.8G   1% /run
tmpfs                                                5.0M     0  5.0M   0% /run/lock
tmpfs                                                7.8G     0  7.8G   0% /sys/fs/cgroup

stgraber@castiana:~$ lxc info c1
Name: c1
Location: none
Remote: unix://
Architecture: x86_64
Created: 2019/05/09 16:09 UTC
Status: Running
Type: persistent
Profiles: default
Pid: 10096
Ips:
  eth0:	inet	10.166.11.38	vethKM0DFY
  eth0:	inet6	2001:470:b368:4242:216:3eff:fe4b:2c3	vethKM0DFY
  eth0:	inet6	fe80::216:3eff:fe4b:2c3	vethKM0DFY
  lo:	inet	127.0.0.1
  lo:	inet6	::1
Resources:
  Processes: 24
  Disk usage:
    root: 739.77MB
  CPU usage:
    CPU usage (in seconds): 7
  Memory usage:
    Memory (current): 104.91MB
    Memory (peak): 229.67MB
  Network usage:
    lo:
      Bytes received: 1.23kB
      Bytes sent: 1.23kB
      Packets received: 12
      Packets sent: 12
    eth0:
      Bytes received: 480.35kB
      Bytes sent: 27.21kB
      Packets received: 332
      Packets sent: 277

Routes on container NIC devices

New ipv4.routes and ipv6.routes options on the nic devices make it possible to tie a particular route to a specific container, making it follow the container as it’s moved between hosts.

This will usually be a better option than using the similarly named key on the network itself.

Configurable NAT source address

New ipv4.nat.address and ipv6.nat.address properties on LXD networks now make it possible to override the outgoing IP address for a particular bridge.

LXC features exported in API

Similar to what was done in the previous release with kernel features, specific LXC features which LXD can use when present are now exported by the LXD API so that clients can check what advanced feature to expect on the target.

  lxc_features:
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    seccomp_notify: "true"

Bugs fixed

  • client: Consider volumeOnly option when migrating
  • client: Copy volume config and description
  • client: Don’t crash on missing stdin
  • client: Fix copy from snapshot
  • client: Fix copying between two unix sockets
  • doc: Adds missing packages to install guide
  • doc: Correct host_name property
  • doc: Update storage documentation
  • i18n: Update translations from weblate
  • lxc/copy: Don’t strip volatile keys on refresh
  • lxc/utils: Updates progress to stop outputting if msg is longer than window
  • lxd/api: Rename alias* commands to imageAlias*
  • lxd/api: Rename apiProject* to project*
  • lxd/api: Rename certificateFingerprint* to certficate*
  • lxd/api: Rename operation functions for consistency
  • lxd/api: Rename serverResources to api10Resources
  • lxd/api: Rename snapshotHandler to containerSnapshotHandler
  • lxd/api: Replace Command with APIEndpoint
  • lxd/api: Sort API commands list
  • lxd/candid: Cleanup config handling
  • lxd/certificates: Make certificate add more robust
  • lxd/certificates: Port to APIEndpoint
  • lxd/cluster: Avoid panic in Gateway
  • lxd/cluster: Fix race condition during join
  • lxd/cluster: Port to APIEndpoint
  • lxd/cluster: Use current time for hearbeat
  • lxd/cluster: Workaround new raft logging
  • lxd/containers: Avoid costly storage calls during snapshot
  • lxd/containers: Change disable_ipv6=1 to accept_ra=0 on host side interface
  • lxd/containers: Don’t fail on old libseccomp
  • lxd/containers: Don’t needlessly mount snapshots
  • lxd/containers: Early check for running container refresh
  • lxd/containers: Fix bad operation type
  • lxd/containers: Fix profile snapshot settings
  • lxd/containers: Moves network limits to network up hook
  • lxd/containers: Only run network up hook for nics that need it
  • lxd/containers: Optimize snapshot retrieval
  • lxd/containers: Port to APIEndpoint
  • lxd/containers: Remove unused arg from network limits function
  • lxd/containers: Speed up simple snapshot list
  • lxd/daemon: Port to APIEndpoint
  • lxd: Don’t allow remote access to internal API
  • lxd: Fix volume migration with snapshots
  • lxd: Have Authenticate return the protocol
  • lxd: More reliably grab interface host name
  • lxd: Port from HasApiExtension to LXCFeatures
  • lxd: Rename parseAddr to proxyParseAddr
  • lxd: Use idmap.Equals
  • lxd/db: Fix substr handling for containers
  • lxd/db: Parent filter for ContainerList
  • lxd/db/profiles: Fix cross-project updates
  • lxd/db: Properly handle unsetting keys
  • lxd/event: Port to APIEndpoint
  • lxd/images: Fix project handling on copy
  • lxd/images: Fix simplestreams cache expiry
  • lxd/images: Port to APIEndpoint
  • lxd/images: Properly handle invalid protocols
  • lxd/images: Replicate images to the right project
  • lxd/internal: Port to APIEndpoint
  • lxd/migration: Fix feature negotiation
  • lxd/network: Filter leases by project
  • lxd/network: Fix DNS records for projects
  • lxd/network: Port to APIEndpoint
  • lxd/operation: Port to APIEndpoint
  • lxd/patches: Fix LVM VG name
  • lxd/profiles: Optimize container updates
  • lxd/profiles: Port to APIEndpoint
  • lxd/projects: Port to APIEndpoint
  • lxd/proxy: Correctly handle unix: path rewriting with empty bind=
  • lxd/proxy: Don’t wrap string literal
  • lxd/proxy: Fix goroutine leak
  • lxd/proxy: Handle mnts for abstract unix sockets
  • lxd/proxy: Make helpers static
  • lxd/proxy: Make logfile close on exec
  • lxd/proxy: Only attach to mntns for unix sockets
  • lxd/proxy: Retry epoll on EINTR
  • lxd/proxy: Use standard macros on exit
  • lxd/proxy: Validate the addresses
  • lxd/resource: Port to APIEndpoint
  • lxd/storage: Don’t hardcode default project
  • lxd/storage: Fix error message on differing maps
  • lxd/storage: Handle XFS with leftover journal entries
  • lxd/storage: Port to APIEndpoint
  • lxd/storage/btrfs: Don’t make ro snapshots when unpriv
  • lxd/storage/ceph: Don’t mix stderr with json
  • lxd/storage/ceph: Fix snapshot of running containers
  • lxd/storage/ceph: Fix snapshot of running xfs/btrfs
  • lxd/storage/ceph: Fix UUID re-generation
  • lxd/storage/ceph: Only rewrite UUID once
  • lxd/sys: Cleanup State struct
  • scripts/bash: Add bash completion for profile/container device get, set, unset
  • shared: Add StringMapHasStringKey helper function
  • shared: Fix $SNAP handling under new snappy
  • shared: Fix Windows build
  • shared/idmap: Add comparison function
  • shared/netutils: Adapt to kernel changes
  • shared/netutils: Add AbstractUnixReceiveFdData()
  • shared/netutils: Export peer link id in getifaddrs
  • shared/netutils: Handle SCM_CREDENTIALS when receiving fds
  • shared/netutils: Move network cgo to shared/netutils
  • shared/netutils: Move send/recv fd functions
  • shared/network: Fix reporting of down interfaces
  • shared/network: Get HostName field when possible
  • shared/osarch: Add i586 to arch aliases
  • tests: Extend migration tests
  • tests: Handle built-in shiftfs
  • tests: Updates config tests to use host_name for nic tests

Try it for yourself

This new LXD release is already available for you to try on our demo service.

Downloads

The release tarballs can be found on our download page.

5 Likes

Great news!

I had a look at the network_ipvlan LXC feature. In LXD 3.13 (channel: candidate),

$ lxc info 
...
  kernel_version: 4.15.0-48-generic
  lxc_features:
    mount_injection_file: "true"
    network_gateway_device_route: "false"
    network_ipvlan: "false"
    network_l2proxy: "false"
    seccomp_notify: "false"
...
$ 

It is a feature that it is not enabled in my case. Is it user-configurable?

It appears it is not user-configurable. As mentioned in the announcement, it relates to the version of liblxc that is bundled in the snap package.

The master branch of liblxc knows about network_ipvlan:

But what version of liblxc does the 3.13 LXD snap package have?

$ snap run --shell lxd
bash-4.3$ ls -l /snap/lxd/current/lib/liblxc*
lrwxrwxrwx 1 0 0      15 May  9 17:12 /snap/lxd/current/lib/liblxc.so.1 -> liblxc.so.1.5.0
-rwxr-xr-x 1 0 0 1068656 May  9 17:29 /snap/lxd/current/lib/liblxc.so.1.5.0
-rwxr-xr-x 1 0 0   80688 May  9 17:29 /snap/lxd/current/lib/liblxcfs.so
bash-4.3$ 

It mentions it is a liblxc 1.5.0 version. There is no string network_ipvlan in liblxc.so.1.5.0.
How does this library relate to the repository https://github.com/lxc/lxc ? I could not find a relevant 1.5.0 tag in the source code repository.

I have used the snap package from the candidate channel, which bundles the following lxclxc version, tag: lxc-3.1.0. It is not recent enough to have the network_ipvlan goodness.

But what snap package channel is recent enough to have the git version of liblxc? Is it edge?

It is edge. Does edge have a recent enough LXD that includes IPVLAN support?

$ snap info lxd
...
channels:
  stable:        3.12        2019-04-16 (10601) 56MB -
  candidate:     3.13        2019-05-09 (10732) 56MB -
  beta:          ↑                                   
  edge:          git-566ee20 2019-05-09 (10738) 56MB -
...

There is a commit, 566ee20. Is that recent enough to have IPVLAN support in LXD? Here are the commits, https://github.com/lxc/lxd/commits/master and discourse does not show a good snapshot of the page.

So, we could switch to the edge snap channel of LXD and experience the full goodness of the new features. But that would be utterly inappropriate for the stability of the system. Because, would things break in LXD if you switch forward and back between the stable and edge channels? The LXD version is almost the same, but liblxc differs quite a bit; almost six months of changes to the code.

So, what do we do? You know what we do, but let’s capture first the error message when you try IPVLAN on a liblxc that it is not recent enough. They say it is good for SEO.

$ lxc launch ubuntu:18.04 mycontainer --profile default --profile ipvlan
Creating mycontainer
Error: Failed container creation: Create container: Create LXC container: Initialize LXC: LXC is missing one or more API extensions: network_ipvlan, network_l2proxy, network_gateway_device_route

Let’s switch the snap package of LXD to the edge channel.

$ snap switch --channel edge lxd
"lxd" switched to the "edge" channel
$ snap refresh
lxd (edge) git-566ee20 from Canonical✓ refreshed

Will it work now?

$ lxc launch ubuntu:18.04 mycontainer --profile default --profile ipvlan
Creating mycontainer
Starting mycontainer
$ lxc list mycontainer
+-------------+---------+------+------+------------+-----------+
|    NAME     |  STATE  | IPV4 | IPV6 |    TYPE    | SNAPSHOTS |
+-------------+---------+------+------+------------+-----------+
| mycontainer | RUNNING |      |      | PERSISTENT | 0         |
+-------------+---------+------+------+------------+-----------+

No IP address from the LAN. What went wrong? Isn’t IPVLAN supposed to let the container get the IP address automatically from the LAN? Probably not, considering that it is Layer 3 (not Layer 2 that macvlan is). Scratch that then, we start over again.

To cut this short, you need to tell LXD (ipv4.address=...) the IP address for the container. Then, LXD will be able to set up what is needed. And you need to instruct the container of the DNS server settings because without DNS, cloud-init takes time to complete the bootup sequence (and create the ubuntu account).

In a nutshell,

  1. You need to get LXD to setup the IP address for the container, because that’s the way IPVLAN works.
  2. You do not get a DNS server autoconfigured, so you need to configure it in some way, such as with cloud-init from a LXD profile.
  3. You do not need to (cannot?) add a default route. LXD/ipvlan does that for you. See below how the default route looks like.
ubuntu@mycontainer:~$ route 
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         0.0.0.0         0.0.0.0         U     0      0        0 eth0
ubuntu@mycontainer:~$ ping -c 1 www.google.com
PING www.google.com (216.58.198.4) 56(84) bytes of data.
64 bytes from mil04s03-in-f4.1e100.net (216.58.198.4): icmp_seq=1 ttl=54 time=76.2 ms

--- www.google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 76.275/76.275/76.275/0.000 ms
ubuntu@mycontainer:~$ 

@simos correct, the LXC version in the stable snap is 3.1 (as lxc info should show as driver_version), you need current master to get IPVLAN which the LXD edge snap has.

And indeed, IPVLAN doesn’t have a default gateway in the normal sense of the term so that’s configured for you, DNS is up to you to configure though, you should be able to use network-config (netplan) to configure that, possibly by adding your DNS config to the loopback device so that netplan doesn’t attempt to mess the ipvlan device?

1 Like

I tried with network-config on eth0 to set the nameserver and it worked well.

As is, I use a LXD profile per ipvlan container, because each profile needs to specify a unique LAN IP addresses.

I noticed that the ipvlan container cannot communicate with the host, which follows the case with macvlan.

Good to hear simos.

You can also use a shared profile and add individual ipvlan NICs to a container.

Yes ipvlan like macvlan stops containers and the host communicating. You can add an ipvlan interface to host though to overcome this I believe.

Have the changes to the network limits affected how limits are applied to individual containers? It appears our limits have suddenly stopped working, so we’re trying to determine the root cause. Currently we apply them by overwriting the devices config for the container.

Can you give an example of your config, and a bit more info on what you think isn’t working or has changed?

The network limits shouldn’t have changed, there were some tests added as part of 3.13 to ensure behaviour remained consistent:

Sure thing!

We currently update the devices value of the LXD container. The Ruby code looks something like this (we set the network and disk limits here):

# config = get the LXD container's config using the LXD API
config.devices = {
  eth0: {
    nictype: :bridged,
    parent: :lxdbr0,
    type: :nic,
    'limits.ingress' => '25Mbit',
    'limits.egress' => '5Mbit'
  },
  root: {
    path: '/',
    pool: 'default',
    size: '2048MB',
    type: :disk,
    'limits.read' => '10MB',
    'limits.write' => '10MB'
  }
}
# save the LXD container's config using the LXD API

This allows us to set dynamic network and disk limits based on the instance size we’ve selected. Previously both were working, but now only the disk one is.

Now, we’re only able to set it on the profile level, and only if the profile has that device, which is just our default profile. We do it by running this on the host machine:

lxc profile device set default eth0 limits.ingress 50000000

Note that the 50000000 could also be 50Mbit, but for some reason, this also didn’t appear to be working. We haven’t gone back and tried to reproduce that yet, however.

The instance launches with the eth0 interface already attached thanks to the default profile, so ideally we can continue to alter the limits for that device once it’s already been attached.

You’re correct that you can only set limits on the device at the profile level when the device is being added to the container as part of the profile. If you add a standalone device to a container then you can specify limits on a per-container basis.

I’m not understanding what the issue is that you’re having though, do you get an error compared to pre-3.13? Or is it that the limits do not apply?

Thanks
Tom

Ah, thanks for clarifying.

It’s that the limits do not apply. I know for sure they used to because we tested it thoroughly when we added them originally, but realized just this week they were no longer being used. I saw some changes to network limits mentioned in the 3.13 changelog which is why I brought it up here.

It sounds like our best bet would be to either keep applying more general limits on the default profile or remove the eth0 device from the default profile and add it on a per container basis.

I wonder if perhaps previously the default profile didn’t have the eth0 device added to it. If that’s the case, the code above would have added it on a per container basis, where it sounds like it would have included the limit.

Thanks for your help! I’ll do a bit more experimentation now that I know limits can only be applied when the device is originally attached.

From what I understand, @saulcosta has the eth0 device in his default profile with some initial limits, those limits then get overriden on a per-container basis by adding a eth0 device local to the container.

In this case I would certainly expect the limit defined on the eth0 device on the container to be the effective traffic limit.

There shouldn’t be any need for @saulcosta to alter his default profile here, adding a eth0 device directly to the container should take precedence and immediately change the limit to the running container.

Am I missing something?

Yes that should be fine, there is a test for that scenario here:

@stgraber just checked doing an non-hotplug variant too and it appears to work fine. tc shows the limits applied.

@stgraber that was my expectation as well and how it previously seemed to be performing. Here’re some more details on the configuration we currently have.

The default profile (applicable part):

devices:                                                   
  eth0:                                                    
    limits.egress: 20Mbit                                  
    limits.ingress: 50Mbit                                 
    nictype: bridged                                       
    parent: lxdbr0                                         
    type: nic                                              
  root:                                                    
    path: /                                                
    pool: default                                          
    type: disk

And here are the starting devices for the container, which does have the default profile:

root:
  limits.read: 10MB
  limits.write: 10MB
  path: /
  pool: default
  size: 354MB
  type: disk

When running a speedtest in that container, I get:

workspace $ speedtest
Retrieving speedtest.net configuration...
Testing from Google Cloud (34.66.108.68)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by Kansas Research and Education Network (Wichita, KS) [43.14 km]: 34.618 ms
Testing download speed................................................................................
Download: 45.26 Mbit/s
Testing upload speed................................................................................................
Upload: 19.99 Mbit/s

This makes sense, given the limits on the default profile.

I then can apply the network limits using the approach in the Ruby code above, which does apply this config to the container:

eth0:
  limits.egress: 15Mbit
  limits.ingress: 25Mbit
  nictype: bridged
  parent: lxdbr0
  type: nic
root:
  limits.read: 10MB
  limits.write: 10MB
  path: /
  pool: default
  size: 361MB
  type: disk

However, the speedtest still uses the limits from the default profile:

workspace $ speedtest
Retrieving speedtest.net configuration...
Testing from Google Cloud (34.66.108.68)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by Cox - Wichita (Wichita, KS) [43.14 km]: 26.761 ms
Testing download speed................................................................................
Download: 43.89 Mbit/s
Testing upload speed................................................................................................
Upload: 20.06 Mbit/s

speedtest is a utility that can be installed with sudo pip install speedtest-cli.

Can you apply a container level limit and then check the settings are applied:

 sudo tc class show dev $(lxc config get test1 volatile.eth0.host_name)

I couldn’t get that command to work with my container name, but here’s the only output from the config for that container that includes volatile.eth0:

volatile.eth0.hwaddr: 00:16:3e:0c:92:f9
volatile.eth0.name: eth0

Can you try running ip link on the host and look for the parent interfaces starting “veth”.

If you run sudo tc class show dev X where X is each parent veth device name before the “@” sign, then you should be able to see which interfaces have what limits applied.

We need to ascertain whether the issue is with limits settings not being applied to the OS or whether the OS isn’t restricting them.

For testing purposes, if you set a host_name property on one of your containers then you’ll be more able to identify it’s peer on the host.

lxd config device set ct1 eth0 host_name myct1

sudo tc class show dev myct

Looks like the limit is indeed being applied. sudo tc class show dev myct1 returns:

class htb 1:10 root prio rate 25Mbit ceil 25Mbit burst 1600b cburst 1600b