I’m new to LXD and have been playing around with it lately to see what it can do and how it works. I’ve been playing with ephemeral containers lately and I’ve noticed that continually destroying and creating a lot of containers gradually increases LXD memory usage over time. Here’s exactly what I’m doing:
On a fresh boot I run the following on the host to spin up 25 Ubuntu 16.04 containers: for i in {1..25}; do echo -n "btest&i: lxc launch -p default -p bridgeprofile ubuntu:16.04 btest$i; done
I have a cron job that runs once every minute that randomly stops one of the containers and starts a new one with the same name. Here the script that does that:
#!/bin/bash
num=$((1 + RANDOM % 25))
echo -n "Containter btest$num ip address: "
lxc info btest$num | grep "eth0" | grep -w "inet" | cut -f 3
echo "Stopping btest$num"
lxc stop btest$num
lxc launch --ephemeral -p default -p bridgeprofile ubuntu:16.04 btest$num
sleep 10 #Wait to get an IP address
echo -n "Containter btest$num new ip address: "
lxc info btest$num | grep "eth0" | grep -w "inet" | cut -f 3
echo
The containers themselves aren’t doing anything other than running the OS at this point. I’ve noticed that when I first create all the containers system memory usage is ~1GB. After running for 24+ hours memory usage on the system has doubled, hitting 2-3GB used.
My expectation is that memory usage would remain relatively consistent and not grow like this since I have the same number of containers running the whole time. Am I missing something that’s eating this memory or is there a possible memory leak somewhere? Like I said, I’m new to this so I’m assuming that I’m missing something basic here.
It looks like the LXD process to me and not disk caching. I started just using free to check it and now I’m using htop to watch live. I just noticed I have a bunch of what appear to be stuck lxc exec processes from me checking container uptimes periodically. I’m going to try killing these and see if that fixes the issue.
I was able to kill all the lxc exec processes and memory usage is still the same. Here is the output of sudo ps aux --sort rss: https://pastebin.com/raw/ppvf1ZLu
*Edit: Also here’s what free is showing:
jeff@ubuntu-lxd:~$ free -m
total used free shared buff/cache available
Mem: 7976 3474 257 171 4244 3969
Swap: 4095 257 3838
At first I was just tracking just the “used” col so I’m not entirely sure what the growth of memory over time has been for any particular process. This evening I’ll try shutting everything down, rebooting, and kludging together some tracking scripts to try to replicate and get more info.
returns 1.7 Go of Rss. Quite a lot but not 2-3 Gb. You should probably track this use rather than free - a notoriously unreliable tool. And LXD itself and the containers are not taking a lot.
A nitpick: when you say that the containers are doing nothing, don’t confuse a complex OS like Ubuntu 16 with something like Alpine). When you start a new Ubuntu 16, it does lot of thngs (cloud-init, automatic updates)
I have never looked at the global memory stats of htop to be candid. Once I realized that enormous efforts are done at the kernel level to optimize memory usage (all sorts of deduplication) at the expense of naive accounting, I was not interested anymore in global counters. The main characteristic of htop global counters is that they are unreadable by default with their nasty coloring. I prefer free. Both have the only counter meaning unequivocally that the kernel is running in trouble with memory allocation: swap. If the swap goes over 25% and still grows, system is in trouble. If your test leads to swapping, something is wrong. If not, nothing is proved by memory stats.