On one of my lxd servers (standalone) I’ve noticed high load on the lxd process, in the range of 60% of one core (ok, it is only an Atom C2358, but still). Almost identical twin machine has no load at all.
stracing the process reveals tons of reads on inotify watch and not much else. further strace reveals that it has a watch on /dev/net and receiving updates on /dev/net/tun. That is the difference between the machines, one that has higher load runs an openvpn process in the host (not container), utilizing a tun device.
Why does lxd hold an inotify watch on /dev/net/tun and can it be turned off somehow? I’m not really planning on using tun devices in containers there in the near future.
No, unless it is happening by default, definitely not. If I’m interpreting the inotify-info above correctly, the watch is on the directory /dev/net and tun is just accidentally there
i have just tested, this happens even without any containers running, it is enough to have a tunnel running outside of the container and lxd would use tons of CPU just processing the /dev/net/tun device “changes”.
What sense does it make and is there any way to turn it off?
We have a component called DevMonitor, it enables fanotify on /dev. I can’t see any way to disable particular watch on the /dev/net/tun. But anyway, it looks like a serious problem that you see a lot of events for this device and lxd is eating a lot of CPU time to process it.
Can you provide some details about your workload of tun device? Do you have just one OpenVPN server instance on the host with heavy traffic on it? Or possibly you have several OpenVPN servers with many clients? I’m asking because probably we will need to have some local reproduction of this problem.
tun device is used by only one openvpn server running outside of lxd on a physical hardware.
How lightly loaded it is is a matter of interpretation, it only gets to serve 2 lines 100Mbps each, but on the other hand, the CPU on the machine is nothing special either.
What I have empirically observed, lxd uses roughly the half of CPU power than openvpn process, but that is highly unscientific
PID TID S CPU COMMAND-LINE
6855 - S 80% /usr/sbin/openvpn …
282494 - S 41% lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
4010401 - R 10% atop
those are the only 3 processes taking more than 1% of the CPU.
yes, it stops using basically any CPU and there are no reads of the inotify watch of /dev/net/tun (actually on /dev/net), lxd itself is not ever reading directly from /dev/net/tun, as far as I can see
no changes in CPU load if I reload with systemctl reload snap.lxd.daemon, it the vpn was on it continues doing the same inotify events processing, if it was on, it doesn’t, as there are no other files in /dev/net on my system except /dev/net/tun
well, /dev/net/tun is a networking device, openvpn is writing to it all the time so if the watch is watching for modify events, there will be an event every millisecond or whatever the minimum resolution of those events is.
the question is, does it really need to be a modify watch?
I’ve performed some local tests with OpenVPN client (not server) and… small test program:
/* based on https://man7.org/linux/man-pages/man7/inotify.7.html */
#include <errno.h>
#include <poll.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/inotify.h>
#include <unistd.h>
#include <string.h>
/* Read all available inotify events from the file descriptor 'fd'.
wd is the table of watch descriptors for the directories in argv.
argc is the length of wd and argv.
argv is the list of watched directories.
Entry 0 of wd and argv is unused. */
static void
handle_events(int fd, int *wd, int argc, char* argv[])
{
/* Some systems cannot read integer variables if they are not
properly aligned. On other systems, incorrect alignment may
decrease performance. Hence, the buffer used for reading from
the inotify file descriptor should have the same alignment as
struct inotify_event. */
static int i = 0;
char buf[4096]
__attribute__ ((aligned(__alignof__(struct inotify_event))));
const struct inotify_event *event;
ssize_t len;
/* Loop while events can be read from inotify file descriptor. */
for (;;) {
/* Read some events. */
len = read(fd, buf, sizeof(buf));
if (len == -1 && errno != EAGAIN) {
perror("read");
exit(EXIT_FAILURE);
}
/* If the nonblocking read() found no events to read, then
it returns -1 with errno set to EAGAIN. In that case,
we exit the loop. */
if (len <= 0)
break;
/* Loop over all events in the buffer. */
for (char *ptr = buf; ptr < buf + len;
ptr += sizeof(struct inotify_event) + event->len) {
event = (const struct inotify_event *) ptr;
/* Print event type. */
if (event->mask & IN_MODIFY)
printf("IN_MODIFY: %d", i++);
if (event->mask & IN_ATTRIB)
printf("IN_ATTRIB: ");
if (event->mask & IN_OPEN)
printf("IN_OPEN: ");
if (event->mask & IN_CLOSE_NOWRITE)
printf("IN_CLOSE_NOWRITE: ");
if (event->mask & IN_CLOSE_WRITE)
printf("IN_CLOSE_WRITE: ");
/* Print the name of the watched directory. */
for (int i = 1; i < argc; ++i) {
if (wd[i] == event->wd) {
printf("%s/", argv[i]);
break;
}
}
/* Print the name of the file. */
if (event->len)
printf("%s", event->name);
/* Print type of filesystem object. */
if (event->mask & IN_ISDIR)
printf(" [directory]\n");
else
printf(" [file]\n");
}
}
}
int
main(int argc, char* argv[])
{
char buf;
int fd, i, poll_num;
int *wd;
nfds_t nfds;
struct pollfd fds[2];
if (argc < 2) {
printf("Usage: %s PATH [PATH ...]\n", argv[0]);
exit(EXIT_FAILURE);
}
printf("Press ENTER key to terminate.\n");
/* Create the file descriptor for accessing the inotify API. */
fd = inotify_init1(IN_NONBLOCK);
if (fd == -1) {
perror("inotify_init1");
exit(EXIT_FAILURE);
}
/* Allocate memory for watch descriptors. */
wd = calloc(argc, sizeof(int));
if (wd == NULL) {
perror("calloc");
exit(EXIT_FAILURE);
}
/* Mark directories for events
- file was opened
- file was closed */
for (i = 1; i < argc; i++) {
wd[i] = inotify_add_watch(fd, argv[i],
IN_MOVED_TO | IN_MOVED_FROM | IN_CREATE | IN_ATTRIB | IN_MODIFY | IN_MOVE_SELF | IN_DELETE | IN_DELETE_SELF);
if (wd[i] == -1) {
fprintf(stderr, "Cannot watch '%s': %s\n",
argv[i], strerror(errno));
exit(EXIT_FAILURE);
}
}
/* Prepare for polling. */
nfds = 2;
fds[0].fd = STDIN_FILENO; /* Console input */
fds[0].events = POLLIN;
fds[1].fd = fd; /* Inotify input */
fds[1].events = POLLIN;
/* Wait for events and/or terminal input. */
printf("Listening for events.\n");
while (1) {
poll_num = poll(fds, nfds, -1);
if (poll_num == -1) {
if (errno == EINTR)
continue;
perror("poll");
exit(EXIT_FAILURE);
}
if (poll_num > 0) {
if (fds[0].revents & POLLIN) {
/* Console input is available. Empty stdin and quit. */
while (read(STDIN_FILENO, &buf, 1) > 0 && buf != '\n')
continue;
break;
}
if (fds[1].revents & POLLIN) {
/* Inotify events are available. */
handle_events(fd, wd, argc, argv);
}
}
}
printf("Listening for events stopped.\n");
/* Close inotify file descriptor. */
close(fd);
free(wd);
exit(EXIT_SUCCESS);
}
Then I’ve performed the connection to my VPN server and run speed test. This resulted in ultra-heavy load on the CPU and ~280K IN_MODIFY events were generated.
Thanks, we would expect a high amount of modify events, but I wonder why @monstermunchkin set it up this way, perhaps there was a particular need. Or hopefully we can relax the event types we collect.
Because its not just /dev/net/tun that would be affected by this.
I wonder if the modify event only needs to be established on directories in order to detect new sub files created.
As I can see fsnotify golang package provides us with no choice here.
IN_MODIFY is used to detect writes. But we are not interested in writes, we only want to subscribe to CREATE and DELETE events. While in fsnotify code we can see unconditional enabling of all flags (fsnotify/backend_inotify.go at main · fsnotify/fsnotify · GitHub)
Ah so its a limitation of the package used. Please can you log this here Issues · lxc/lxd · GitHub so we can investigate if we can workaround it or make a change to upstream or change package used. Thanks!