Lxd daemon high cpu usage processing inotify watch on /dev/net/tun

On one of my lxd servers (standalone) I’ve noticed high load on the lxd process, in the range of 60% of one core (ok, it is only an Atom C2358, but still). Almost identical twin machine has no load at all.

stracing the process reveals tons of reads on inotify watch and not much else. further strace reveals that it has a watch on /dev/net and receiving updates on /dev/net/tun. That is the difference between the machines, one that has higher load runs an openvpn process in the host (not container), utilizing a tun device.

Why does lxd hold an inotify watch on /dev/net/tun and can it be turned off somehow? I’m not really planning on using tun devices in containers there in the near future.


$ sudo /tmp/inotify-info lxd | fgrep 0:5
       90 [0:5] /dev/input/
       96 [0:5] /dev/char/
      190 [0:5] /dev/bus/
      191 [0:5] /dev/bus/usb/
      192 [0:5] /dev/bus/usb/001/
      194 [0:5] /dev/bus/usb/002/
      196 [0:5] /dev/bus/usb/003/
      199 [0:5] /dev/bsg/
      218 [0:5] /dev/block/
      220 [0:5] /dev/disk/
      221 [0:5] /dev/disk/by-path/
      223 [0:5] /dev/disk/by-id/
      238 [0:5] /dev/disk/by-partuuid/
      241 [0:5] /dev/disk/by-label/
      245 [0:5] /dev/disk/by-uuid/
      321 [0:5] /dev/md/
      384 [0:5] /dev/net/
      388 [0:5] /dev/mapper/
      390 [0:5] /dev/vfio/
      396 [0:5] /dev/snd/
      408 [0:5] /dev/input/by-path/
      415 [0:5] /dev/dri/
      433 [0:5] /dev/dri/by-path/
      536 [0:5] /dev/input/by-id/

$ sudo strace -fp 3384458  -e read=27
...
[pid 3965240] read(27,  <unfinished ...>
[pid 3384472] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 3965240] <... read resumed>"\31\0\0\0\2\0\0\0\0\0\0\0\20\0\0\0tun\0\0\0\0\0\0\0\0\0\0\0\0\0", 65536) = 32
 | 00000  19 00 00 00 02 00 00 00  00 00 00 00 10 00 00 00  ................ |
 | 00010  74 75 6e 00 00 00 00 00  00 00 00 00 00 00 00 00  tun............. |
[pid 3384472] <... nanosleep resumed>NULL) = 0
....

 sudo ls -al /proc/3384458/fd/27
lr-x------ 1 root root 64 Nov 14 17:00 /proc/3384458/fd/27 -> anon_inode:inotify

any thoughts anyone?

Are you passing /dev/net/tun into any of your instances?

No, unless it is happening by default, definitely not. If I’m interpreting the inotify-info above correctly, the watch is on the directory /dev/net and tun is just accidentally there :wink:

i have just tested, this happens even without any containers running, it is enough to have a tunnel running outside of the container and lxd would use tons of CPU just processing the /dev/net/tun device “changes”.

What sense does it make and is there any way to turn it off?

We have a component called DevMonitor, it enables fanotify on /dev. I can’t see any way to disable particular watch on the /dev/net/tun. But anyway, it looks like a serious problem that you see a lot of events for this device and lxd is eating a lot of CPU time to process it.

Can you provide some details about your workload of tun device? Do you have just one OpenVPN server instance on the host with heavy traffic on it? Or possibly you have several OpenVPN servers with many clients? I’m asking because probably we will need to have some local reproduction of this problem.

If you reload LXD do you see the same reads on /dev/net/tun?

What is this and can we use it locally?

Any thoughts @monstermunchkin ?

inotify-info is GitHub - mikesart/inotify-info: Linux inotify info reporting app

tun device is used by only one openvpn server running outside of lxd on a physical hardware.

How lightly loaded it is is a matter of interpretation, it only gets to serve 2 lines 100Mbps each, but on the other hand, the CPU on the machine is nothing special either.

What I have empirically observed, lxd uses roughly the half of CPU power than openvpn process, but that is highly unscientific :slight_smile:

PID     TID S  CPU COMMAND-LINE 

6855 - S 80% /usr/sbin/openvpn …
282494 - S 41% lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
4010401 - R 10% atop

those are the only 3 processes taking more than 1% of the CPU.

Thanks. Next questions

  • If you stop OpenVPN on the server does LXD stop using as much CPU (and do you see the reads of /dev/net/tun reduce)?
  • If you reload LXD (with OpenVPN running) does it continue to read from /dev/net/tun?

yes, it stops using basically any CPU and there are no reads of the inotify watch of /dev/net/tun (actually on /dev/net), lxd itself is not ever reading directly from /dev/net/tun, as far as I can see

no changes in CPU load if I reload with systemctl reload snap.lxd.daemon, it the vpn was on it continues doing the same inotify events processing, if it was on, it doesn’t, as there are no other files in /dev/net on my system except /dev/net/tun

Thanks. I wonder if something that OpenVPN is doing is causing lots of events to be sent to the /dev/net listener.

well, /dev/net/tun is a networking device, openvpn is writing to it all the time so if the watch is watching for modify events, there will be an event every millisecond or whatever the minimum resolution of those events is.

the question is, does it really need to be a modify watch?

Over to you @monstermunchkin, any ideas whats happening here and whether we can relax the watches on those devices?

I’ve performed some local tests with OpenVPN client (not server) and… small test program:

/* based on https://man7.org/linux/man-pages/man7/inotify.7.html */
#include <errno.h>
#include <poll.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/inotify.h>
#include <unistd.h>
#include <string.h>

/* Read all available inotify events from the file descriptor 'fd'.
    wd is the table of watch descriptors for the directories in argv.
    argc is the length of wd and argv.
    argv is the list of watched directories.
    Entry 0 of wd and argv is unused. */

static void
handle_events(int fd, int *wd, int argc, char* argv[])
{
    /* Some systems cannot read integer variables if they are not
        properly aligned. On other systems, incorrect alignment may
        decrease performance. Hence, the buffer used for reading from
        the inotify file descriptor should have the same alignment as
        struct inotify_event. */
    static int i = 0;
    char buf[4096]
        __attribute__ ((aligned(__alignof__(struct inotify_event))));
    const struct inotify_event *event;
    ssize_t len;

    /* Loop while events can be read from inotify file descriptor. */

    for (;;) {

        /* Read some events. */

        len = read(fd, buf, sizeof(buf));
        if (len == -1 && errno != EAGAIN) {
            perror("read");
            exit(EXIT_FAILURE);
        }

        /* If the nonblocking read() found no events to read, then
            it returns -1 with errno set to EAGAIN. In that case,
            we exit the loop. */

        if (len <= 0)
            break;

        /* Loop over all events in the buffer. */

        for (char *ptr = buf; ptr < buf + len;
                ptr += sizeof(struct inotify_event) + event->len) {

            event = (const struct inotify_event *) ptr;

            /* Print event type. */

            if (event->mask & IN_MODIFY)
                printf("IN_MODIFY: %d", i++);
            if (event->mask & IN_ATTRIB)
                printf("IN_ATTRIB: ");
            if (event->mask & IN_OPEN)
                printf("IN_OPEN: ");
            if (event->mask & IN_CLOSE_NOWRITE)
                printf("IN_CLOSE_NOWRITE: ");
            if (event->mask & IN_CLOSE_WRITE)
                printf("IN_CLOSE_WRITE: ");

            /* Print the name of the watched directory. */

            for (int i = 1; i < argc; ++i) {
                if (wd[i] == event->wd) {
                    printf("%s/", argv[i]);
                    break;
                }
            }

            /* Print the name of the file. */

            if (event->len)
                printf("%s", event->name);

            /* Print type of filesystem object. */

            if (event->mask & IN_ISDIR)
                printf(" [directory]\n");
            else
                printf(" [file]\n");
        }
    }
}

int
main(int argc, char* argv[])
{
    char buf;
    int fd, i, poll_num;
    int *wd;
    nfds_t nfds;
    struct pollfd fds[2];

    if (argc < 2) {
        printf("Usage: %s PATH [PATH ...]\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    printf("Press ENTER key to terminate.\n");

    /* Create the file descriptor for accessing the inotify API. */

    fd = inotify_init1(IN_NONBLOCK);
    if (fd == -1) {
        perror("inotify_init1");
        exit(EXIT_FAILURE);
    }

    /* Allocate memory for watch descriptors. */

    wd = calloc(argc, sizeof(int));
    if (wd == NULL) {
        perror("calloc");
        exit(EXIT_FAILURE);
    }

    /* Mark directories for events
        - file was opened
        - file was closed */

    for (i = 1; i < argc; i++) {
        wd[i] = inotify_add_watch(fd, argv[i],
                                    IN_MOVED_TO | IN_MOVED_FROM | IN_CREATE | IN_ATTRIB | IN_MODIFY | IN_MOVE_SELF | IN_DELETE | IN_DELETE_SELF);
        if (wd[i] == -1) {
            fprintf(stderr, "Cannot watch '%s': %s\n",
                    argv[i], strerror(errno));
            exit(EXIT_FAILURE);
        }
    }

    /* Prepare for polling. */

    nfds = 2;

    fds[0].fd = STDIN_FILENO;       /* Console input */
    fds[0].events = POLLIN;

    fds[1].fd = fd;                 /* Inotify input */
    fds[1].events = POLLIN;

    /* Wait for events and/or terminal input. */

    printf("Listening for events.\n");
    while (1) {
        poll_num = poll(fds, nfds, -1);
        if (poll_num == -1) {
            if (errno == EINTR)
                continue;
            perror("poll");
            exit(EXIT_FAILURE);
        }

        if (poll_num > 0) {

            if (fds[0].revents & POLLIN) {

                /* Console input is available. Empty stdin and quit. */

                while (read(STDIN_FILENO, &buf, 1) > 0 && buf != '\n')
                    continue;
                break;
            }

            if (fds[1].revents & POLLIN) {

                /* Inotify events are available. */

                handle_events(fd, wd, argc, argv);
            }
        }
    }

    printf("Listening for events stopped.\n");

    /* Close inotify file descriptor. */

    close(fd);

    free(wd);
    exit(EXIT_SUCCESS);
}

Here I’ve tried to setup the inotify the same way as it done in LXD (fsnotify/backend_inotify.go at main · fsnotify/fsnotify · GitHub).

Then I’ve performed the connection to my VPN server and run speed test. This resulted in ultra-heavy load on the CPU and ~280K IN_MODIFY events were generated.

1 Like

Thanks, we would expect a high amount of modify events, but I wonder why @monstermunchkin set it up this way, perhaps there was a particular need. Or hopefully we can relax the event types we collect.

Because its not just /dev/net/tun that would be affected by this.

I wonder if the modify event only needs to be established on directories in order to detect new sub files created.

As I can see fsnotify golang package provides us with no choice here.
IN_MODIFY is used to detect writes. But we are not interested in writes, we only want to subscribe to CREATE and DELETE events. While in fsnotify code we can see unconditional enabling of all flags (fsnotify/backend_inotify.go at main · fsnotify/fsnotify · GitHub)

	var flags uint32 = unix.IN_MOVED_TO | unix.IN_MOVED_FROM |
		unix.IN_CREATE | unix.IN_ATTRIB | unix.IN_MODIFY |
		unix.IN_MOVE_SELF | unix.IN_DELETE | unix.IN_DELETE_SELF
2 Likes

Ah so its a limitation of the package used. Please can you log this here Issues · lxc/lxd · GitHub so we can investigate if we can workaround it or make a change to upstream or change package used. Thanks!

What host os version are you using. As aside from the bug in fsnotify, lxd should try and use fanotify if available .

Please can you show “lxc info” output?