Lxd daemon high cpu usage processing inotify watch on /dev/net/tun

Aleks · November 14, 2022, 4:10pm

On one of my lxd servers (standalone) I’ve noticed high load on the lxd process, in the range of 60% of one core (ok, it is only an Atom C2358, but still). Almost identical twin machine has no load at all.

stracing the process reveals tons of reads on inotify watch and not much else. further strace reveals that it has a watch on /dev/net and receiving updates on /dev/net/tun. That is the difference between the machines, one that has higher load runs an openvpn process in the host (not container), utilizing a tun device.

Why does lxd hold an inotify watch on /dev/net/tun and can it be turned off somehow? I’m not really planning on using tun devices in containers there in the near future.


$ sudo /tmp/inotify-info lxd | fgrep 0:5
       90 [0:5] /dev/input/
       96 [0:5] /dev/char/
      190 [0:5] /dev/bus/
      191 [0:5] /dev/bus/usb/
      192 [0:5] /dev/bus/usb/001/
      194 [0:5] /dev/bus/usb/002/
      196 [0:5] /dev/bus/usb/003/
      199 [0:5] /dev/bsg/
      218 [0:5] /dev/block/
      220 [0:5] /dev/disk/
      221 [0:5] /dev/disk/by-path/
      223 [0:5] /dev/disk/by-id/
      238 [0:5] /dev/disk/by-partuuid/
      241 [0:5] /dev/disk/by-label/
      245 [0:5] /dev/disk/by-uuid/
      321 [0:5] /dev/md/
      384 [0:5] /dev/net/
      388 [0:5] /dev/mapper/
      390 [0:5] /dev/vfio/
      396 [0:5] /dev/snd/
      408 [0:5] /dev/input/by-path/
      415 [0:5] /dev/dri/
      433 [0:5] /dev/dri/by-path/
      536 [0:5] /dev/input/by-id/

$ sudo strace -fp 3384458  -e read=27
...
[pid 3965240] read(27,  <unfinished ...>
[pid 3384472] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 3965240] <... read resumed>"\31\0\0\0\2\0\0\0\0\0\0\0\20\0\0\0tun\0\0\0\0\0\0\0\0\0\0\0\0\0", 65536) = 32
 | 00000  19 00 00 00 02 00 00 00  00 00 00 00 10 00 00 00  ................ |
 | 00010  74 75 6e 00 00 00 00 00  00 00 00 00 00 00 00 00  tun............. |
[pid 3384472] <... nanosleep resumed>NULL) = 0
....

 sudo ls -al /proc/3384458/fd/27
lr-x------ 1 root root 64 Nov 14 17:00 /proc/3384458/fd/27 -> anon_inode:inotify

Aleks · November 16, 2022, 7:21am

any thoughts anyone?

tomp · November 16, 2022, 8:22am

Are you passing /dev/net/tun into any of your instances?

Aleks · November 16, 2022, 8:55am

No, unless it is happening by default, definitely not. If I’m interpreting the inotify-info above correctly, the watch is on the directory /dev/net and tun is just accidentally there

Aleks · November 23, 2022, 10:51am

i have just tested, this happens even without any containers running, it is enough to have a tunnel running outside of the container and lxd would use tons of CPU just processing the /dev/net/tun device “changes”.

What sense does it make and is there any way to turn it off?

amikhalitsyn · November 23, 2022, 12:50pm

We have a component called DevMonitor, it enables fanotify on /dev. I can’t see any way to disable particular watch on the /dev/net/tun. But anyway, it looks like a serious problem that you see a lot of events for this device and lxd is eating a lot of CPU time to process it.

Can you provide some details about your workload of tun device? Do you have just one OpenVPN server instance on the host with heavy traffic on it? Or possibly you have several OpenVPN servers with many clients? I’m asking because probably we will need to have some local reproduction of this problem.

tomp · November 23, 2022, 12:52pm

If you reload LXD do you see the same reads on /dev/net/tun?

tomp · November 23, 2022, 12:52pm

What is this and can we use it locally?

tomp · November 23, 2022, 12:53pm

Any thoughts @monstermunchkin ?

Aleks · November 23, 2022, 1:21pm

inotify-info is GitHub - mikesart/inotify-info: Linux inotify info reporting app

tun device is used by only one openvpn server running outside of lxd on a physical hardware.

How lightly loaded it is is a matter of interpretation, it only gets to serve 2 lines 100Mbps each, but on the other hand, the CPU on the machine is nothing special either.

What I have empirically observed, lxd uses roughly the half of CPU power than openvpn process, but that is highly unscientific

PID     TID S  CPU COMMAND-LINE

6855 - S 80% /usr/sbin/openvpn …
282494 - S 41% lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
4010401 - R 10% atop

those are the only 3 processes taking more than 1% of the CPU.

tomp · November 23, 2022, 1:22pm

Thanks. Next questions

If you stop OpenVPN on the server does LXD stop using as much CPU (and do you see the reads of /dev/net/tun reduce)?
If you reload LXD (with OpenVPN running) does it continue to read from /dev/net/tun?

Aleks · November 23, 2022, 1:38pm

yes, it stops using basically any CPU and there are no reads of the inotify watch of /dev/net/tun (actually on /dev/net), lxd itself is not ever reading directly from /dev/net/tun, as far as I can see

no changes in CPU load if I reload with systemctl reload snap.lxd.daemon, it the vpn was on it continues doing the same inotify events processing, if it was on, it doesn’t, as there are no other files in /dev/net on my system except /dev/net/tun

tomp · November 23, 2022, 1:41pm

Thanks. I wonder if something that OpenVPN is doing is causing lots of events to be sent to the /dev/net listener.

Aleks · November 23, 2022, 2:29pm

well, /dev/net/tun is a networking device, openvpn is writing to it all the time so if the watch is watching for modify events, there will be an event every millisecond or whatever the minimum resolution of those events is.

the question is, does it really need to be a modify watch?

tomp · November 23, 2022, 2:35pm

Over to you @monstermunchkin, any ideas whats happening here and whether we can relax the watches on those devices?

amikhalitsyn · November 23, 2022, 7:19pm

I’ve performed some local tests with OpenVPN client (not server) and… small test program:

/* based on https://man7.org/linux/man-pages/man7/inotify.7.html */
#include <errno.h>
#include <poll.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/inotify.h>
#include <unistd.h>
#include <string.h>

/* Read all available inotify events from the file descriptor 'fd'.
    wd is the table of watch descriptors for the directories in argv.
    argc is the length of wd and argv.
    argv is the list of watched directories.
    Entry 0 of wd and argv is unused. */

static void
handle_events(int fd, int *wd, int argc, char* argv[])
{
    /* Some systems cannot read integer variables if they are not
        properly aligned. On other systems, incorrect alignment may
        decrease performance. Hence, the buffer used for reading from
        the inotify file descriptor should have the same alignment as
        struct inotify_event. */
    static int i = 0;
    char buf[4096]
        __attribute__ ((aligned(__alignof__(struct inotify_event))));
    const struct inotify_event *event;
    ssize_t len;

    /* Loop while events can be read from inotify file descriptor. */

    for (;;) {

        /* Read some events. */

        len = read(fd, buf, sizeof(buf));
        if (len == -1 && errno != EAGAIN) {
            perror("read");
            exit(EXIT_FAILURE);
        }

        /* If the nonblocking read() found no events to read, then
            it returns -1 with errno set to EAGAIN. In that case,
            we exit the loop. */

        if (len <= 0)
            break;

        /* Loop over all events in the buffer. */

        for (char *ptr = buf; ptr < buf + len;
                ptr += sizeof(struct inotify_event) + event->len) {

            event = (const struct inotify_event *) ptr;

            /* Print event type. */

            if (event->mask & IN_MODIFY)
                printf("IN_MODIFY: %d", i++);
            if (event->mask & IN_ATTRIB)
                printf("IN_ATTRIB: ");
            if (event->mask & IN_OPEN)
                printf("IN_OPEN: ");
            if (event->mask & IN_CLOSE_NOWRITE)
                printf("IN_CLOSE_NOWRITE: ");
            if (event->mask & IN_CLOSE_WRITE)
                printf("IN_CLOSE_WRITE: ");

            /* Print the name of the watched directory. */

            for (int i = 1; i < argc; ++i) {
                if (wd[i] == event->wd) {
                    printf("%s/", argv[i]);
                    break;
                }
            }

            /* Print the name of the file. */

            if (event->len)
                printf("%s", event->name);

            /* Print type of filesystem object. */

            if (event->mask & IN_ISDIR)
                printf(" [directory]\n");
            else
                printf(" [file]\n");
        }
    }
}

int
main(int argc, char* argv[])
{
    char buf;
    int fd, i, poll_num;
    int *wd;
    nfds_t nfds;
    struct pollfd fds[2];

    if (argc < 2) {
        printf("Usage: %s PATH [PATH ...]\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    printf("Press ENTER key to terminate.\n");

    /* Create the file descriptor for accessing the inotify API. */

    fd = inotify_init1(IN_NONBLOCK);
    if (fd == -1) {
        perror("inotify_init1");
        exit(EXIT_FAILURE);
    }

    /* Allocate memory for watch descriptors. */

    wd = calloc(argc, sizeof(int));
    if (wd == NULL) {
        perror("calloc");
        exit(EXIT_FAILURE);
    }

    /* Mark directories for events
        - file was opened
        - file was closed */

    for (i = 1; i < argc; i++) {
        wd[i] = inotify_add_watch(fd, argv[i],
                                    IN_MOVED_TO | IN_MOVED_FROM | IN_CREATE | IN_ATTRIB | IN_MODIFY | IN_MOVE_SELF | IN_DELETE | IN_DELETE_SELF);
        if (wd[i] == -1) {
            fprintf(stderr, "Cannot watch '%s': %s\n",
                    argv[i], strerror(errno));
            exit(EXIT_FAILURE);
        }
    }

    /* Prepare for polling. */

    nfds = 2;

    fds[0].fd = STDIN_FILENO;       /* Console input */
    fds[0].events = POLLIN;

    fds[1].fd = fd;                 /* Inotify input */
    fds[1].events = POLLIN;

    /* Wait for events and/or terminal input. */

    printf("Listening for events.\n");
    while (1) {
        poll_num = poll(fds, nfds, -1);
        if (poll_num == -1) {
            if (errno == EINTR)
                continue;
            perror("poll");
            exit(EXIT_FAILURE);
        }

        if (poll_num > 0) {

            if (fds[0].revents & POLLIN) {

                /* Console input is available. Empty stdin and quit. */

                while (read(STDIN_FILENO, &buf, 1) > 0 && buf != '\n')
                    continue;
                break;
            }

            if (fds[1].revents & POLLIN) {

                /* Inotify events are available. */

                handle_events(fd, wd, argc, argv);
            }
        }
    }

    printf("Listening for events stopped.\n");

    /* Close inotify file descriptor. */

    close(fd);

    free(wd);
    exit(EXIT_SUCCESS);
}

Here I’ve tried to setup the inotify the same way as it done in LXD (fsnotify/backend_inotify.go at main · fsnotify/fsnotify · GitHub).

Then I’ve performed the connection to my VPN server and run speed test. This resulted in ultra-heavy load on the CPU and ~280K IN_MODIFY events were generated.

tomp · November 23, 2022, 7:24pm

Thanks, we would expect a high amount of modify events, but I wonder why @monstermunchkin set it up this way, perhaps there was a particular need. Or hopefully we can relax the event types we collect.

Because its not just /dev/net/tun that would be affected by this.

I wonder if the modify event only needs to be established on directories in order to detect new sub files created.

amikhalitsyn · November 23, 2022, 7:33pm

As I can see fsnotify golang package provides us with no choice here.
IN_MODIFY is used to detect writes. But we are not interested in writes, we only want to subscribe to CREATE and DELETE events. While in fsnotify code we can see unconditional enabling of all flags (fsnotify/backend_inotify.go at main · fsnotify/fsnotify · GitHub)

	var flags uint32 = unix.IN_MOVED_TO | unix.IN_MOVED_FROM |
		unix.IN_CREATE | unix.IN_ATTRIB | unix.IN_MODIFY |
		unix.IN_MOVE_SELF | unix.IN_DELETE | unix.IN_DELETE_SELF

tomp · November 23, 2022, 7:46pm

Ah so its a limitation of the package used. Please can you log this here Issues · lxc/lxd · GitHub so we can investigate if we can workaround it or make a change to upstream or change package used. Thanks!

tomp · November 23, 2022, 7:59pm

What host os version are you using. As aside from the bug in fsnotify, lxd should try and use fanotify if available .

Please can you show “lxc info” output?