Lxd 3.0.1 with kernel 4.18

Hi all,

Running Ubuntu 18.04, I upgraded the kernel to 4.18rc2 from http://kernel.ubuntu.com/~kernel-ppa/mainline/. However since the upgrade it became impossible to launch LXD containers.
The error message shown by lxc info --show-log is:

Name: kali

Remote: unix://
Architecture: x86_64
Created: 2018/05/04 05:30 UTC
Status: Stopped
Type: persistent
Profiles: default

Log:

lxc kali 20180630004334.373 ERROR    lxc_utils - utils.c:open_devnull:1753 - Permission denied - Can't open /dev/null
lxc kali 20180630004334.373 ERROR    lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5)
lxc kali 20180630004334.398 ERROR    lxc_container - lxccontainer.c:wait_on_daemonized_start:834 - Received container state "ABORTING" instead of "RUNNING"
lxc kali 20180630004334.398 ERROR    lxc_start - start.c:__lxc_start:1887 - Failed to spawn container "kali"
lxc 20180630004334.411 WARN     lxc_commands - commands.c:lxc_cmd_rsp_recv:130 - Connection reset by peer - Failed to receive response for command "get_state"

The LXD installation is 3.0.1 from the Ubuntu repos. Everything worked normally with older mainline kernel, including and up to 4.17.x.

Thanks in advance for any solution or workaround, if there are any. Unfortunately both the default Ubuntu 4.15 kernel and the 4.17 mainline kernels have other issues on my machine so upgrading to 4.18 is pretty much a necessity.

Jacob

Can you post grep results on lxd? That might help with this.

That’s a known issue with the 4.18 kernel which affects LXC, LXD, systemd and pretty much every piece of software that uses user namespaces.

I’d recommend downgrading to another kernel for now.

1 Like

Thanks Stephane. Not to beat this thread up with an aside, but can you point me towards any kernel net discussions about how the 4.18 namespace issue is progressing. Much appreciated.

https://lists.linuxfoundation.org/pipermail/containers/2018-June/thread.html#39174 has the relevant thread

1 Like

Thanks very much. Resubscribed to the containers list.

The patch required to make LXC work again is available in git master and has been backported to the 3.0.0 stable branch.

Thanks for the quick fix! Which git repo can I pick it up from? On https://github.com/lxc/lxd the latest patch available is 6160132 from 3 days ago, which doesn’t seem to be related to this.

I think it’s this one. https://github.com/lxc/lxd/pull/4704

I’ve rebuilt LXD 3.0.1 with this patch applied but it still doesn’t work.

The required fix is in liblxc not in LXD. There’s also the possibility that the 4.18 kernel and the patch causing this will be reverted. This is something we are currently discussing.

Can you please point me to the patch? The commits in the lxc repo for the past few days seem to be all about test cleanups.

It works! Thank you so much for your help. This saves the day for me :slight_smile:

Whew… This thread just saved me a heap of time debugging…

What’s the recommended solution for this issue on Ubuntu 18.04? Is it still to avoid 4.18.x kernels?

Help!! I just ran into this issue today! LXD/LXC ran has been running perfectly for me for many months in a row. Today…I run into this issue! I am on Ubuntu 16.04.4
root@localhost:/var/log/apt# uname -a
Linux localhost 4.18.8-x86_64-linode117 #1 SMP PREEMPT Tue Sep 18 18:48:25 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

I don’t know how the kernel got to 4.18.8 between last night and this afternoon!

Here are the errors:

root@localhost:/var/log/apt# lxc info --show-log LPC1
Name: LPC1
Remote: unix://
Architecture: x86_64
Created: 2018/08/04 23:47 UTC
Status: Stopped
Type: persistent
Profiles: default

Log:

lxc LPC1 20181015181038.960 ERROR lxc_utils - utils.c:open_devnull:1753 - Permission denied - Can’t open /dev/null
lxc LPC1 20181015181038.960 ERROR lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5)
lxc LPC1 20181015181039.202 ERROR lxc_container - lxccontainer.c:wait_on_daemonized_start:834 - Received container state “ABORTING” instead of “RUNNING”
lxc LPC1 20181015181039.205 ERROR lxc_start - start.c:__lxc_start:1887 - Failed to spawn container “LPC1”
lxc 20181015181039.299 WARN lxc_commands - commands.c:lxc_cmd_rsp_recv:130 - Connection reset by peer - Failed to receive response for command “get_state”
This is critical to me and what I am doing! How can I patch this? I don’t understand how to use the above fix?

Thanks,

Ray

It’s the linode kernel. Something happened and you got upgraded to that kernel. Probably that kernel does not have the full features that LXD needs.

See Access an Apache Web Server Inside a LXD Container | Linode Docs on how to switch to the stock Ubuntu Linux kernel.

@Rayj, @simos suggestion worked for me. I used the guide below to walk me through in changing the kernel. I selected “Grub 2”. Rebooted the linode and started the lxc containers with no problem.

Here is the guide: https://www.linode.com/docs/tools-reference/custom-kernels-distros/custom-compiled-kernel-debian-ubuntu/#configure-the-linode

Thanks @simos

Please read the whole thread it links to a patch that is in lxc 3.0.2 which handles newer kernels that have different mknod() semantics. :slight_smile: