LXC Container is not starting on Android Q kernel 4.19.78+

Hello,

I am trying to launch lxc container using lxc-start with Busybox rootfs and it is giving below error which terminates the launch process:
start - start.c:lxc_spawn:1737 - Invalid argument - Failed to clone a new set of namespaces

Below is my setup:
lxc-start version : 3.0.4
lxcfs version : 3.0.0
Android Q kernel : 4.19.78+ #1 SMP PREEMPT Tue Jun 2 14:27:08 UTC 2020 aarch64 GNU/Linux
Arch : aarch64

Below is the log output when i start container on Android Q kernel 4.19.78 and it fails :

lxc-start busybox 20200421134559.497 ERROR lxc_start - lxc_start.c:main:289 - No container config specified
lxc-start mybusybox 20200421134640.962 INFO lxccontainer - lxccontainer.c:do_lxcapi_start:971 - Set process title to [lxc monitor] /var/lib/lxc mybusybox
lxc-start mybusybox 20200421134640.964 INFO lsm - lsm.c:lsm_init:50 - LSM security driver nop
lxc-start mybusybox 20200421134640.966 DEBUG terminal - terminal.c:lxc_terminal_peer_default:676 - No such device - The process does not have a controlling terminal
lxc-start mybusybox 20200421134640.967 INFO start - start.c:lxc_init:919 - Container “mybusybox” is initialized
lxc-start mybusybox 20200421134640.970 INFO cgfsng - cgfsng.c:cgfsng_monitor_create:1403 - The monitor process uses “lxc.monitor/mybusybox” as cgroup
lxc-start mybusybox 20200421134640.973 INFO cgfsng - cgfsng.c:cgfsng_payload_create:1468 - The container process uses “lxc.payload/mybusybox” as cgroup
lxc-start mybusybox 20200421134640.974 ERROR

start - start.c:lxc_spawn:1737 - Invalid argument - Failed to clone a new set of namespaces

lxc-start mybusybox 20200421134640.974 DEBUG network - network.c:lxc_delete_network:3308 - Deleted network devices
lxc-start mybusybox 20200421134640.974 ERROR start - start.c:__lxc_start:2019 - Failed to spawn container “mybusybox”
lxc-start mybusybox 20200421134640.976 DEBUG lxccontainer - lxccontainer.c:wait_on_daemonized_start:839 - First child 2992 exited
lxc-start mybusybox 20200421134640.976 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:851 - Received container state “ABORTING” instead of “RUNNING”
lxc-start mybusybox 20200421134640.976 ERROR lxc_start - lxc_start.c:main:329 - The container failed to start
lxc-start mybusybox 20200421134640.976 ERROR lxc_start - lxc_start.c:main:332 - To get more details, run the container in foreground mode
lxc-start mybusybox 20200421134640.976 ERROR lxc_start - lxc_start.c:main:335 - Additional information can be obtained by setting the --logfile and --logpriority options

Below is the log output when I start container on Android Q kernel 4.19.62 and it works:

lxc-start mybusybox 20200522083930.134 INFO lxccontainer - lxccontainer.c:do_lxcapi_start:971 - Set process title to [lxc monitor] /var/lib/lxc mybusybox
lxc-start mybusybox 20200522083930.135 INFO lsm - lsm.c:lsm_init:50 - LSM security driver nop
lxc-start mybusybox 20200522083930.139 DEBUG terminal - terminal.c:lxc_terminal_peer_default:676 - No such device - The process does not have a controlling terminal
lxc-start mybusybox 20200522083930.140 INFO start - start.c:lxc_init:919 - Container “mybusybox” is initialized
lxc-start mybusybox 20200522083930.142 DEBUG cgfsng - cgfsng.c:cg_legacy_handle_cpuset_hierarchy:612 - “cgroup.clone_children” was already set to “1”
lxc-start mybusybox 20200522083930.143 INFO cgfsng - cgfsng.c:cgfsng_monitor_create:1403 - The monitor process uses “lxc.monitor/mybusybox” as cgroup
lxc-start mybusybox 20200522083930.146 DEBUG cgfsng - cgfsng.c:cg_legacy_handle_cpuset_hierarchy:612 - “cgroup.clone_children” was already set to “1”
lxc-start mybusybox 20200522083930.147 INFO cgfsng - cgfsng.c:cgfsng_payload_create:1468 - The container process uses “lxc.payload/mybusybox” as cgroup
lxc-start mybusybox 20200522083930.154 ERROR start - start.c:proc_pidfd_open:1607 - Function not implemented - Failed to send signal through pidfd
lxc-start mybusybox 20200522083930.154 INFO start - start.c:lxc_spawn:1750 - Cloned CLONE_NEWNS
lxc-start mybusybox 20200522083930.154 INFO start - start.c:lxc_spawn:1750 - Cloned CLONE_NEWPID
lxc-start mybusybox 20200522083930.154 INFO start - start.c:lxc_spawn:1750 - Cloned CLONE_NEWUTS
lxc-start mybusybox 20200522083930.155 INFO start - start.c:lxc_spawn:1750 - Cloned CLONE_NEWIPC
lxc-start mybusybox 20200522083930.155 INFO start - start.c:lxc_spawn:1750 - Cloned CLONE_NEWNET

Below gives the namesapce details on 4.19.78:

/proc/sys/user# cat max_cgroup_namespaces
15075
/proc/sys/user# cat max_ipc_namespaces
15075
/proc/sys/user# cat max_mnt_namespaces
15075
/proc/sys/user# cat max_net_namespaces
15075
/proc/sys/user# cat max_pid_namespaces
15075
/proc/sys/user# cat max_user_namespaces
15075
/proc/sys/user# cat max_uts_namespaces
15075

Notes:

I could see some differences in kernel ver 4.19.78 as compared to 4.19.62 w.r.t. PIDFD. Maybe this could help for further analysis.
I also verified the lxc-checkconfig output on 4.19.78 kernel and it is same as of 4.19.62 except below param:
CONFIG_FHANDLE: missing
But I do not see any relevance of this config to the issue that we are facing.

Please review this and let me know the possible causes for this. If you need any more details, please let me know.

Thank you.
-Swapnil

Hello,
There is an update on this issue.
I tried sample program on 4.19.78 kernel to create a child with clone() function call with below flags:
CLONE_NEWNS | CLONE_NEWPID | CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWNET
It was able to create child successfully.

As soon as I added CLONE_PIDFD flag to above flags, it failed to create a child.

To confirm that CLONE_PIDFD could be a cause in LXC, I modified below function call in lxc_spawn function of start.c and removed CLONE_PIDFD flag:

lxc_raw_clone_cb(do_start, handler,
CLONE_PIDFD | handler->ns_on_clone_flags, &handler->pidfd);

Observation :
I observed that lxc container did not report the error and proceed for further steps to create a container environment as per **config** file. At last, it started init process of android user space within container.

Notes:
I have not tested this workaround thoroughly which I would do as a next step. But I think CLONE_PIDFD is playing a role here and not sure if this workaround would have any side effects on lxc container behaviour or the OS being hosted inside container.
I would like to hear from you on this observation and possible causes.

Thank you.
-Swapnil