Environment: Linux buildroot 5.10.131
LXC version: lxc-4.0.12
When I tried to run an unprivileged container started by a non-root user, and met some permission issues. In the stage of launching network, I saw this error messages.
lxc-start test_unpri 19700101000048.176 WARN start - start.c:lxc_spawn:1835 - Operation not permitted - Failed to allocate new network namespace id
lxc-start test_unpri 19700101000048.179 INFO network - network.c:lxc_create_network_unpriv_exec:2949 - Execing lxc-user-nic create /usr/var/lib/lxc_unpri/default/.local/share/lxc test_unpri 6855 veth lxcbr0 (null)
lxc-start test_unpri 19700101000048.276 ERROR network - network.c:lxc_create_network_unpriv_exec:2977 - lxc-user-nic failed to configure requested network: cmd/lxc_user_nic.c: 474: instantiate_veth - Operation not permitted - Failed to create veth1000_MD05-veth1000_MD05p
This is a lacking capability issue, which requires cap_net_admin,cap_sys_admin, … , etc.
After I added these missing capabilities, it passed most of permission issues, but still had another permission error remained.
lxc-start test_unpri 19700101000138.372 INFO network - network.c:lxc_create_network_unpriv_exec:2949 - Execing lxc-user-nic create /usr/var/lib/lxc_unpri/default/.local/share/lxc test_unpri 6849 veth lxcbr0 (null)
lxc-start test_unpri 19700101000138.591 ERROR network - network.c:lxc_create_network_unpriv_exec:2977 - lxc-user-nic failed to configure requested network: cmd/lxc_user_nic.c: 886: lxc_secure_rename_in_ns - Operation not permitted - Failed to setns() to original network namespace of PID 3
This is really weird to me, setns() requres cap_sys_admin, and I already added cap_sys_admin before, didn’t know why I lost the capabilities added before.
I did investigate this issue and finally found it’s because capabilities would be clear if calling any api to change uid/gid. In my case it’s setresuid.
Refer to lxc-4.0.12/src/lxc/cmd/lxc_user_nic.c
static char *lxc_secure_rename_in_ns(int pid, char *oldname, char *newname,
int *container_veth_ifidx)
{
...
ret = setresuid(ruid, ruid, 0); // **++TY: Capabilities would be clear if calling setresuid**
if (ret < 0) {
CMD_SYSERROR("Failed to drop privilege by setting effective user id and real user id to %d, and saved user ID to 0\n", ruid);
/*
* It's ok to jump to do_full_cleanup here since setresuid()
* will succeed when trying to set real, effective, and saved
* to values they currently have.
*/
goto out_setns;
}
...
out_setns:
ret = setns(ofd, CLONE_NEWNET); // ++TY: And finally failed here, because of lacking cap_sys_admin**
if (ret < 0)
return cmd_error_errno(NULL, errno, "Failed to setns() to original network }
If I added prctl(PR_SET_KEEPCAPS, prctl_arg(1)) before any setresuid, I could keep all capabilities and successfully launch my unprivileged container, but I am not sure if it’s a proper and formal fix or not.
Could anyone give me any thoughts or suggestions?
Thanks in advance,
TengYang