I’ve installed Ubuntu 18.04 with LXD 3.0.1 from packages. I’ve created an unprivileged container and want to set net.ipv4.ping_group_range inside the container. This was used to be working on Ubuntu 16.04:
static int ipv4_ping_group_range(struct ctl_table *table, int write,
void __user *buffer,
size_t *lenp, loff_t *ppos)
{
...
if (write && ret == 0) {
low = make_kgid(user_ns, urange[0]);
high = make_kgid(user_ns, urange[1]);
if (!gid_valid(low) || !gid_valid(high) ||
(urange[1] < urange[0]) || gid_lt(high, low)) {
low = make_kgid(&init_user_ns, 1);
high = make_kgid(&init_user_ns, 0);
}
set_ping_group_range(table, low, high);
}
return ret;
}
So what this means is that if either the minimum or maximum GID value in the specified range is not valid inside of the user namespace, the kernel will (silently) set the sysctl’s value to the range of “1 0” from the init user namespace (IMO, it should be returning an error in this situation).
After the write has silently failed and you read back the sysctl value, the kernel does something silly by reporting that the min and max values of the GID range are the overflow gid (DEFAULT_OVERFLOWGID in the source code) since the actual sysctl value doesn’t map to a valid GID range inside the container. This is why you see 65534 65534 when reading the sysctl from inside the 18.04 container.
I suspect that in your 16.04 container, 429296729 is a valid GID and that your 18.04 container is configured differently in a way that 2000000 is not a valid GID inside the container.
@stgraber’s request for the gid_map contents will give us useful information. Please include the gid_map contents from both containers.
My guess is that your 16.04 system is using the snap and your 18.04 system is using the deb, that’d explain the difference in range size.
On your 18.04 system you could edit /etc/subuid and /etc/subgid and bump from 65536 to 1000000000 which would then match the snap setup (will require a restart of the LXD daemon).
I’ve submitted an upstream kernel fix for this issue that would have made the situation easier for @oms-kauz to debug by making it clear that the sysctl value being written was invalid: