Incus OCI routed/ipvlan network issue

Hello,

run into some issue during network configuration to setup a second interface. I started using type macvlan which does work perfect but has the drawback that other container can’t reach it. So I tried alternatives like routed or ipvlan which both failed in a similar way:

routed device:
devices:
eth1:
ipv4.address: 111.222.333.444
name: eth1
nictype: routed
parent: eth0
type: nic

lxc s6demo 20240808082515.229 WARN     idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:165 - newuidmap binary is missing
lxc s6demo 20240808082515.229 WARN     idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:171 - newgidmap binary is missing
lxc s6demo 20240808082515.229 WARN     idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:165 - newuidmap binary is missing
lxc s6demo 20240808082515.229 WARN     idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:171 - newgidmap binary is missing
lxc s6demo 20240808082515.692 ERROR    utils - ../src/lxc/utils.c:run_buffer:571 - Script exited with status 1
lxc s6demo 20240808082515.692 ERROR    start - ../src/lxc/start.c:lxc_spawn:1917 - Failed to run lxc.hook.start-host
lxc s6demo 20240808082515.705 WARN     network - ../src/lxc/network.c:lxc_delete_network_priv:3674 - Failed to rename interface with index 0 from "eth0" to its initial name "vethcd1c6e5e"
lxc s6demo 20240808082515.713 WARN     network - ../src/lxc/network.c:lxc_delete_network_priv:3674 - Failed to rename interface with index 0 from "eth1" to its initial name "vethe1bfef2f"
lxc s6demo 20240808082515.713 ERROR    lxccontainer - ../src/lxc/lxccontainer.c:wait_on_daemonized_start:837 - Received container state "ABORTING" instead of "RUNNING"
lxc s6demo 20240808082515.714 ERROR    start - ../src/lxc/start.c:__lxc_start:2114 - Failed to spawn container "s6demo"
lxc s6demo 20240808082515.714 WARN     start - ../src/lxc/start.c:lxc_abort:1037 - No such process - Failed to send SIGKILL via pidfd 17 for process 976083
lxc 20240808082515.972 ERROR    af_unix - ../src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20240808082515.972 ERROR    commands - ../src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_init_pid"

ipvlan device:
devices:
eth1:
ipv4.address: 111.222.333.444
name: eth1
nictype: ipvlan
parent: eth0
type: nic

lxc s6demo 20240808083435.130 WARN     idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:165 - newuidmap binary is missing
lxc s6demo 20240808083435.130 WARN     idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:171 - newgidmap binary is missing
lxc s6demo 20240808083435.131 WARN     idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:165 - newuidmap binary is missing
lxc s6demo 20240808083435.131 WARN     idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:171 - newgidmap binary is missing
lxc s6demo 20240808083435.511 ERROR    utils - ../src/lxc/utils.c:run_buffer:571 - Script exited with status 1
lxc s6demo 20240808083435.511 ERROR    start - ../src/lxc/start.c:lxc_spawn:1917 - Failed to run lxc.hook.start-host
lxc s6demo 20240808083435.519 WARN     network - ../src/lxc/network.c:lxc_delete_network_priv:3674 - Failed to rename interface with index 0 from "eth0" to its initial name "vethe472eb2b"
lxc s6demo 20240808083435.523 ERROR    lxccontainer - ../src/lxc/lxccontainer.c:wait_on_daemonized_start:837 - Received container state "ABORTING" instead of "RUNNING"
lxc s6demo 20240808083435.524 ERROR    start - ../src/lxc/start.c:__lxc_start:2114 - Failed to spawn container "s6demo"
lxc s6demo 20240808083435.524 WARN     start - ../src/lxc/start.c:lxc_abort:1037 - No such process - Failed to send SIGKILL via pidfd 17 for process 983306
lxc 20240808083435.888 ERROR    af_unix - ../src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20240808083435.888 ERROR    commands - ../src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_init_pid"

Exact same device setup works with debian/12 or other distributions as a container. Bridged is obviously working :wink:

Anyone tested or has an idea how to fix?

It will most likely need some special logic to not attempt to run our DHCP client when using routed or ipvlan as those are obviously incompatible with it.

Thanks for looking into it @stgraber.

Properly will take a while until there is a fix available?

Properly can work around it to convert the OCI into a real container for now.

We are one step forward but also one back as it looks :thinking:

Installed 6.4 and tested again. I’m able to start an OCI container which has ipvlan as a second interface which is great. Side effect is now that resolv.conf is empty and you can’t edit it as it is bind mounted.

Looks like it requires some more thoughts to find a better solution?

We may be able to unmount /etc/resolv.conf on failure, will have to investigate.

There area couple of options…

It makes sense to unmount if DHCP fails or you have defined a static IP and you have multiple interfaces configured.

In my case I have two interfaces and I would prefer to keep the default eth0 (DHCP incus) version. Having a flag or option for this would be great.

Or allow to configure your resolv.conf as param / file for OCI would be also cool. Same would apply for defining static IP’s. It isn’t that straight forward for some OCI images as you need to know which package to install etc.

Again, I appreciate your efforts and enjoy using incus.

The DHCP client only handles eth0 so /etc/resolv.conf would only get unmounted if we failed to do DHCP on that.

Understand, but why was it empty after I added a second device using ipvlan?

It behaves differently if the second device is a macvlan. There is wasn’t over written but haven’t checked with 6.4. Try to get this tested later this week.

My guess is that ipvlan must set up a default route which then prevents our DHCP client from setting its own default route and causes it to fail before the DNS is configured.