OCI Docker app cant ping other containers in network by hostname

So im trying to run guacd (from guacamole/guacd) in my cluster network and i have the app running with an address it gets from the OVN backend. I have noticed in the past that for some reason i cant ping the other containers on the same ovn network if they are running as the docker version.

Meaning that if I try to run ping kcisne04 from guacd-1 , it simply wont work and it says ping: bad address , I also cant ping guacd-1 from kcisne04 by hostname. But, if i create kcisne04-2 , i have no issues running ping from either container to eachother (i.e i can run ping kcisne04 from kcisne04-2 without issue and vice-versa)…

the docker container can ping websites (i.e google.com) or it can ping other containers if i provide their IP address. I would like to not have to provide the IP address, and im curious if theres anything i can do to fix this ?

Im also using the newest version of incus as of 06/21/2024 … i just updated the cluster today

Also to note, it looks like the docker image uses alpine linux

What do you have in /etc/resolv.conf inside the container?
Do you get the same behavior if you ping kcisne04.incus instead of kcisne04?

@stgraber i get the same behavior if i ping .incus at the end too.

My resolv.conf has …

nameserver 1.1.1.1
search incus

@stgraber i see that i cant ping my other containers by their hostname if i run any regular non-docker containers with Alpine as the base image.

Is there something i can do to fix this ? Is Alpine missing something ?

I’m seeing the same thing on my OVN cluster with Incus 6.13 on Debian 12.

I was reproducing the issue fairly consistently but I had to move on to other issues so I just reverted to Macvlan. But I found that the DNS interception for the .incus domain just seemed to stop working after containers were restarted. This was what I tried a few times:

  1. Build a new OVN uplink and managed network.
  2. Launch up a few containers and check nslookup or ping working on .incus domain.
  3. Restart the containers.
  4. Check again and now NXDOMAIN returned on all .incus addresses from the upstream DNS server.
  5. Delete containers and the OVN managed network (leave the uplink) and build a new one.
  6. Launch containers, DNS working again.

I checked as much as I could to make sure OVN was working: SB/NB DBs and OVS tunnels up across all the hosts and northbound DB DNS table had all the entries. Just ran out of things to check and had to move on. The one thing that did stick out was that I am using the Debian 12 distro packages for OVN and they appear to be a few versions back. Tried to look for some pre-built packages for Debian but they seemed pretty tough to find and didn’t have time to build my own.

Your instance clearly uses a public DNS which has no idea about local instance names. How does the config of the OVN look like? You might need to specify the correct DNS…

@osch

here is my network that i made from the uplink…

And here is my UPLINK for that network…

Whats odd is that i do have 1.1.1.1 as dns on the uplink, but it works find for my other containers…

@localgrp did you install via the zabbly github method ? @stgraber is this a known issue for debian?

EDIT: I just changed the dns to my UPLINK address and added dns 1.1.1.1 on my firewall (which is acting like my router/fw/dns) … Im using SplitDNS and i made it so that on that interface (which is a vlan) the dns is 1.1.1.1 on my firewall, again my regular containers work fine with pinging eachother but these docker oci containers still wont work

4.0.4.1 is my router

That is correct and confirms that your OVN will take the settings from the UPLINK. In this case 1.1.1.1 as DNS which as stated has no idea about the inter Incus hosts.

If no DNS is defined on your OVN network it will take / use the default from your UPLINK. IF you want to also have DNS resolution for both internal and public you need to configure your OVN to use the local DNS (dnsmasq) Incus starts on your host. Just point it to your host IP.

What you need to do it change the DNS server for your OVN and point it to your host IP to be able to resolve Incus hosts and public hosts at the same time. Check out the dns.nameservers settings at OVN network

Also review your incusbr0 network settings which uses exactly the same DNS.

There is nothing wrong with zabbly or debian. It is all about settings you need to configure. On default Incus uses incusbr0 which uses it automatically but OVN requires manual tweaks to work exactly the same.

@osch i dont use icusbr0 at all , everything is set on ovn networks which are on vlans

i have mellanox cards , so i have my network setup like so …

# Setup OVS MGMT Layer
ip link set eth0 up
ip link set enp65s0f0v0 name ovs-mgmt
ip addr add 4.0.0.5/24 brd + dev ovs-mgmt
ip link set ovs-mgmt up

# Setup Linstor MGMT Layer
ip link set eth1 up
ip link set enp65s0f0v1 name linstor-mgmt
ip addr add 4.0.1.5/24 brd + dev linstor-mgmt
ip link set linstor-mgmt up

# Setup Incus MGMT Layer
ip link set eth2 up
ip link set enp65s0f0v2 name incus-mgmt
ip addr add 4.0.2.5/24 brd + dev incus-mgmt
ip link set incus-mgmt up

# Setup admin_labs interface
ip link set eth3 up
ip link set enp65s0f0v3 name admin-labs
ip link set admin-labs up

# Setup research_labs interface
ip link set eth4 up
ip link set enp65s0f0v4 name research-labs
ip link set research-labs up

these are setup like this on all 3 nodes i use on my cluster…

this is my ovs-vsctl show output.

root@r620:/home/mihai# ovs-vsctl show
aeb3c900-2927-4cc2-803c-c3a51255d0da
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port ovn-6880a7-0
            Interface ovn-6880a7-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="4.0.0.10"}
                bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up}
        Port veth799b26f2
            Interface veth799b26f2
        Port patch-br-int-to-incus-net9-ls-ext-lsp-provider
            Interface patch-br-int-to-incus-net9-ls-ext-lsp-provider
                type: patch
                options: {peer=patch-incus-net9-ls-ext-lsp-provider-to-br-int}
        Port ovn-9f0c53-0
            Interface ovn-9f0c53-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="4.0.0.8"}
        Port patch-br-int-to-incus-net8-ls-ext-lsp-provider
            Interface patch-br-int-to-incus-net8-ls-ext-lsp-provider
                type: patch
                options: {peer=patch-incus-net8-ls-ext-lsp-provider-to-br-int}
        Port br-int
            Interface br-int
                type: internal
        Port ovn-0a8972-0
            Interface ovn-0a8972-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="4.0.0.9"}
        Port vethe64ac66a
            Interface vethe64ac66a
        Port ovn-2139ff-0
            Interface ovn-2139ff-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="4.0.0.5"}
                bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up}
        Port ovn-19b0bc-0
            Interface ovn-19b0bc-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="4.0.0.7"}
    Bridge incusovn2
        Port incusovn2
            Interface incusovn2
        Port patch-incus-net8-ls-ext-lsp-provider-to-br-int
            Interface patch-incus-net8-ls-ext-lsp-provider-to-br-int
                type: patch
                options: {peer=patch-br-int-to-incus-net8-ls-ext-lsp-provider}
        Port patch-incus-net9-ls-ext-lsp-provider-to-br-int
            Interface patch-incus-net9-ls-ext-lsp-provider-to-br-int
                type: patch
                options: {peer=patch-br-int-to-incus-net9-ls-ext-lsp-provider}
        Port research-labs
            Interface research-labs
    ovs_version: "3.1.0"
root@r620:/home/mihai#

@osch i made the change to my dns.nameservers to point to my router which is 4.0.4.1 and it still isnt working

if this is my ovn network …

should my dns.nameservers on this network be the volatile.network.ipv4.address (i.e 4.0.4.3) ?

Check if you have any dnsmasq process running on your host. I guess not as you don’t have any incusbr0 configured. As far as I know OVN doesn’t spin-up a DNS and you might need to create your own.

but why does it work with regular non-oci containers ? could it be something else?

It doesnt work with any oci containers or any of the alpine images…

Right, OCI and alpine…

Remember had some issues with them at some stage. By any chance you have static / stateless IP’s configured? This cause me a lot of trouble, didn’t get any IP’s or dns failed sometimes etc. As soon as I changed OVN to not use stateless / static IP’s it all went away.

Haven’t really revisited it since than but it could be also completely unrelated to your issue.

Did you check your non alpine and OCI container what kind of DNS they receive during startup? They shouldn’t get 1.1.1.1 if they are able to resolve their incus names…

@osch
im not sure if i have stateless ip’s i havent set that up… how would i be able to tell ?

If you can’t remember forget about it. Not requires to worry about.

so interestingly i took our dns.nameserver from the ovn network and just left it on the uplink…

my UPLINK looks like this…


config:
  dns.nameservers: 4.0.4.1
  ipv4.gateway: 4.0.4.1/28
  ipv4.ovn.ranges: 4.0.4.2-4.0.4.14
  volatile.last_state.created: "false"
description: ""
name: UPLINK_research-labs
type: physical
used_by:
- /1.0/networks/secureAI?project=research-labs
- /1.0/networks/bomdas?project=research-labs
managed: true
status: Created
locations:
- supermicro
- r620
- gigabyte
project: default

and my ovn network created from the uplink as a parent looks like this…

config:
  bridge.mtu: "1500"
  ipv4.address: 10.5.129.1/24
  ipv4.nat: "true"
  ipv6.address: fd42:cb3c:a9b4:7b5c::1/64
  ipv6.nat: "true"
  network: UPLINK_research-labs
  volatile.network.ipv4.address: 4.0.4.3
description: ""
name: secureAI
type: ovn
used_by:
- /1.0/instances/ajoshi16?project=research-labs
- /1.0/instances/caring-wasp?project=research-labs
- /1.0/instances/guacamole-secureAI?project=research-labs
- /1.0/instances/test?project=research-labs
managed: true
status: Created
locations:
- gigabyte
- supermicro
- r620
project: research-labs

buttt i dont have any outside wan connectivity… I can ping internally though.
my firewall is set to DNS servers on 1.1.1.1 but it doesnt seem to pick up on incus. so no nameresolution for things like google.com

this is also my resolv.conf from a regular non-oci container

# This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).
# Do not edit.
#
# This file might be symlinked as /etc/resolv.conf. If you're looking at
# /etc/resolv.conf and seeing this text, you have followed the symlink.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs should typically not access this file directly, but only
# through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a
# different way, replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0 trust-ad
search .

and here is my resolv.conf from an oci-container

root@caring-wasp:~# cat /etc/resolv.conf
nameserver 4.0.4.1
domain incus
search incus
nameserver fd42:cb3c:a9b4:7b5c::1
search incus

How about setting this to 4.0.4.1 , 1.1.1.1 ? Not 100% sure if it actually works but worse a try.

run this in your container to see the real DNS behind systemd.