Internal DNS with OVN

I’m experimenting with OVN networks on my Incus lab.

The box has a single VLAN aware Ethernet port for data.

That port is used by an uplink network:

incus network show uplink14
config:
dns.nameservers: 172.29.14.1
ipv4.gateway: 172.29.14.1/24
ipv4.ovn.ranges: 172.29.14.128-172.29.14.250
parent: enp2s0
vlan: “14”
volatile.last_state.created: “true”
description: VLAN 14 uplink
name: uplink14
type: physical
used_by:

* /1.0/networks/ovn-test?project=test
  managed: true
  status: Created
  locations:
* none
  project: default

I use OpenTofu to create an OVN network, connected to uplink14. I’m using OVN as I want the networks to be part of a project, and apparently a bridge can only be part of the default project:

incus --project test network show ovn-test
config:
  bridge.mtu: "1500"
  dns.domain: test.internal.
  dns.search: test.internal
  ipv4.address: 10.225.114.1/24
  ipv4.dhcp: "true"
  ipv4.nat: "true"
  ipv6.address: none
  network: uplink14
  volatile.network.ipv4.address: 172.29.14.128
description: OVN for test project
name: ovn-test
type: ovn
used_by:
- /1.0/instances/a1?project=test
- /1.0/instances/a2?project=test
- /1.0/instances/test?project=test
- /1.0/profiles/default?project=test
managed: true
status: Created
locations:
- none
project: test

Finally, OpenTofu created a profile using the above OVN network and an OCI Alpine Linux container from which to test.

I’ve also manually added a couple of Alpine system containers to the project to help test - a1 and a2.

Running ovn-sbctl list DNS shows that the hosts are registered as records:

_uuid               : a3f6e0cc-12f8-4c55-bc7b-f611878f56af
datapaths           : [cca6963f-f06f-4b56-b642-cb60c913cad8]
external_ids        : {dns_id="02730897-93fe-4301-9507-4a55eabfd03f"}
records             : {a1.test.internal.="10.225.114.2"}

_uuid               : 61a7d1e2-ce89-49df-a0ba-04cd5653e07f
datapaths           : [cca6963f-f06f-4b56-b642-cb60c913cad8]
external_ids        : {dns_id="5752559d-d97a-4bff-ae65-b92c446ef52c"}
records             : {a2.test.internal.="10.225.114.4"}

The problem is that I can’t resolve those names from anywhere. The containers don’t resolve those names, and neither does the Incus host itself.

Where am I suppose to be able to resolve these?

Thanks in advance.

1 Like

Incus host not being able to resolve them is normal, containers on that network not being able to resolve isn’t quite as normal.

Though it depends exactly on what the container does too. The way OVN DNS works is that it will intercept UDP DNS requests headed to the DNS server and respond with its own response for the records it has in the DNS SouthBound table.

So if you’re doing DNS-over-HTTPS or even DNS-over-TCP, that may not get the expected DNS records.

We also had a bug that would prevent some records from making it into the SouthBound database but @presztak fixed that earlier this week but in your case, you don’t seem to be having that problem (yet).

Thanks for the prompt response.

The containers are just basic Alpine Linux containers at the moment, while I figure this out. They would be using basic UDP/53 for name resolution.

Their /etc/resolv.conf has:

search test.internal
nameserver 172.29.14.1

Presumably handed to them by the OVN DHCP server.

Your explanation of how it works is my (limited) understanding too, from Google. I was wondering what the next step for debugging this would be.

I can imagine this interception of UDP/53 not being that easy to investigate with standard Linux tools.

I’d probably use incus info NAME to see what the host side device is (veth) and then tcpdump that while running a DNS query inside the guest. That should confirm that the DNS query is headed to the expected DNS server, over UDP and that the record being queried matches the one in the Southbound DB.

tcpdump (on a different interface to the one shown by incus info --project test a1 gives:

tcpdump  -nvvv -i any port 53
08:06:18.358750 veth1143b02a P   IP (tos 0x0, ttl 64, id 40297, offset 0, flags [DF], proto UDP (17), length 62)
    10.225.114.3.54551 > 172.29.14.1.53: [bad udp cksum 0x373e -> 0x8e44!] 37362+ A? a1.test.internal. (34)
08:06:18.360440 enp2s0.14 Out IP (tos 0x0, ttl 63, id 40297, offset 0, flags [DF], proto UDP (17), length 62)
    172.29.14.128.54551 > 172.29.14.1.53: [udp sum ok] 37362+ A? a1.test.internal. (34)
08:06:18.360447 enp2s0 Out IP (tos 0x0, ttl 63, id 40297, offset 0, flags [DF], proto UDP (17), length 62)
    172.29.14.128.54551 > 172.29.14.1.53: [udp sum ok] 37362+ A? a1.test.internal. (34)
08:06:18.361356 enp2s0 P   IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 137)
    172.29.14.1.53 > 172.29.14.128.54551: [udp sum ok] 37362 NXDomain q: A? a1.test.internal. 0/1/0 ns: . [56m13s] SOA a.root-servers.net. nstld.verisign-grs.com. 2025100500 1800 900 604800 86400 (109)
08:06:18.361356 enp2s0.14 P   IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 137)
    172.29.14.1.53 > 172.29.14.128.54551: [udp sum ok] 37362 NXDomain q: A? a1.test.internal. 0/1/0 ns: . [56m13s] SOA a.root-servers.net. nstld.verisign-grs.com. 2025100500 1800 900 604800 86400 (109)
08:06:18.362192 veth1143b02a Out IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto UDP (17), length 137)
    172.29.14.1.53 > 10.225.114.3.54551: [udp sum ok] 37362 NXDomain q: A? a1.test.internal. 0/1/0 ns: . [56m13s] SOA a.root-servers.net. nstld.verisign-grs.com. 2025100500 1800 900 604800 86400 (109)

So, it’s submitting the request to the upstream DNS server and not intercepting it.

I tried setting the log levels on OVN with ovn-appctl vlog/set dns_resolve:file:dbg and monitored /var/log/ovn/ovn-controller.log but that didn’t show anything related to dns. The logs do show many warning similar to:

2025-10-05T06:40:59.434Z|14049398|patch|WARN|Bridge 'incusovn143' not found for network 'uplink14'
2025-10-05T06:40:59.434Z|14049399|patch|WARN|Bridge 'incusovn145' not found for network 'uplink14'

But I don’t know if that has any relevance.

I did the same with ovs-appctl but those logs were empty.

Definitely looks like this should work… Clearly northd did its job since it went from Northbound to Southbound, now debugging ovn-controller can be a bit trickier as it’s going to be turning those Southbound entries into local flow rules, so not as easy to track down things at that level, unless you have a system that’s running extremely little workloads and where going through the OVS database is possible.

There is nothing other than these test Alpine instances on the OVN networks. The Incus node does run other workloads, but they’re not on OVN networks.

I’ve had a go at tracing, but to be honest, I don’t know what I’m doing:

ovn-nbctl show
switch ffe13614-fba2-44c4-9adf-e2d5d5698d1a (incus-net202-ls-ext)
    port incus-net202-ls-ext-lsp-router
        type: router
        router-port: incus-net202-lr-lrp-ext
    port incus-net202-ls-ext-lsp-provider
        type: localnet
        addresses: ["unknown"]
switch 7bbcbc46-b8bc-4d89-8df5-0c399efe4fd7 (incus-net202-ls-int)
    port incus-net202-ls-int-lsp-router
        type: router
        router-port: incus-net202-lr-lrp-int
    port incus-net202-instance-1d5bbbbc-81db-4221-8a0b-b17b2b3fb974-eth0
        addresses: ["10:66:6a:5b:6b:c3 dynamic"]
    port incus-net202-instance-e2ba1dc4-c448-4cf7-991f-f33e6dbcf65b-eth0
        addresses: ["10:66:6a:31:cb:e2 10.225.114.3"]
    port incus-net202-instance-5dffebf7-3f93-4e91-9f97-4935b2a1b6c1-eth0
        addresses: ["10:66:6a:80:a8:e0 dynamic"]
router 805c47fb-af5a-46f3-852f-7723396e14dd (incus-net202-lr)
    port incus-net202-lr-lrp-int
        mac: "10:66:6a:fb:ec:1b"
        networks: ["10.225.114.1/24"]
    port incus-net202-lr-lrp-ext
        mac: "10:66:6a:fb:ec:1b"
        networks: ["172.29.14.128/24"]
    nat 270ffe38-10be-4ed0-94dd-1e8f3c42c25f
        external ip: "172.29.14.128"
        logical ip: "10.225.114.0/24"
        type: "snat"

Using the MAC and IP from the container to grab the port from the output above, I followed with:

ovn-trace --minimal 'inport == "incus-net202-instance-e2ba1dc4-c448-4cf7-991f-f33e6dbcf65b-eth0" && eth.src == 10:66:6a:31:cb:e2 && eth.dst == ff:ff:ff:ff:ff:ff && ip4.dst == 172.29.14.1 && udp.dst == 53'

I received:

2025-10-05T20:49:44Z|00001|ovntrace|WARN|outport == "incus-net202-ls-int-lsp-router" && ip6.dst == <nil> && (udp.dst == 53 || tcp.dst == 53): parsing expression failed (Syntax error at `<' expecting constant.)
# udp,reg14=0x5,vlan_tci=0x0000,dl_src=10:66:6a:31:cb:e2,dl_dst=ff:ff:ff:ff:ff:ff,nw_src=0.0.0.0,nw_dst=172.29.14.1,nw_tos=0,nw_ecn=0,nw_ttl=0,nw_frag=no,tp_src=0,tp_dst=53
*** dns_lookup action not implemented;
output("incus-net202-instance-1d5bbbbc-81db-4221-8a0b-b17b2b3fb974-eth0");
output("incus-net202-instance-5dffebf7-3f93-4e91-9f97-4935b2a1b6c1-eth0");

Which I initially thought was the culprit. However, reading OVN source suggests that the “not implemented” message is actually the ovn-trace utility simply stating a fact about itself.

The WARN message is also odd; and more worrying, is always there. If I change tcp.dst to another port (e.g. 443) I still receive the same message. A red herring, or something to investigate?

It turns out that the issue was cause by having set the net.ipv6.conf.all.disable_ipv6 = 1 kernel parameter with sysctl. I removed this, rebooted, deleted and recreated the OVN networks (and the physical uplink network) and it started to work.

@stgraber - would this be worthy of a bug report? Arguably, it’s a bug in the OVN implementation if it should work with IPv6 disabled, or a bug in the documentation which should state that IPv6 cannot be disabled?