Systemd-resolved/dnsmasq infinite loop

Hi,
I observe a very weird and fortunately reproducible problem:

host centos.spd.co.il

causes infinite requests loop between systemd-resolved and LXD’s dnsmasq. So far I see this problem with this specific CentOS domain only. The problem happens when I run this command either on the host or in a container.
I have Ubuntu 18.0.4 server with LXD 3.0.1
The systemd-resolved is configured to resolve the ‘lxd’ domain via lxdbr0 interface. For example:

systemd-resolve --interface lxdbr0 --set-dns 10.159.65.1 --set-domain "lxd"
systemd-resolve --status lxdbr0
Link 4 (lxdbr0)
    Current Scopes: DNS
     LLMNR setting: yes
  MulticastDNS setting: no
  DNSSEC setting: no
      DNSSEC supported: no
      DNS Servers: 10.159.65.1
      DNS Domain: lxd

The problem doesn’t happen when I disable DNS for lxdbr0 interface or define it as ‘~lxd’.

I ran systemd-resolved service in debug mode and found that

systemd-resolved[12879]: Processing query...
systemd-resolved[12879]: Got DNS stub UDP query packet for id 27109
systemd-resolved[12879]: Looking up RR for 183.128-255.163.199.212.in-addr.arpa IN PTR.
systemd-resolved[12879]: Cache miss for 183.128-255.163.199.212.in-addr.arpa IN PTR
systemd-resolved[12879]: Transaction 43470 for <183.128-255.163.199.212.in-addr.arpa IN PTR> scope dns on lxdbr0/*.
systemd-resolved[12879]: Using feature level UDP+EDNS0 for transaction 43470.
systemd-resolved[12879]: Using DNS server 10.159.65.1 for transaction 43470.
systemd-resolved[12879]: Sending query packet with id 43470.
systemd-resolved[12879]: Positive cache hit for 183.128-255.163.199.212.in-addr.arpa IN PTR
systemd-resolved[12879]: Transaction 54638 for <183.128-255.163.199.212.in-addr.arpa IN PTR> on scope dns on eno1/* now complete with <success> from cache
systemd-resolved[12879]: Freeing transaction 43470.
systemd-resolved[12879]: Freeing transaction 54638.
systemd-resolved[12879]: Sending response packet with id 27109 on interface 1/AF_INET.
systemd-resolved[12879]: Processing query...
systemd-resolved[12879]: Got DNS stub UDP query packet for id 59106
systemd-resolved[12879]: Looking up RR for 183.128-255.163.199.212.in-addr.arpa IN PTR.
systemd-resolved[12879]: Cache miss for 183.128-255.163.199.212.in-addr.arpa IN PTR
systemd-resolved[12879]: Transaction 54384 for <183.128-255.163.199.212.in-addr.arpa IN PTR> scope dns on lxdbr0/*.
systemd-resolved[12879]: Using feature level UDP+EDNS0 for transaction 54384.
systemd-resolved[12879]: Using DNS server 10.159.65.1 for transaction 54384.
systemd-resolved[12879]: Sending query packet with id 54384.
systemd-resolved[12879]: Positive cache hit for 183.128-255.163.199.212.in-addr.arpa IN PTR
systemd-resolved[12879]: Transaction 11243 for <183.128-255.163.199.212.in-addr.arpa IN PTR> on scope dns on eno1/* now complete with <success> from cache
systemd-resolved[12879]: Freeing transaction 54384.
systemd-resolved[12879]: Freeing transaction 11243.
systemd-resolved[12879]: Sending response packet with id 59106 on interface 1/AF_INET.

As you can see the loop is related to resolving of PTR records.
Any idea how to solve this problem? I have many CetOS7 containers and when I run ‘yum update’ inside of one of them the problem starts…

Thank you,
Leonid

Hi,
We found that the problem doesn’t happen if we define “~lxd” domain instead of “lxd” for the lxdbr0 interface:
systemd-resolve --interface lxdbr0 --set-dns 10.159.65.1 --set-domain “~lxd”

Unfortunately systemd-resolve also put “~lxd” as a search domain in the /run/systemd/resolve/stub-resolv.conf file. As result of that, it is not possible to resolve the container names on the host. For example, 'host <cont_name> " is not resolved and it is needed to use containers FQDN: "host <cont_name>.lxd.
In order to fix it, we replaced the /etc/resolv.conf symbolic link with a regular file that contains the same data as we have in the /run/systemd/resolve/stub-resolv.conf but with ‘lxd’ instead of ‘~lxd’:
nameserver 127.0.0.53
search lxd example.com

Leonid