LXD dnsmasq rejecting queries

notetienne · December 23, 2021, 1:21am

Hi,

I’ve got LXD installed as a snap. I’ve also got a local dnsmasq installation that only listens to specific interfaces, excluding lxdbr0 (the default bridge interface).

When launching LXC containers, dnsmasq correctly assign IPs, but fails to resolve any DNS requests.

My network configuration (lxc network show lxdbr0):

config:
  ipv4.address: 10.204.141.1/24
  ipv4.dhcp: "true"
  ipv4.firewall: "false"
  ipv4.nat: "true"
  ipv6.address: fd42:87ee:235a:9776::1/64
  ipv6.firewall: "false"
  ipv6.nat: "true"
description: ""
name: lxdbr0
type: bridge
used_by:
- /1.0/instances/webserver
- /1.0/profiles/default
managed: true
status: Created
locations:
- none

The LXC container assigned DNS (with resolvectl) is:

  Current DNS Server: fe80::216:3eff:fe4b:3dd6
         DNS Servers: 10.204.141.1            
                      fe80::216:3eff:fe4b:3dd6

$ dig @10.204.141.1 google.com
; <<>> DiG 9.16.1-Ubuntu <<>> @10.204.141.1 google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 41226
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;google.com.			IN	A

;; Query time: 0 msec
;; SERVER: 10.204.141.1#53(10.204.141.1)
;; WHEN: Thu Dec 23 00:26:04 UTC 2021
;; MSG SIZE  rcvd: 39

We can see that the status is REFUSED. I’ve got no firewall and flushed iptable rules to be sure.

The LXD dnsmasq is running (sudo lsof -i -P -n | grep :53):

dnsmasq   1087079 dnsmasq    4u  IPv4  971159      0t0  UDP 192.168.0.108:53 
dnsmasq   1087079 dnsmasq    5u  IPv4  971160      0t0  TCP 192.168.0.108:53 (LISTEN)
dnsmasq   1087079 dnsmasq    6u  IPv4  971161      0t0  UDP 127.0.0.1:53 
dnsmasq   1087079 dnsmasq    7u  IPv4  971162      0t0  TCP 127.0.0.1:53 (LISTEN)
dnsmasq   1087079 dnsmasq    8u  IPv6  971163      0t0  UDP [fe80::6da4:88ab:f911:52e9]:53 
dnsmasq   1087079 dnsmasq    9u  IPv6  971164      0t0  TCP [fe80::6da4:88ab:f911:52e9]:53 (LISTEN)
dnsmasq   1087079 dnsmasq   10u  IPv6  971165      0t0  UDP [2002:a00:3d:1:b746:64a8:5acd:5caa]:53 
dnsmasq   1087079 dnsmasq   11u  IPv6  971166      0t0  TCP [2002:a00:3d:1:b746:64a8:5acd:5caa]:53 (LISTEN)
dnsmasq   1087079 dnsmasq   12u  IPv6  971167      0t0  UDP [2002:a00:3d:1:38e6:47f3:cacb:43d3]:53 
dnsmasq   1087079 dnsmasq   13u  IPv6  971168      0t0  TCP [2002:a00:3d:1:38e6:47f3:cacb:43d3]:53 (LISTEN)
dnsmasq   1087079 dnsmasq   14u  IPv6  971169      0t0  UDP [::1]:53 
dnsmasq   1087079 dnsmasq   15u  IPv6  971170      0t0  TCP [::1]:53 (LISTEN)
dnsmasq   1110924     lxd    8u  IPv4  983695      0t0  UDP 10.204.141.1:53 
dnsmasq   1110924     lxd    9u  IPv4  983696      0t0  TCP 10.204.141.1:53 (LISTEN)
dnsmasq   1110924     lxd   10u  IPv6  983697      0t0  UDP [fd42:87ee:235a:9776::1]:53 
dnsmasq   1110924     lxd   11u  IPv6  983698      0t0  TCP [fd42:87ee:235a:9776::1]:53 (LISTEN)

Any help would be welcome. I’ve been digging into it for a whole day and I’m simply out of ideas. Thanks

notetienne · December 23, 2021, 1:27am

If that can be useful, here’s what ps aux gives about the process started by LXD:

lxd      1110924  0.0  0.0   7212  3728 ?        Ss   17:45   0:02 dnsmasq --keep-in-foreground --strict-order --bind-interfaces --except-interface=lo --pid-file= --no-ping --interface=lxdbr0 --dhcp-rapid-commit --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.204.141.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-range 10.204.141.2,10.204.141.254,1h --listen-address=fd42:87ee:235a:9776::1 --enable-ra --dhcp-range ::,constructor:lxdbr0,ra-stateless,ra-names -s lxd --interface-name _gateway.lxd,lxdbr0 -S /lxd/ --conf-file=/var/snap/lxd/common/lxd/networks/lxdbr0/dnsmasq.raw -u lxd -g lxd

stgraber · December 23, 2021, 3:05am

Maybe run tcpdump -ni lxdbr0 port 53 to see the actual request being made and then tcpdump -ni lo port 53 to maybe see the request that’s forwarded to your local dnsmasq?

I believe dnsmasq usually just relays through whatever is in /etc/resolv.conf, so in your case, LXD’s dnsmasq will likely be talking to your system’s dnsmasq.

notetienne · December 23, 2021, 3:31am

Thanks for your quick reply.

Running tcpdump -ni lxdbr0 port 53 and executing dig @10.204.141.1 google.com in the container, I get:

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lxdbr0, link-type EN10MB (Ethernet), capture size 262144 bytes
22:29:41.955723 IP 10.204.141.222.56484 > 10.204.141.1.53: 59624+ [1au] A? google.com. (51)
22:29:41.955839 IP 10.204.141.1.53 > 10.204.141.222.56484: 59624 Refused$ 0/0/1 (39)

Running tcpdump -ni lo port 53 doesn’t show any result when executing dig @10.204.141.1 google.com.

stgraber · December 23, 2021, 5:09am

Hmm, maybe try tcpdump -ni any port 53 ?

notetienne · December 23, 2021, 5:51pm

That’s what it gives for a simple query:

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
12:42:06.571487 IP 10.204.141.222.33599 > 10.204.141.1.53: 14579+ [1au] A? google.com. (51)
12:42:06.571487 IP 10.204.141.222.33599 > 10.204.141.1.53: 14579+ [1au] A? google.com. (51)
12:42:06.571678 IP 10.204.141.1.53 > 10.204.141.222.33599: 14579 Refused$ 0/0/1 (39)
12:42:06.571693 IP 10.204.141.1.53 > 10.204.141.222.33599: 14579 Refused$ 0/0/1 (39)

That’s all for the query I made in the container.

As I said, the query is REFUSED by the LXD dnsmasq instance. According to RFC1035 Section 4.1.1 RCODE 5:

Refused - The name server refuses to perform the specified operation 
for policy reasons.  For example, a name server may not wish to
provide the information to the particular requester, or a name server 
may not wish to perform a particular operation (e.g., zone transfer) 
for particular data.

I suspect the LXD dnsmasq is not aware of any upstream DNS server (that would be the local one). Any idea how I could see the status of the snap dnsmasq?

stgraber · December 23, 2021, 5:59pm

dnsmasq will log to syslog so you may want to check that.
As for what it uses for upstream servers, I believe it inspects your system’s /etc/resolv.conf.

You can change that though by setting raw.dnsmasq to any dnsmasq config you want and so can use that to point to different upstream servers.

notetienne · December 24, 2021, 12:37am

Thanks a lot! After playing with /etc/resolv.conf and restarting the lxd service, it seems to work.

jadjay · July 8, 2022, 12:11pm

Hi,
Would you be so kind to share your operations ?

I get the same problem, with no evident solutions as we have the sames configuration on other LXD servers without this “Refused” operations.

Regards,

jadjay · July 8, 2022, 12:31pm

This is typical logs I get :

Jul  8 14:29:51 contXXXX-01 dnsmasq[2565]: query[A] XXXXX.fr from 172.16.103.XXXX
Jul  8 14:29:51 contXXXX-01 dnsmasq[2565]: config error is REFUSED

notetienne · July 8, 2022, 3:22pm

I will try to look at my notes. Sadly, I’m away from my computer and won’t be able to look at what caused the problem until maybe at least 22:00 GMT (18:00 EST).

What would be useful would be to say your current distro version your servers are using. Also, assuming you’re using a custom DNS server, how did you configure it?

I don’t remember precisely the problem, but I think it was setting a custom upstream DNS server in Dnsmasq on Ubuntu and disabling systemd-resolved. Not that in itself is wrong, but it’s pretty fragile. Resolving DNS has become such a complicated task today on GNU/Linux. With too many solutions depending on context.

I ended up still using dnsmasq on the local machine, but did some tricks I’ll have to investigate.

Sincerely,

notetienne · July 8, 2022, 4:57pm

The logs you show seem to originate from the container.

I think this is not the same architecture I had. What I had was Dnsmasq running in the bare metal host. LXD is already using its own dnsmasq service to provide DNS configuration to containers. However, LXD’s dnsmasq couldn’t communicate correctly with the bare metal host. The containers I used didn’t have dnsmasq installed.

jadjay · September 27, 2022, 1:16pm

Thanks for your messages,

I figured out by creating a service that configure systemd-resolve, and set few things :

declare domain and dns server on each lxd’s network interface
that service is launched when thoses interface are set up (dependancies)

Regards,

notetienne · September 27, 2022, 1:58pm

Happy to see everything is fine now!

As I said, DNS management is too complicated nowadays!