[LXD] Built-in DNS server

stgraber · August 31, 2021, 4:39am


Project	LXD
Status	Implemented
Author(s)	@stgraber
Approver(s)	@stgraber @tomp @sdeziel
Release	LXD 4.20
Internal ID	LX006

Abstract

Implement a DNS server built into LXD which will offer AXFR (zone transfer) of auto-generated DNS zones including forward and reverse records for all instances running in LXD.

Rationale

When operating a LXD cluster that runs multiple projects and a variety of instances across a large set of networks, having valid forward and reverse records for all your instances can be quite important. It’s something that public clouds always provided (albeit with limited customization) and that can be quite important not only for ease of access to the instance but also to avoid hosted services getting flagged as potential spam due to lacking reverse DNS records.

The intent here is to have every address that LXD itself manages and which isn’t hidden behind an internal NAT to have valid forward and reverse DNS records.
LXD will then serve those zones for zone transfer to the operator’s production DNS servers.

Specification

Design

A new config key core.dns_address will be introduced to instruct LXD to listen for DNS traffic. This will enable a built-in authoritative DNS server in the LXD daemon which will listen on both UDP and TCP ports.

This DNS server will be authoritative only and will be intended to mostly serve zone transfer requests to an external DNS infrastructure.

LXD will be tracking DNS zones on a per project basis (part of the network feature) but DNS zone names will have to be globally unique.

Networks can then be tied to DNS zones for both forward and reverse records. Doing so will populate the selected zones with records for each instance. This will only happen for instance addresses which aren’t behind NAT. It will also be restricted to addresses which are known by LXD directly through its current lease records.

Automatic records are expected for:

Gateway for IPv4 or IPv6 subnets on a LXD managed network when they’re not using NAT
Forward and reverse records for IPv4 and IPv6 addresses of instances when not from a NAT-ed subnet and when the address can be determined from a lease or static record or be derived from the MAC address of the instance (EUI64)

Config changes

core.dns_address in local server configuration (DNS listener address)
dns.zone.forward in network configuration (name of zone to use for A/AAAA records)
dns.zone.reverse.ipv4 in network configuration (name of zone to use for IPv4 PTR records)
dns.zone.reverse.ipv6 in network configuration (name of zone to use for IPv6 PTR records)
restricted.networks.zones in project configuration (comma separate list of DNS zones that the project can manage, sub-zones will be allowed)

API changes

/1.0/network-zones (GET, POST)
/1.0/network-zones/<name> (GET, PUT, PATCH, DELETE)

The zone struct will contain:

Name
Description
Config (key=value map)
UsedBy (networks using the zone)

Initial config keys for the zone will be:

peers.NAME.address
peers.NAME.key
user.*

The peers key will be used to configure AXFR authentication with remote DNS servers. Initially, each zone will be expected to have this configured, either to a server the project admin’s control or to a standard set of values provided by the LXD cluster operator to interact with a central upstream DNS server.

CLI changes

lxc network zone list
lxc network zone create
lxc network zone delete
lxc network zone show
lxc network zone edit
lxc network zone set
lxc network zone unset

Database changes

network_zones table to track the zones (id, project_id, name, description)
network_zones_config table to track additional configuration (id, zone_id, key, value)

DNS behavior

The DNS server will act as an AXFR source for those zones it manages.

As LXD can’t easy know when a zone will be changing, the zone serial will be the current time (YYYYMMDDHHSS) and the TTL will be set to 60s with the zone expiry set to a day. This may change in the future as we may develop ways to track changes to the zones.

Initially, LXD will not initiate any outgoing zone transfers, it’s expected that the external DNS servers will initiate the zone transfer.

At this stage, there won’t be support for custom DNS records in those zones.
Should there be a duplicate record (instance connected twice to the same network for example), the DNS record will simply round-robin.

Upgrade handling

This is an optional additional feature, no behavior changes will occur on upgrade.

tomp · August 31, 2021, 1:50pm

I’m not sure if you’re seeking review of this doc yet, but could you expand this section to give more info into how you see these commands working, I can’t visualise it yet.

tomp · August 31, 2021, 1:52pm

If there are multiple networks assigned to the same zone (is this possible?), what will the gateway names be?

tomp · August 31, 2021, 1:57pm

Will this DNS server only server AXFR requests, or also A/AAAA and PTR requests for the published names? The abstract suggests this to be the case, but the mostly here puts doubt into my mind

stgraber · August 31, 2021, 1:58pm

I expect we’ll encode the network name in the record.

stgraber · August 31, 2021, 2:00pm

We’ll let it answer normal queries too, just no recursion. This should come in handy for those who want to integrate with resolved or similar.

sdeziel · August 31, 2021, 3:30pm

As you know, IP ACLs are pretty common to restrict who can request XFRs. TSIG would be a nice addition IMHO.

That’s indeed going to be convenient.

To reduce the amplification factor, I think the “ANY” query type should be handled according to RFC8482. The HINFO way would be best IMHO as it would reduce the cost of responding.

tomp · September 1, 2021, 8:14am

Something I thought about after our discussion on this yesterday @stgraber :

Although its not possible to have two instances with the same name connected to different networks in the same zone and same project (because instance names are unique per project), it would be possible to have multiple networks in the same project (and zone) and have a single instance connected to both of them via multiple NICs.

In that case we should think about and make clear what IP(s) the DNS name will point to - either we pick the first one (using some indicator for “first”) or we round-robin them?

stgraber · September 2, 2021, 12:26am

I’d round robin in those cases.

stgraber · October 12, 2021, 9:12pm

@sdeziel @tomp

I did a few edits to narrow the scope and clarify exactly what we’ll be providing.
This is ready for your review.

tomp · October 13, 2021, 11:09am

If multiple networks are assigned to the same zone, what gateway DNS records are generated, will they encode the network name?

stgraber · October 13, 2021, 2:04pm

For the gateway, I may use a similar trick to what I did for uplinks, effectively use something like lxdbr0.gw.example.net where example.net is the zone here.

tomp · October 13, 2021, 2:22pm

Sounds good to me. Thanks

sdeziel · October 13, 2021, 5:35pm

Looks good to me, thanks.