Serving DNS over OVN networks and accessing the instances from the hosts

Disabling NAT and then adding a route on your LXD host to the OVN router’s external IP on the uplink would be a supported way. Running sudo ip address add 10.201.159.2/24 dev ovn-access as above will create a static route on your LXD host to 10.201.159.9/24 anyway, so no different in that respect (in that you’ll need to avoid overlapping internal OVN subnets).

You keep saying “uplink”. What does it mean to you?

1 Like

The network property of the OVN network, the bridge in your case.
This is the terminology we use in the docs also OVN network - LXD documentation

Yes, I do get the route you mention, but I also get an interface that allows me to run f.i. a dnsmasq that will serve DNS on the host. AFAIK this is not provided by LXD (yet?).

Oh right I thought we were talking about accessing Ansible.

It’s the other way around, instances do not access Ansible; Ansible (running on the host) accesses the instances via SSH (I tried using the LXD Ansible connector but it didn’t work; also, ssh is still useful in case we have to go in and debug something).

1 Like

Will you be running a dnsmasq process per ovn network?

one dnsmasq per cluster, yes, plus an extra one for general DNS resolution for all clusters.

host <-> lxd-provision + general DNS <-> cluster <-> private network + private DNS

I hope that makes sense. Left side should be routable in both directions, and stays up indefinitely. Right side I don’t care about routing, and are setup and teared down by CI.

Now that I think of it, maybe I can run the private dnsmasq on the same network namespace as the instances. That way I won’t need a interface on the host namespace. But then I will not know how to connect to the OVN switches.

My hardware network consist on a single NIC on each host. Following your suggestions and after learning about netplan try, I managed to create a bridge lan0 on top of it without losing SSH (\o/!!!), and configured the host’s single IP there. I guess this is my “physical uplink” network.

Then I created the OVN network as follows:

for host in ${hosts[@]}; do
    lxc network create ovn-overlay --type=physical --target $host parent=lan0
done

# now put a cluster layer on top
lxc network create ovn-overlay --type=physical

lxc network set ovn-overlay ipv4.ovn.ranges=10.0.0.1-10.254.254.254

And my provision network on top of it:

lxc network create lxd-provision --type=ovn ipv4.dhcp=true ipv4.nat=true network=ovn-overlay

You can see I was not understanding anything about the ovn.range. This is what I get:

cloudian@uk-lxd1:~$ lxc network show ovn-overlay
config:
  ipv4.ovn.ranges: 10.0.0.1-10.254.254.254
  volatile.last_state.created: "false"
description: ""
name: ovn-overlay
type: physical
used_by:
- /1.0/networks/lxd-provision
managed: true
status: Created
locations:
- uk-lxd1
- uk-lxd2.cloudian.com
cloudian@uk-lxd1:~$ lxc network show lxd-provision
config:
  bridge.mtu: "1442"
  ipv4.address: 10.138.38.1/24
  ipv4.dhcp: "true"
  ipv4.nat: "true"
  ipv6.address: fd42:d6a0:aebf:2a0::1/64
  ipv6.nat: "true"
  network: ovn-overlay
description: ""
name: lxd-provision
type: ovn
used_by: []
managed: true
status: Created
locations:
- uk-lxd1
- uk-lxd2.cloudian.com

Because you said in IRC:

I added that detection so if it detects the physical uplink interface is a bridge it will instead connect a port to it

So we get

cloudian@uk-lxd1:~$ brctl show lan0
bridge name     bridge id               STP enabled     interfaces
lan0            8000.001fe241a6c5       no              enp2s0
                                                        lxdovn2a

and

332: lxdovn2b@lxdovn2a: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP group default qlen 1000
    link/ether 46:3f:48:d1:79:b8 brd ff:ff:ff:ff:ff:ff
333: lxdovn2a@lxdovn2b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master lan0 state UP group default qlen 1000
    link/ether ce:7b:ce:ac:e1:c7 brd ff:ff:ff:ff:ff:ff

So that seems to be a veth pair. This is the ovn-overlay network, not lxd-provision.

each OVN network you create is made up of (amongst other things) a virtual OVN router and a virtual OVN switch.

cloudian@uk-lxd1:~$ sudo ovn-nbctl show
switch 867302e9-bcfc-4ee3-b935-2b2dbb63a180 (lxd-net25-ls-int)
    port lxd-net25-ls-int-lsp-router
        type: router
        router-port: lxd-net25-lr-lrp-int
    port lxd-provision
        addresses: ["dynamic"]
router fc6149f6-1cf3-47ba-83ab-46ddc513c137 (lxd-net25-lr)
    port lxd-net25-lr-lrp-int
        mac: "00:16:3e:e9:5f:fc"
        networks: ["10.138.38.1/24", "fd42:d6a0:aebf:2a0::1/64"]

This is the lxd-provision network created as above. BTW, it’s a shame that OVN networks have names that do not resemble the ones we set on the LXD side. I can only identify which is which because of the IPs they get assigned.

The second port in the switch lxd-provision is in fact my handmade port built as shown previously.

The OVN router has 2 virtual ports on it; one connected to the uplink network (lxdbr0) and one connected to the virtual OVN switch

I see only one port.

The virtual OVN router then provides (optionally) NAT, DHCP and IPv6 SLAAC services to the OVN switch, as well as routing of course. This allows the instances connected to the virtual OVN switch to be configured with IPs and access the external uplink network NATted to the OVN router’s external address.

Because I have only one port, I have seen nothing of this.

In that case you could run a small LXD container running dnsmasq attached to each OVN network. That would be a supported way to do it (and LXD would manage the ports).

However one issue that we’ve not covered yet and I think would be a problem for you in both the above solution and your workaround port solution above is that LXD’s OVN router DHCP server doesn’t currently allow you to manually configure the DNS servers it advertises. It always intercepts DNS requests and forwards them to the uplink network’s specified DNS servers (via the dns.nameservers setting). Thus is you’ve got a DNS server running inside the OVN network (either via an instance or via a manual port) then you’d have to configure the other instances to use that DNS server manually.

I do think that as it stands the OVN network setup doesn’t quite cover all of your requirements.

I would encourage you to post feature requests for customisable DHCP DNS server addresses and the ability for OVN networks to have their own custom internal DNS records (which would avoid the need for a manual DNS server in the first place) over at Issues · lxc/incus · GitHub so we can track and prioritise. Thanks

We’re hacking around this by configuring the dhclient to ignore DNS servers and, because of the routing issues, the routers too. Ansible FTW :slight_smile:

1 Like

Just in case someone wants to do the same:

interface "provision" {
    # OVN DHCP server sends us a router and no DNS server, so we hardcode them locally
    supersede routers {{ default_gateway }};

    # use private DNS servers for this cluster
    supersede domain-name-servers {{ dns_servers | join(', ') }};
}

I think the problem here is that you’ve not specified the subnet or gateway on the physical uplink network via the ipv4.address setting.

cloudian@uk-lxd1:~$ lxc network show ovn-overlay
config:
  ipv4.ovn.ranges: 10.0.0.1-10.254.254.254
  volatile.last_state.created: "false"
description: ""
name: ovn-overlay
type: physical
used_by:
- /1.0/networks/lxd-provision
managed: true
status: Created
locations:
- uk-lxd1
- uk-lxd2.cloudian.com

I would expect to see an entry here something like ipv4.gateway: 10.0.0.1/24, and then the ipv4.ovn.ranges setting would need to be reduced in size to not include the gateway address and not expand beyond the specified subnet.

It would also need to be a reachable address on the uplink network (could be the address on the bridge itself).

Can you show ip a on the LXD host?

$ ip a
2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master lan0 state UP group default qlen 1000
    link/ether 00:1f:e2:41:a6:c5 brd ff:ff:ff:ff:ff:ff
7: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 3a:ce:9a:03:0c:60 brd ff:ff:ff:ff:ff:ff
8: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether f2:62:c9:6c:99:49 brd ff:ff:ff:ff:ff:ff
335: lxd-provision: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 46:1e:43:8a:26:03 brd ff:ff:ff:ff:ff:ff
    inet 10.138.38.2/24 scope global lxd-provision
       valid_lft forever preferred_lft forever
    inet6 fd42:d6a0:aebf:2a0:bcd7:8e5c:fd51:b35/64 scope global temporary dynamic 
       valid_lft 528193sec preferred_lft 9432sec
    inet6 fd42:d6a0:aebf:2a0:10e7:2f77:e28d:168c/64 scope global temporary deprecated dynamic 
       valid_lft 442155sec preferred_lft 0sec
    inet6 fd42:d6a0:aebf:2a0:441e:43ff:fe8a:2603/64 scope global dynamic mngtmpaddr 
       valid_lft forever preferred_lft forever
    inet6 fe80::441e:43ff:fe8a:2603/64 scope link 
       valid_lft forever preferred_lft forever
159: lan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:1f:e2:41:a6:c5 brd ff:ff:ff:ff:ff:ff
    inet 10.130.40.81/22 brd 10.130.43.255 scope global lan0
       valid_lft forever preferred_lft forever
    inet6 fe80::21f:e2ff:fe41:a6c5/64 scope link 
       valid_lft forever preferred_lft forever

I removed uninteresting entries, including ovn-access, which I think it’s a leftover from previous attempts.

So the uplink network is lan, and has an address and subnet of 10.130.40.81/22, so we would expect that to be used by OVN routers as the external gateway address.

So you need to specify that in the ovn-overlay (not a particular accurate name btw as its not the overlay network, ovn-uplink would be more accurate IMHO) using the ipv4.gateway=10.130.40.81/22. You would then need to ensure that ipv4.ovn.ranges doesn’t overlap with 10.130.40.81, and is in the same subnet as the gateway.

See the allocation logic here:

As an aside, what is lxd-provision interface on your LXD host?

Well, to me OVN is and overlay (a layer on top; but remember that I’m not a native speaker, so my semantics might be off) over the underlaying TCP/IP/Ethernet network.

A LXD OVN network can be connected to an existing managed Bridge network or Physical network to gain access to the wider network. By default, all connections from the OVN logical networks are NATed to an IP allocated from the uplink network.

Do you notice that this paragraph mentions the ‘uplink network’ without actually defining it? This is part of the disconnection I have internally, ‘uplink network’ has no meaning in my repertoire.

The physical network type connects to an existing physical network, which can be a network interface or a bridge, and serves as an uplink network for OVN.

So the physical network is describing a physical network (duh) to LXD without actually building anything. Is just so LXD can consider it managed?

So OVN needs IPs on the same network as the host it runs on. This is the first time I understand this. Can’t it just reuse the IP of the host?

I just reread this. It makes no sense to me. 10.130.40.81 is definitely within 10.0.0.2-10.254.254.254. What am I missing?

Oh, so the ‘uplink network’ is where the default gateway resides? Is that all?

Yes that was my mistake, basically ipv4.ovn.ranges is used by LXD to pick an IP for the OVN network(s) router on the uplink network. So it must be in the same subnet as the ipv4.gateway setting, and not overlap with the gateway IP itself.

An uplink network is based on existing network terminology for “uplink” meaning a port on a router or switch that connects to an external network.

See What is an Uplink?