OVN Load-balancers Not Responding

Hello LXD Fam!

I’m having this problem with ovn load balancers where I can’t seem to hit the end point from within networks that are ovn controlled. I can hit the load balancer just fine from outside the the cluster. Is this by design? Did I miss something? It feels like i might need to add a route some where maybe? I am advertising the network the loadbalancer ip is on via BGP. I would imagine that would take care of that.

I’m also using peering relationships between networks in ovn… which maybe could be the problem. I read some where that doing that is a more efficient way of moving traffic from one ovn network to another in lxd.

Wondering if anyone had the same issue

Please can you provide all the config details of your ovn networks, uplink networks, peering relationships and forwarders, along with an example container config and command you are trying to run that is and isn’t working, to give a better description of your setup.

Thanks

Sure

Network that the load balancer is on

config:
  bridge.mtu: "1442"
  dns.zone.forward: lxd.external-services.thelabs.online
  ipv4.address: 192.168.51.1/24
  ipv4.nat: "false"
  ipv6.address: fd42:1f7:e6a7:dd60::1/64
  ipv6.nat: "true"
  network: UPLINK
  volatile.network.ipv4.address: 192.168.20.32
description: ""
name: external-services
type: ovn
used_by:
- /1.0/networks/internal-services?project=internal-services
- /1.0/networks/media-services?project=media-services
- /1.0/instances/red-zone-caddy?project=external-services
- /1.0/instances/vpn-server?project=external-services
- /1.0/instances/blue-zone-caddy?project=external-services
- /1.0/instances/thebox-rebase?project=external-services
managed: true
status: Created
locations:
- raijin
- nox
- baku

Load Balancer config

description: ""
config: {}
backends:
- name: red-zone-caddy
  description: caddy reverse proxy
  target_port: 80,443
  target_address: 192.168.51.7
- name: blue-zone-caddy
  description: caddy reverse proxy
  target_port: 80,443
  target_address: 192.168.51.6
ports:
- description: http and https ports
  protocol: tcp
  listen_port: 80,443
  target_backend:
  - red-zone-caddy
  - blue-zone-caddy
listen_address: 192.168.51.100
location: ""

Uplink Network

config:
  bgp.peers.edgerouter.address: 192.168.20.1
  bgp.peers.edgerouter.asn: "65200"
  dns.nameservers: 192.168.20.2
  ipv4.gateway: 192.168.20.254/24
  ipv4.ovn.ranges: 192.168.20.30-192.168.20.40
  ipv4.routes: 192.168.48.0/22
  volatile.last_state.created: "false"
description: ""
name: UPLINK
type: physical
used_by:
- /1.0/networks/media-services?project=media-services
- /1.0/networks/external-services?project=external-services
- /1.0/networks/internal-services?project=internal-services
- /1.0/networks/internal-service?project=internal-services
managed: true
status: Created
locations:
- raijin
- nox
- baku

Peering relation ships

description: ""
config: {}
name: external-to-internal
target_project: internal-services
target_network: internal-services
status: Created
used_by: []

BGP info

{
	"peers": [
		{
			"address": "192.168.20.1",
			"asn": 65200,
			"count": 1,
			"holdtime": 0,
			"password": ""
		}
	],
	"prefixes": [
		{
			"nexthop": "192.168.20.30",
			"owner": "network_11",
			"prefix": "192.168.49.0/24"
		},
		{
			"nexthop": "192.168.20.32",
			"owner": "network_15",
			"prefix": "192.168.51.0/24"
		},
		{
			"nexthop": "192.168.20.31",
			"owner": "network_10",
			"prefix": "192.168.50.0/24"
		}
	],
	"server": {
		"address": "192.168.20.53:179",
		"asn": 64512,
		"router_id": "192.168.20.53",
		"running": true
	}
}

Ovn Cluster info

Name: OVN_Northbound
Cluster ID: 7202 (7202b2bf-a6e9-4bfa-ac7f-54e3faaaa23e)
Server ID: 6c25 (6c256d4c-7b5c-4766-b5a2-dc7d9355944d)
Address: tcp:192.168.20.50:6643
Status: cluster member
Role: leader
Term: 13390
Leader: self
Vote: self

Last Election started 80126448 ms ago, reason: leadership_transfer
Last Election won: 80126434 ms ago
Election timer: 1000
Log: [5690, 5692]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->6dae ->c334 <-6dae <-c334
Disconnections: 49
Servers:
    6dae (6dae at tcp:192.168.20.52:6643) next_index=5692 match_index=5691 last msg 305 ms ago
    6c25 (6c25 at tcp:192.168.20.50:6643) (self) next_index=5691 match_index=5691
    c334 (c334 at tcp:192.168.20.53:6643) next_index=5692 match_index=5691 last msg 305 ms ago

Let me know if there is anything else you can think of that you might need.

Hey @tomp Just bumping this to see if you have any ideas.

Thanks

I’ve just tried this in what I believe to be a similar setup:

  • 2x OVN networks connected to same uplink.
  • Both networks peered together.
  • A container running a TCP service (nginx) connected to first network.
  • A network forward on first network for port 80 on uplink forwarding to container with TCP service.
  • 2x containers with a client command (curl), one connected to first network, one connected to second network.

I found I can connect to the service via the network forward listen IP from the container connected to the same network as the TCP service. But I found I could not connect to the network forward listen IP from the container connected to the second OVN network that peers with the first.

So looks like a possible bug in LXD or OVN.

The issue appears with traffic emerging from the second OVN virtual router and then looping back into the network forward listen address on the first OVN virtual router.

Using tcpdump I can see the SYN requests coming out of the 2nd virtual router and onto the uplink network.

But it looks like OVN then tries to reply internally and the traffic flow doesn’t succeed.

Adding a static route on the 2nd OVN router to the first OVN router’s IP on the uplink network then makes it work:

E.g. Where the network forward listen address is 10.0.0.1 and the first OVN network’s external IP on the uplink is 10.21.203.11, then this makes it work:

sudo ovn-nbctl lr-route-add lxd-net21-lr 10.0.0.1/32 10.21.203.11

Please can you log a bug over at Issues · lxc/lxd · GitHub

Thanks

Hey @tomp !!

Thank you very much for chasing this down. I will log a bug and reference this thread. I appreciate your help.

Thank you

1 Like