Incus containers unable to ping each other

So I’ve started migration from LXD to incus on our terraform modules GitHub - upmaru/terraform-aws-instellar: Terraform module for bootstrapping LXD cluster for using with https://instellar.app. I’m trying to setup 2 machines clustered using the bridge network. I booted up 2 containers (alpine 3.18) and tried to get them to ping each other and I can’t seem to get it to work. I’ve read the documentation about ufw and ran all the commands and nothing seems to be working. When I used to use LXD it came with fan networking that used to work out of the box. Let me know where i’m going wrong here.

I’m setting this up on AWS ubuntu 22.04 VMs

$ sudo nft list ruleset
table inet incus {
	chain pstrt.incusbr0 {
		type nat hook postrouting priority srcnat; policy accept;
		ip saddr 10.199.238.0/24 ip daddr != 10.199.238.0/24 masquerade
		ip6 saddr fd42:6115:687d:79b7::/64 ip6 daddr != fd42:6115:687d:79b7::/64 masquerade
	}

	chain fwd.incusbr0 {
		type filter hook forward priority filter; policy accept;
		ip6 version 6 oifname "incusbr0" accept
		ip6 version 6 iifname "incusbr0" accept
	}

	chain in.incusbr0 {
		type filter hook input priority filter; policy accept;
		iifname "incusbr0" tcp dport 53 accept
		iifname "incusbr0" udp dport 53 accept
		iifname "incusbr0" icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
		iifname "incusbr0" udp dport 547 accept
	}

	chain out.incusbr0 {
		type filter hook output priority filter; policy accept;
		oifname "incusbr0" tcp sport 53 accept
		oifname "incusbr0" udp sport 53 accept
		oifname "incusbr0" icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, parameter-problem, echo-request, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert, mld2-listener-report } accept
		oifname "incusbr0" udp sport 547 accept
	}
}
table ip filter {
}
table ip6 filter {
}

I’ve also tried configuring resolvectl based on the documentation

resolvectl status incusbr0
Link 3 (incusbr0)
    Current Scopes: DNS
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 10.199.238.1
       DNS Servers: 10.199.238.1
        DNS Domain: \047var\047lib\047incus
config:
  ipv4.address: 10.199.238.1/24
  ipv4.nat: "true"
  ipv6.address: fd42:6115:687d:79b7::1/64
  ipv6.nat: "true"
description: ""
name: incusbr0
type: bridge
used_by:
- /1.0/instances/test-01
- /1.0/instances/test-02
- /1.0/profiles/default
managed: true
status: Created
locations:
- ip-172-31-26-149
- incus-exp-02

I have also made sure to enable all traffic between the 2 nodes for the security group in AWS. If the containers are on the same host, they can see / ping each other.

After doing some further reading, I’m starting to understand that incus by default doesn’t support cross-host networking.

I will need to setup openvswitch (vxlan | gre) on the hosts to enable this support.

@stgraber this is quite a large gap between the ease of use of LXD and incus setting up cross host networking. OVN is too heavy for what I need, which is just simple cross-host networking between containers.

Could you please offer some guidance? I’m happy to contribute to documentation for incus as well once i can figure out the whole thing.

Update on this: I’m just tried microovn it was super easy to setup. I’ll try and setup incus using microovn.

1 Like

The cross-host networking story in LXD was using a custom kernel feature called the Ubuntu FAN. Unfortunately this was never upstreamed and Canonical had no intention to turn that code into something upstreamable.

So it was basically only available on Ubuntu systems running an Ubuntu kernel. As Incus did away with any such platform lock-in, this is no longer an option.

OVN is definitely the most flexible option and a lot more useful as with it your addresses stay consistent across hosts, making it possible to re-shuffle workloads or perform cluster evacuations during maintenance.

It’s possible though not trivial to get MicroOVN working with Incus. The main difficulty comes from it running a built-in OpenVswitch daemon.

I’ll post instructions on how to do that when I’m back on a computer.

We’ve been doing a bunch of OVN related work in Incus which will make it much easier to integrate with MicroOVN in the near future (by not needing access to any of the OVS/OVN CLI tools).

2 Likes

Thank you @stgraber We’ve been building a product based on lxd/incus for the past year. We’re close to launching. However your announcement of disabling support for images: on LXD pretty much is a deal breaker for us. Since our customers clusters setup using LXD would stop working in April / May 2024.

We’d prefer to resolve this by switching our terraform module to incus before we do a full public launch, because migrating 10-20 clusters to incus is much easier than supporting our customers migrating 500 clusters to incus.

We’ve managed to build a deployment platform based on LXD and one primary reason for choosing LXD was this ease of setting up cross-node networking.

We’re happy to switch to incus, we’re provisioning lxd (soon to be incus) using terraform modules, which means incus will be used by a lot of people. We intend it to be an alternative to k8s basically. We’ve built an engine on top of it to make it even easier to run than k8s. Users do 0 config, all they do is click a few buttons on the wizard and deploy their rails / django / phoenix / go / etc… apps.

We really would like to use incus moving forward provided this cross-node networking is resolved easily. We use the api quite intensively and seeing that incus is 100% compatible with LXD the switch is straight foward except for this cross-node networking issue.

Not using the Ubuntu FAN is probably for the best then. If running other people’s workload, it’s probably a good thing being able to create separate networks, have the ability to put ACLs on those and be able to move workloads within your cluster without all addresses changing.

So for MicroOVN and Incus, what you need installed is:

  • openvswitch-switch (for ovs-vsctl command)
  • ovn-common (for ovn-nbctl and ovn-sbctl)

However you need to make sure that OpenVSwitch doesn’t actually start, so you’ll want to do:

  • systemctl disable ovs-vswitchd openvswitch-switch

With that done, you still need to make ovs-vsctl work, the way I’ve done it is with:

[Service]
ExecStartPost=-/usr/bin/mkdir -p /run/openvswitch
ExecStartPost=-/usr/bin/mkdir -p /var/snap/microovn/common/run/switch/
ExecStartPost=-/usr/bin/umount -l /run/openvswitch/
ExecStartPost=-/usr/bin/mount -o bind /var/snap/microovn/common/run/switch/ /run/openvswitch/

Which you can add as an override on snap.microovn.switch.service with systemctl edit snap.microovn.switch.

With that done and MicroOVN or the system restarted, you should now have ovs-vsctl show work properly.’

And can finally configure Incus to use MicroOVN with:

. /var/snap/microovn/common/data/ovn.env 
incus config set network.ovn.northbound_connection="${OVN_NB_CONNECT}"
cat /var/snap/microovn/common/data/pki/client-cert.pem | incus config set network.ovn.client_cert -
cat /var/snap/microovn/common/data/pki/client-privkey.pem | incus config set network.ovn.client_key -
cat /var/snap/microovn/common/data/pki/cacert.pem | incus config set network.ovn.ca_cert -

That will result in your OVN connection string and certificates being loaded into the Incus config (requires Incus 0.4 or higher) which you can confirm with incus config show.

Our ongoing work to move to a pure-Go OVSDB client will eventually make it so you don’t need to install those two packages and don’t need to put that override in place to make ovs-vsctl work.

But there’s a lot of code that remains to be ported over to the new client so it’ll probably be Incus 0.6 or so before we’re done with this. Until then you need that small workaround to get things going with MicroOVN.

1 Like

The FAN networking worked for us because customers run their own cluster, which means they’re running their apps on their own cluster inside their VPC on their own cloud.

That’s why it was fine and we didn’t need all the advance features of OVN. I was wondering if it was possible for you to re-enable fan networking until all this stuff with OVN has been resolved in 0.6

I’ll read through and try to make sense of this, and see if I can codify this into our terraform module. We need to do something which is repeatable and we can do via terraform since clusters are setup in an automated / repeatable manor.

Sadly, no. If the Ubuntu FAN was just a kernel feature that we could detect, maybe we’d have kept it around (though not likely as we really don’t want anything to be vendor-dependent), the other side of the Ubuntu FAN problem is that because it’s not in the mainline kernel, it also requires a patched up userspace, in the form of iproute2.

iproute2 (mostly known for the ip command) is a pretty critical core package, and having to require folks run a patched up version of this isn’t really something we’re comfortable with.
(The iproute2 contained in the LXD snap package obviously does have that change).

Ok I’ll try the workaround.

Also I saw that in the incus documentation you didn’t use microovn, but manually setup ovn. Is the reason for that is because currently we require this workaround?

Moving forward what is the long-term recommended approach? is microovn recommended? or the manual setup route recommended? We’re only looking at microovn because it’s simple to setup.

Thank you for your response.

Ideally, in the future we’d just refer to upstream OVN documentation on how to install OVN rather than distro-specific instructions like we have today.

MicroOVN should be perfectly fine to use for Ubuntu users, so long as you understand how snaps refresh and make sure to either disable auto-refreshes or to plan maintenance time for those refreshes to happen (as OVN may go down during that time).

With Terraform and Ansible support for Incus having improved significantly over the past couple of weeks, I’d actually like to see a solid Ansible Incus playbook that can be used to deploy clusters with Ceph and OVN (optionally for those two), making things generally more reproducible and reliable.

2 Likes

So I followed the instruction you provided, and tried setting up the network using ovn. Everything went ok until I ran this

incus network create my-ovn --type=ovn

Basically after running that line the connection froze and then I got disconnected from the node.

I’m trying this on amazon ec2 running inside a VPC. I guess ovn doesn’t work with amazon’s VPC? The status check on amazon dashboard then failed.

This basically means ovn is a no go for my use case? It means I have to continue to use lxd, my only other alternative is setup my own image server, that may be our only choice at this point.

As much as I’d like to use incus I think removing fan networking basically reduced all advantage of incus over something like kubernetes requiring a overlay network. The FAN network offered simplicity and it basically works everywhere. I’ve tried it on AWS, GCP, DO, Hetzner cloud etc… it just worked. I think dropping support for Fan networking basically breaks the simplicity our setup offers.

Though I understand your concerns with FAN networking, for our use-case where customers are running their own infra / networking and everything is isolated to their infrastructure, fan networking makes a lot of sense with the simplicity.

That’s almost certainly a setup mistake that caused the physical NIC to be used as an uplink or something like this.

In isolate environments like the public cloud, you usually want to create a traditional bridge as your uplink and then use that as the backing of the OVN networks.

I’m on vacation in Europe so I don’t exactly have time to sort out an AWS EC2 account to test this, but with just a basic setup of 3 VMs without any kind of fancy dedicated network for the uplink, this works fine here:

That’s a basic cluster of 3 machines on Incus 0.4 stable along with a matching MicroOVN cluster:

root@ovn-test01:~# microovn status
MicroOVN deployment summary:
- ovn-test01 (10.178.240.117)
  Services: central, chassis, switch
- ovn-test02 (10.178.240.224)
  Services: central, chassis, switch
- ovn-test03 (10.178.240.182)
  Services: central, chassis, switch
root@ovn-test01:~# incus cluster list
+------------+-----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
|    NAME    |             URL             |      ROLES      | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATE  |      MESSAGE      |
+------------+-----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| ovn-test01 | https://10.178.240.117:8443 | database-leader | x86_64       | default        |             | ONLINE | Fully operational |
|            |                             | database        |              |                |             |        |                   |
+------------+-----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| ovn-test02 | https://10.178.240.224:8443 | database        | x86_64       | default        |             | ONLINE | Fully operational |
+------------+-----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| ovn-test03 | https://10.178.240.182:8443 | database        | x86_64       | default        |             | ONLINE | Fully operational |
+------------+-----------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
root@ovn-test01:~# ovs-vsctl show
4f93b55e-de95-419a-b157-007b765f59f8
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port br-int
            Interface br-int
                type: internal
        Port ovn-ovn-te-1
            Interface ovn-ovn-te-1
                type: geneve
                options: {csum="true", key=flow, remote_ip="10.178.240.182"}
        Port ovn-ovn-te-0
            Interface ovn-ovn-te-0
                type: geneve
                options: {csum="true", key=flow, remote_ip="10.178.240.224"}
    ovs_version: "2.17.8"
root@ovn-test01:~# incus config show
config:
  cluster.https_address: 10.178.240.117:8443
  core.https_address: 10.178.240.117:8443
  network.ovn.ca_cert: |
    -----BEGIN CERTIFICATE-----
    MIIB+zCCAYGgAwIBAgIQfNgJqevusQuFvfhkyC9UBzAKBggqhkjOPQQDAzA/MREw
    DwYDVQQKEwhNaWNyb09WTjEUMBIGA1UECxMLTWljcm9PVk4gQ0ExFDASBgNVBAMT
    C01pY3JvT1ZOIENBMB4XDTIzMTIyNjIzMzE0OVoXDTMzMTIyMzIzMzE0OVowPzER
    MA8GA1UEChMITWljcm9PVk4xFDASBgNVBAsTC01pY3JvT1ZOIENBMRQwEgYDVQQD
    EwtNaWNyb09WTiBDQTB2MBAGByqGSM49AgEGBSuBBAAiA2IABEEkthA/9jeexgxh
    3/sFg8/noQHUmfvU4seFpxUGqGK9ByXVahvva5EKJVXCzKby51bOSblb3/HZut5c
    dF3snijkdtqUSvo7uM2WPwOHIVkK9GvX89j4bA5k41KyIadd3qNCMEAwDgYDVR0P
    AQH/BAQDAgEGMA8GA1UdEwEB/wQFMAMBAf8wHQYDVR0OBBYEFIaphP9hoQXAb4xN
    ICAoL23VOwRKMAoGCCqGSM49BAMDA2gAMGUCMQCBnYXQ6p7I/3/nRQpqqRaGyjcT
    RhNQ91i4N/yJTkQVrdyeZ0YPxZwj7HJQvj0FUMYCMGexbl4hX6z3j/YDJqtVJ2rg
    HqnAHSMi7LNBH0AbZv+njAMfgOzOLlgbzAXPyNCdJQ==
    -----END CERTIFICATE-----
  network.ovn.client_cert: |
    -----BEGIN CERTIFICATE-----
    MIIB8zCCAXqgAwIBAgIQXW3bpOXd6gJ5aKvsSNNZrDAKBggqhkjOPQQDAzA/MREw
    DwYDVQQKEwhNaWNyb09WTjEUMBIGA1UECxMLTWljcm9PVk4gQ0ExFDASBgNVBAMT
    C01pY3JvT1ZOIENBMB4XDTIzMTIyNjIzMzE0OVoXDTI1MTIyNTIzMzE0OVowOTER
    MA8GA1UEChMITWljcm9PVk4xDzANBgNVBAsTBmNsaWVudDETMBEGA1UEAxMKb3Zu
    LXRlc3QwMTB2MBAGByqGSM49AgEGBSuBBAAiA2IABEf5vO+gq3Iq9fCixmO/nvKL
    0ozhTaiC2CSncaiarfdGPvp1LMFMqq3oe3oR986yJ347QVowZRtGYg+ImKkj8pjS
    21RepM5iHVZF2QUU/zSGOceWTG+lYytkP4Mlop+i0aNBMD8wDgYDVR0PAQH/BAQD
    AgPoMAwGA1UdEwEB/wQCMAAwHwYDVR0jBBgwFoAUhqmE/2GhBcBvjE0gICgvbdU7
    BEowCgYIKoZIzj0EAwMDZwAwZAIwVqRUpI1FLlomr5oPRUjr5o93dmL2U6banDdm
    C3x2ikzIO3pPChDngI8rgBIPCLXlAjA9eJFn8Ep3kvoaxONbW070RMo/WnnTyLFZ
    bvhvZl4296zKMoyiNHGWw3SMhP2KuIA=
    -----END CERTIFICATE-----
  network.ovn.client_key: |
    -----BEGIN EC PRIVATE KEY-----
    MIGkAgEBBDCdbhKWQXSKpMM86BjCuew1/Jvfbb/xvrIW8I3EskWfJha2BuMieZyb
    eG34cwMH5NigBwYFK4EEACKhZANiAARH+bzvoKtyKvXwosZjv57yi9KM4U2ogtgk
    p3Gomq33Rj76dSzBTKqt6Ht6EffOsid+O0FaMGUbRmIPiJipI/KY0ttUXqTOYh1W
    RdkFFP80hjnHlkxvpWMrZD+DJaKfotE=
    -----END EC PRIVATE KEY-----
  network.ovn.northbound_connection: ssl:10.178.240.117:6641,ssl:10.178.240.224:6641,ssl:10.178.240.182:6641
root@ovn-test01:~# incus network create UPLINK --type=bridge --target ovn-test01
Network UPLINK pending on member ovn-test01
root@ovn-test01:~# incus network create UPLINK --type=bridge --target ovn-test02
Network UPLINK pending on member ovn-test02
root@ovn-test01:~# incus network create UPLINK --type=bridge --target ovn-test03
Network UPLINK pending on member ovn-test03
root@ovn-test01:~# incus network create UPLINK --type=bridge ipv4.address=10.123.123.1/24 ipv4.nat=true ipv4.dhcp.ranges=10.123.123.10-10.123.123.50 ipv4.ovn.ranges=10.123.123.100-10.123.123.250 ipv6.address=none
Network UPLINK created
root@ovn-test01:~# incus network create my-ovn --type=ovn
Error: Failed creating pending network for member "ovn-test01": Network is not in pending state
root@ovn-test01:~# incus network delete my-ovn
Network my-ovn deleted
root@ovn-test01:~# incus network create my-ovn --type=ovn
Network my-ovn created
root@ovn-test01:~# incus launch images:alpine/edge a1 --network my-ovn
Creating a1
Starting a1                                   
root@ovn-test01:~# incus launch images:alpine/edge a2 --network my-ovn
Creating a2
Starting a2                               
root@ovn-test01:~# incus launch images:alpine/edge a3 --network my-ovn
Creating a3
Starting a3                               
root@ovn-test01:~# incus list
+------+---------+--------------------+----------------------------------------------+-----------+-----------+------------+
| NAME |  STATE  |        IPV4        |                     IPV6                     |   TYPE    | SNAPSHOTS |  LOCATION  |
+------+---------+--------------------+----------------------------------------------+-----------+-----------+------------+
| a1   | RUNNING | 10.96.216.2 (eth0) | fd42:24a0:146:16b9:216:3eff:fe41:a383 (eth0) | CONTAINER | 0         | ovn-test01 |
+------+---------+--------------------+----------------------------------------------+-----------+-----------+------------+
| a2   | RUNNING | 10.96.216.3 (eth0) | fd42:24a0:146:16b9:216:3eff:fe28:4429 (eth0) | CONTAINER | 0         | ovn-test02 |
+------+---------+--------------------+----------------------------------------------+-----------+-----------+------------+
| a3   | RUNNING | 10.96.216.4 (eth0) | fd42:24a0:146:16b9:216:3eff:fe99:615 (eth0)  | CONTAINER | 0         | ovn-test03 |
+------+---------+--------------------+----------------------------------------------+-----------+-----------+------------+
root@ovn-test01:~# incus exec a1 -- ping 10.96.216.4
PING 10.96.216.4 (10.96.216.4): 56 data bytes
64 bytes from 10.96.216.4: seq=0 ttl=64 time=1.290 ms
64 bytes from 10.96.216.4: seq=1 ttl=64 time=0.751 ms
^C
--- 10.96.216.4 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.751/1.020/1.290 ms
root@ovn-test01:~# incus exec a2 -- ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1): 56 data bytes
64 bytes from 1.1.1.1: seq=0 ttl=55 time=28.548 ms
64 bytes from 1.1.1.1: seq=1 ttl=55 time=13.620 ms
^C
--- 1.1.1.1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 13.620/21.084/28.548 ms
root@ovn-test01:~# 

Hey thank you for the reply! I will do another attempt. This helps a lot!

So I tried this out, and I was able to get further. Here is what I have (I’m trying all this on a single node first)

I created a container and assigned the my-ovn network to it. I got no IP address when I run incus list

+---------+---------+------+------+-----------+-----------+--------------+
|  NAME   |  STATE  | IPV4 | IPV6 |   TYPE    | SNAPSHOTS |   LOCATION   |
+---------+---------+------+------+-----------+-----------+--------------+
| test-01 | RUNNING |      |      | CONTAINER | 0         | incus-exp-01 |
+---------+---------+------+------+-----------+-----------+--------------+
| test-02 | RUNNING |      |      | CONTAINER | 0         | incus-exp-01 |
+---------+---------+------+------+-----------+-----------+--------------+

However when I run incus config show test-01 I got

incus config show test-01
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Alpine 3.18 amd64 (20231226_13:00)
  image.os: Alpine
  image.release: "3.18"
  image.requirements.secureboot: "false"
  image.serial: "20231226_13:00"
  image.type: squashfs
  image.variant: default
  volatile.base_image: 085372ea353dcf972c083d3d0daa46228568abda4752635fbf0052a8dc9dafa2
  volatile.cloud-init.instance-id: bec1ae5d-7532-43ff-a22f-e1f82694c89a
  volatile.eth0.host_name: vethe7ef5d76
  volatile.eth0.hwaddr: 00:16:3e:c4:32:ac
  volatile.eth0.last_state.ip_addresses: 10.68.9.2,fd42:8510:cb48:6882:216:3eff:fec4:32ac
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.uuid: bb51f5b3-1f2f-425d-8c52-6d74591748c7
  volatile.uuid.generation: bb51f5b3-1f2f-425d-8c52-6d74591748c7
devices:
  eth0:
    name: eth0
    network: my-ovn
    type: nic
ephemeral: false
profiles:
- default
stateful: false
description: ""

Another thing is I could run incus network attach-profile my-ovn default eth0 however when creating the container it gave me an error

incus launch images:alpine/3.18 test-01
Creating test-01
Starting test-01
Error: Failed to start device "eth0": Parent device 'my-ovn' doesn't exist
Try `incus info --show-log local:test-01` for more info

Here is the incus profile show default

config: {}
description: Default Incus profile
devices:
  eth0:
    nictype: macvlan
    parent: my-ovn
    type: nic
  root:
    path: /
    pool: local
    type: disk
name: default
used_by:
- /1.0/instances/test-02

I changed the profile devices to this:

config: {}
description: Default Incus profile
devices:
  eth0:
    name: eth0
    network: my-ovn
    type: nic
  root:
    path: /
    pool: local
    type: disk
name: default
used_by:
- /1.0/instances/test-01

Basically modified the network device, and it could launch the container but still no IP assigned to the container.

Here is config for the UPLINK network

config:
  ipv4.address: 10.1.0.1/24
  ipv4.dhcp.ranges: 10.1.0.10-10.1.0.50
  ipv4.nat: "true"
  ipv4.ovn.ranges: 10.1.0.100-10.1.0.250
  ipv6.address: none
description: ""
name: UPLINK
type: bridge
used_by:
- /1.0/networks/my-ovn
managed: true
status: Created
locations:
- incus-exp-01

Here is the config for the my-ovn network

config:
  bridge.mtu: "1500"
  ipv4.address: 10.68.9.1/24
  ipv4.nat: "true"
  ipv6.address: fd42:8510:cb48:6882::1/64
  ipv6.nat: "true"
  network: UPLINK
  volatile.network.ipv4.address: 10.1.0.100
description: ""
name: my-ovn
type: ovn
used_by:
- /1.0/instances/test-01
- /1.0/profiles/default
managed: true
status: Created
locations:
- incus-exp-01
sudo ovs-vsctl show
d6cca92a-7270-48d3-b2be-82ce2f09bcb6
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port br-int
            Interface br-int
                type: internal
        Port veth0e2fd700
            Interface veth0e2fd700
    Bridge incusovn1
        Port incusovn1
            Interface incusovn1
                type: internal
        Port incusovn1b
            Interface incusovn1b
    ovs_version: "2.17.8"

UPDATE: I just re-did the setup again

I ran incus network list-allocations and saw this

+------------------------+--------------------------------------------+----------+------+-------------------+
|        USED BY         |                  ADDRESS                   |   TYPE   | NAT  | HARDWARE ADDRESS  |
+------------------------+--------------------------------------------+----------+------+-------------------+
| /1.0/networks/UPLINK   | 172.123.123.1/24                           | network  | true |                   |
+------------------------+--------------------------------------------+----------+------+-------------------+
| /1.0/networks/my-ovn   | 10.39.18.1/24                              | network  | true |                   |
+------------------------+--------------------------------------------+----------+------+-------------------+
| /1.0/networks/my-ovn   | fd42:bb1a:1acf:49aa::1/64                  | network  | true |                   |
+------------------------+--------------------------------------------+----------+------+-------------------+
| /1.0/instances/test-02 | 10.39.18.2/32                              | instance | true | 00:16:3e:80:86:47 |
+------------------------+--------------------------------------------+----------+------+-------------------+
| /1.0/instances/test-02 | fd42:bb1a:1acf:49aa:216:3eff:fe80:8647/128 | instance | true | 00:16:3e:80:86:47 |
+------------------------+--------------------------------------------+----------+------+-------------------+

Which is odd becuase based on this it’s getting the ip 10.39.18.2 but when i’m in the container I can’t access the internet nor does the ip show up in incus list.

Hi @stgraber I tried this little experiment on a different cloud provider (digitalocean) and it worked in 1 go. There wasn’t much to do outside of your instructions. Once you ship incus 0.6 it’ll basically be as seamless as FAN networking. So seems like something is up with AWS. I’ll dig around a little deeper to see what’s going on. It seems there is something on AWS that’s not working right.

Here is

ovs-vsctl show
b2d890e2-5de2-4621-a5fb-a31f9edb3396
    Bridge incusovn1
        Port incusovn1
            Interface incusovn1
                type: internal
        Port incusovn1b
            Interface incusovn1b
        Port patch-incus-net3-ls-ext-lsp-provider-to-br-int
            Interface patch-incus-net3-ls-ext-lsp-provider-to-br-int
                type: patch
                options: {peer=patch-br-int-to-incus-net3-ls-ext-lsp-provider}
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port br-int
            Interface br-int
                type: internal
        Port veth23e4e71d
            Interface veth23e4e71d
        Port patch-br-int-to-incus-net3-ls-ext-lsp-provider
            Interface patch-br-int-to-incus-net3-ls-ext-lsp-provider
                type: patch
                options: {peer=patch-incus-net3-ls-ext-lsp-provider-to-br-int}
    ovs_version: "2.17.8"

The AWS version looks very different from this.

Maybe an AWS security group that’s blocking the Geneve tunnel traffic on the VPC? Or some firewall policies blocking the OVN database traffic?

I’m still looking into it. I’ve opened up all the ports on the security group. I’ll let you know what I find.

@stgraber So I managed to get it working on AWS. However I haven’t figured out why my previous attempts failed.

The only thing I changed in this last attempt was I created my own VPC / Subnet / Route Table etc… from scratch. instead of using the default one that came with the AWS account. I created VPC with the following CIDR
10.0.0.0/16 and subnets with 10.0.0.0/24, 10.0.16.0/24

the default one that came with AWS had a CIDR block of 172.31.0.0/16 and subnets had CIDR of 172.31.16.0/20, 172.31.32.0/20, 172.31.0.0/20

No matter what I did the instance created in the default VPC / subnets just would not work.

I checked the microovn logs (on the broken instance the one created in the default VPC) and found this.

2023-12-27T14:36:56Z ovn-controller[15070]: ovs|01645|main|INFO|OVNSB commit failed, force recompute next time.
2023-12-27T14:37:01Z ovn-controller[15070]: ovs|01646|main|INFO|OVNSB commit failed, force recompute next time.
2023-12-27T14:37:03Z ovn-controller[15070]: ovs|01647|ovsdb_idl|WARN|Dropped 19 log messages in last 57 seconds (most recently, 2 seconds ago) due to excessive rate
2023-12-27T14:37:03Z ovn-controller[15070]: ovs|01648|ovsdb_idl|WARN|transaction error: {"details":"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"172.31.60.164\") for index on columns \"type\" and \"ip\".  First row, with UUID d71c98b2-f507-4de4-a6a1-ec8aee8c71dc, was inserted by this transaction.  Second row, with UUID 9c23ca33-479a-424a-b35d-6defb93865f1, existed in the database before this transaction and was not modified by the transaction.","error":"constraint violation"}
2023-12-27T14:37:03Z ovn-controller[15070]: ovs|01649|main|INFO|OVNSB commit failed, force recompute next time.

Anyways the lesson here is don’t use the default VPC / subnets and create the network from scratch (which we do by default anyway through the terraform module) and everything should work ‘out of the box’. I’m going to mark this as “another mystery of AWS” and call it a day.