Hi, I have a two-node cluster that just updated itself … I noticed when I started getting errors in my backup (!) … it would seem that if I create a new Network (called say “public”) in the GUI, it creates a network in an ERRORED state. No errors in the log file I can see.
Not only that, the GUI then failes to delete the network when asked.
Dropping to the CLI it also shows the network in the errored state, but allows me to delete it. If I then do;
incus network create public --target=rpi1
incus network create public --target=rad
incus network create public
It works quite happily.
I’ve repeated a couple of times with different network names, same issue.
It’s about creation of non-OVN networks in a cluster environment. It looks like it’s not attempting to do the per-target creation and just goes straight to creating the global item triggering some kind of failure.
It’s about creation of non-OVN networks in a cluster environment. It looks like it’s not attempting to do the per-target creation and just goes straight to creating the global item triggering some kind of failure.
Yes, I’m no longer using OVN anywhere. Regular networks, default settings.
@oddjobz I wasn’t able to reproduce the issue on my end, it doesn’t seem to be a general UI problem. That said, could you please provide a bit more detail?
It would be helpful if you could open your browser’s Developer Tools and check the Console tab for any errors. Also, take a look at the Network tab, in particular, check the following requests:
/1.0/networks?project=<project>&target=rpi1
/1.0/networks?project=<project>&target=rad
/1.0/networks?project=<project>
Do any of these requests fail or return something other than a 2xx status? If so, could you share the response details?
Ok, so trying it again now, I get this in the JS console;
index-CfMUciCt.js:93 
GET https://rpi1:8443/1.0/networks/t?project=default 404 (Not Found)
index-CfMUciCt.js:93 
GET https://rpi1:8443/1.0/networks/te?project=default 404 (Not Found)
index-CfMUciCt.js:93 
GET https://rpi1:8443/1.0/networks/tes?project=default 404 (Not Found)
index-CfMUciCt.js:93 
GET https://rpi1:8443/1.0/networks/test?project=default 404 (Not Found)
index-CfMUciCt.js:93 
GET https://rpi1:8443/1.0/networks/test?project=default 404 (Not Found)
index-CfMUciCt.js:93 
GET https://rpi1:8443/1.0/networks/test?project=default 404 (Not Found)
useNetworks-4mRAJCTt.js:1 
POST https://rpi1:8443/1.0/networks?project=default net::ERR_NETWORK_CHANGED
useNetworks-4mRAJCTt.js:1 
GET https://rpi1:8443/1.0/networks/test?project=default net::ERR_NETWORK_CHANGED
useNetworks-4mRAJCTt.js:1 
Uncaught (in promise) TypeError: Failed to fetch
at f (useNetworks-4mRAJCTt.js:1:692)
at useNetworks-4mRAJCTt.js:1:1769
Leaves me with a green spinner on the button, i.e. doesn’t complete.
If I reload the GUI I see;
Connections
test
General
Type
Bridge
Description
-
IPv4 address
10.99.223.1/24
IPv6 address
fd42:7156:3344:41f9::1/64
ACLs
-
RX
0 B (0 packets)
TX
0 B (0 packets)
Status
Errored
@oddjobz What’s strange is that I don’t see a request with the target parameter from you. But even so, even if such a request wasn’t sent, you should still get: Error: Network not pending on any node (use --target <node> first)
or, in case the request was sent to only one server: Error: Network not defined on nodes: <node1>, <node2>
so, I’m still having trouble reproducing this error. Could you also provide the version of the UI you’re using?
So just to update, this issue is coming from my original cluster that inadvertently upgraded to 6.14. I have a “new” production cluster, all machines installed on 6.14 from scratch which seems to work Ok.
I also see that you’re receiving net::ERR_NETWORK_CHANGED, which can be caused by a broken DNS configuration. Can you try connecting using the IP address?
Mm, Ok, so a constant ping from the machine running chrome during the process shows no packet loss …
Two issue here;
This browser error is unlikely to be the reason the network create (and delete) is failing
Although this “looks” like a network issue, again it seems unlikely
The network create is being performed server-side, so any communication with the browser should be irrelevant in this context other than to update the browser. So whereas there is a problem with the communication, this looks to be a slightly different issue.
Most of the references to this error indicate a physical networking issue, however the browser in question is used ‘heavily’ for maybe 12 hours per day performing many tasks, so there is no generic problem with it. The incus server is a production internet facing server running 4 live containers … generally speaking it seems to have no operational problems.
The Incus UI on the box generally works fine. It “only” errors in this instance when trying to create or delete a network. When you do a deep-dive into the error it seems it can also be created by the server “aggressively closing the socket”. Given the process is failing to perform server-side, an aggressive socket close would seem to be a potential side-effect of the problem.
The first thing that I wonder about is that the server is generating a network in an error’d state (for whatever reason) yet has produced no log entries in the incus log file. Is there an easy way to up the log-level on the server? (or is there somewhere else I should be looking for logs)