SSL handshake networking issue with container

So today I logged into my machine and was trying to do some maintenance work. I noticed that my container is struggling to establish any SSL traffic. SSL handshake is failing.

HTTP traffic works, I can ping servers just fine. I think this just started with LXD 3.16.

Initially after disabling lxd sudo snap disable lxd and then sudo snap enable lxd I am able to get ssl traffic to work for a short amount of time and then it goes off again. Any clue on how I would go about resolving this?

I tried with Ubuntu Disco and Alpine 3.10 container both getting the same results.

On the host I’m able to access all SSL traffic just fine, it’s just inside the container that it’s failing.

So that’s traffic coming from inside the containers?
Is it a MTU issue? That would explain differing behavior based on packet size.

Yes it’s outgoing traffic originating from inside the container.

I have no clue. I can just describe the problem. If you give me some commands I can try it out. How would I go about changing the MTU?

You’d want to ping some external server and try to see if packets are getting dropped when using the full 1500 MTU.

Something like this for example:

And see if all 4 of them make it through fine.

The first 2 commands worked the last 2 returned errors as shown in the screenshot.

Right, so you’re on a system which is dealing with a MTU of 1460 apparently.

You probably should set your bridge MTU to 1460 to avoid such sisues.

lxc network set lxdbr0 bridge.mtu 1460 and then restart your containers.

1 Like

Getting this error
Error: Maximum MTU for a VXLAN FAN bridge is 1450

I’ve tried a few values, 1450 (the max) and based on this article

https://cloud.google.com/vpn/docs/concepts/mtu-considerations

I tried setting it to 1390 doesn’t seem to make any difference to the problem.

here is an output of lxd info

  driver: lxc
  driver_version: 3.2.1
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    shiftfs: "false"
    uevent_injection: "true"
    unpriv_fscaps: "true"
  kernel_version: 5.0.0-1013-gcp
  lxc_features:
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    seccomp_notify: "true"
  project: default
  server: lxd
  server_clustered: true
  server_name: artello-foundation-hv2v
  server_pid: 1155
  server_version: "3.16"
  storage: zfs
  storage_version: 0.7.12-1ubuntu5

Ah, I didn’t know you were on a fan bridge, so that does make things a bit different.
Try to unset the MTU property so things go back to the previous behavior:

  • lxc network unset lxdbr0 bridge.mtu
  • restart the container

Then show both:

  • ifconfig lxdbr0 from the host
  • ifconfig eth0 in the container

Normally the MTU on both should line up as it’s advertised by dnsmasq to the container.
If that’s the case, then the issue is fragmentation going out of the host, which can likely be worked around with a tiny bit of iptables MSS mangling.

here is the ifconfig from the host

here is the one from inside the container

Also here is the output of my iptables

zacksiri@artello-foundation-hv2v:~$ sudo iptables -t filter -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:53 /* generated for LXD network lxdfan0 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:53 /* generated for LXD network lxdfan0 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:67 /* generated for LXD network lxdfan0 */

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdfan0 */
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdfan0 */

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp spt:53 /* generated for LXD network lxdfan0 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp spt:53 /* generated for LXD network lxdfan0 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp spt:67 /* generated for LXD network lxdfan0 */

I have already unset the bridge.mtu and restarted the container. I’m curious how it can just start to happen all of a sudden, it was working fine and then it just stopped.

So I did some experiment

You were right on the mark with MTU stuff. Apparently the containers are all inheriting the wrong MTU configuration they’re all using MTU 1500

While I manually went inside the container and ran

ifconfig eth0 mtu 1410 # config from host ifconfig lxdfan0

and all https traffic resumed working.

Now i just need to figure out how to set this config by default.

Could I do this via the profile?

To answer my own question yes all I had to do was add an MTU config to the nic device and it’s all done!

Thanks @zacksiri

If this just started with LXD 3.16, I’ll check that the MTU inheritance behaviour is working as it should on that version.

Confirmed this is a bug in LXD 3.16 (although it appears to have existed in some form before that but only when devices were hotplugged into a container, rather than started on boot).

1 Like

Fix got merged upstream and is being cherry-picked to stable, ETA is a couple of hours at which point snap refresh lxd will include the fix.

2 Likes

Wonderful! Super happy with this project.