LXD 3.0.3: Failed to create proxy devices listening on host port: Failed setns to container user namespace: Invalid argument

Hello fellow LXC/LXD users!

Since only recently I experience the follow problem: when either adding a new proxy device to running container or when starting a container with proxy devices already attached it fails with following error messages.

First a brief description of my setup:

  • Host OS: Ubuntu 18.04.5 LTS
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.5 LTS
Release:        18.04
Codename:       bionic
$ lxc info
lxc info                                                                                                                                                                                                                                                                                                         
config:
  core.https_address: 123.123.123.12:8443
  core.trust_password: true
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway                   
- file_get_symlink         
- network_leases
- unix_device_hotplug         
- storage_api_local_volume_handling
- operation_description  
- clustering                 
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- candid_authentication
- candid_config
- candid_config_key        
- usb_optional_vendorid      
api_status: stable      
api_version: "1.0"  
auth: trusted             
public: false              
auth_methods:
- tls           
environment:    
  addresses:              
  - 123.123.123.12:8443
  architectures:                
  - x86_64   
  - i686                    
  certificate: |    
    -----BEGIN CERTIFICATE-----
  (REDACTED)
    -----END CERTIFICATE-----
  certificate_fingerprint: fb96c36fa9943ecb90898f69c725be08d1f2c08195c0b9b4ac613b5aadc2add7
  driver: lxc                 
  driver_version: 3.0.3         
  kernel: Linux
  kernel_architecture: x86_64
  kernel_version: 4.15.0-139-generic
  server: lxd            
  server_pid: 2465
  server_version: 3.0.3
  storage: btrfs 
  storage_version: 4.15.1
  server_clustered: false
  server_name: caroline
  project: "" 
$ lxc version
Client version: 3.0.3
Server version: 3.0.3

Steps to reproduce:

  1. Create new container
  2. Add proxy device listening on host port
$ lxc launch ubuntu:20.04 proxy-test
$ lxc config device add proxy-test http proxy listen=tcp:0.0.0.0:8000 connect=tcp:127.0.0.1:80 bind=host
Error: Error occurred when starting proxy device: Failed to run: /usr/lib/lxd/lxd forkproxy 2465 tcp:0.0.0.0:8000 24302 tcp:127.0.0.1:80 /var/log/lxd/proxy-test/proxy.http.log /var/lib/lxd/devices/proxy-test/proxy.http:
$ lxc info --show-log proxy-test
Name: proxy-test
Remote: unix://
Architecture: x86_64
Created: 2021/03/23 11:13 UTC
Status: Running
Type: persistent
Profiles: default
Pid: 24302
Ips:
eth0: inet    10.73.42.161    veth70RX3J
eth0: inet6   fe80::216:3eff:fe22:bdff        veth70RX3J
lo:   inet    127.0.0.1
lo:   inet6   ::1
Resources:
Processes: 62
CPU usage:
CPU usage (in seconds): 14
Memory usage:
Memory (current): 380.96MB
Memory (peak): 425.89MB
Network usage:
eth0:
Bytes received: 22.91kB
Bytes sent: 35.19kB
Packets received: 320
Packets sent: 544
lo:
Bytes received: 4.41kB
Bytes sent: 4.41kB
Packets received: 47
Packets sent: 47

Log:

lxc proxy-test 20210323111332.853 WARN     conf - conf.c:lxc_setup_devpts:1616 - Invalid argument - Failed to unmount old devpts instance

$ sudo cat /var/log/lxd/proxy-test/proxy.http.log
Failed setns to container user namespace: Invalid argument
Broken pipe - Failed to send file descriptor
Error: Failed to send file descriptor via abstract unix socket
lxc config show proxy-test --expanded
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 18.04 LTS amd64 (release) (20210319)
  image.label: release
  image.os: ubuntu
  image.release: bionic
  image.serial: "20210319"
  image.version: "18.04"
  volatile.base_image: a1225cfdd3d11210f647fd457b610773c4e8f2304427c3b5283b639d7923c69f
  volatile.eth0.hwaddr: 00:16:3e:39:fa:46
  volatile.idmap.base: "0"
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.power: RUNNING
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

I hope the information provided sufficient for someone debugging the problem, if not – please ask :wink:

Best regards
Constantin

*push*

Trying to raise awareness for this topic…

Does anyone face the same issue?

Hi!

This looks to be the affected file, https://github.com/lxc/lxd/blob/stable-3.0/lxd/main_nsexec.go and the affected line (from the first error message): https://github.com/lxc/lxd/blob/stable-3.0/lxd/main_nsexec.go#L195

I tried to replicate the issue in a LXD VM with Ubuntu 18.04 but couldn’t. The proxy device command worked for me. Here are my steps (I used LXD 4.x, from the snap package, which has VM support).

$ lxc launch ubuntu:18.04 --vm myvm --profile default --profile vm   # requires a "vm" profile, see docs.
Creating myvm
Starting myvm
$ lxc console myvm
...
$ sudo lxd init    # configured to use btrfs storage pool as in your case. I used a loop device.
...
ubuntu@myvm:~$ lxc version
Client version: 3.0.3
Server version: 3.0.3
ubuntu@myvm:~$ lxc launch ubuntu:20.04 proxy-test
To start your first container, try: lxc launch ubuntu:18.04

Creating proxy-test
Starting proxy-test                         
ubuntu@myvm:~$ lxc config device add proxy-test http proxy listen=tcp:0.0.0.0:8000 connect=tcp:127.0.0.1:80 bind=host
Device http added to proxy-test
ubuntu@myvm:~$ lxc list proxy-test
+------------+---------+-----------------------+-----------------------------------------------+------------+-----------+
|    NAME    |  STATE  |         IPV4          |                     IPV6                      |    TYPE    | SNAPSHOTS |
+------------+---------+-----------------------+-----------------------------------------------+------------+-----------+
| proxy-test | RUNNING | 10.198.185.144 (eth0) | fd42:9a20:f2a9:cc46:216:3eff:fe88:f4f1 (eth0) | PERSISTENT | 0         |
+------------+---------+-----------------------+-----------------------------------------------+------------+-----------+
ubuntu@myvm:~$ lxc restart proxy-test
ubuntu@myvm:~$
Output of lxc info

ubuntu@myvm:~$ lxc info
config: {}
api_extensions:

  • storage_zfs_remove_snapshots
  • container_host_shutdown_timeout
  • container_stop_priority
  • container_syscall_filtering
  • auth_pki
  • container_last_used_at
  • etag
  • patch
  • usb_devices
  • https_allowed_credentials
  • image_compression_algorithm
  • directory_manipulation
  • container_cpu_time
  • storage_zfs_use_refquota
  • storage_lvm_mount_options
  • network
  • profile_usedby
  • container_push
  • container_exec_recording
  • certificate_update
  • container_exec_signal_handling
  • gpu_devices
  • container_image_properties
  • migration_progress
  • id_map
  • network_firewall_filtering
  • network_routes
  • storage
  • file_delete
  • file_append
  • network_dhcp_expiry
  • storage_lvm_vg_rename
  • storage_lvm_thinpool_rename
  • network_vlan
  • image_create_aliases
  • container_stateless_copy
  • container_only_migration
  • storage_zfs_clone_copy
  • unix_device_rename
  • storage_lvm_use_thinpool
  • storage_rsync_bwlimit
  • network_vxlan_interface
  • storage_btrfs_mount_options
  • entity_description
  • image_force_refresh
  • storage_lvm_lv_resizing
  • id_map_base
  • file_symlinks
  • container_push_target
  • network_vlan_physical
  • storage_images_delete
  • container_edit_metadata
  • container_snapshot_stateful_migration
  • storage_driver_ceph
  • storage_ceph_user_name
  • resource_limits
  • storage_volatile_initial_source
  • storage_ceph_force_osd_reuse
  • storage_block_filesystem_btrfs
  • resources
  • kernel_limits
  • storage_api_volume_rename
  • macaroon_authentication
  • network_sriov
  • console
  • restrict_devlxd
  • migration_pre_copy
  • infiniband
  • maas_network
  • devlxd_events
  • proxy
  • network_dhcp_gateway
  • file_get_symlink
  • network_leases
  • unix_device_hotplug
  • storage_api_local_volume_handling
  • operation_description
  • clustering
  • event_lifecycle
  • storage_api_remote_volume_handling
  • nvidia_runtime
  • candid_authentication
  • candid_config
  • candid_config_key
  • usb_optional_vendorid
    api_status: stable
    api_version: “1.0”
    auth: trusted
    public: false
    auth_methods:
  • tls
    environment:
    addresses: []
    architectures:
    • x86_64
    • i686
      certificate: |
      -----BEGIN CERTIFICATE-----
      REDACTED
      -----END CERTIFICATE-----
      certificate_fingerprint: b5a2019d224aac8d9e3d87961144107c84c5c0f9e8a6dfd7e7e8f8ec56857978
      driver: lxc
      driver_version: 3.0.3
      kernel: Linux
      kernel_architecture: x86_64
      kernel_version: 4.15.0-140-generic
      server: lxd
      server_pid: 1223
      server_version: 3.0.3
      storage: btrfs
      storage_version: 4.15.1
      server_clustered: false
      server_name: myvm
      project: “”
      ubuntu@myvm:~$

You have described quite well your environment. I do not know whether this issue does not show up in a VM. I am running Ubuntu 18.04 LTS as well, but switched to the snap package (hence, running LXD 4.12).

There has been a similar report affecting nearby code, it has been found to be reproducible and fixed, https://github.com/lxc/lxd/issues/6112

The next step is to figure out whether this issue does happen on a pristine Ubuntu 18.04 or whether there has been some other configuration that is affecting LXD.

1 Like

When you get the error please can you provide the output of sudo ss -tlpn on your LXD host?

Also, is it possible to use LXD 4.0 LTS release or higher as there have been some fixes to the proxy listening code in the past that may be relevant here.

Hello @simos and @tomp,

thanks to both of you for replies.

@simos
In contrast to your experiment I worked with containers only when experiencing the problem and have not tried it with VMs yet.

Since server setup there have been quite some modifications to the initial configuration which I may not fully be aware of. But it seems to me that the error occurred after an system software update and not after making changes to LXD configuration.

What is the preferred way to migrate from DEB-package installation to the newer SNAP version?


@tomp
ss gives the following output, missing the ports listened on by the configured proxy devices:

➜  ~ sudo ss -tlpn
State  Recv-Q   Send-Q                           Local Address:Port      Peer Address:Port
LISTEN 0        32                                  10.73.42.1:53             0.0.0.0:*      users:(("dnsmasq",pid=2787,fd=7))
LISTEN 0        32                                   127.0.0.1:53             0.0.0.0:*      users:(("dnsmasq",pid=1783,fd=5))
LISTEN 0        32                                    10.0.3.1:53             0.0.0.0:*      users:(("dnsmasq",pid=1385,fd=7))
LISTEN 0        128                                    0.0.0.0:2233           0.0.0.0:*      users:(("sshd",pid=1750,fd=3))
LISTEN 0        128                             [WAN_IP_REDACTED]:8443           0.0.0.0:*      users:(("lxd",pid=2543,fd=22))
LISTEN 0        128                                  127.0.0.1:8125           0.0.0.0:*      users:(("netdata",pid=4383,fd=83))
LISTEN 0        128                                  127.0.0.1:8126           0.0.0.0:*      users:(("trace-agent",pid=1582,fd=8))
LISTEN 0        128                                    0.0.0.0:19999          0.0.0.0:*      users:(("netdata",pid=4383,fd=4))
LISTEN 0        128                                  127.0.0.1:5000           0.0.0.0:*      users:(("agent",pid=1581,fd=8))
LISTEN 0        128                                  127.0.0.1:5001           0.0.0.0:*      users:(("agent",pid=1581,fd=9))
LISTEN 0        128                                  127.0.0.1:6379           0.0.0.0:*      users:(("redis-server",pid=1731,fd=6))
LISTEN 0        128                                  127.0.0.1:6062           0.0.0.0:*      users:(("process-agent",pid=1626,fd=10))
LISTEN 0        32          [fe80::94d3:dfff:fe7f:2f81]%lxdbr0:53                [::]:*      users:(("dnsmasq",pid=2787,fd=9))
LISTEN 0        32                                       [::1]:53                [::]:*      users:(("dnsmasq",pid=1783,fd=7))
LISTEN 0        128                                       [::]:2233              [::]:*      users:(("sshd",pid=1750,fd=4))
LISTEN 0        128                                       [::]:19999             [::]:*      users:(("netdata",pid=4383,fd=5))
LISTEN 0        128                                      [::1]:6379              [::]:*      users:(("redis-server",pid=1731,fd=7))

Best regards,
Constantin

You should be able to do:

sudo snap install lxd --channel=4.0/stable
sudo lxd.migrate

See also Managing the LXD snap