Linux for Tegra (L4T) networking issues

Hey Forum – this feels like a basic question with a basic networking answer that I ought to be able to find with google, but I’ve so far struck out.

I’m running Linux for Tegra (based on Ubuntu 18.04) on an NVIDIA Jetson TX2. I’m trying to make use of LXD containers for a few topics and am struggling with networking. Here’s my setup steps:

sudo snap install lxd
sudo lxd init
[accept the defaults for all values]
lxc launch ubuntu:18.04 test
lxc ls

The output of lxc ls shows that my container has no IP address. If I get a shell via lxc exec test bash I’m unable to ping due to ping: socket: operation not permitted.

I can get around this by making the container privileged via lxc config set test security.privileged true and lxc restart test – now I get IP addresses and can ping.

I’m curious why this is – I’m new to LXD but have been using it extensively on Ubuntu 18.04/20.04 and have never needed to run a container as privileged before. Furthermore I’m having some other issues running snapcraft which I fear might be related.

Can you show output of the following commands please:

lxc network ls
lxc network show lxdbr0
lxc config show test --expanded
$ lxc network ls
+--------+----------+---------+-------------+---------+
|  NAME  |   TYPE   | MANAGED | DESCRIPTION | USED BY |
+--------+----------+---------+-------------+---------+
| eth0   | physical | NO      |             | 0       |
+--------+----------+---------+-------------+---------+
| l4tbr0 | bridge   | NO      |             | 0       |
+--------+----------+---------+-------------+---------+
| lxdbr0 | bridge   | YES     |             | 3       |
+--------+----------+---------+-------------+---------+
| rndis0 | physical | NO      |             | 0       |
+--------+----------+---------+-------------+---------+
| usb0   | physical | NO      |             | 0       |
+--------+----------+---------+-------------+---------+
| wlan0  | physical | NO      |             | 0       |
+--------+----------+---------+-------------+---------+
$ lxc network show lxdbr0
config:
  ipv4.address: 10.26.226.1/24
  ipv4.nat: "true"
  ipv6.address: none
description: ""
name: lxdbr0
type: bridge
used_by:
- /1.0/instances/novel-amoeba
- /1.0/instances/snapbuilder
- /1.0/instances/snapcraft-pds
- /1.0/instances/test
managed: true
status: Created
locations:
- none
lxc config show test --expanded
architecture: aarch64
config:
  image.architecture: arm64
  image.description: ubuntu 18.04 LTS arm64 (release) (20200506)
  image.label: release
  image.os: ubuntu
  image.release: bionic
  image.serial: "20200506"
  image.type: squashfs
  image.version: "18.04"
  volatile.base_image: 5adf1912b637d1511279c0471ab56d3374c49f21774c094acd41330bbb9548e5
  volatile.eth0.host_name: veth5cabaa5c
  volatile.eth0.hwaddr: 00:16:3e:c0:25:6c
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
devices:
  eth0:
    name: eth0
    network: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

Hey @tomp any red flags in my config output?

No that all seems fine.

Can you add an IP manually inside the container and try pinging the bridge, e.g.

lxc exec test -- ip a add 10.26.226.2/24 dev eth0
lxc exec test -- ping 10.26.226.1

Assigning the IP address worked (and is reflected in lxc ls) but for ping, I get ping: socket: Operation not permitted

This is consistent with my description – I suspect if I make the container privileged ping will work, but I also suspect I’ll get IP addresses from DHCP.

I wonder if there is some sort of additional security settings on the host OS that is preventing unprivileged users from sending raw packets (most likely used for ping and DHCP).

@stgraber @brauner is there anything you can think of that could cause this behavior?

Feels like it could be a kernel restriction or LSM maybe…

Anything suspicious in dmesg?

@brauner

Is ping either setid or has the CAP_NET_RAW capability set?

Running as root though

Hey just following up that I’m still interested in working through this topic – LXD on Linux for Tegra will be critical for building snaps targeting the platform.

Do let me know if there’s additional information I can collect!

Can we get a login to it?

I’ll do my best to set that up today. However in the recent past I was unsuccessful getting SSH ports forwarded through my Verizon FIOS router so I’m not optimistic.

@tomp – actually looks like ssh port forwarding to the device is working. No idea what was going on the last time I tried.

Anyway – how do you want to handle access? Maybe you can share a public key and I’ll create a sudoers account on the device for you?

FYI as mentioned over on the snapcraft thread I’ve managed to get snapcraft building snaps inside an LXD container on L4T through a combination of a privileged container, exclusion from apparmor, and some other raw.lxc changes.

I’m not sure whether this is actually the solution to the core LXD problem on L4T, but I also don’t really care since my objective is just to build snaps inside LXD containers on this platform. I’m happy to finish helping debug if the core LXD team is interested to do so, though – let me know.

My ssh public key is here: https://launchpad.net/~tomparrott/+sshkeys

sudo root access would be needed.

Thanks

ssh uskellse@108.52.92.202 -p 55555 should get you in. Passwordless sudo enabled. Feel free to modify anything; no critical data or state is stored on the machine.

Please let me know when you’re done so I can lock everything back down… :smiley:

Thanks, so I took a look.

You’re running the 4.9.140-tegra kernel, which isn’t a standard ubuntu kernel, so there could be something unusual in its configuration.

I did notice however that if I add an IP inside the container manually using:

ip a add 10.1.185.2/24 dev eth0

Then I can ping that IP from the LXD host successfully, proving the bridge and veth-pair is running OK.

Additionally I can actually make DNS requests from inside the container to the LXD bridge, e.g.:

dig @10.1.185.1 www.google.com

; <<>> DiG 9.11.3-1ubuntu1.12-Ubuntu <<>> @10.1.185.1 www.google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49501
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.google.com.			IN	A

;; ANSWER SECTION:
www.google.com.		123	IN	A	172.217.7.228

So outbound traffic is allowed too.

Also interestingly if I remove the static IP and then run the dhclient command manually, DHCP succeeds and a dynamic IP and default is configured.

This then allows me to do curl http://www.google.com successfully also.

So it seems that networking is up and running OK, but that any application that tries to access (I’m guessing) raw sockets is denied.

@stgraber @brauner is there anything that can block raw socket access running as root inside a user namespace?

strace ping 10.1.185.1, so it does look like RAW sockets are blocked:

socket(AF_INET, SOCK_DGRAM, IPPROTO_ICMP) = -1 EACCES (Permission denied)
socket(AF_INET, SOCK_RAW, IPPROTO_ICMP) = -1 EPERM (Operation not permitted)
socket(AF_INET6, SOCK_DGRAM, IPPROTO_ICMPV6) = -1 EACCES (Permission denied)
socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6) = -1 EPERM (Operation not permitted)

Has cap_net_raw:

capsh --print | grep cap_net_raw
Current: = cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read+ep
Bounding set =cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read

So I asked @stgraber whether he had any ideas on what could be causing the issue and he downloaded the Tegra custom kernel source and tracked the issue down to what appears to be a bug that has been introduced into the custom kernel. When opening raw sockets, rather than checking the namespace capabilities (which it does in the vanilla kernel) it is checking the global capabilities in the root namespace. And as the container is running unprivileged it does not have global CAP_NET_RAW capability and fails.

I’m not sure if you can see this, but the diff is here: https://paste.ubuntu.com/p/rJZ8hfhFHD/

So this is a custom kernel issue and not something we can fix I’m afraid.