tomp
(Thomas Parrott)
November 23, 2022, 8:10pm
23
Can you also look in /var/snap/lxd/common/lxd/logs/lxd.log
for a message like Failed to initialize fanotify, falling back on fsnotify
Because if you can make your system offer fanotify then you wont have the same issue as it doesn’t look for MODIFY events
return nil
}
var err error
d.fd, err = unix.FanotifyInit(unix.FAN_CLOEXEC|unix.FAN_REPORT_DFID_NAME, unix.O_CLOEXEC)
if err != nil {
return fmt.Errorf("Failed to initialize fanotify: %w", err)
}
err = unix.FanotifyMark(d.fd, unix.FAN_MARK_ADD|unix.FAN_MARK_FILESYSTEM, unix.FAN_CREATE|unix.FAN_DELETE|unix.FAN_ONDIR, unix.AT_FDCWD, d.prefixPath)
if err != nil {
_ = unix.Close(d.fd)
return fmt.Errorf("Failed to watch directory %q: %w", d.prefixPath, err)
}
fd, err := unix.Open(d.prefixPath, unix.O_DIRECTORY|unix.O_RDONLY|unix.O_CLOEXEC, 0)
if err != nil {
_ = unix.Close(d.fd)
return fmt.Errorf("Failed to open directory %q: %w", d.prefixPath, err)
}
Aleks
(Aleksandar Ivanisevic)
November 23, 2022, 8:14pm
24
its debian 11.5 bullseye, kernel 5.10.0-19-amd64
lxc info:
config:
cluster.https_address: qbs01:8443
core.https_address: qbs01:8443
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source [305/41265]
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- candid_authentication
- backup_compression
- candid_config
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- candid_config_key
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool [244/41265]
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- rbac
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles [183/41265]
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update [122/41265]
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication [61/41265]
- images_target_project
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
addresses:
- qbs01:8443
architectures:
- x86_64
- i686
certificate: |
-----BEGIN CERTIFICATE-----
MIICADCCAYegAwIBAgIRAKsjuP4UkKOM3yDYb//yNQkwCgYIKoZIzj0EAwMwMzEc
MBoGA1UEChMTbGludXhjb250YWluZXJzLm9yZzETMBEGA1UEAwwKcm9vdEBxYnMw
MTAeFw0yMjExMDMwODE5NDFaFw0zMjEwMzEwODE5NDFaMDMxHDAaBgNVBAoTE2xp
bnV4Y29udGFpbmVycy5vcmcxEzARBgNVBAMMCnJvb3RAcWJzMDEwdjAQBgcqhkjO
PQIBBgUrgQQAIgNiAAQxEUeY2Yyj4kPxZ8Lo95JhDE2DR+SHgx37o+S4UIKvlINV
hfUdL3PH7pFOow2Gn5LC3R3tM9e3AwnaB1ahAF+yJIUDVh1TFuApFc3JtXeShXTL
3oGunVbwYhaHIjcSSLWjXzBdMA4GA1UdDwEB/wQEAwIFoDATBgNVHSUEDDAKBggr
BgEFBQcDATAMBgNVHRMBAf8EAjAAMCgGA1UdEQQhMB+CBXFiczAxhwR/AAABhxAA
AAAAAAAAAAAAAAAAAAABMAoGCCqGSM49BAMDA2cAMGQCMDbVCeEsIQAWpX7GCFIe
IZ+840r/vapLX/WTZzj1+FaC4+dhDMOLa/4gce5ieQgtiwIwEW0RmGh8x3JaiPMT
MznMy4s+bS6HSuwDWicVLY83jUtJZsMg8TWeN3EyI1fh2XuZ
-----END CERTIFICATE-----
certificate_fingerprint: 84f825d06b5db6935feab6edfdd8ccd1633839c0ed90cd30946459837b7a427e
driver: lxc | qemu
driver_version: 5.0.1 | 7.1.0
firewall: nftables [0/41265]
kernel: Linux
kernel_architecture: x86_64
kernel_features:
idmapped_mounts: "false"
netnsid_getifaddrs: "true"
seccomp_listener: "true"
seccomp_listener_continue: "true"
shiftfs: "false"
uevent_injection: "true"
unpriv_fscaps: "true"
kernel_version: 5.10.0-19-amd64
lxc_features:
cgroup2: "true"
core_scheduling: "true"
devpts_fd: "true"
idmapped_mounts_v2: "true"
mount_injection_file: "true"
network_gateway_device_route: "true"
network_ipvlan: "true"
network_l2proxy: "true"
network_phys_macvlan_mtu: "true"
network_veth_router: "true"
pidfd: "true"
seccomp_allow_deny_syntax: "true"
seccomp_notify: "true"
seccomp_proxy_send_notify_fd: "true"
os_name: Debian GNU/Linux
os_version: "11"
project: default
server: lxd
server_clustered: false
server_event_mode: full-mesh
server_name: qbs01
server_pid: 4024973
server_version: "5.7"
storage: zfs
storage_version: 2.0.3-9
storage_supported_drivers:
- name: btrfs
version: 5.4.1
remote: false
- name: ceph
version: 15.2.16
remote: true
- name: cephfs
version: 15.2.16
remote: true
- name: cephobject
version: 15.2.16
remote: true
- name: dir
version: "1"
remote: false
- name: lvm
version: 2.03.07(2) (2019-11-30) / 1.02.167 (2019-11-30) / 4.43.0
remote: false
- name: zfs
version: 2.0.3-9
remote: false
Aleks
(Aleksandar Ivanisevic)
November 23, 2022, 8:20pm
25
indeed there is an error message in the log
$ sudo fgrep notify /var/snap/lxd/common/lxd/logs/lxd.log
time="2022-11-23T14:33:41+01:00" level=warning msg="Failed to initialize fanotify, falling back on fsnotify" err="Failed to watch directory \"/dev\": no such device"
but, fanotify seems to be compiled in the kernel, why is lxd complaining then?
$ sudo cat /boot/config-$(uname -r) | grep FANOTIFY
CONFIG_FANOTIFY=y
CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
sdeziel
(Simon Deziel)
November 23, 2022, 8:30pm
26
If /dev
doesn’t exist, falling back to fsnotify will probably not help, is it?
tomp
(Thomas Parrott)
November 23, 2022, 8:30pm
27
I can recreate the problem inside a LXD VM using images:debian/11
and /dev
does exist.
tomp
(Thomas Parrott)
November 23, 2022, 8:31pm
28
The error doesn’t mean /dev doesn’t exist, sometimes syscalls return errors for things like “not found” but they come across as giving a different meaning when out of context.
1 Like
sdeziel
(Simon Deziel)
November 23, 2022, 8:33pm
29
FYI, I’m using Ubuntu on the host and in the containers and I used to get that error too:
root@log:~# grep 'Failed to initialize fanotify' /var/log/archives/2022/2022-*syslog
/var/log/archives/2022/2022-01-syslog:Jan 4 20:53:28 ocelot lxd.daemon[1603]: t=2022-01-04T20:53:28+0000 lvl=warn msg="Failed to initialize fanotify, falling back on fsnotify" err="Failed to watch directory \"/dev\": no such device"
/var/log/archives/2022/2022-01-syslog:Jan 4 22:45:09 ocelot lxd.daemon[1623]: t=2022-01-04T22:45:09+0000 lvl=warn msg="Failed to initialize fanotify, falling back on fsnotify" err="Failed to watch directory \"/dev\": no such device"
/var/log/archives/2022/2022-01-syslog:Jan 6 22:28:45 ocelot lxd.daemon[1626]: t=2022-01-06T22:28:45+0000 lvl=warn msg="Failed to initialize fanotify, falling back on fsnotify" err="Failed to watch directory \"/dev\": no such device"
/var/log/archives/2022/2022-01-syslog:Jan 9 20:05:47 xeon lxd.daemon[3892352]: t=2022-01-09T20:05:47+0000 lvl=warn msg="Failed to initialize fanotify, falling back on fsnotify" err="Failed to watch directory \"/dev\": no such device"
/var/log/archives/2022/2022-01-syslog:Jan 9 20:54:49 c2d lxd.daemon[541879]: t=2022-01-09T20:54:49+0000 lvl=warn msg="Failed to initialize fanotify, falling back on fsnotify" err="Failed to watch directory \"/dev\": no such device"
/var/log/archives/2022/2022-01-syslog:Jan 9 20:54:24 ocelot lxd.daemon[3789341]: t=2022-01-09T20:54:24+0000 lvl=warn msg="Failed to initialize fanotify, falling back on fsnotify" err="Failed to watch directory \"/dev\": no such device"
/var/log/archives/2022/2022-01-syslog:Jan 11 16:11:09 ocelot lxd.daemon[1613]: t=2022-01-11T16:11:09+0000 lvl=warn msg="Failed to initialize fanotify, falling back on fsnotify" err="Failed to watch directory \"/dev\": no such device"
/var/log/archives/2022/2022-01-syslog:Jan 13 17:42:13 ocelot lxd.daemon[1633]: t=2022-01-13T17:42:13+0000 lvl=warn msg="Failed to initialize fanotify, falling back on fsnotify" err="Failed to watch directory \"/dev\": no such device"
Those are from January and I was probably using 20.04 at that time, while I’m now using 22.04.
tomp
(Thomas Parrott)
November 23, 2022, 8:36pm
30
Yeah so seems like a kernel thing then. fannotify isn’t new, but LXD uses FAN_REPORT_DFID_NAME which only became available in LXD 5.9. That explains Focal then.
https://man7.org/linux/man-pages/man2/fanotify_init.2.html
But not Debian, which is using 5.10, unless it didn’t get enabled/included.
sdeziel
(Simon Deziel)
November 23, 2022, 8:40pm
31
Back in January, I was running 20.04 with the then current HWE kernel 5.11.0
tomp
(Thomas Parrott)
November 23, 2022, 8:41pm
32
I dont know, looks like we are needing something more recent.
@monstermunchkin may be able to shed some light on this as he added that functionality.
I’ll look into this. I don’t remember why FAN_REPORT_DFID_NAME had to be used.
1 Like
amikhalitsyn
(Aleksandr Mikhalitsyn)
November 24, 2022, 9:33am
35
Most likely ENODEV
comes from here:
int err;
/*
* Make sure path is not in filesystem with zero fsid (e.g. tmpfs).
*/
err = vfs_get_fsid(path->dentry, fsid);
if (err)
return err;
if (!fsid->val[0] && !fsid->val[1])
return -ENODEV;
/*
* Make sure path is not inside a filesystem subvolume (e.g. btrfs)
* which uses a different fsid than sb root.
*/
err = vfs_get_fsid(path->dentry->d_sb->s_root, &root_fsid);
if (err)
return err;
if (root_fsid.val[0] != fsid->val[0] ||
It’s only required when one of FAN_REPORT_FID | FAN_REPORT_DFID_NAME
bits is set.
It means that filesystem which is mounted on /dev
is not supporting exportfs operations (open_by_handle_at/name_to_handle_at syscalls).
@Aleks , @sdeziel couldn’t you show mount | grep "/dev"
from your systems?
tomp
(Thomas Parrott)
November 24, 2022, 9:39am
36
Yeah I was thinking that on IRC last night with @stgraber as /dev is devtmpfs
1 Like
amikhalitsyn
(Aleksandr Mikhalitsyn)
November 24, 2022, 9:55am
37
On my system /dev
is also devtmpfs
but fanotifies are working well.
The reason that I have this patch in my kernel:
https://lore.kernel.org/all/20210322173944.449469-3-amir73il@gmail.com/
It enables shmem/tmpfs/devtmpfs filesystems to report statfs f_fsid
appropriately.
1 Like
tomp
(Thomas Parrott)
November 24, 2022, 9:56am
38
Right, so that is why we need a recent kernel to support fanotify, thanks.
tomp
(Thomas Parrott)
November 24, 2022, 10:15am
39
On your small test program above, what PATH were using?
LXD appears to only add watchers on directories, so /dev/net and not directly on /dev/net/tun.
So I’m wondering why listening for MODIFY watchers on a directory also gets events when files inside that directory get written to.
Were you watching /dev/net/tun directly?
tomp
(Thomas Parrott)
November 24, 2022, 10:20am
40
Can you try and recreate the issue using LXD rather than your own test program please?
I’ve not got openvpn setup, but have just tried writing to a block device under /dev and with fsnotify and do not see the same issue.
amikhalitsyn
(Aleksandr Mikhalitsyn)
November 24, 2022, 10:21am
41
Were you watching /dev/net/tun directly?
I’ve played with different options /dev/net
, /dev/net/tun
. The same result.
Edit: /dev
is not working with my simple program
tomp
(Thomas Parrott)
November 24, 2022, 10:25am
42
Ah I have logged the events and can see writes to /dev/tty
now. So yes I see the problem.
amikhalitsyn
(Aleksandr Mikhalitsyn)
November 24, 2022, 10:35am
43
So I’m wondering why listening for MODIFY watchers on a directory also gets events when files inside that directory get written to.
As far as I understand, if inotify watch is set on the directory then it looks for events for all files/directories in this directory, but only for 1 level. So, if watch is set to /dev it will not see /dev/net/tun, but we are recursively adding watches for nested directories:
https://github.com/lxc/lxd/blob/master/lxd/fsmonitor/drivers/driver_fsnotify.go#L70