TCMC
June 11, 2023, 10:33pm
1
Hey everyone! I need some help configuring SWAP for LXD Containers on Ubuntu 22.04.2 with Linux 5.19 and LXD 5.14. Is this related to cgroupv2 only now with the latest Ubuntu/Linux versions? If it is, could someone guide me on the proper and easy way to get this done nowadays?
I’m curious about the current state of SWAP accounting in LXD/LXC with Ubuntu 22.04.2 and CGroupV2. I’ve come across various methods, but unfortunately, I haven’t been able to make it work on a fresh installation.
To give you an idea, here’s an example of my LXD containers configuration:
config:
limits.cpu: "16"
limits.memory: 64GB
limits.memory.enforce: hard
limits.memory.swap: "false"
The issue is that when I check the container, there doesn’t seem to be any swap space available. It shows a size of zero.
I’ve tried using the swapaccount
Linux option and experimenting with different values for the other options, but so far, no luck in getting the swap working in the LXD containers.
I’m curious why it’s not as straightforward as something like this (for example):
limits.memory.swap: 32G
Any insights or suggestions would be greatly appreciated!
References:
Hello:
Swap space on the host is ok:
manager@andromeda:~$ sudo swapon --show
NAME TYPE SIZE USED PRIO
/dev/zram0 partition 2G 256K 5
/dev/zram1 partition 2G 256K 5
/dev/zram2 partition 2G 256K 5
/dev/zram3 partition 2G 0B 5
/swapfile file 2G 0B -2
manager@andromeda:~$ free -h
total used free shared buff/cache available
Mem: 15G 11G 807M 53M 3.0G 3.5G
Swap: 9…
How to enable swap support in lxc? (actually, I see same issue in LXD)
I have this in lxc/config “lxc.cgroup2.memory.max = 2G”
and this in grub " cgroup_enable=memory swapaccount=1"
uname = Linux LXCHOST 5.10.0-8-amd64 #1 SMP Debian 5.10.46-1 (2021-06-24) x86_64 GNU/Linux
root@lxc:~# free
total used free shared buff/cache available
Mem: 2097152 127228 1652728 1584 317196 1969924
Swap: 0 0 0
opened 03:57PM - 26 Apr 22 UTC
When checking swap on container, reports 0.
```swapaccount and cgroup_enable=… memory``` are active on grub_cmdline
```/boot/config... ``` show that ```CONFIG_MEMECG_SWAP``` is enabled in the kernel
But here is the result when lauching lxcfs :
```
Running constructor lxcfs_init to reload liblxcfs
mount namespace: 5
hierarchies:
0: fd: 6: cpuset,cpu,io,memory,hugetlb,pids,rdma
Kernel supports pidfds
Kernel does not support swap accounting
api_extensions:
- cgroups
- sys_cpu_online
- proc_cpuinfo
- proc_diskstats
- proc_loadavg
- proc_meminfo
- proc_stat
- proc_swaps
- proc_uptime
- shared_pidns
- cpuview_daemon
- loadavg_daemon
- pidfds
```
It tells me that Kernel does not support swap accouting
Is that a problem on LXCFS ?
opened 09:34PM - 15 Aug 22 UTC
closed 01:23AM - 16 Aug 22 UTC
# Required information
* Distribution:
Description: Debian GNU/Linux 11 (bu… llseye)
* More details
Linux gre 5.4.199-odroidxu4 lxc/lxd#22.05.3 SMP PREEMPT Wed Jun 22 07:29:40 UTC 2022 armv7l
* The output of "lxc info" or if that fails:
```
/snap/bin/lxc info
config:
core.https_address: 192.168.0.220:8443
core.trust_password: true
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_Release: 11
Codename: bullseyehandling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- candid_authentication
- backup_compression
- candid_config
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- candid_config_key
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- rbac
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
addresses:
- 192.168.0.220:8443
architectures:
- armv7l
certificate: |
-----BEGIN CERTIFICATE-----
MIIB+zCCAYGgAwIBAgIRAPEb677D8ktTOR+FKWyT6IkwCgYIKoZIzj0EAwMwMTEc
...
o+v2Ap/P/pIp3/EFLauErVYDFkLgssfapBscjow3ug==
-----END CERTIFICATE-----
certificate_fingerprint: 50f66adbac44369300aa0c06654011ed860fd3226e6f2326d9ae656ee4802c97
driver: lxc
driver_version: 5.0.1
firewall: nftables
kernel: Linux
kernel_architecture: armv7l
kernel_features:
idmapped_mounts: "false"
netnsid_getifaddrs: "true"
seccomp_listener: "true"
seccomp_listener_continue: "false"
shiftfs: "false"
uevent_injection: "true"
unpriv_fscaps: "true"
kernel_version: 5.4.199-odroidxu4
lxc_features:
cgroup2: "true"
core_scheduling: "true"
devpts_fd: "true"
idmapped_mounts_v2: "true"
mount_injection_file: "true"
network_gateway_device_route: "true"
network_ipvlan: "true"
network_l2proxy: "true"
network_phys_macvlan_mtu: "true"
network_veth_router: "true"
pidfd: "true"
seccomp_allow_deny_syntax: "true"
seccomp_notify: "true"
seccomp_proxy_send_notify_fd: "true"
os_name: Debian GNU/Linux
os_version: "11"
project: default
server: lxd
server_clustered: false
server_event_mode: full-mesh
server_name: gre
server_pid: 1800
server_version: "5.4"
storage: btrfs
storage_version: 5.4.1
storage_supported_drivers:
- name: btrfs
version: 5.4.1
remote: false
- name: cephfs
version: 15.2.16
remote: true
- name: dir
version: "1"
remote: false
- name: lvm
version: 2.03.07(2) (2019-11-30) / 1.02.167 (2019-11-30) / 4.41.0
remote: false
- name: ceph
version: 15.2.16
remote: true
```
# Issue description
no swap inside my container
```
/snap/bin/lxc exec yunh1 -- free -m
total used free shared buff/cache available
Mem: 1990 13 1969 7 7 1977
Swap: 0 0 0
```
# Steps to reproduce
create container the free -hin the container show no swap available
# Information to attach
- [x] Any relevant kernel output (`dmesg`)
` Kernel command line: console=ttySAC2,115200n8 console=tty1 consoleblank=0 loglevel=1 root=UUID=50536e32-d033-4595-bfcd-113e6053042e rootfstype=btrfs rootwait rw smsc95xx.macaddr=00:xxx55 governor=performance hdmi_tx_amp_lvl=31 hdmi_tx_lvl_ch0=3 hdmi_tx_lvl_ch1=3 hdmi_tx_lvl_ch2=3 hdmi_tx_emp_lvl=6 hdmi_clk_amp_lvl=31 hdmi_tx_res=0 HPD=true vout=hdmi usb-storage.quirks=0x2537:0x1066:u,0x2537:0x1068:u swapaccount=1 cgroup_enable=memory cgroup_memory=1`
- [x] Container log (`lxc info NAME --show-log`)
```
/snap/bin/lxc info yunh1 --show-log
Name: yunh1
Status: RUNNING
Type: container
Architecture: armv7l
PID: 3597
Created: 2022/08/15 22:13 CEST
Last Used: 2022/08/15 23:20 CEST
Resources:
Processes: 6
CPU usage:
CPU usage (in seconds): 16
Memory usage:
Memory (current): 19.95MiB
Network usage:
eth0:
Type: broadcast
State: UP
Host interface: enx00xxx6b6
MAC address: 00:xxx:2e
MTU: 1500
Bytes received: 3.45kB
Bytes sent: 2.31kB
Packets received: 24
Packets sent: 23
IP addresses:
inet: 192.168.0.153/24 (global)
inet6: 2a01:xxx:7b2e/64 (global)
inet6: fe80:xxx:7b2e/64 (link)
lo:
Type: loopback
State: UP
MTU: 65536
Bytes received: 0B
Bytes sent: 0B
Packets received: 0
Packets sent: 0
IP addresses:
inet: 127.0.0.1/8 (local)
inet6: ::1/128 (local)
Log:
lxc yunh1 20220815212056.933 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc yunh1 20220815212056.934 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc yunh1 20220815212056.936 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc yunh1 20220815212056.937 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc yunh1 20220815212104.254 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc yunh1 20220815212104.255 WARN conf - ../src/src/lxc/conf.c:lxc_map_ids:3598 - newgidmap binary is missing
```
- [x] Container configuration (`lxc config show NAME --expanded`)
```
/snap/bin/lxc config show yunh1 --expanded
architecture: armv7l
config:
image.architecture: armhf
image.description: Debian buster armhf (20220815_05:25)
image.os: Debian
image.release: buster
image.serial: "20220815_05:25"
image.type: squashfs
image.variant: default
limits.memory.swap: "true"
security.syscalls.intercept.sysinfo: "true"
volatile.base_image: 888106799c2e7b4a3ed66b5a0a5e32e53a65bdadabaf25e03adf80f71fabb1a2
volatile.cloud-init.instance-id: 89bafb63-157e-4f67-b6b5-b1c188f3048a
volatile.eth0.host_name: mace6df26bd
volatile.eth0.hwaddr: 00:1xxx:2e
volatile.eth0.last_state.created: "false"
volatile.eth0.name: eth0
volatile.idmap.base: "0"
volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.power: RUNNING
volatile.uuid: d1606124-a0ca-4479-9e6f-2491e9a5d497
devices:
eth0:
nictype: macvlan
parent: enx001e063686b6
type: nic
root:
path: /
pool: default
type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""
```
- [ ] Main daemon log (at /var/log/lxd/lxd.log or /var/snap/lxd/common/lxd/logs/lxd.log)
- [ ] Output of the client with --debug
- [ ] Output of the daemon with --debug (alternatively output of `lxc monitor` while reproducing the issue)
Thanks in advance, cheers!
tomp
(Thomas Parrott)
June 12, 2023, 7:35am
2
Any thoughts on this one @amikhalitsyn ?
TCMC
June 30, 2023, 12:59pm
3
Hey guys, any tips about this?
This problem is the only blocker for me to move forward with a nice LXD project!
tomp
(Thomas Parrott)
July 3, 2023, 7:05am
4
Hi,
What is the reason you need to see the swap available inside the container?
I think this document may help to explain the reasons why swap allocation in containers is not straightforward:
https://github.com/lxc/lxcfs#swap-handling
But generally speaking one should increase the limits.memory
to the amount of overall memory required and then let the OS manage swapping centrally. However you can use the other settings you mentioned to indicate to the OS the propensity to swapping it should have.
1 Like
We are also hitting this swap issue and it is very much a blocking issue for us. I will layout the scenario as much as possible.
Env:
Ubuntu 22.04 LXD host and containers
Using ubuntu 6.1.x mainline kernel
Hardware: Intel 13900k with 128G ram
Swap is enabled on host and set to 128G on fast nvme.
Usage:
We have a container for remote ide (Jetbrain gateway) per developer
Each container has access to a gpu which can also do training/inference.
Issue:
Container cannot see any swap regardless of toggle mentioned here or in other lxd posts/docs. Host is also showing zero swap usage from containers.
As such, we are getting linux process.fork() faults due to memory on the containers where there are some physical ram left and full 128g of swap available.
Whatever the developers are running, ide, python scripts, etc are using lots of ram and/or also allocating lots of virtual memory.
Without proper swap, a fully functional host with 100% swap available is brought to a halt unless we kill some processes to relief memory pressure. There are only 4 memory slots on this machine so we can’t expand further.
paulocoghi
(Paulo Coghi)
September 9, 2023, 4:37pm
6
Using LXC with cgroups v2 instead of LXD, at least the “swap” line is being printed when running the “free” command inside a container.
But it always has a value of 0 (on “free” output), despite setting a non zero value with lxc-cgroup -n my-container memory.swap.high X
, where X could be anything you want, like 128G.
At least I can confirm cgroups v2 is “aware” of it, because the command lxc-cgroup -n my-container memory.swap.high
(to check the current defined value) returns the correct one (non-zero value).