How can I see available quota for an instance in the Incus Metrics?

Hi,

I want to be able to track how close an instance is to filling its assigned disk quota in grafana.
On a previous cluster (running LXD), I could easily track this with the metrics:

lxd_filesystem_size_bytes{mountpoint="/"} - lxd_filesystem_avail_bytes{mountpoint="/"}

This system was using LVM for it’s storage pools and the metric for each instance would accurately reflect how much space was remaining on the instance, making it simple to find out if an instance had run out of its assigned quota.

We now have a new cluster running incus and this time the storage backend is BTFS. Running the same query as above (With the lxd_incus_ substitution), the graph is wildly off, seemingly showing that each instance has the ability to consume the entire disk and summing disk usage across all instances. I can confirm that disk quotas are working as several instances have hit our lower limits and needed increments, but unfortunately we cannot track or alert on this as we could previously.

Is there a better way to track the disk usage and available quota per instance in the metrics? and is this just a limitation of BTFS, or does it also apply to any of the other storage backends?

Context:

  • incus v6.0
  • default storage pool is local disk on each cluster member
    • formatted with btrfs
    • incus has the entire disk.
1 Like

If you have BTRFS, incus is utilising subvolumes of BTRFS. So you can just monitor these subvolumes using tools related to to BTRFS itself, without any incus interference.

Do you have any advice or suggestions for tools that could help with that?

Ideally I want something that can export this information to our monitoring system, so we can track the usage across our instances and clusters.

A quick search online is telling me that I can run btrfs qgroup show -reF /path/to/Subvolume on a machine to see the current usage and limit on a subvolume (command listed on ArchWIki- BTRFS for checking quota of a subvolume).

But that doesn’t integrate nicely into any metrics system:

  • I’d have to add custom metrics exporters onto any cluster members with BTRFS storage pools.
  • Adjust all dashboards to try and determine whether an instance is using a storage method with viable disk usage reporting in incus metrics, or whether it’s using BTRFS and thus needs to use data from another source

I have similar requirements and I don’t think such thing exists.

I am using container snapshots which is backed by BTRFS subvolume. When showing disk usage, I would like to organize a running container and its snapshots into one qgroup. However, incus is not helping anything like that, BTRFS has no direct knowledge for that. It might be worth to write some scripts for that.

Edit: Spend few hours with ChatGPT to come with this:

incus_qgroup.sh
#!/bin/bash

if [ "$#" -lt 1 ]; then
    echo "Usage: $0 <btrfs_path> [--debug]"
    exit 1
fi

declare -A instances_map
declare -A current_qgroup_map
declare -A targeted_qgroup_map
declare -A qgroup_size_map

btrfs_path="$1"
debug=false

if [ "$#" -eq 2 ] && [ "$2" == "--debug" ]; then
    debug=true
fi

output=$(sudo btrfs qgroup show -c "$btrfs_path")
lines=()

while IFS= read -r line; do
    lines+=("$line")
done <<< "$output"

for line in "${lines[@]}"; do
    read -r group_id referenced_size exclusive_size children path <<< "$line"

    # Check incus containers or virtual-machines or custom images
    if [[ $path == @incus-*/containers* ]] || [[ $path == @incus-*/virtual-machines* ]] || [[ $path == @incus-*/custom* ]]; then
        IFS='/' read -r -a path_segments <<< "$path"
        instance_name=${path_segments[2]}

        if [[ -n ${instances_map[$instance_name]} ]]; then
            IFS=' ' read -r -a entries <<< "${instances_map[$instance_name]}"
        else
            entries=()
        fi

        entry="$group_id:$path:$referenced_size:$exclusive_size"

        # Put instance subvolume at the beginning of array
        if [[ ${path_segments[1]} == *-snapshots ]]; then
            entries+=("$entry")
        else
            entries=("$entry" "${entries[@]}")
        fi

        instances_map[$instance_name]="${entries[*]}"
    fi

    # Check incus qgroup
    if [[ $group_id == */0 ]]; then
        IFS=',' read -r -a groups <<< "$children"
        current_qgroup_map[$group_id]="${groups[*]}"
        qgroup_size_map[$group_id]="$referenced_size"
    fi
done

for instance in "${!instances_map[@]}"; do
    children=()
    IFS=' ' read -r -a entries <<< "${instances_map[$instance]}"
    for entry in "${entries[@]}"; do
        IFS=':' read -r group_id path <<< "$entry"
        children+=("$group_id")
    done

    first_child="${children[0]}"
    qgroup_id="${first_child#0/}/0"  # Convert 0/number to number/0
    qgroup_size="${qgroup_size_map[$qgroup_id]}"

    targeted_qgroup_map[$qgroup_id]="${children[*]}"

    echo "Instance: $instance [$qgroup_size]"
    for entry in "${entries[@]}"; do
        IFS=':' read -r group_id path referenced_size exclusive_size <<< "$entry"
        echo "  $group_id: $path [$referenced_size, $exclusive_size]"
    done
done

if [ "$debug" = true ]; then
    echo ""
    echo "Current Qgroup Map:"
    for qgroup in "${!current_qgroup_map[@]}"; do
        IFS=' ' read -r -a children <<< "${current_qgroup_map[$qgroup]}"
        echo "  $qgroup: ${children[*]}"
    done

    echo ""
    echo "Targeted Qgroup Map:"
    for qgroup in "${!targeted_qgroup_map[@]}"; do
        IFS=' ' read -r -a children <<< "${targeted_qgroup_map[$qgroup]}"
        echo "  $qgroup: ${children[*]}"
    done

    echo ""
    echo "BTRFS Commands:"
fi

btrfs_changed=false
for qgroup in "${!current_qgroup_map[@]}"; do
    if [[ -z ${targeted_qgroup_map["$qgroup"]} ]]; then
        IFS=' ' read -r -a children <<< "${current_qgroup_map[$qgroup]}"
        for child in "${children[@]}"; do
            sudo btrfs qgroup remove --no-rescan $child $qgroup $btrfs_path
            $debug && echo "  btrfs qgroup remove $child $qgroup $btrfs_path"
        done
        sudo btrfs qgroup destroy $qgroup $btrfs_path 2> /dev/null
        $debug && echo "  btrfs qgroup destroy $qgroup $btrfs_path"
	btrfs_changed=true
    fi
done
for qgroup in "${!targeted_qgroup_map[@]}"; do
    if [[ -z ${current_qgroup_map[$qgroup]} ]]; then
        sudo btrfs qgroup create $qgroup $btrfs_path
        $debug && echo "  btrfs qgroup create $qgroup $btrfs_path"
    fi

    IFS=' ' read -r -a current_children <<< "${current_qgroup_map[$qgroup]}"
    IFS=' ' read -r -a targeted_children <<< "${targeted_qgroup_map[$qgroup]}"
    for child in "${targeted_children[@]}"; do
        if [[ ! " ${current_children[@]} " =~ " ${child} " ]]; then
            sudo btrfs qgroup assign --no-rescan $child $qgroup $btrfs_path 2> /dev/null
            $debug && echo "  btrfs qgroup assign $child $qgroup $btrfs_path"
	    btrfs_changed=true
        fi
    done
done
if [ "$btrfs_changed" = true ]; then
    sudo btrfs quota rescan $btrfs_path
    $debug && echo "  btrfs quota rescan $btrfs_path"
else
    $debug && echo "  (No commands executed)"
fi