Mounting a disk image inside an Incus CT?

Hi all,

I’m exploring whether Incus is a suitable solution for us. I’m currently running Incus 0.7 on an Ubuntu 22.04.4 LTS and also on an AlmaLinux 9.4 for evaluation.

One of our requirements for using Incus CTs for virtual hosting is having Linux user and group quota inside a Container. As far as I understand it, this may be a bit of an issue? I do have project quota enabled on the host node.

As a crutch I created an image inside an Incus Container, formatted it with XFS and then tried to have Systemd mount it with …

/bin/mount -o loop,usrquota,gquota /home.img /home

However, this throws:

mount: /mnt/: mount failed: Operation not permitted.

Even straight up mount attempts of the image (w/o usrquota,gquota) are not permitted due to the default security options. According to the Incus docs and this info from Git it should at least be possible to adjust the container security policies to allow the CT to use mount? But so far I can’t figure out the syntax to configure the Incus container with the right security switches to allow this.

So far I tried various combinations these options without luck:

incus config set <CT-NAME> security.nesting=true security.syscalls.intercept.mknod=true security.syscalls.intercept.setxattr=true security.syscalls.intercept.mount=true security.syscalls.intercept.bpf=true

But I still can’t mount the image in the container. Any idea what I might be missing or doing wrong?

Or if someone has another idea how to get user and group quota working inside an Incus CT? Even if it’s a crutch or ugly? I’d be happy to give it a try nonetheless.

Many thanks!

Linux generally does not support using traditional user/group quotas inside of containers.

You should be able to have a filesystem on the host system that has user/group quotas enabled and configured, then pass that to the container and have them be respected, but the container will not be able to alter the quotas (even as root).

Many thanks for the answer. This is a bit disappointing, considering that containers on OpenVZ had fully working quota support since 2005. But yeah. That’s a different story.

Our intent was to integrate Incus into our web hosting platform BlueOnyx, which would then need to set disk quotas via API or GUI from inside the CT. I guess that’s out of the window then unless we use VMs.

Just out of curiosity: Any hints on what I’m doing wrong with the attempts to mount an image inside an Incus container?

Most likely the issue is that your container doesn’t have loop devices for it to map the file to it.

Most likely the issue is that your container doesn’t have loop devices for it to map the file to it.

Indeed! Many thanks for pointing that out.

try this:
incus config device add loop0 unix-block path=/dev/loop0
incus config device add loop-control unix-char path=/dev/loop-control
incus config set security.syscalls.intercept.mount true

for ext4 filesystem
incus config set security.syscalls.intercept.mount.allowed ext4

for btrfs filesystem
incus config set security.syscalls.intercept.mount.allowed btrfs

That doesn’t seem very robust; it depends on /dev/loop0 being available on the host at the time it’s needed, and not allocated to some other purpose.

Maybe an incus VM is the safest approach here? Given that you want to put a dedicated filesystem inside a block device anyway, the additional overhead of kvm and virtio should be relatively low.

Many thanks to @abdodz1234 for the suggested config changes. That sure is helpful and I’ll give them a try just to see if it will work. Just in case I ever need this again.

@candlerb wrote:

Maybe an incus VM is the safest approach here?

Given the circumstances? Certainly. Yet it isn’t what our clients asked for and some grumbling ensued. Their primary usage would be CTs for webhosting, email, CardDAV/CalDAV, SQL, FTP and DNS using the BlueOnyx control panel. Which is similar to cPanel, Plesk, DirectAdmin, ISPconfig and other similar solutions.

When the idea is to pack as many of those hosting VPS’s as possible onto a server, then containers have way fewer overhead than VMs. So you can run more with less “iron”, cooling or electricity, which cuts operating costs and initial investḿent considerably.

For this kind of “run of the mill”-webhosting one must have the possibility to limit how much disk space an individual shared hosting account or user account inside an instance can consume. File system disk quotas for groups and users are the most convenient way to both quickly check if a large number of vsites (groups) and users are within their limits and they also reliably prevent further disk space consumption once these limits are reached.

OpenVZ had this for CTs since 2005, but as that’s essentially dead and fast approaching the EOL it’s no longer an option.

I spent the last couple of days overhauling BlueOnyx to work in the absence of file-system disk quotas for users and groups and to still enforce hard limits once vsites and users exceed their allowed disk quotas. So it now works in Incus CTs, too. With some caveats.

Such as: Quickly checking how much disk space a couple hundred groups and several hundred users consume w/o quotas is several magnitudes slower if you can’t use “repquota” or similar. Even if you use find/stat to quickly tally the used blocks.

Given that we don’t have much in the way of an alternative for CTs and seeing the great potential that Incus has? That will have to do.

According to Wikipedia, OpenVZ required a patched kernel to get the full feature set, and presumably those patches were never upstreamed.

Such as: Quickly checking how much disk space a couple hundred groups and several hundred users consume w/o quotas is several magnitudes slower if you can’t use “repquota” or similar.

Could you provide a filesystem from the host (as Stéphane suggested), with per-user quotas enabled, but set the quotas very high so enforcement never happens? Then you can still report on the usage.

Otherwise you can look at FreeBSD jails, Solaris zones etc.

Hi Brian,

According to Wikipedia, OpenVZ required a patched kernel to get the full feature set, and presumably those patches were never upstreamed.

Probably, yeah. Or upstream rejected them for good reasons. I faintly recall that a few select features made it through and some not, because the kernel-implementation was bordering a bit on the obscene.

Could you provide a filesystem from the host (as Stéphane suggested), with per-user quotas enabled, but set the quotas very high so enforcement never happens? Then you can still report on the usage.

That is an interesting approach, which is indeed partially useful. Let’s see: I have an AlmaLinux 9 named “first” in a CT on an AlmaLinux 9 node running Incus 0.7. The storage driver is “dir” and it’s using a directory on /home on the node where /home is mounted this way in /etc/fstab:

/dev/mapper/VolGroup00-home /home xfs defaults,gquota,uquota,pquota 0 0

Inside the AlmaLinux 9 CT on said node I get this:

[root@first ~]# df -h
Filesystem                   Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-home   30G  506M   30G   2% /
none                         492K  4.0K  488K   1% /dev
devtmpfs                     4.0M     0  4.0M   0% /dev/tty
tmpfs                        100K     0  100K   0% /dev/incus
tmpfs                        100K     0  100K   0% /dev/.incus-mounts
tmpfs                         32G     0   32G   0% /dev/shm
tmpfs                         13G  8.2M   13G   1% /run
tmpfs                        410M     0  410M   0% /run/user/0
[root@first ~]# mount
/dev/mapper/VolGroup00-home on / type xfs (rw,relatime,idmapped,attr2,inode64,logbufs=8,logbsize=32k,usrquota,prjquota,grpquota)
none on /dev type tmpfs (rw,relatime,size=492k,mode=755,uid=1000000,gid=1000000,inode64)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,relatime)
devtmpfs on /dev/fuse type devtmpfs (rw,nosuid,size=4096k,nr_inodes=8161030,mode=755,inode64)
devtmpfs on /dev/net/tun type devtmpfs (rw,nosuid,size=4096k,nr_inodes=8161030,mode=755,inode64)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/incus type tmpfs (rw,relatime,size=100k,mode=755,inode64)
tmpfs on /dev/.incus-mounts type tmpfs (rw,relatime,size=100k,mode=711,inode64)
none on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)
lxcfs on /proc/cpuinfo type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/diskstats type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/loadavg type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/meminfo type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/slabinfo type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/stat type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/swaps type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/uptime type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /sys/devices/system/cpu type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
devtmpfs on /dev/full type devtmpfs (rw,nosuid,size=4096k,nr_inodes=8161030,mode=755,inode64)
devtmpfs on /dev/null type devtmpfs (rw,nosuid,size=4096k,nr_inodes=8161030,mode=755,inode64)
devtmpfs on /dev/random type devtmpfs (rw,nosuid,size=4096k,nr_inodes=8161030,mode=755,inode64)
devtmpfs on /dev/tty type devtmpfs (rw,nosuid,size=4096k,nr_inodes=8161030,mode=755,inode64)
devtmpfs on /dev/urandom type devtmpfs (rw,nosuid,size=4096k,nr_inodes=8161030,mode=755,inode64)
devtmpfs on /dev/zero type devtmpfs (rw,nosuid,size=4096k,nr_inodes=8161030,mode=755,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=1000005,mode=620,ptmxmode=666,max=1024)
devpts on /dev/ptmx type devpts (rw,nosuid,noexec,relatime,gid=1000005,mode=620,ptmxmode=666,max=1024)
devpts on /dev/console type devpts (rw,nosuid,noexec,relatime,gid=1000005,mode=620,ptmxmode=666,max=1024)
none on /proc/sys/kernel/random/boot_id type tmpfs (ro,nosuid,nodev,noexec,relatime,size=492k,mode=755,uid=1000000,gid=1000000,inode64)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,uid=1000000,gid=1000000,inode64)
tmpfs on /run type tmpfs (rw,nosuid,nodev,size=13081700k,nr_inodes=819200,mode=755,uid=1000000,gid=1000000,inode64)
tracefs on /sys/kernel/debug/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=419428k,nr_inodes=104857,mode=700,uid=1000000,gid=1000000,inode64)

Attempts to read the quota information yield nothing via “repquota” and throw an error via xfs_quota:

[root@first ~]# /usr/sbin/repquota -a 
[root@first ~]# 
[root@first ~]# xfs_quota -x -c 'report -h' /
xfs_quota: cannot setup path for mount /: No such device or address

Which isn’t that big of a surprise as far as “repquota” goes, as no quotas specific to the CT have been set on the node aside from the generic project-quota that limits this CT to a total of 30GiB of usage. And “xfs_quota” seems to have issue with the mountpoint and/or absence of /etc/fstab.

There is a group “site1” configured in the quota of the node and a matching “group1” exists inside the CT. Yet no quota is reported for this, which may or may not be due to UID/GID-shifting?

Being able to use the quota-tools to report on user and group disk usage would be nice, but I think I can make do without with our overhauled code in BlueOnyx. If the quotas of the node can be used in one way or another, but can’t be modified from inside the CT they’re not really useful to us.

Because in a typical scenario a Vsite gets created with a siteAdmin. Either manually or via a provisioning tool. That Vsite is then handed off to the end user, who can use the GUI interface to create and modify further users and set their disk-quota allowance. This would need to feed back to an API on the node to append/adjust the disk quota there as well.

That quickly gets complicated once CTs are moved from one node to another, or get backed up and restored. Which may or may not happen during their lifetime. Having the CTs isolated with no unnecessary callbacks back to the node is much more preferable.

Otherwise you can look at FreeBSD jails, Solaris zones etc.

Yeah, it’s not really an option, even though FreeBSD jails sure are great. Our codebase and build environment is deeply rooted in the RedHat hellscape. Means our OS’s of choice are currently AlmaLinux 8 and 9 or RockyLinux 8 and 9. The build env for BlueOnyx spits out around 1200 RPMs all things said and done and porting that to a non RPM-based architecture is such a major undertaking that it ain’t funny.

I’ll try to make do without quota, but I really do appreciate your input as I still have much to learn about Incus. Many thanks!