Best practice for creating a data directory for MySQL

Guides that discuss using ZFS with MySQL / MariaDB always say you should create a directory to hold the MySQL data with recordsize=16k, to match MySQL’s block size.

I’ve created a Debian container with LXD and would like to run MySQL inside it, but the recordsize of the container in the ZFS pool is just at the default 128k. I would like to create another ZFS filesystem to just hold the MySQL data, so I can set that filesystem to have recordsize=16k.

I’m wondering what is the best practice way to do this with LXD?

Do I need to create a ZFS filesystem outside LXD’s ZFS dataset and share it with the container? Something like:

zfs create -p rpool/extracontainerdata/mycontainer/mysql/data

zfs set recordsize=16k rpool/extracontainerdata/mycontainer/mysql/data

lxc config device add mycontainer mysqldata disk source=/extracontainerdata/mycontainer/mysql/data path=/var/lib/mysql/data/

echo -en “uid 117 106\ngid 124 110” | lxc config set mycontainer raw.idmap -

It seems like this should work but seems a little clunky as this extra ZFS dataset lives outside the dataset that LXD manages. Is there a better way to do this where I can create the device so LXD itself manages it, just like it currently does with the main root filesystem for this container? Thanks.

HI @rosm

First of all, I hope my comments will help a bit.

zfsutils-linux(ubuntu) cannot be installed on containers in lxd.
This is because zfs is a kernel-based file system.

It is possible to save mysql data on the ext4 file system if there is no performance problem. With the syscall interception function, you can create, mount, and use an ext4 file system by putting the disk(not bind mount) in a container.

Thank you.

Yes, I create a separate ZFS filesystem for MySQL/MariaDB data, with recordsize=16k. I also create a separate zfs dataset for the Innodb log, with recordsize=128k (the zfs default).
I add this to /etc/mysql/my.cnf:

[mysqld]
skip-innodb_doublewrite
innodb_use_native_aio=0
innodb_use_atomic_writes=0

innodb_log_group_home_dir = /var/lib/mysql-log
bind-address = 0.0.0.0

/var/lib/mysql-log is where I mount the log filesystem.
I add these two external filesystems to a profile, which I then add to the container.

I don’t use an idmap. Instead, I change the ownership of the filesystem to match the uid/gid of the container’s mysql user. For my container, this is 1000111:1000115. The idmap may be better, because you can change it appropriate when moving the filesystem to a container with different mysql uid.

Using such external filesystems with containers has advantages and disadvantages. The disadvantage is that creating snapshots or copies of the container (e.g. for testing) requires more care, because you need to snapshot/clone the external filesystem too. The advantage is that you can manage the database data separately from the container OS, move it to another container, etc. I’ve been using external filesystems with several of my containers, and I try to thus separate the guest OS from my data.

1 Like

After creating the storage volume, set the necessary settings and attach it to the container, it will be better than the bind mount.

Please see the picture below.

@lbg74 I’m wondering what advantage do you see using a storage volume instead of an external filesystem as a bind mount? Thanks.

Please remember that my answer to your question is at a rudimentary level.

I would like @stgraber or @brauner to give a more professional answer.

The advantages of using a storage volume over the bind mount I think are:

  1. It is more secure in terms of security because it is not mounted as a file system on the host (because it is invisible…).
    –> Users cannot accidentally delete or change files.

  2. It is more convenient to share than bind mount when it needs to be shared by multiple containers for data or other reasons. (Including when the container is not in privileged mode, using shiftfs.)

  3. When doing snap, copy, or move of a container, there is less to worry about than bind mount.

I don’t know about the performance bind-mount vs storage volume.
(Somehow this seems to be the most important part.)

Thank you. @rosm

lxc storage volume create data nextcloud-data
zfs set recordsize=16k ssd/lxd/custom/database-data
lxc storage volume attach default database-data database /var/lib/mysql-data

lxc storage volume create data nextcloud-logs
zfs set recordsize=128k ssd/lxd/custom/database-logs
lxc storage volume attach default database-data database /var/lib/mysql-logs