ZFS vs EXT4 Performance on High Load with LXD-Benchmark

CoolHandLuke · June 22, 2023, 8:31am

Yes. Expecting the same results without using the exact hardware, versions, etc. doesn’t seem a good goal.

Ibragim_Ganizade · June 22, 2023, 8:33am

My equipment and some information

OS: Alpine Linux v3.17 x86_64
Host: ProLiant DL380p Gen8
Kernel: 5.15.79-0-lts
CPU: Intel Xeon E5-2667 v2 (32) @ 4.000GHz
Memory: 2129MiB / 257902MiB
Raid Mode: HBA
Disk: 12 SAS HDDs of 6 TB

CoolHandLuke · June 22, 2023, 8:34am

This is different than in the video ?

Ibragim_Ganizade · June 22, 2023, 8:35am

Certainly

I am testing on server hardware
But @stgraber is testing on his PC if I’m not mistaken

CoolHandLuke · June 22, 2023, 8:36am

If it’s different hardware, you still wish similar synthetic performance ?

Ibragim_Ganizade · June 22, 2023, 8:37am

The fact is that I check the speed of ZFS and EXT4 on the same server with an identical setting

CoolHandLuke · June 22, 2023, 8:39am

Your results do not seem unexpected. You may be trying to find an issue where none exists.

Ibragim_Ganizade · June 22, 2023, 8:40am

ZFS should be slower than EXT4?

CoolHandLuke · June 22, 2023, 8:43am

I expect using a synthetic benchmarking utility on different hardware and software versions than a demo video would give you this type of result. The benchmark utility doesn’t demonstrate real-world performance differences with these different filesystems.

Ibragim_Ganizade · June 22, 2023, 8:47am

I tried to run ZFS and EXT4 with the following configurations

on one disk
on the raid
in ram
And in every case ZFS is slower than EXT4 with standard LXD settings

I checked with both FIO and LXD-BENCHMARK

I can check right now with any other utility and give any information, no problem

I really want to use ZFS but don’t know how to make it faster than EXT4

Regards.

CoolHandLuke · June 22, 2023, 8:50am

Yes. You clearly demonstrate you wish the synthetic benchmarks to output a better performance result for zfs than ext4.

Ibragim_Ganizade · June 22, 2023, 8:53am

Do you know how to achieve faster performance?

Here is some information: I will use PostgreSQL on my containers with ZFS storage type in very high loads

Regards.

PS: Thanks for still helping me

CoolHandLuke · June 22, 2023, 8:55am

Why not stage the system as you intend and run some real-world tests against it rather than synthetic benchmarks?

Ibragim_Ganizade · June 22, 2023, 8:57am

It’s not good that container creation is slower on ZFS

Okay then I’ll check the tests using the PGBench utility

Is this a good idea?

CoolHandLuke · June 22, 2023, 8:58am

No. I am recomending to configure the system as you intend and run real-world tests against it to determine real-world performance.

Ibragim_Ganizade · June 22, 2023, 9:01am

Okay then I’m setting up my system now as if I set it up in production and then I’ll check the speed of its work

I will use Debian 12 on the host, Debian 11 on the container and PostgreSQL 15

Now I’ll set it up and tell you the speed of work

Ibragim_Ganizade · June 22, 2023, 1:21pm

There were problems installing Debian 12 on the server because there is no native support for ZFS unfortunately

But I decided to run Alpine Linux on my RAM and install LXD on it with standard settings

Created a pgbench debian 11 container

I launched my ERP system and it works exactly the same as on EXT4, but there is a nuance that these are not high loads, I work alone

To check high loads, I still had to run pgbench
Here are my stats on ZFS

root@pgbench:~# pgbench -h localhost -p 5432 -U postgres -c 50 -j 2 -P 60 -T 600 benchmark

Password:
pgbench (15.3)
starting vacuum…end.
progress: 60.0 s, 17607.9 tps, lat 2.744 ms stddev 1.433, 0 failed
progress: 120.0 s, 16685.8 tps, lat 2.911 ms stddev 1.139, 0 failed
progress: 180.0 s, 16754.9 tps, lat 2.899 ms stddev 1.611, 0 failed
progress: 240.0 s, 16348.3 tps, lat 2.972 ms stddev 1.150, 0 failed
progress: 300.0 s, 16373.4 tps, lat 2.968 ms stddev 1.666, 0 failed
progress: 360.0 s, 16450.3 tps, lat 2.954 ms stddev 1.257, 0 failed
progress: 420.0 s, 16272.8 tps, lat 2.986 ms stddev 1.737, 0 failed
progress: 480.0 s, 16109.5 tps, lat 3.018 ms stddev 1.167, 0 failed
progress: 540.0 s, 15348.4 tps, lat 3.171 ms stddev 2.173, 0 failed
progress: 600.0 s, 15340.2 tps, lat 3.173 ms stddev 1.875, 0 failed
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 150
query mode: simple
number of clients: 50
number of threads: 2
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 9797529
number of failed transactions: 0 (0.000%)
latency average = 2.975 ms
latency stddev = 1.554 ms
initial connection time = 100.341 ms
tps = 16331.009234 (without initial connection time)

CoolHandLuke · June 22, 2023, 1:27pm

You are unable to configure the system as you intend and run real-world tests against it ?

Ibragim_Ganizade · June 22, 2023, 1:52pm

Sorry, I started testing in RAM again
Ok I created a pool on my disks like this

bench:~# modprobe zfs
bench:~#    zpool create -f -o ashift=12 \
>        -O acltype=posixacl -O canmount=off -O compression=lz4 \
>        -O dnodesize=auto -O normalization=formD -O relatime=on -O xattr=sa \
>        -O recordsize=8K -O atime=off -O logbias=throughput \
>        data mirror /dev/sda /dev/sdb mirror /dev/sdc /dev/sdd mirror /dev/sde /dev/
sdf mirror /dev/sdg /dev/sdh mirror /dev/sdi /dev/sdj mirror /dev/sdk /dev/sdl

System set up successfully
Here are pgbench results when using disks

root@pgbench:~# pgbench -h localhost -p 5432 -U postgres -i -s 150 benchmark

Password:
dropping old tables…
NOTICE: table “pgbench_accounts” does not exist, skipping
NOTICE: table “pgbench_branches” does not exist, skipping
NOTICE: table “pgbench_history” does not exist, skipping
NOTICE: table “pgbench_tellers” does not exist, skipping
creating tables…
generating data (client-side)…
15000000 of 15000000 tuples (100%) done (elapsed 29.26 s, remaining 0.00 s)
vacuuming…
creating primary keys…
done in 41.73 s (drop tables 0.00 s, create tables 0.14 s, client-side generate 30.05 s, vacuum 0.77 s, primary keys 10.76 s).

root@pgbench:~# pgbench -h localhost -p 5432 -U postgres -c 50 -j 2 -P 60 -T 600 benchmark
Password:
pgbench (15.3)
starting vacuum…end.
progress: 60.0 s, 2156.3 tps, lat 23.016 ms stddev 9.696, 0 failed
progress: 120.0 s, 2125.8 tps, lat 23.379 ms stddev 10.477, 0 failed
progress: 180.0 s, 2082.9 tps, lat 23.866 ms stddev 10.236, 0 failed
progress: 240.0 s, 2067.1 tps, lat 24.054 ms stddev 11.318, 0 failed
progress: 300.0 s, 1526.7 tps, lat 32.653 ms stddev 24.898, 0 failed
progress: 360.0 s, 2005.5 tps, lat 24.821 ms stddev 11.331, 0 failed
progress: 420.0 s, 1956.7 tps, lat 25.460 ms stddev 10.749, 0 failed
progress: 480.0 s, 1698.5 tps, lat 29.348 ms stddev 11.020, 0 failed
progress: 540.0 s, 1842.1 tps, lat 27.051 ms stddev 12.325, 0 failed
progress: 600.0 s, 1888.4 tps, lat 26.363 ms stddev 14.288, 0 failed
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 150
query mode: simple
number of clients: 50
number of threads: 2
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 1161060
number of failed transactions: 0 (0.000%)
latency average = 25.720 ms
latency stddev = 13.184 ms
initial connection time = 89.406 ms
tps = 1935.208875 (without initial connection time)

Ibragim_Ganizade · June 22, 2023, 1:54pm

The result is of course worse (because I ran the test on disks), but this is the setting that I would use in a production environment

Regards.