How to disable rsync compression for "lxc copy"? 100% cpu and slow for local network transfer

Over the 1Gb private local connection on an amd 3900x using 2x nvme raid0 LVM thin ext4 container lxc copy is only able to transfer at about 40MB/s or 300Mbps with rsync cpu at 100% on one core.

Here is the rysnc process spawned by lxc copy:

sh -c /snap/lxd/current/bin/lxd netcat @lxd/6a92977a-7acd-4f61-bd3f-df53e1a3a64b xxx localhost rsync --server -logD
tpXrSze.iLsfxC --compress-level=2 --delete --partial --numeric-ids . /tmp/foo

The 2 local servers is able to sustain full 1Gbp/s transfer and we believe the compression-level=2 option is causing rsync to use 100% of cpu.

However, we have not found a way to optionally disable or pass options to rsync that is started by lxc copy.

Any help is appreciated. Thanks.

Name    Version   Rev    Tracking       Publisher   Notes
core18  20200724  1885   latest/stable  canonicalâś“  base
lxd     4.6       17320  latest/stable  canonicalâś“  -
snapd   2.46.1    9279   latest/stable  canonicalâś“  snapd

Should be possible to introduce a new rsync.compress option on the storage pools to disable transfer compression when copying to/from a particular pool.

That would be great for future release.

In the meantime, can we compile lxd with the compression args disabled in the rsync code and place the custom LXD binary in /snap/lxd/current/bin? Is this the best way to run a custom LXD binary?

Yeah, a Github issue for it would be good. We’ll tag it as easy+hacktoberfest so hopefully someone picks it up as a way to start contributing to LXD :slight_smile:

To run an alternate LXD binary, we actually have a cleaner option which is to put it at /var/snap/lxd/common/lxd.debug. This will also cause a message to be logged on startup, reminding you that you’re not running the original binary.

Created github issue. Turns out my two server had 10g nics. and disabling compression on rsync improved lxc copy from 300Mbps to 3500Mbps. Cpu went from 100% to 80%. :exploding_head:

2 Likes

@stgraber Has there been any rsync changes since Oct 2020? There appears to be a regression in the rsync transfer speed by a factor of about 3.5x even with rsync.compression set to false on both source/destination storage pool.

Previously I got 3500 mbps on a lan 10g, now I can only do 1000 mbps. The disk io is not the bottleneck. Tested doing multiple lxc copy and each lxc copy instance can do 1000 mbps so i have 5 parallel operations I can do 5000 mbps which points to some limitation in the rsync which is strange.

Ubuntu 20.04
Snap LXD 4.21
sync version 3.2.3 protocol version 31

rsync --server -vlogDtpre.iLsfx --numeric-ids --devices --partial --sparse --xattrs --filter=-x security.selinux --delete . /var/snap/lxd/common/lxd/storage-pools/default2/containers/test/

The rsync packaged with lxd does not appear to have any cpu or openssl optimizations? Is okay for me to test by temp replacing the rysnc binary in /snap/lxd/22162/bin folder on both source/destination?

root@rat1:~# /snap/lxd/22162/bin/rsync --version
rsync  version 3.1.3  protocol version 31
Copyright (C) 1996-2018 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace,
    append, ACLs, xattrs, iconv, symtimes, prealloc
root@rat1:~# rsync --version
rsync  version 3.2.3  protocol version 31
Copyright (C) 1996-2020 by Andrew Tridgell, Wayne Davison, and others.
Web site: https://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, hardlink-specials, symlinks, IPv6, atimes,
    batchfiles, inplace, append, ACLs, xattrs, optional protect-args, iconv,
    symtimes, prealloc, stop-at, no crtimes
Optimizations:
    SIMD, asm, openssl-crypto
Checksum list:
    xxh128 xxh3 xxh64 (xxhash) md5 md4 none
Compress list:
    zstd lz4 zlibx zlib none

The /snap/lxd directories are read-only so I was unable to test a modified version of rsync which has openssl/cpu optimizations from stock ubuntu 20.04.

@stgraber do we compile rsync in the snap?

No, we don’t, we include whatever comes from the base OS, so Ubuntu 20.04 these days.

You can’t overwrite the content but you can overmount it.
mount -o bind /some/path/to/rsync /snap/lxd/current/bin/rsync

@stgraber I tried to overmount as you described. The overmount works but lxc copy (snapd lxd) would fail instantly complaining about missing linked libraries. It appears snap running a non-snap rsync breaks library links since it can’t find them in the snaps env.

Pretty much isolated glass ceiling of lxc copy to rsync and I just need a fast way to verify it.

  1. Ubuntu 22.04
  2. 5.19 custom kernel
  3. MTU 9000
  4. iperf3 verified 37Gb/s bidirectional over 40G nic.
  5. fio verified 1000MB+ read/write speed on local pcie4 nvme.
  6. Snapd lxd latest stable
  7. rsync.compression off in lxc storage
  8. amd zen3 cpu

Even with above config, lxc copy between the two machines, large database files, can only muster 3Gb/s max transfer speed or around 375MB/s

What is the easiest way for me to inject a distro default (optimzied) rsync into lxd for testing? I can compile lxd if needs be.

Thanks.

For reference (ubuntu 22.04) using same two identical machines and same container files (dirfs):

  1. scp over the same link/machines via (scp) consumes max 4.3Gb/s bandwidth or ~520MB/s.

  2. rsync over the same link/machines via (rsync -avzh --progress) consumes max 4.6Gb/s or ~575MB/s.

So lxd bundled rsync is ~1.6Gb/s slower than the 22.04 distro version optimized with simd/asm + openssl library.

Maybe we should add a rsync.external option to the snap package?

@stgraber ?

1 Like

Nah, that’s starting to get a little too much if we make that flag for every single binary we use. We’ll be moving to core22 next cycle, that will take care of it.

2 Likes

Disabled rsync compression (optionally) increased transfer speed from 300Mb/s to 3Gb/s. And pending move to core22 rsync will increase that from 3Gb/s to 4.6Gb/s (depending on cpu) which is great but both are still well below the ceiling of 10G, nvme devices.

25G and 40+G is starting to proliferate so perhaps lxd v6 roadmap can see a customized or rsync replacement that can break the 10Gb/s barrier (without compression). Fingers crossed.

There is perhaps another optimization that will 2x the throughput beyond moving to 22.04 release of rsync.

Below are tests using rsync in same 22.04 zen3 sysytem with 40G nic. Note that setting aes128/aes256 (likely hardareware accelerated) gave tremendous speedup. Not sure what is the default encryption used for rsync in current lxd. Speeds are in B/s.

rsync  -a --progress -e "ssh -T -c aes128-gcm@openssh.com -o Compression=no -x" 

Max observed: 1.36GB/s

rsync  -a --progress -e "ssh -T -c aes256-gcm@openssh.com -o Compression=no -x" 

Max observed: 943MB/s

rsync  -a --progress -e "ssh -T -c chacha20-poly1305@openssh.com  -o Compression=no -x" 

Max observed: 595MB/s

An new lxd storage rsync.encryption option perhaps? Combining 22.04 build of rsync with encryption scheme selection will push lxd remote copy beyond 10Gb nics for the first time (depending on cpu aes hardware acceleration support)

1 Like

LXD doesn’t use rsync over SSH though. It uses rsync over the HTTPS connection.

Hi,

I just tested the rsync.compression configuration option on a pool and it works great.
The only thing is, I don’t see it documented yet on the lxc docs web pages.
Shouldn’t that be included?

Quinn