Performance improvement tips?

I have kopia installed on a ryzen 3700x system with all ssd based storage. I’m testing backup performance to a repository server over a 10Gbit network and with the default settings, compression set to zstd for the policy and a test folder of mostly large 1GB+ sized files i’m only able to achieve ~400-500Mbit/s over sftp or webdav (not ssl) running the official docker image. Looking at cpu usage kopia only seems to be leveraging 1-2 cores and not always full utilization at that. Is there any way to parallelize the backup task to better use all of the available resources?

Thanks for the wonderful software and for any tips!

i’m not entirely sure here (as a new user myself) but you could be limited by your compression choice. you might try testing with s2-parallel-4 or s2-parallel-8.

I think repository server itself may be a bottleneck, I don’t think it was ever pushed to 10Gbit/s. Try using sftp or webdav directly from the client. Basically:

kopia repository connect sftp

instead of

kopia repository connect server

Also you can try:

  • enabling/disabling compression (in kopia 0.8 it happens client-side, in 0.9 will be possible to compress server-side)
  • changing the number of parallel uploads (--parallel=X)
  • create repository with different hash/encryption algorithms based on what works best (kopia benchmark crypto --print-options)

When performing those tests, make sure to use new repository each time, as caching/deduplication can have surprising effects on these kinds of benchmarks.

Also, Kopia does not currently support parallel uploads of individual large files, so a file that’s say 10 GB would be processed single-threaded limiting throughput. If you have many such files in a single directory, uploads will be parallelized across them, but it’s not as efficient for a small number of very large files. Also, there’s no parallelization across directories, so lots of directories with very few files each will limit CPU usage. There are plans to improve all this in future versions.

1 Like

Thank you, this has actually been very helpful and --parallel=10 (just as a random starting point) seems to have greatly improved performance. Is there a way to set --parallel through the web interface or in a policy? Right now i’m passing it from the cli manually but i’d like it to be applied automatically to the policy on the docker container that my instance of kopia is actually running in.