My benchmark of kopia and bupstash

nh2 · July 1, 2021, 1:02am

I recently benchmarked current versions of kopia and bupstash for various scenarios to evaluate these tools.

Report: GitHub - nh2/bupstash-kopia-comparison: Evaluation benchmarks of backup tools bupstash and kopiia

I will likely update the report a couple times as the tools improve over time.

For bupstash, I’ve implemented (but not benchmarked yet) a WIP for per-directory parallel stat()ing, which is its current bottleneck for my use case.

For kopia, my main request is to allow to runtime-configure the number of threads it uses (instead of having it hardcoded to 16), as my networked file system would benefit a lot from that. There are also a couple issues I found (and linked).

I would also appreciate that if you find some answers to the open questions in there (e.g. why my kopia run didn’t deduplcate the data within the first run on the “4 GB, small files” dataset), please answer them here or file an issue in my report’s repo.

Thanks, and happy backuping!

nh2 · July 1, 2021, 1:06am

Btw, this discourse is configured such that

Sorry, new users can only put 2 links in a post.

which is somewhat annoying for technical work (e.g. linking to issues).

jkowalski · July 2, 2021, 2:16pm

Interesting report. Kopia will generally deduplicate across “contents” which are sections of large files between 1MB and 8MB - this is to keep the number of entries in the index low. This also makes deduplication across small files not effective, but those are typically compressible (esp. log/source files).

If you can, try with:

$ kopia policy set --global --compression=zstd-fastest

or even better run:

$ kopia benchmark compression --data-file=sample-file.log

and pick compression method that is fastest on your machine.

You have some excellent points about parallelism, memory consumption, etc. All things we should improve over time - I’d be happy to review PRs for those.

Definitely please file individual GH issues for proposed improvements. That would be highly appreciated.

Topic		Replies	Views
Big performance improvements in 0.7 Announcements	0	532	September 15, 2020
Performance improvement tips? Support	3	1583	July 26, 2021
Kopia chunk size configuration Support	8	910	November 25, 2024
I have never seen a backup tool this fast! General Topics	8	345	December 7, 2024
Kopia v0.9 has been released Announcements	3	897	October 8, 2021

My benchmark of kopia and bupstash

Related topics