Kopia gzip compression

fortemezzo · September 7, 2022, 5:02pm

I was a bit surprised when I ran a compression benchmark on a very compressable log file (16 MB) on an older i7 with 2 cores and 4 threads:

     Compression                Compressed   Throughput   Allocs   Usage
------------------------------------------------------------------------------------------------
  0. s2-default                 1.3 MiB      2.1 GiB/s    944      18.5 MiB
  1. s2-parallel-4              1.3 MiB      2.1 GiB/s    940      18.5 MiB
  2. s2-parallel-8              1.3 MiB      1.9 GiB/s    993      29.6 MiB
  3. s2-better                  1.2 MiB      1.5 GiB/s    937      18.4 MiB
  4. pgzip-best-speed           1.7 MiB      644.1 MiB/s  1119     35.9 MiB
  5. zstd-fastest               757.8 KiB    575.1 MiB/s  5538     11.1 MiB
  6. zstd                       673.3 KiB    529.2 MiB/s  2807     20.1 MiB
  7. pgzip                      1.1 MiB      380.8 MiB/s  1126     36 MiB
  8. deflate-best-speed         1.7 MiB      352.2 MiB/s  33       5.6 MiB
  9. gzip-best-speed            1.7 MiB      218 MiB/s    39       5.9 MiB
 10. deflate-default            1.1 MiB      216.3 MiB/s  32       3.5 MiB
 11. zstd-better-compression    535.5 KiB    185.1 MiB/s  2938     38.6 MiB
 12. gzip                       1 MiB        84.3 MiB/s   37       3.2 MiB
 13. pgzip-best-compression     0.9 MiB      60.3 MiB/s   1154     38.1 MiB
 14. deflate-best-compression   0.9 MiB      23.5 MiB/s   33       3.5 MiB
 15. gzip-best-compression      0.9 MiB      19.3 MiB/s   36       3.2 MiB

pgzip is 4.5 times faster than gzip, but I only have 2 cores. I’d expect a factor 2.5-3 maybe (hyper threading would help a bit). So my guess is that the pgzip implementation is faster per core. Wouldn’t it be better to replace gzip with pgzip with a concurrency of 1?

IcePic · September 19, 2022, 1:53pm

Is there actually any case where someone would want gzip at all?
For speed, s2 is faster by a huuuge margin, for perf zstd is better even in zstd-better mode as seen above. The only reason to use gzip would be for low memory usage, but that is mostly moot there days.

fortemezzo · September 19, 2022, 3:34pm

Good points. I ended up using zst and I’m happy with the results.

Low memory can be important though when running Kopia on a single board computer with lower memory (mine has 2 GB), but in my case that failed even without compression enabled.

IcePic · October 4, 2022, 2:40pm

Yeah, I don’t think the buffers used by compression will be the limiting factor in those cases, but number of files in filesystem and golang footprint and so on.

Topic		Replies	Views
Kopia server taking too much ram->swapping Support	11	993	November 20, 2022
Inconsistent compression results Support	1	273	September 18, 2022
Is anybody using LZ4 compression algorithm? General Topics	8	1851	May 9, 2024
My benchmark of kopia and bupstash General	2	907	July 2, 2021
Performance improvement tips? Support	3	1592	July 26, 2021

Kopia gzip compression

Related topics