Deduplication and compression

koala · November 2, 2021, 10:14pm

Hello everybody,

I was a little confused, that the compression setting is not specified for the whole repository, but can be specified by path.

I’m wondering now if deduplication still works when backing up paths with compression and paths without compression, when both contain the same files.

Kind regards!

stpr · November 3, 2021, 12:09am

Compression can (in a way) be enabled for the whole repository by specifying compression to be enabled globally like so:

kopia policy set --compression=zstd --global

By default all snapshots inhert from global policy, and therefore, compression will be enabled for all backup items. You can of course override it for specific backup items.

As for the deduplication and compression, a recent change to the repository format introduced as part of V0.9 means that data compression happens after hashing, so it should in theory mean that deduplication shouldn’t be affected by compression at all. So it shouldn’t care whether or not compression was enabled, or what algorithm was used.

koala · November 3, 2021, 8:29pm

Thanks for your reply @stpr!

The linked pull request seems to exactly address my questions!

I wonder though what is meant with

Also since compression will be done after hashing, it has so be done server-side, thus the bandwidth usage and CPU utilization between kopia client to kopia server may change.

I was planning to use an S3 repository and (to my knowledge) no kopia server.

stpr · November 3, 2021, 11:23pm

Kopia has a ‘repository server’ feature targeted at multi-user scenario where multiple users share a single repository, but provides some security features to prevent users from seeing each other’s contents. In such a setup, each user does not directly access the underlying files, but through the server program which takes care of authentication. This is not the typically relevant for single user backups, like perhaps you are using, where you have direct access to the filesystem.

Topic		Replies	Views
Compression on client side General Topics	7	110	March 25, 2025
Deduplication and different compression types Support	5	394	May 6, 2023
Best practice for backup/snapshot second disk? General Topics	6	1690	January 25, 2023
Compression questions General	6	2532	February 1, 2021
Where config policies in a Kopia Repository Server setup? Support	3	151	July 10, 2024

Deduplication and compression

Related topics