How does kopia confirm that a block already exists?

lhl7200121 · November 26, 2022, 12:01pm

Hi, as proposed in the documentation, kopia uses deduplication. What I am more concerned about is how kopia judges whether one of the blocks (named a1) is equal to the blocks decomposed by other files after a file is decomposed into several blocks. In other words, how does kopia recognize that new file blocks are duplicates? Does kopia compare block a1 with all existing blocks, or use a data structure such as a Bloom filter to speed up the process of judging duplication?
I would be very grateful if you could answer this for me.

jkowalski · November 27, 2022, 3:01am

Kopia keeps an index of all content IDs that are in the repository. Content ID is basically a cryptographic hash of its contents.

Bloom filters are not currently used, but may be used in the future but mostly to speed up index lookups.

lhl7200121 · November 27, 2022, 4:21am

Thank you for your patient reply, I learned a lot in kopia, especially in terms of chunking.
I also wonder if the same parts of the two files in kopia share the same block in the repository

Topic		Replies	Views
How can I validate dedup behavior Support	3	149	June 18, 2024
Internal read operation processing of kopia General	0	166	February 1, 2023
Recognizing duplicate inodes? General	13	510	April 25, 2021
Deduplication: file-wise or repo-wide? General	2	563	September 9, 2020
Does deduplication work across snapshots? General Topics	1	216	April 25, 2024

How does kopia confirm that a block already exists?

Related topics