Hi, as proposed in the documentation, kopia uses deduplication. What I am more concerned about is how kopia judges whether one of the blocks (named a1) is equal to the blocks decomposed by other files after a file is decomposed into several blocks. In other words, how does kopia recognize that new file blocks are duplicates? Does kopia compare block a1 with all existing blocks, or use a data structure such as a Bloom filter to speed up the process of judging duplication?
I would be very grateful if you could answer this for me.
Kopia keeps an index of all content IDs that are in the repository. Content ID is basically a cryptographic hash of its contents.
Bloom filters are not currently used, but may be used in the future but mostly to speed up index lookups.
Thank you for your patient reply, I learned a lot in kopia, especially in terms of chunking.
I also wonder if the same parts of the two files in kopia share the same block in the repository