Bitrot and/or data corruption protection

Hi, I’m new to Kopia, I see it’s full of features and I’m seriously evaluating its adoption (currently I’m on duplicacy).

Since Kopia is fully feature packed, does it also have any feature to protect backups from bitrot and/or data corruption? For example, if you are going to do a restore and some file on the repository got corrupted:

  1. The file will not be restored
  2. The file will be restored without any warning
  3. The file will be restored with a warning
  4. The file will be corrected and restored

Which one?

Moreover, if a bitrot and/or data corruption is present on a repository and I syncronize it to another repository, the data error will be propagated to the target repository? Will I get any warning?

If you are using a cloud repository, I guess you should trust your storage provider to use all the measures to avoid data corruption. But if you use your own local or SFTP repository, this possible issue should be taken in account. So, depending on the risk, I will set the repository on an “heavy” but secure file system as ZFS or not.

The answer is currently 1. Kopia does checksum verification on all reads which will fail in case of bitrot.

The current expectation is to have bitrot protection at the blob storage level (either cloud or ZFS/btrfs or similar) so Kopia can reg on data not becoming spontaneously mutated.

I personally use ZFS everywhere except on volatile data (DVR) that I don’t care about losing.

BTW, I think it would be totally possible to implement something like https://en.wikipedia.org/wiki/Reed–Solomon_error_correction at the File provider level (possibly SFTP and WebDAV too), so that even with Kopia on a non-fancy local filesystem you would get protection from bitrot.

1 Like

I think that would be an awesome option to have. For instance, in my case, I plan to backup everything to an external drive primarily, and I do not plan to use fancy stuff like ZFS or RAID. It would be fair to say that a reasonable chunk of people would fall in this category. So having a RS error correction option would allow for reasonable protection against low levels of data corruption.

It may be a good idea to also support option 3. The file will be restored with a warning. This could be controlled with a CLI flag in the restore command.