A friend and I are trying out a bunch of backup tools (namely Duplicacy, Restic and Kopia) to find out which one meets our needs best (for personal use).
One test is to simulate the catastrophic scenario of bit rot / data rot / data degradation by corrupting the repository by flipping a bit in the data sections of one of the blob files. On the positive side, Kopia performed really good, is really easy to use and is able to recover all other files, that are not affected by the corruption.
Unfortunately we are not able to “repair” the repo into a state where at least the new snapshots do not rely on the corrupted blob but create a new blob from still existing source data.
Is it possible to tell Kopia to detect and rewrite / replace corrupted or deleted blobs, when the source files are still available?
Besides repository repair we also tried to delete the blob and the associated content object using blob delete and content remove commands. We then deleted cache and created another snapshot in the hope, that it will create a new blob for the picture that was corrupted in the backup. Unfortunately this newly created backup still refers to the corrupted/deleted blob and fails to restore the image as well.
There are some missing features here, so it’s not optimal UX yet, but today you can remove the damaged content from the index (using kopia content rm <x>). This will actually corrupt the repository a bit more temporarily, but you should be able to create a snapshot of the file with --force-hash=100 to reupload the original data which will fix it. This is because kopia uses content-addressable storage and same data will result in same content IDs (within the same repository).
There’s probably more streamlined experience here that could be built.
Longer term we’ll probably have some redundancy codes to recover from bitrot automatically.
In the short term, using managed cloud storage or a filesystem which handles bitrot like ZFS/BTRFS and performing regular scrubs is recommended.
Now there’s another issue: In case the original file has been deleted, it is not possible to repair the repo using the steps above. The only thing that we could think of to get rid of the error messages was deleting all snapshots, that refer to the corrupted data.
As snapshot verify only reveals the date of the latest snapshot that is affected, one would have to delete and verify over and over again until all broken snapshots are removed. For a large repo this would be extreeemly time consuming.
Is it possible to list all snapshots that use a specific content object?
Or is it possible to somehow “hide” the error for future invocations of snapshot verify?