Protection from bitrot

I have experimented with this idea with other backup solutions. I think it would work well with Kopia.

The idea is to use PAR2 recovery files to add a layer of recoverability to all repo files. In the event that a repo file is slightly damaged, PAR2 could repair the damaged file. Just 1%-10% PAR2 files add quite a bit of protection.

If PAR2 capability is not directly added to Kopia, then it would be helpful to be able to execute a command/script after Kopia creates each repo file in the cache. The repo file would not be removed from cache until the command/script has completed.

One could decide to store these PAR2 files locally or some location on the repo side.

While I could accomplish this now with PAR2 and scripts, it would likely mean the script would need to download each repo file, but that is not efficient.

You can see the details of PAR2 here:

My current script solution today has been this. Using powershell, I create PAR2 recovery files of 5% and I 7zip the files into a file which matches the Kopia repo file.

The powershell script will create/re-create the PAR2/7zip file for any repo file then added or changed by Kopia.

This gives me confidence that a 5TB Kopia backup can be repaired in the event of some corruption.

3-2-1 backups are great, but in addition to 3-2-1 my only alternative would be to have 2 cloud providers having the 5TB backup. I prefer having 1 cloud provider and some insurance.

I also scripted a check that a 7zip file exists for each Kopia repo file and that the age of the 7zip file is newer than the Kopia repo file.

I’d like to have an option for Kopia to log to another file the repo files created/updated. Then I can more easily know which repo files need a PAR2 file created.

I think that covers me for now.

Having a PAR2 “Reed Solomon” solution imbedded into Kopia would be great though.

Kopia has been incredible. Awesome tool and really smart software development. I can keep my Kopia backup local, then PAR2 it all, and then rclone/robocopy it all to the cloud.

p.s. I moved to Kopia from CloudBerry Ultimate Backup.

I create PAR2 recovery files of 5% and I 7zip the files into a file

Don’t you just void the whole point of PAR2 by creating single point of failure ? If 7z archive might get corrupted then your PAR2 files won’t be available. To be able for PAR2 to recover content of broken file it must have access to *.par2 files.

It’s a secondary level of protection. If the any Kopia repo files are corrupted then there is data loss.

Of course in best-case I can do a full restore using uncorrupted a Kopia repo. My intent is to improve recoverability and repo repair. Except for ZFS, I don’t think any file system can really be trusted, including all cloud providers.

@ RunDover can you share how you do this?

Just so you know, the latest version supports ECC, but only on new repositories. It’s still experimental, though.

1 Like

That is excellent to hear!

I haven’t looked at the implementation, but it is using Reed-Solomon error correction. This is the same as my PAR2 solution.

Built-in ECC will definitely be the way to go.

As for my solution, my script is not generic enough to share and would require a lot of changes for anyone else to use.

1 Like