Help with corrupt repo

Hi there,

I’m wondering if I can get some help to recover a seemingly corrupt repo and also possibly some advice if the way I’m running maintenance led to the corruption.

My backup script has this sequence of commands:

kopia repository connect filesystem ...
kopia snapshot create /path1
kopia snapshot create /path2
...
kopia snapshot expire --delete
kopia maintenance run --full --force
kopia repository sync-to ...

On a recent run, one of the snapshot create steps failed with this error:

ERROR unable to add xn25_04abe9d83a08a65c09bda238af191587-s91f147a798c55c73110-c1 to index blob cache: context canceled
ERROR failed to open repository: unable to create shared content manager: error loading indexes: error downloading indexes: error loading index blob xn25_30af0bb486ea90784518d7f408be9a12-s37430c3792a30207110-c1: error decrypting BLOB xn25_30af0bb486ea90784518d7f408be9a12-s37430c3792a30207110-c1: unable to decrypt content: cipher: message authentication failed
ERROR open repository: unable to open repository: unable to create shared content manager: error loading indexes: error downloading indexes: error loading index blob xn25_30af0bb486ea90784518d7f408be9a12-s37430c3792a30207110-c1: error decrypting BLOB xn25_30af0bb486ea90784518d7f408be9a12-s37430c3792a30207110-c1: unable to decrypt content: cipher: message authentication failed

Subsequently, any attempt to connect to the repo now fails with:

ERROR unable to add xs22_0d8f8797b3f05d4f46b95e9f0f5befe6-s533719395a86f109-c1 to index blob cache: context canceled
ERROR failed to open repository: unable to create shared content manager: error loading indexes: error downloading indexes: error loading index blob xn25_04abe9d83a08a65c09bda238af191587-s91f147a798c55c73110-c1: error decrypting BLOB xn25_04abe9d83a08a65c09bda238af191587-s91f147a798c55c73110-c1: unable to decrypt content: cipher: message authentication failed

From my command history, my backup script ran without issue last time, completing the snapshots, full maintenance and sync.

I’ve tried resyncing twice from my online backup. The first time, I simply reran the above script and I run into the same sequence of errors (ie. able to connect, then fail on the same snapshot create step) leading me to believe the corruption happened sometime between the last successful snapshot and the sync.

After the second resync, I tried the following:

kopia repository connect filesystem
kopia maintenance run --full --force
kopia snapshot verify

Those commands complete without issue but when I tried my usual backup script again, it too failed at the same snapshot create step.

Is there any advice on what I should try next to try to recover my repo?

Try removing that one blob (move it to another directory, instead of deleting, just in case)

Then run kopia index recover --parallel=10 --commit and see if that recovers your repository.

If index recover fails, you may still have a decent chance of recovering most of the data as long as only few contents are damaged. Try kopia snapshot fix invalid-files.

Did you do kopia snapshot verify or kopia snapshot verify --verify-files-percent 100? The former does not really do much in terms of verifying if your blobs are valid.

Also, in case of corruption, you may want to check the health of the drives on your filesystem.

I ended up removing all the files under the xn25 directory before I was able to connect and run an index recover. That seemed to have done the trick, and I was able to complete a round of snapshot create commands.

Thank you! In general, if I wanted to verify the integrity of the local repo before doing a sync, would a snapshot verify 100% be the only way?

Edit: found the answer to my question in this related thread

Thanks for the heads up. I wasn’t running with the verify files 100% option in my original script. Out of curiosity though, I try to run it now with with the 100% parameter against my online backup with the known bad blobs and it completes processings all the objects without error.

To be honest I don’t quite understand how my repo became corrupt. My understanding is that the x files are indices and somehow a bunch of index files in my repo got corrupted between being generated in memory and being stored on disk. I can’t tell if the blocks associated with those deleted indices were also corrupted but I guess a future maintenance session will clear out those unreferenced blocks.

The drive the repo is on is reporting good health but I will heed your caution and move the file system to a different set of disks.

@jkowalski Is this intended behavior? Shouldn’t kopia snapshot verify --verify-files-percent 100 flag the corrupt blobs?