Best method to ensure valid snapshots: snapshot verify vs snapshot fix invalid-files

My understanding is that kopia snapshot verify --verify-files-percent 100 will download all files in a snapshot and ensure the backup of each file is valid in the snapshot. But, in the case of the backup of a file is not valid, the verify command does not do anything except notify you of the error. In contrast, there is kopia snapshot fix invalid-files --verify-files-percent 100 which will remove backups of invalid source files from snapshots.

This leads me to an important question, which I hope someone can help with. Does fix invalid-files also verify if the backup of a source file is valid? In other words, is fix invalid-files doing a source to target check and removing all targets with an invalid source? Or is it verifying that all source files are properly backed up in a snapshot on the target?

My goal is to ensure that my source files have valid backups in target snapshots. Any tips on how to do that are greatly appreciated. Right now, I have kopia snapshot verify --verify-files-percent 100 running monthly, but I am still not sure what I would need to do is the verify command found errors. Should I be running kopia snapshot fix invalid-files --verify-files-percent 100 instead of kopia snapshot verify --verify-files-percent 100?

1 Like

There are many verification methods, depending on what you need.

In the order of lowest- to highest-level:

  1. kopia content verify - will ensure that content manager index structures are correct and that every index entry is backed by an existing file

  2. kopia content verify --download-percent=10 - same as above, but will download 10% of random contents and ensure they can be decrypted properly

  3. kopia snapshot verify - will ensure that directory structures in the repository are consistent by walking all files and directories in snapshots from their roots and performing equivalent of kopia content verify on all contents required to restore each file, but does not download the file

  4. kopia snapshot verify --verify-files-percent 10 - same as #3 but will also download random 10% of all files, this ensures that decryption and decompression is correct.

  5. kopia snapshot fix invalid-files [--verify-files-percent] - performs exactly the same verification as kopia snapshot verify (3&4) for all practical purposes, but it will also write fixed directory entries and manifests.

As of today, it is not recommend to run snapshot fix automatically, only when 3 or 4 detects a problem.

So where does the difference between 1/2 & 3/4 practically matter, you might ask? Imagine a world were some index blobs are deleted and corresponding pack blobs are deleted too. In this case 1/2 will succeed (because index structures are still internally consistent), but if the blobs were needed for some snapshot 3/4 will fail.

It might be worth adding that full maintenance (which happens automatically in both CLI and UI) does #3 every time as part of mark&sweep garbage collection, so technically running verification is optional today as it will be happening regularly anyway. Paying attention maintenance status is not optional in such cases, though and it’s quite hard to observe - this will be improved through notification mechanism in future versions.

Again, all this depends on on the type and value of data, tolerance for risk and cost of data storage but I would generally recommend running something like this daily:

$ kopia snapshot verify --verify-files-percent 1

This will preform full repository scrub every 100 days on average. With tons of data, perhaps decrease that to 0.3 to get full scrub every 300 days. Some folks may want to even perform test restores - again all depends on use case.

8 Likes

Thanks, this is very helpful.

But why is snapshot fix not recommended to run automatically? If fix essentially does the same as verify except it also fixes errors that are found, what is the downside to running fix regularly? Is the fix process not foolproof?

3 Likes

Thank you very much for explanation !

Does this means that absolutely every file will be checked over 100 days ?
Asking this because according to help --verify-files-percent is checking files randomly

    --verify-files-percent=0   Randomly verify a percentage of files by
                               downloading them [0.0 .. 100.0]

No, there is no guarantee that all files will be checked unless you do --verify-files-percent=100. Doing --verify-files-percent=1 every day for 100 days will not check 100% of the files, since the command grabs files at random and it is not doing any sort of record keeping in terms of which files were grabbed yesterday or last time. Over time, you will likely be checking a large percent of your files due to the randomness, but there is no guarantee.

Yes, that’s was my understanding too, that’s why I confused but phrase:

According to scie… my googling of science, it will sort-of cover 64% of the data if done for 100 days.
Or inversely, the chance of any one percent of your data not getting tested over 100 days is around 36%

Still, over a year or so, most of the data should get tested, while not spending too much time every day doing the validations.

1 Like

and the math for “over a year” says any one percent of your data not getting tested would be only 2% which seems like pretty good chances to cover it all without excessive tests.