I’ve checked several other posts related to “missing blobs” but couldn’t arrive at any conclusions, so I apologize if I missed something.
This is using kopia 0.13 with local sources and repository.
The sequence of events is:
Backup is to a zfs filesystem without redundancy.
zfs reported I/O errors in a few files which were blobs. It advised to delete these files. Since I have all the source files, I though kopia would recreate such missing blobs at the next backup.
I deleted these blobs.
I ran kopia verify and it indeeds shows errors about missing blobs.
I tried to do a new snapshot, but it ended saying that nothing new needed to be backed up.
kopia verify still shows the same missing blobs.
I double-checked that the files being reported to have missing blobs do exist in the source path, and haven’t changed (they are static PDFs).
At this point I’m stuck. Clearly kopia thinks the blobs exist in some internal table. I see a snapshot fix invalid-files option, but it seems it will remove files that depend on those blobs, instead of marking the blobs as missing instead. I’m concerned that using that, even if the next snapshot will be proper, will make all previous snapshots to miss the files, as if they never were there.
Any suggestions appreciated.
EDIT: to add that, after testing fix invalid-files on a mock repository, it indeed populates files with missing blobs with .INVALIDfilename. So that’s not what I’m after.
EDIT: to add that I tried deleting ~/.cache/kopia, passing --force-hash=100 to kopia snapshot create, and:
A new snapshot is taken without reporting errors.
snapshot verify reports errors only in the original snapshot affected, not the newly created one.
Mounting the repository reports I/O errors in the files in all snapshots, not only the one reported by snapshot verify.
First of all… a blob in Kopia does not necessarily contain only one file - in fact blobs will almost always contain multiple smaller files. So, if a blob gets deleted or corrupted, the same blob cannot be re-created again, due to the fact that a blob is subject to re-writing by repo management. So you just can’t re-create a missing blob.
Hi @budy, thanks for the answer. I see, I suspected something like that by the amount of small files affected by just a handful of blobs.
After my tests I too I’m of the thinking that kopia can’t currently fix those blobs with the available cli. I was hoping that it worked somehow like in restic where rebuild-index + snapshot does indeed fix missing blobs. I guess there’s always tradeoffs.
I do wonder if it is possible to implement this kind of recovery in kopia though, since deduplication must rely on the same blobs being obtained for the same input files, even if new files appear in the mix. So I guess some kind of e.g. snapshot fix replay should be doable to fix any missing/damaged blobs, as long as the original sources are available, of course.
Since losing a blob to bitrot or backend malfunction doesn’t seem that far-fetched (yes, I know our backends should be trustable), it would be a nice feature to have for extra peace of mind, specially since it can save a lot of snapshot history. Maybe it’s simple enough to implement that the effort/benefit ratio allow it.
Have you tried to only mount the latest snapshot?
I can mount all of them, no problem, and all of them have the same broken files, as expected.
In the end I’ve gone the route of fix invalid-files + new complete snapshot. I know how to reproduce the issue easily though and have compiled kopia from sources in the past, so I’d be willing to help implement/test this feature.
I don’t actually think, that you can fix these errors for past snapshots, since you probably won’t be able to recreate the exact blobs anyway. I’d suggest to think of missing blobs as unreadable parts of a tape backup which runs in incremental mode…
Aaaand… since we’re talking disc-based backup here, having more than one is mandantory, if you want to ensure a minimum safety level, anyway.
I see what you mean with the tape analogy. Then the solution involves creating new blobs for any lost files and updating the snapshot manifest. This seems doable, as snapshot fix is already updating contents of past snapshots, as long as the hashes of the lost files are in the snapshot metadata and not in the blob itself.
Yup, I have multiple backups; cloud-based, and at a separate location, done with different backup programs too. This so happens to be my first serious experiment with kopia. And I must say I’m loving it, with this being the only caveat I’ve found till now.