1 invalid checksum in a huge snapshot

After losing a lot of files because the snapshot was broken and I didn’t know it, I was rebuilding with what I’ve salvaged. I spent many hours taking a snapshot, and then immediately ran verify contents --full for many more hours.

It ran through the night and found one error:

[ … many successful lines … ]
ERROR error content bd24d7b5100ee1ad014705bf74e2721d is invalid: invalid checksum at p4d86ef908128223b30d9ac96a0c54de0-s3790696fb7e5769412f offset 17584579 length 5257669/5257669: decrypt: unable to decrypt content: cipher: message authentication failed
[… many more successful lines …]
Verified 451026 of 488431 contents (92.3%), 1 errors, remaining 1h15m30s, ETA 2025-04-09 05:16:39 PDT
Finished verifying 488431 contents, found 1 errors.

After that, I did a “snapshot verify” (following the instructions from this thread) and it ran successfully.

I haven’t tried “fix invalid-files” yet, because I’d like to avoid re-downloading the entire contents (~850GB, which takes hours). If the error message gives me a hash, is there a way I can specifically fix that one file? I’m still pretty confused about the various verify and fix commands.

edit, to clarify: I had this huge backup in a Wasabi S3 bucket before. I recovered it earlier this year and found that some of the files were broken. I deleted the broken files and kept the rest, and now I’m backing them up to the same S3 bucket, so it’s possible this incorrect checksum is the same error from before. (Most of the snapshot was probably cached, I think it only uploaded a small portion of the 850GB size)

Do you have the ability to check the birth timestamp on that piece of data on either endpoint? Eg:

stat ~/.vimrc

  File: /home/user/.vimrc
  Size: 2464            Blocks: 8          IO Block: 4096   regular file
Device: 0,51    Inode: 11115395    Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/  user)   Gid: ( 1000/  user)
Access: 2024-07-21 03:01:12.299544022 -0600
Modify: 2024-11-01 08:51:37.726508750 -0600
Change: 2025-03-03 00:25:26.034482989 -0600
 Birth: 2025-03-03 00:25:07.837727108 -0600

(I’m following this thread but don’t think I’d be much help. I have yet to run drills on Kopia.)

fix invalid files will not download the entire contents - why should it? It will merely check what files have vanished from the repo and perform the fix to resolve that issue. It might download some contents, since it will surely need to rewrite some blobs, but that should do it.

I vaguely remember your thread about this but I wonder, what you mean by “I deleted the broken files”? If you’ve got the repo available locally and I seem to remember, that you do, I’d fix that repo locally and perform a repo sync-to-s3 afterwards, to get the repo in the S3 bucket in order.

I thought, since the error was discovered during a full content verify, it would need to verify the contents again to know where the error is.
Here’s a timeline of events to clarify my situation, because I didn’t do a good job explaining what’s up:

  1. Used KopiaUI for years, backed up my projects to an S3 bucket but didn’t do any verifications lately
  2. This snapshot is actually a Cryptomator volume, because it was previously backed up to Google drive and I wanted it encrypted
  3. That Windows installation was lost, but I figured I had my files backed up to the S3 bucket, so I formatted the local drive and installed Linux on the machine
  4. Restored the snapshot from Kopia and discovered that some files were now incomplete/corrupt/zero-length
  5. That’s when I realized my mistake (KopiaUI doesn’t verify anything, yet?) and made my previous thread about it
  6. I accepted that those files were probably unrecoverable and just deleted them
  7. I created a new snapshot to the same S3 repo and let it run
  8. Most of the data was unchanged, so a lot of it didn’t have to upload again, I think.
  9. Ran content verify --full and found this one checksum error.

Now I want to fix the checksum on the S3 repo to make sure the data is all solid there.

I see… you could always just remove the content ID from the repo using

kopia content delete bd24d7b5100ee1ad014705bf74e2721d --advanced-commands=enabled

I’d assume that the repo maintenance would then take care of the unreferenced blob and also remove it. The data within that blob should be useless to you anyway and probably ties into the files, that got corrupted.

What does kopia content delete do? I would like to remove the corrupted stuff from the remote s3 bucket and then re-run my snapshot to send the correct data to fix it.

The docs are a bit light for that command:

and it sounds like it might delete my local files? I definitely don’t want that.

Why would Kopia ever delete something from you sources, as long as you don’t run a restore command and prompt it to overwrite any data already there?
As the docs state, that command will delete the contents associated with that ID from the repo. The files associated with this content ID in the repo will become unavailable.
From that point onward you could run a

kopia snapshot fix invalid-files

and see, what errors you get. If your’re ok with it, repeat that using the --commit flag to update the snapshots. Then perform another snapshot.

Well, I ran those commands and they didn’t say they fixed or changed anything.

I ran another snapshot, expecting to see the files corresponding to the deleted checksum to transfer up to the repo, but the snapshot finished instantly.

I followed up with another content verify --full and it finished, this time with no errors found:

Finished verifying 489216 contents, found 0 errors.

So, I guess I’m good? I’ve been relying on the UI to get a full picture of what’s going on, and I don’t feel like I confidently understand the CLI yet, so I definitely look forward to the day when verification and fixing corruption can be done from the UI!

I’ll periodically run content verify and in the future I’ll use this approach to try to fix any errors I get.

Yeah… your repo should be good then. If neither command shows anything amiss, I’d reckon that repo healthy.

1 Like