Where may I find documentation on how to recover from errors or atleast show some information on the data affected?
I’m getting this error message when running maintenance:
2022-03-21T17:12:25.522694Z ContentInfo("4c50f57a81264768a65890663bc8c7f6") - error content not found
2022-03-21T17:12:25.522942Z error processing elasticsearch/nodes/0/indices/531RLnciSzeHZabvH3oW0Q/0/translog/translog.ckp: error verifying 4c50f57a81264768a65890663bc8c7f6: error getting content info for 4c50f57a81264768a65890663bc8c7f6: content not found
user@host:~$ kopia show 4c50f57a81264768a65890663bc8c7f6
ERROR error opening object 4c50f57a81264768a65890663bc8c7f6: content 4c50f57a81264768a65890663bc8c7f6 not found: object not found
You can explore the command diff too, but my guess, since it transaction log of elasticsearch it dynamically changing during backup that’s why. The best thing to do backup either by making filesystem snapshot (lvm, zfs, freebsd’s ufs…) to freeze stage before backup or stop elasticsearch (as well databases) before doing backup.
If I understand correctly this command tells me if a repo is corrupted, but I already know that something is wrong. Or does it tell me which snapshots are affected?
Data from elasticsearch isn’t important for me and can be recreated easily.
The issue you experiencing is because you running backup over live working instance of elsaticsearch (ES) where transaction log changing while you backing up.
In the same way as you shouldn’t backup live MySQL or Postgres, you shouldn’t do it with ES.
Exclude completely in .kopiaignore live location of ES and do instead ElsticSearch snapshot and backup ES’s snapshots instead of live ES.
Snapshoting of ES is pretty easy, just a single curl request (after register repository once), like:
HOST=http://localhost:9200
repo=mybackup
snap_name=$(date '+%Y-%m-%dT%H-%M-%S')
repo_path='/path/to/backup'
# Register your repository (you need to do it just once):
curl -s -XPUT "${HOST}/_snapshot/${repo}?pretty=true" \
-H 'Content-Type: application/json' \
-d '{
"type": "fs",
"settings": {
"location": "'"${repo_path}/${repo}"'",
"compress": true
}
}'
# create snapshot of elasticsearch (kinda like a dump database)
curl -XPUT \
"${HOST}/_snapshot/${repo}/${snap_name}?wait_for_completion=true&pretty=true"
# run kopia.... on ${repo_path}/${repo} only
You seem to misunderstand the question. Kopia maintenance runs are aborted with the above posted error message and I want to know how to fix/repair the repository.
user@host:~$ kopia maintenance info
Owner: user@host
Quick Cycle:
scheduled: true
interval: 6h0m0s
next run: 2022-04-03 16:45:16 UTC (in 1h11m56s)
Full Cycle:
scheduled: false
Log Retention:
max count: 10000
max age of logs: 720h0m0s
max total size: 1 GiB
Recent Maintenance Runs:
snapshot-gc:
2022-04-02 16:45:13 UTC (13m37s) ERROR: unable to find in-use content ID: error processing snapshot root: error verifying 4c50f57a81264768a65890663bc8c7f6: error getting content info for 4c50f57a81264768a65890663bc8c7f6: content not found
2022-03-26 16:45:11 UTC (7m11s) ERROR: unable to find in-use content ID: error processing snapshot root: error verifying 4c50f57a81264768a65890663bc8c7f6: error getting content info for 4c50f57a81264768a65890663bc8c7f6: content not found
2022-03-24 14:22:22 UTC (3m18s) ERROR: unable to find in-use content ID: error processing snapshot root: error verifying 4c50f57a81264768a65890663bc8c7f6: error getting content info for 4c50f57a81264768a65890663bc8c7f6: content not found
2022-03-24 14:13:06 UTC (6m23s) ERROR: unable to find in-use content ID: error processing snapshot root: error verifying 4c50f57a81264768a65890663bc8c7f6: error getting content info for 4c50f57a81264768a65890663bc8c7f6: content not found
2022-03-24 13:28:11 UTC (9m4s) ERROR: unable to find in-use content ID: error processing snapshot root: error verifying 4c50f57a81264768a65890663bc8c7f6: error getting content info for 4c50f57a81264768a65890663bc8c7f6: content not found
cleanup-epoch-manager:
2022-03-16 12:23:20 UTC (3s) SUCCESS
2022-03-15 10:46:56 UTC (2m12s) SUCCESS
2022-01-23 17:22:37 UTC (0s) SUCCESS
cleanup-logs:
2022-03-16 12:23:20 UTC (0s) SUCCESS
2022-03-15 10:46:51 UTC (4s) SUCCESS
2022-01-23 17:22:37 UTC (0s) SUCCESS
full-delete-blobs:
2022-03-15 10:46:23 UTC (28s) SUCCESS
full-drop-deleted-content:
2022-03-16 12:23:19 UTC (0s) SUCCESS
2022-03-15 10:46:22 UTC (0s) SUCCESS
full-rewrite-contents:
2022-03-16 10:06:38 UTC (2h16m40s) SUCCESS
2022-01-23 17:22:37 UTC (0s) SUCCESS
I wonder if it would help to recover the indices using
kopia index recover
maybe even with --delete-indexes. Also, I do think, that omitting --commit, would not change anything, but I haven’t tried that. Neither have I recovering the indices, so use at your own risk. However, I am rather confident, that Kopia is very reluctant about destroying it’s internal data structures… and rightly so.
Since you’re running your repo on a S3 bucket, this will ultimately be a rather slow process, depending on the size of your repo. Maybe you want to try that locally first.
Or maybe, try to drop the content from the index…
kopia index optimize --drop-contents=DROP-CONTENTS
Since that content cannot be found anyway, you won’t loose anything - at least this is, what I think.
# 1. But make sure there no any kopia's instance are working except this
kopia maintenance run --full --safety=none
# 2. If it failed
kopia blob delete 4c50f57a81264768a65890663bc8c7f6
This command did not produce any output, but I guess that’s expected because Kopia complained that this blob is missing, hence there is nothing to delete.
I ran that command without --commit and got this: Found 6432604 contents to recover from 221632 blobs, but not committed. Re-run with --commit
I’m currently running with --commit and will report back later as there is still 9 hours remaining.
No, I didn’t try that because I couldn’t find any information when to use this command.
Unfortunately I have lots of snapshots with errors from multiple Windows clients because of read permission errors. But the 1 client with elasticsearch on it does not show any errors in its snapshots.
Running full maintenance...
Looking for active contents...
ERROR error processing 0F8E0C64646F76CA-win10_2022-04-04_Full-00-00.mrimg: error verifying Ixeec3b41bd642f3971d4d8bb231d4fc97: unable to read index: error getting content info for xeec3b41bd642f3971d4d8bb231d4fc97: content not found
Finished full maintenance.
ERROR snapshot GC failure: error running snapshot gc: unable to find in-use content ID: error processing snapshot root: error verifying Ixeec3b41bd642f3971d4d8bb231d4fc97: unable to read index: error getting content info for xeec3b41bd642f3971d4d8bb231d4fc97: content not found
I’m getting this (same) error message for a few days now on a different repository (repository from my original post has since been deleted) and a different server. Any ideas what is causing it and how to recover?