Local metadata cache for incremental backup

CrendKing · September 10, 2021, 11:01am

A few months ago, there’s a post about Kopia in Hacker News. Someone mentioned that it’d nice if Kopia has separate metadata files from data files so that he can upload the data files to Amazon S3 while keeping metadata locally for incremental backup. The point is that some tiers of S3 (e.g. Glacier) is extremely cheap for storage and ingress requests, but very expensive for egress data transfer.

Jarek replied that Kopia can easily cache the metadata locally for those purposes. If I understand correctly, this is the $KOPIA_CACHE_DIRECTORY directory? If I keep these directories locally, is it guaranteed that Kopia would always use the local copies of the n and q files unless there’s cache miss, in which case a download from remote repo would happen?

If the cache directory is not for that purpose, is there a mechanism to keep a copy of metadata files locally in the current version?

Side question: if I use local filesystem repo, the cache would be pointless since Kopia can just access the source, right? If so, is there a way to disable cache for local filesystem repo?

CrendKing · September 14, 2021, 9:32am

So I cloned Kopia repo locally, changed the code to support storage class when calling PutObject on S3, then proceed to test if this simple change would work. Unfortunately it doesn’t due to some class not allowing GET.

If I directly supply --storage-class=DEEP_ARCHIVE during repository create s3 command, it will end with error “kopia.exe: error: unable to connect to repository: error connecting to repository: unable to read format blob: unable to complete GetBlob(kopia.repository,0,-1) despite 10 retries, last error: The operation is not valid for the object’s storage class, try --help” Looks like Kopia do some read-after-write operations on S3.
If I --storage-class=STANDARD during repository create s3, I can create the repo. All objects are stored in standard storage class. Now if I change the storage class option in the repo JSON on the fly, then create snapshot, new objects will indeed be stored in the new storage class.
Unfortunately, if the storage class is in Glacier, creating snapshot results in this error: “error running maintenance: error running maintenance: unable to get maintenance params: error looking for maintenance manifest: unable to load manifest contents: error loading manifest content: error getting cached content: unable to complete GetBlob(q872c39f3462b6c40d83b46ced9e7ee78-sbf7032cc1d0a2125108,0,-1) despite 10 retries, last error: The operation is not valid for the object’s storage class”

So at this point, simply supporting storage class option to the S3 repo is probably OK for non-Glacier classes, but not enough for the two Glacier classes. To support Glacier, Kopia probably need to treat metadata blobs differently from data blobs. Currently it seems all blobs are treated identically in the storage layer.

jkowalski · September 14, 2021, 1:19pm

Have you tried conditional storage class based on the blob id prefix? The p blobs contain the bulk of data while q blobs and others contain metadata so it might make sense (not sure about cost) to only put p in the glacier class.

CrendKing · September 15, 2021, 4:08am

Much better: The only unexpectedly error’d operation is maintenance run --full --safety=none. All other operations are doing fine.

Here is the output for maintenance:

.\kopia --config-file=S3.config maintenance run --full --safety=none
Running full maintenance...
Looking for active contents...
  Processed 36 contents, discovered 36...
Looking for unreferenced contents...
Rewriting contents from short packs...
unable to rewrite content "64d1928fd03b83a50a18756605646ec56d793cbf6b802f040303e055ecee548d": unable to get content data and info: error getting cached content: unable to complete GetBlob(p87bb2c02fd9b6f1ac925b9b3af2e0762-sdc1d5ce7d406f483108,1182885,2987) despite 10 retries, last error: The operation is not valid for the object's storage class
unable to rewrite content "a357ae7bb288742bd0326fc473e2032a69c49ac524c811c42aaa8ca304fc60c7": unable to get content data and info: error getting cached content: unable to complete GetBlob(p87bb2c02fd9b6f1ac925b9b3af2e0762-sdc1d5ce7d406f483108,6559571,1762591) despite 10 retries, last error: The operation is not valid for the object's storage class
unable to rewrite content "a474dbbc30a3d4c7cf58bf2cb67f1d3840c37d1ddaf5fc234879771302ec3c3a": unable to get content data and info: error getting cached content: unable to complete GetBlob(p87bb2c02fd9b6f1ac925b9b3af2e0762-sdc1d5ce7d406f483108,1114261,23878) despite 10 retries, last error: The operation is not valid for the object's storage class
unable to rewrite content "b2664813e60d10d3b066a56fe4a0a15510eaf6b07a0457690640953633c9f2ec": unable to get content data and info: error getting cached content: unable to complete GetBlob(p87bb2c02fd9b6f1ac925b9b3af2e0762-sdc1d5ce7d406f483108,1270393,6316) despite 10 retries, last error: The operation is not valid for the object's storage class
<skipping many lines>
Finished full maintenance.
default (64KiB) - allocated 48 chunks freed 47 alive 1 max 12 free list high water mark: 11
ERROR: error rewriting contents in short packs: failed to rewrite 37 contents

I guess full maintenance would actually modify existing blobs in some way?

Anyways, it’s good progress. Here’s my questions:

Beside the “p” and “q” blobs, I also noticed many “xn”, “n” and “_log” files, in this example and an new repo I recently created after 0.9. They are usually small and fewer, but still take some chunks of storage. However, when I blob list an old repo I created way before, those blobs are extremely few (there were only 3 non-p and non-q blobs in this 2TB repo). Is there an internal doc listing the meaning of all these blob prefixes? Are some of them good candidates for glacier class as well? And why I’m see dramatic increase in number of these blobs in recent repos?
Since glacier is basically a write-only storage, but during backup the metadata will inevitably be read. Can we offer option to store or cache the non-p blobs locally, so that we can completely eliminate the need of S3 standard storage class?

CrendKing · September 16, 2021, 7:39am

Storage Tiers | Kopia basically answered my questions. So it looks like a solution to support AWS S3 Glacier would be:

Add option to allow user to choose which storage class to store p blobs. Defaults to standard. Update S3 document to suggest user to turn off full maintenance if setting this to Glacier classes.
Add option to allow user to choose which storage class to store other types of blobs. Defaults to standard. Update S3 document to warn user not to set this to Glacier classes.

Anything else? I can do a pull request for these two, if you want.

paulr · October 3, 2021, 12:08am

Anything else?

@CrendKing : what about deletion of files? Have your tests been able to check if kopia is able to delete Glacier’s archives in case a blob needs deleted? I’m new to kopia but guessing it will need to delete blobs eventually if the file is removed from the backup up folder, right?

The AWS’ doc mentions one needs to retrieve an object and wait before being able to delete it.
And what looks like the kopia line deleting blobs doesn’t seem to handle that complex use case

(& your pull request doesn’t mention deletion of blobs either)

And great idea by the way! I’ve started using kopia since that HackerNews thread and it’s great!
I’m facing issues this weekend in one of my repository where all my objects have moved to Glacier and broke kopia…

I think an AWS lifecycle rule can do the same as what you’re currently suggesting (it’s possible to move files starting with “p” to Glacier after X days), but the deletion of files is still an issue.

Topic		Replies	Views
Kopia failed backup on AWS S3 Glacier Support	9	154	July 18, 2024
Cloud Storage with Object Lifecycle Management General Topics	1	304	October 4, 2023
Backup subset of filesystem repo to Amazon S3 General Topics	2	48	September 20, 2024
Help using Amazon Glacier Deep Archive with Kopia General Topics	3	738	October 15, 2023
Kopia, malware and object locks General Topics	50	1143	September 15, 2024

Local metadata cache for incremental backup

Related topics