Questions about Maintenance

Hello!

First I want to start by saying I really enjoy Kopia. It’s an incredible little tool and it convinced me to roll my own backup solution because it has everything that I’m looking for.

I use an object storage provider that has a minimum storage duration policy of 90 days. I’ve been keeping an eye on Kopia and it’s maintenance runs for a little while and I noticed a couple things. It seems the most data is deleted/rewritten (both count against me) when I add new clients to the repository. For example, when I started adding data it was just my three clients at home. I did all the uploading/testing during a free trial, and when I was satisfied I switched over to paid and just let Kopia run in the background. Over 11 days my deleted/rewritten data went from 0TB to 0.05TB, which isn’t that bad, and something I’m okay with.

Adding two new clients now, both with about 1.5TB each, over the past three days i went from 0.05TB to 0.08TB. Which again, the amount on its own isn’t an issue, it is if it keeps growing at that rate. If I delete/rewrite 10GB a day and my minimum storage duration is 90 days, thats 900GB that I’m paying for that is unused.

I’ve kept an eye on maintenance and it mostly seems to be content repacks and then blob deletions.

Is this kind of growth only seen when adding new clients? Should I expect it to stabilize a bit once everything is uploaded and new data is added slowly over time?

If it ever gets out of control, what are some durations that I could set on maintenance to balance a minimum of rewrite/deletions with repository performance? Right now I’ve kept everything maintenance wise at defaults.

Thanks for any help, I really do appreciate it.

Right now there are no fine-grained settings to control the rewrites other than the interval which you can set using

kopia maintenance set --full-interval 24h

or something like that. To minimize rewrites, I would set it to something like 240h (10 days) or more, so that it rewrites things less frequently in bigger chunks, so it won’t needlessly rewrite stuff that’s partially alive.

2 Likes

Thanks so much for the response! I was wondering if pushing full maintenance off was okay, since it seems like the quick maintenance is the indexes/what helps Kopia work efficiently. I’ll most likely implement that and keep an eye on things.

Really appreciate all the work you’ve done with Kopia, and looking forward to the new repository update!

Yes, that is accurate w.r.t quick maintenance. in the new version that won’t be strictly necessary, so you can decide to run maintenance very infrequently.

1 Like

Out of curiosity, are there any plans to create a sort of hybrid model between old/new? Something like “rewrites are fine if packs are over days old” or similar? From the little bit I was able to glean looking through the code it seems like that would be challenging, to say the least, but I figured it might be worth an ask on the off chance it’s been considered.

Rewrites of contents are only necessary if we want to be able to delete blobs (and to keep the number of blobs down which is mostly cosmetics), they don’t really help performance.

On the other rewrites/compaction of index blobs (n) are necessary for good performance as Kopia will suffer badly if the number of n blobs get above few hundred.

I haven’t fully thought through maintenance in the new index format, but i think we’ll have just one type of maintenance - full, because index compaction is not really required for performance anymore. I’m going to be working on this over the next few weekends.

2 Likes