PLEASE READ: Don't use --safety=none for routine maintenance

jkowalski · June 8, 2022, 5:38am

I’ve noticed folks here often paste command line examples involving running:

$ kopia maintenance --safety=none

I can’t stress this enough:

This is NOT recommended for most users and could be outright dangerous in some cases.

It is recommended to ALWAYS use default safety settings which lets Kopia apply appropriate safety margins. In the short term it means the most-recently written data in the repository may not be immediately compacted, but when running maintenance regularly the repository will be fully compacted over time.

Safety buffers in kopia serve three purposes:

ensuring proper cache expiration - for good performance Kopia caches a lot of data. For correctness, we must ensure certian information reliably propagates to all the clients that need to see it, and they could be active, so we must wait long enough for all clients to refresh their caches.
clock skew correction - kopia can tolerate clock skew between client(s) and server up to several minutes. --safety=none tells it - “trust me, there is ZERO clock skew”, which is almost always dangerous and can lead to premature blob deletion and data corruption.
provider consistency issues - not all storage providers are created equal. Some providers will have weaker consistency than others. For example sometimes after writing a file, that file may not show up in directory listing for a brief moment (milliseconds, sometimes up to seconds or even minutes for very slow connections). Networked filesystems are typical examples of this, but many providers exhibit similar behaviors to a certain degree.

When running garbage collection during maintenance it is critical for Kopia to see ALL data written up to this point (and default safety margins allow for quite significant provider-level inconsistencies and clock skews), otherwise during its mark/sweep garbage collection it may incorrectly treat blobs as unreferenced and delete them prematurely leading to data loss. This has actually happened to several folks and --safety=none was the leading cause of data loss we’ve seen so far.

ray · December 8, 2023, 9:11am

hello
sorry for not reading this topic, We trid kopia maintenance --safety=none, after some day, when we use the kopia connect command, we find it is not successful, no output is returned, so does the other kopia command, and it seems the memory for the kopia is very large,
so it seems the data is broken？ is there any way to recover the data?

tobiasBora · January 19, 2025, 3:31pm

My understanding is that this is safe to use for local (filesystem) repositories, do you confirm? And without using --safety=none, how long am I supposed to wait (maximum) before the repository size can be shrinked? Are we talking like 2mn or 24h? Maybe adding some information in the maintenance logs like “in 5mn we can free X gb” it would encourage people to wait 5mn instead of using safety=none.

kapitainsky · January 19, 2025, 4:14pm

It is clearly documented:

This means that effects of full maintenance are not immediate - it may take several hours and/or multiple maintenance cycles to remove blobs that are not in use.

The best approach is to run maintenance regularly and do not lose sleep that free space reclamation takes some time. As a bonus we have lock free solution where multiple clients can use the same repo at the same time.

In most cases simply running regular backups is enough. Maintenance will be run automatically according to its default schedule. Full maintenance is run every 24h by default.

tobiasBora · January 19, 2025, 7:09pm

It is clearly documented:

This is not very precise (and also does not clearly answer my question about filesystem repos): of course since maintainance is ran every 24h it can take a few hours, but what if I manually run it? Is it cutting it to a few mn? What if I only have one client? And is the time dependent of the repo’s type?

The use case I have in mind is when I want to do a backup but lack space on my (typically local) repo (which occurs nearly everytime I backup my system since my hard drive has little free space). I can’t afford in this scenario to wait a few times 24h until maintenance has run enough time, I need to free space within a few minutes (otherwise I’ll never dare doing backups if I need a few days to run them). So I need precise guarantee to know what is safe or not to do to reach a low time maintenance.

kapitainsky · January 19, 2025, 7:27pm

will free space immediately I think. Or on the second run. The best to test. Either way it will be easy to run it twice if needed.

Only risk is that if you run any other operations in parallel you risk corrupting all repository but as you are only user than it is fully under your control.

Another problem is that if you run out of space when running backup you also risk corrupting all repo. I would rethink my backup strategy in such case. Running backups on shoestring is not the best approach.

tobiasBora · January 19, 2025, 7:34pm

Ok, thanks.

??? Really ??? Any reference to such a statement in the doc/github issue? This seems like a major risk here, one can very easily run out of space without noticing.

My computer has a 2TB drive, like my backup drive, and in practice it cannot hold more than 2-3 snapshots separated by a few months. I could afford a 5TB one, but even in this case, after a few snapshots, I can easily imagine running out of 5TB of space…

kapitainsky · January 19, 2025, 8:51pm

It is your responsibility to plan it accordingly and unfortunately also need to DYI.
If you expect that your max delta backup can be 200GB (worst case + some fat safety margin) then do not run backup if you have less than 200GB free. In such case send notification, delete oldest snapshot(s) either automatically or manually until you have enough space. These are things which kopia could do but does not:) until somebody will need it bad enough to do development.

Also no filesystem performs well when filled up to the brim. Good practice is to keep 15-20% of free space to maintain disk performance and longevity.

dimejo · January 20, 2025, 8:58am

That would mean that you are roughly adding 1TB of NEW data within a few months. Is that true?

dimejo · January 20, 2025, 9:11am

I don’t recall any reports of a corrupted repository because of that. But Kopia can’t run any tasks when your disk is full and that’s when people start to manually delete files and hence corrupting the repository.

Best practice on a system with such tight disk space is to create an empty placeholder file of a few GB, that you can easily delete in case you need some space.

Side note: Please don’t start a Github issue to asks questions. Github issues are meant for bug reports and there are already way too many open (and unanswered) issues. If you need an answer directly from a developer your best bet is the Slack channel.

larryc · January 22, 2025, 6:38pm

This is an excellent idea, thanks! It should be in the FAQ.

It is unfortunate that (as you wrote) “Kopia can’t run any tasks when your disk is full and that’s when people start to manually delete files and hence corrupting the repository.” This happened to me when I exceeded my quota on my cloud storage service. It was a major pain to get kopia running again. I do wish there was a way to engage a “break glass delete old snapshot(s)” that would be able to run in the even of no more disk space or quota.

tobiasBora · January 23, 2025, 8:21am

I think responsibility are shared: it is my responsibility to avoid doing mistakes, it is kopia’s responsibility to minimize the risks regarding dataloss, including when errors occurs (from hardware, but also users): isn’t it the whole point of doing backups? Without errors (hardware/users), there would be no point in doing backups. And given how likely one can forget to check spare disk (see e.g. @larryc’s experience), risking to corrupt its whole backup due to such a common mistake is a huge risk. I’m even surprised I’ve to argue about it…

No, I think you misread my comment, I have maybe ~100GB of new/deleted data per month, for instance by adding/removing ISO images, RAW image files etc. So if my laptop drive is 1.4TB full, say, if my backup is 1.9TB full (e.g. it contains 5 snapshots, one per month), and if I do a backup after 1 month, it will fill the drive. I’d rather either have a warning “not enough free space” before doing the operation so that I can delete a few snapshots, or at least not corrupt the repository when it gets full or lock me with no other choice than corrupting my repository.

I see, that makes sense and is partially reasuring… and having an empty placeholder file is indeed a cool idea, thanks. Yet, it would be great to have this handled automatically by kopia (or at the bare minimum written in the doc & quickstart), as one can easily forget to add such file (again, let’s try to prevent user mistakes).

Well, I do report a bug in GitHub · Where software is built kopia’s inability to prevent common user mistakes leading to corrupted repository.

Exactly the kind of things I’m asking, thanks for showing I’m not the only one finding this important. I was starting to question my own sanity

dimejo · January 23, 2025, 8:52am

I totally agree that such a built-in feature would be great. What you can do now is write a small script to check available disk space by using actions.

github.com/kopia/kopia

FR - Native way for kopia to recover from underlying storage running out of space

opened 07:33PM - 08 Feb 22 UTC

xxxliqu1dxxx

bug help wanted robustness

FR - Native way for kopia to recover from underlying storage running out of spac…e (Spoke with Jarek on slack and he recommended I log this feature request) Problem As of 0.10.4, when storage runs out of space, errors like this are thrown: DEBUG got error cannot create temporary file: cannot create directory: unexpected error creating directory: mkdir f:\newkopia\xs4\_21: There is not enough space on the disk. when PutBlobInPath:f:\newkopia/xs4/_21/69ab5f009aefb557979503cedf8f16-s27e2e49f1c49d18a-c1.f (#1), sleeping for 150ms before retrying Impact When any command is triggered, either snapshot delete, maintenance, or anything really, it seems like Kopia expects some storage to be available on the repo to do temp files. Problem is, when there's no more storage, one would depend on snapshot delete --delete to actually "free" some space but it seems that there's no way for Kopia to natively recover from that situation and is unable to mark snapshots to be deleted and allow maintenance to runs to "clean things up", which would actually free some space. Repro steps Use external sd, i.e. 4TB external harddrive for this test using KopiaUI or CLI, create repo on destination create snapshots until out of space reach out of space state Expected result I would expect to be able to delete a snapshot then run maintenance to "clean things up" Actual result Every command run from kopia CLI or UI is returning very similar errors as below, which was from maintenance run ERROR error updating maintenance schedule: unable to complete PutBlobInPath:f:\newkopia/kopia.maintenance.f despite 10 retries, last error: cannot create temporary file: open f:\newkopia/kopia.maintenance.f.tmp.6b5e03a0660772e8: There is not enough space on the disk. Conclusion Therefore, there does not seem to be a native way to gracefully recover from this state. The below is not a workaround by any means, but the "only way" I found to actually get things going again was to arbitrarily delete an actual file from the hdd, i.e. df000a557ce9d9967853879c814-s7b3fa8a5996b047e109.f, understanding that this "content" gets deleted "forever", allowing subsequent snapshots delete as well as subsequent maintenance to run and clean things up.

tobiasBora · January 23, 2025, 8:59am

Not sure to see how action are helpful: is there a way to estimate beforehand how to much space will be required by kopia to do the backup? Or is the action supposed to create the safety file?

And thanks for pointing to FR - Native way for kopia to recover from underlying storage running out of space · Issue #1738 · kopia/kopia · GitHub, then my report is just duplicate.

dimejo · January 23, 2025, 9:16am

Because of deduplication it is hard to tell beforehand how much space is actually required unless you are doing the whole splitting and hashing twice.

Create a safety file or check for space left on your hard disk. The script can do whatever you want it to do!

larryc · January 23, 2025, 2:36pm

I think this is a harsh statement. All systems have failure scenarios and should be designed to account for those. I concur with @tobiasBora who said “responsibility is shared”.

Would you drive a car that didn’t have bumpers, seat belts, airbags, etc? Why not? Aren’t you a safe driver? It’s your responsibility to drive safely.

I’m making an exaggerated point, which is, yes, a driver needs to be safe, but the car also needs to have safety features because accidents do happen. Just like hard drives or storage quotas fill up. And when that happens, Kopia should fail gracefully rather than crash and burn as it currently does.

kapitainsky · January 23, 2025, 4:16pm

I still believe that it is something kopia should not even try to implement. Not today at least. Or until somebody comes with working PR.

It is only easily workable for very specific use case like local storage (where free space calculation is easily available) and single kopia client accessing given repository. And such case is easily manageable outside kopia. General solution is not trivial and I think would require a lot of work including serious compromises.

With your car analogy, it does not make any sense IMO to spend time and money to equip every car with expensive gizmos to break automatically when driver tries to drive into an abyss because the road ends. Especially that it would make everybody to pay for few saved drivers.

Similarly with kopia. Dev resources on this project are very thin. I am sure we all can see it. There are many other things kopia could do better before committing resources to try to nanny users from committing some lazy mistakes.

So yes in ideal world it would nice to have such kopia functionality but in real one it is not good idea.

Sure there can be very different opinions on this subject.

In practical terms this is what I do trying to maintain repo size within my defined limits:

After every backup cycle (backup, maintenance) I check repository size and if above defined limit I forget the last snapshot. Repo size can go above defined limit temporarily but longer term it stays below what I want. I know less or more what I can expect in terms of my delta snapshots’ sizes and I maintain enough free space to accommodate it.

larryc · January 23, 2025, 4:56pm

Guess we’ll have to agree to disagree on this. I’m not suggesting that the Kopia devs implement “expensive gizmos”. What I am asking for is simply that when a repository runs out of space, there is some way to allow a user to manually back out of it. (eg, as I put it earlier, a “break glass delete the X most recent snapshots”. Versus the current implementation where Kopia will crash and burn, and it’s up to the user to solve the problem by randomly deleting files and then crossing fingers hoping maintenance runs will be able to repair things.

You’ve previously mentioned a number of manual care-taking steps that you perform against your kopia backups, including this above example. Shouldn’t the system be working for you, rather than the other way around?

kapitainsky · January 23, 2025, 7:48pm

I looked at other open source backup programs - restic, rustic, borg. None provides such functionality. They all assume that free space is “unlimited” and leave this aspect for user to manage using other means.

Of course it does not prove anything but shows cloud backup software design mindset.

Such backup programs often are only one piece of the bigger puzzle and building full solution requires additional extra parts and some DIY. And for good or bad some knowledge of used OS and other tools.

I can see here UNIX philosophy influence - create program which does one thing but does it well. To do a new job, build afresh rather than complicate old programs by adding new “features”. And Keep it Simple, Stupid (KISS). I am big fan of such approach myself. I do worry when I see software trying to reinvent a wheel (or do everything for end user) as it often means that there is less focus on core functionality. It is especially true for open source arena.

With ‘kopia’ I think that existence of GUI provides false impression that it is full solution. Its GUI actually has serious limitations compared to cmd version but lowers barrier for basic usage.

Topic		Replies	Views
Maximum usable size of the repository? Petabyte scale possible? General Topics	16	2532	September 26, 2023
Questions i cannot find an answer to General	20	2485	May 4, 2021
Do I want to run dangerous commands? Cast your Vote here! Support	9	676	December 30, 2022
Kopia, malware and object locks General Topics	50	1168	September 15, 2024
"Kopia maintenance --full" not clearing space Support	2	1252	September 16, 2020

PLEASE READ: Don't use --safety=none for routine maintenance

Related topics