Kopia backup strategy and file corruption

Hello everyone…
I’m new to Kopia, but it seems to me very interesting and promising.

I’d like to better understand if the backup strategy of Kopia is risky or not… I’m perplexed about the fact that Kopia (if I’ve correctly understood) never writes one file again in the backup, if the file hasn’t been modified.

I’ve read this topic:

https://kopia.discourse.group/t/very-newbie-questions/279

where Budy stated: "… Incremental means, that the source files get scanned for changes and files which have been changed or have not yet been saved, get saved/backed up. A full backup will always read/process any file present, regardless of it’s state. And this is exactly how Kopia operates. "

If I understand correctly, he means that Kopia always does full backups.
However, in a “traditional” full backup, all the files are written again, fully, in the backup, without considering if they have been changed or not; in this way, if a file previously present in the backup was corrupted (for any reason), with the new backup it can be re-written correctly.
So doing a full backup every month (e.g.), I have a less risk to have corrupted files in my backup set.
Instead, using Kopia strategy, if a file was created 10 years ago, and it has never been changed, if for some reason at a certain moment it got corrupted, it’s lost.

Can you help me to understand if I’m wrong or not?

Many thanks in advance.
Denis

Kopia (as well restic, rustic, borg…) doesn’t keep files in a repository (not a backup) as “exact copy of file”, it split all files in small chunks, it tracks content besides of filenames and metadata, that’s why it can do deduplication. Even if there would be multiple files with different names and timestamp, it will keep only one copy of content and reference to original filenames, timestamps and location. It isn’t a “full” or “incremental” traditional backup because it always full and incremental in the same time. If you work with rsync, then you know its capability to create hard links to existing files, instead of making copy of full content to the same files. Kopia doing conceptually kind of the same but operates on file’s content split by blocks and tracking references to those blocks. That’s why kopia uses term - snapshot, not a backup. You get a snapshot in time that can be restored while maintaining small size in repository because of always incremental, deduplicated and compressed features. Files (its blocks) removed from repository based on retention policy, in needed.

It is oversimplified explanation, a brief and fast review of concept.

if for some reason at a certain moment it got corrupted, it’s lost.

No, kopia has commands that you should run against repository to check its integrity and health to be make sure it isn’t corrupted. Particularly kopia also using error corrections, so it can fix small bitrots in repository on its own, but obviously not a major damage. If you afraid for backup integrity, you should maintain multiple repositories, instead of putting all eggs in the same bucket (single repository / single the same external hard drive). A good backup that satisfy 3-2-1 backup rule should have at least 2 copies of repository, geographically redistributed, one local copy and one offsite. And kopia has mechanism to keep in sync repositories without re-scanning, re-compressing, re-encrypting again snapshot for each repository, it isn’t abuse neither subject for backup, nor it abusing network traffic because it sync only changes between repositories if those stored on a different media.

1 Like

Many thanks to your kind and quick answer!
Denis