Snapshot suitable for media files? Space issues

I’ve been struggling to use Kopia to backup media files to cold storage–I keep running out of space despite the source and destination disks having over 100 GB free space, limiting to only 1 snapshot, and ensuring the changes after the first initial snapshot are only file deletions (not changes or addition of new files). Not sure how this extra 100+ GB is being used.

I’ve been asking around elsewher but have only one reply where someone said they use Restic and they have similar issues and it turned out to be a metadata one where limiting to one snapshot fixed it but I’ve already tried that.

I’m wondering if Kopia can handle snapshots of media files and also curious if it has the equivalent of rsync –delete before to avoid the case that the destination disk might initially be overfilled with space during the mirroring process (not relevant to my particular issue since my examples only included file deletions in subsequent snapshot).

The reason I’m trying to use Kopia with 1 snapshot to mirror media disk to cold storage is that rsync cannot handle file renames and Kopia (like other snapshot software) have useful features like deduplication, built-in encryption, snapshots, etc. that may be more appropriate than filesystem counterparts in terms of performance and efficiency. For example, on Btrfs it’s not possible to resume a btrfs send/btrfs resume without quirky workarounds meaning backups are not resumable (you lose all progress and must start again if you need to shut down your system or disconnect your disk for whatever reason.

Can I still use Kopia for my needs? What may be the cause of high disk usage in my experience? I might be able to make more than one snapshot if I have lots of space available, but I need a better idea of how the space is used and about how much is required to avoid failing to snapshot due to disk space. I would think it would be straightforward with limiting to only 1 snapshot and also with only file deletions involved but somehow the second snapshot always fails. My intuition is that given this, snapshotting a second time should actually use less disk space since no diff is kept.

  1. If by cold storage you mean something like AWS Glacier then you are out of luck. As per docs:

Kopia does not currently support cloud storage that provides delayed access to your files – namely, archive storage such as Amazon Glacier Deep Archive. Do not try it; things will break.

Neither does restic (work in progress). But rustic does (restic port to rust).

  1. During backup kopia does not delete anything. So in terms of space used it is always incremental operation. Not needed data is only deleted during subsequent maintenance runs. In addition by default it is rather slow process - it can take even few days assuming default once per day full maintenance. And here also the process is initially incremental. If some data needs repacking it needs extra space. Deletion is always the last step - it is by design (as with restic) to make it safe in terms of repo integrity.

It means that to use kopia you always need some free space. Exact amount of course differs and depends on your specific data.

Well… it’s not so much the type of files stored, but how much the contents of those files is changing. So, a folder holding a collection of movie files, will not be a problem in itself, because the contents of those files won’t change. However… Kopia does perform deduplication and that means, that any file bigger than the chunk size - 4MB by default, will get split up into chunks which will then be used for deduplication and deduplication is applied repo-wide.

If your backing up other files as well, there could be overlapping/mapping chunks and even if your movie files don’t change, if the others do, Kopia will have to perform some re-writing of blobs, which hold multiple chunks. This can cause a significant decrease in free space on our volume because Kopia has to write/re-write blobs first, before it can dispose of the old ones. Old blobs will be discarded as per default maintenance policy/hold time.

I’d be curious as to see what the dedup ratio is in your case…