Kopia appropriate for simple backup of media files?

I currently only use rsync for all my data, 90% of which are media files. It’s simple, but: 1) my drives are formatted ext4 on LUKS for encryption (I’m mostly on Linux); 2) renaming of source file gets treated as a new file, so it copies over again as if it’s a new file (literal waste of time).

I’m interested in kopia because borg doesn’t support multi-thread. I see people talk about using kopia for more complex needs, so I was wondering if it kopia would still make sense for my simple use-case of backing up media files.

  • If I understand correctly: 1) I will benefit from builtin encryption (i.e. my filesystem doesn’t need to be encrypted and perhaps builtin encryption has less overhead?); 2) renaming source file won’t result in any writes to backups because it smartly handles this?

  • Are there any other features I might benefit from using kopia? Deduplication and snapshots would only be useful for text files and not media files or VM images (both are types of binary files), right? I mostly have external drives half of which contain my data and the other half containing an exact mirror. I also have an NFS server that I backup similarly.

  • Are there any reasons why kopia might not be suitable? I need to be able to play videos but I can just mount the entire repo (interact with it like a regular local filesystem) and play them just the same (and expect the same performance) as if I was playing them on a decrypted file, right? I’m not sure if both the original data and the mirror should be exact Kopia mirrors or if e.g. the source should be what I have existing (data directly on encrypted filesystem) and only the backup is a Kopia repo (technically this would guard against potential bugs in Kopia) or if there are any benefits to both being Kopia repo mirrors. I also need list of filenames and tree structure these files saved locally (to know what media files I’ve downloaded from the web)–I guess saving a tree output of the mounted Kopia repo would be the best approach? Actually I use fsearch file indexer and perhaps it would be better to create empty placeholder files onto the local system replicating the file hierarchy so tools like locate and fsearch can see these filenames for reference as local files.

Welcome @rieje :waving_hand:

Correct. I can’t comment on the overhead part as I’m not too familiar with it.

Correct.

Generally correct. Deduplication might work on some VM images as long as they are not encrypted nor compressed.

Mounting snapshots is a really great feature but you should not expect the same performance as raw files. And you might run into issues with video files as mounting has limits on file size AFAIK. Best is to test this out yourself.

If you like the rsync approach and want to keep it simple, rsnapshot might be a better fit for you. It is using rsync under the hood and uses hard links to create snapshots. Files will only be saved once as long as they didn’t change (or were renamed).

  1. You should use local encryption for security of your data, in case a drive gets stolen or so, this doesn’t change just because kopia also encrypts the data at rest on the remote site. It is totally possible for a machine to have encrypted filesystems that you unlock at login, then it reads out the now-decrypted files to send them off to a backup system, which may or may not encrypt once again in turn. CPU is cheap these days, don’t loosen up security unless forced to.
  2. It will detect that the contents are 100% identical. It will take a slight bit of checksumming first time after the rename, but almost no data will be sent to the kopa repo, only information saying something along the lines of “btw, this file B has exactly the same contents as file A had yesterday. Also, file A is now gone”. Same goes for if you “accidentally” copy data to multiple places, it will be a few more references to the same data, but you will not be sending it over the wire multiple times.

There are certainly times when dedup works for you on binary files too. While not super likely, but if you have an ISO file with 32MB of zeroes at the end and a database file with 32M of zeroes in there for any reason, those two will dedup against eachother (and compress really well). Same goes for VM images, if you install from a template and run a few commands here and there, changing only a little bit of data on each instance, then kopia would be probably able to find spots where data still is similar.