Miscellaneous questions

  • I’m using kopia for media files. Does it still make sense with e.g. the default policy rules? The drives are nearly 90% full and I can’t see why I need more than 1 snapshot (--keep-annual 0 --keep-daily 0 --keep-hourly 0 --keep-monthly 0 --keep-weekly 0 --keep-latest 1) since I only care about the latest mirror backup, which would mean (as I understand) I won’t get the ability to restore from more than one state in time. But I can still enjoy deduplication from the 1 snapshot, right? Or perhaps I should allow for more than 1 snapshot, but look to deleting the oldest snapshot depending on the “diff of the source and the latest snapshot”.

  • How to “diff the source and the latest snapshot” for e.g. list of files changed and some metadata like size differences? Compare the output of kopia snapshot list with some Linux command that generates similar output? Would also like some sort of summary like approximately how of a difference in size between source and the most recently snapshot is so e.g. I can know if I even have enough space to snapshot (else need to delete oldest snapshots to free up space). Diff between existing snapshots can be done with kopia diff.

  • Can you rename a repository, .e.g. simply moving the name of the directory specified by --path (for repo hosted locally on filesystem) or is it referenced by Kopia and could confuse Kopia?

  • How to check whether I’m connected to a repository already? I work with multiple repositories at the same time and want to e.g. add a shell prompt to make it clear what repository I’m connected for a shell. EDIT: I had one repository snapshotting in a terminal instance, then opened another terminal instance and started snapshotting another repository. After the second repository was done, I closed this terminal instance and paused snapshotting the first repository. On resume, it says it’s unable to find the second repository that I had already finished and unmounted from the system. So Kopia can only work with one repository at a time or is there a way to keep managing the repo separately via CLI? My intuition was that when Kopia is only connected to that shell instance so it was surprising to me when resuming the snapshot of the first repo attempted to do that for the second repo instead (because it was connected later while the first repo was still snapshotting).

  • Are there any optimizations for repo containing video/image files up to 15GB on 2-20TB drives like with --object-splitter (I assume would affect disk fragmentation and deduplication in terms of both speed and size efficiency)? And for repo containing exclusively text and database files?

  • I ran kopia snapshot create ... without connecting to a repository–kopia didn’t complain while it snapshots. How can it snapshot without being connected to a repository and where is this stored? I cancelled the snapshot.

  • I attempted to play a video of a mounted repo while another repo is snapshotting. I then see the other repo reporting write error: write ~/.cache/kopia/content-logs/...-createshot-create.1.log: no space left on device which only resumes back to snapshoting after I close the video. It looks like playing the video involves caching a lot of data to ~/.cache/kopia? If this is not a suitable purpose for using a mounted repo, then mounting the repo is only for restoring files, limited only to copying the mounted files to destination?

  • kopia snapshot verify does not account for bitrot and similar like the doc says (there’s --verify-files-percent for that). If there is bitrot, does that prevent the entire repo from being able to be restored? I assume it doesn’t and only affects that file associated with the blocks involved in the bit rot. If such files are media/text files, typically their content is still viewable to humans but maybe missing e.g. video frames or text, right?

Much appreciated.

Hi there @enoryw !

  1. If you’re storing media, you should see very high degree of benefit from having many snapshots. Especially if the media library is only seeing new additions, no deletions. All chunks of media (roughly 20-30mb) that are already stored in the repository will not get stored again. If you’re only adding media to your library you can keep an unlimited number of snapshots and the repository would remain the same size as your library (roughly).
  2. By my understanding, there is no diff mechanism in Kopia at the moment.
  3. I’m not sure that understand your question? Can you please specify. A repository is generally portable, you can move it to a new location or S3 host without issue.
  4. You can run kopia snapshot list to see if you’re connected. Yes, kopia only generally works with one repository at a time. There is a mechanism to copy to another repository.
  5. What optimizations are you looking for? Large files are split into smaller chunks and small files are packed into larger chunks. The file size for all storage data is between 20-30mb.
  6. Kopia needs to snapshot to a repository. Can you please give more details?
  7. All the file data will need to be downloaded from the repository. Mounting a repo is generally only for restoring files, yes.
  8. If there is an issue with one of the files, only that portion will be unable to restore. That’s limited to that 20-30 mb chunk.

This (#4) resonated with me:

^^ I found this behavior very confusing when starting out with Kopia. With repository connections, it’s like a Notepad editor that doesn’t confirm overwrites on save, or even mention “this command that just ran replaced your old repository config”.
It was frustrating to go through creating new cloud storage API keys to connect again. (I wrote down the repository encryption key, but didn’t write down the API keys because those are generally for one-time entry (backblaze B2, instructions recommend making a scoped key, which requires a main key, so 2 keys to re-generate))

To do experiments on connecting to repos (without interfering with my main setup), I started using export HOME=./fake-home which works. Just surprising from the outset.

I’m not sure what would’ve helped with discovery. Maybe a prompt (“replace your current connection? [Y/n]”) or requiring the user to run something like “kopia disconnect” before making a new connection. It seems like there could be something to make this less of a foot-gun for new users.

I think what confused me the most was that it looked like you can specify a config file. I had assumed that multiple “connect” commands would generate multiple config files, with instructions on how to use the non-default connections for each kopia invocation.