Multiple connect

Let’s say I connect to a kopia repository R1 and start a snapshot. Till I disconnect there is
repository.config and repository.config.kopia-password in ~/.config/kopia related to R1.

If at this time I do another kopia connect to a different repository R2 and doing a snapshot then the disconnect to repository R1 will fail because the repository.config and repository.config.kopia-password files are now related to R2.

What is the best way to handle this?

I found now --config-file as a parameter for kopia repository connect which should do what I want.

When you connected to repository R1, do

echo '#!/bin/sh' >reconnect-to_R1.sh
kopia repo status -t -s | egrep '^\$\s+kopia' | sed 's/^\$//g' >> reconnect-to_R1.sh
chmod 750 reconnect-to_R1.sh

You might want to do the same when connected to R2

Hm, I am not sure this helps.

Let’s say it happens that a backup to repository R1 is running when at the same time a cronjob likes to run a backup to repository R2. Then your example would give back me a token for repository R1 which isn’t really helpful as the cronjob wants to connect to R2. But it could be of course, that I didn’t understand your example it correctly.

What I do is to use a randomize config file name when running a backup. Then I am sure I always backup to the repository I want to.

I just showed how to export/convert config of currently connected repository to a token that later can be used for quick re-connection to repository.

If you want a safety way to avoid repository clashing, you can use multiple ways to check if kopia currently running using for example:

run='x'
while [ -n "${run}"]; do
  run=$(ps aux | grep -v grep | grep kopia)
done

kopia repo disconnect  # disconnect from R1 
sleep 3

# and used that quick reconnect to R2 I post previously
reconnect-to_R2.sh

# which should contain : kopia repository connect from-config --token xxxxxxxxxxxxxxxxx

What I do is to use a randomize config file name when running a backup. Then I am sure I always backup to the repository I want to.

By the way, if you concern about repositories clashing, all you have to do, is to setup three variables:

KOPIA_CONFIG_PATH
KOPIA_LOG_DIR
KOPIA_CACHE_DIRECTORY

# well and path to repositories obviously should be different for R1 & R2

that pointing to a different paths for both R1 and R2. This way you can run two kopia in parallel, but… well it would overload computer and network for sure. You can combine both methods as a safety steps. Also this way you don’t need to disconnect from repositories since both are having own space

Looks interesting. Would it be required to set KOPIA_CACHE_DIRECTORY specifically? I would have assumed that kopia takes care of proper cache handling no matter how many instances are running.

Would it be required to set KOPIA_CACHE_DIRECTORY specifically?

Yes. Each cache directory should represent its own repository, or if you don’t want to keep a cache between snapshots, then simply do kopia repo disconnect which will delete cache directory completely (as well content of config directory) and upon next connection to repository, kopia will recreate/recalculated it from scratch, but it would take then more time for backup.

I am on Linux. The default kopia cache directory is .cache/kopia/. I thought that kopia takes care to not mangle caches for different repositories.

After running a backup I always do kopia repo disconnect, simply because connect/reconnect looked to me as open/close with files.

I thought that kopia takes care to not mangle caches for different repositories.

I believe if Jarek would have a time, he will bring more light on this matter, but personally I won’t to risk, especially when I backuping multiple different remote servers.

After running a backup I always do kopia repo disconnect, simply because connect/reconnect looked to me as open/close with files.

Check .cache/kopia/ directory after you disconnected from repository, it will be empty, as well config dir. You loosing this way the whole point of caching I think. It is more concept of secure connect/disconnect than open/close files. The same as with version control repositories (git, fossil-scm…). If you working with some particular content constantly, then no reason to close each time after you doing a commit. The only reason when you need to disconnect is for example - you don’t want someone to see your backup’s config files as well approximate what have been done by looking to cache directory (while it all encrypted, it still, there are some indirect objects that might leak something interesting). I do disconnect on remote computers that do backup in push mode, to prevent access to sensitive files in case remote server get hacked. But since I don’t trust remote servers, I using “pull mode” by mapping remote root directory on backup machine via sshfs and snapshotting them as a local filesystem. This way no need to disconnect, it is already working way to slower due to network bottleneck, so removing cache will slowdown process even more.

Thanks. You are right. When using the same config file for a specific repository and not disconnectiing the same cache subdir in ~/.cache/kopia gets re-used.

I have another situation where it is interesting how to deal with cache.

I backup my laptop’s partition to two different external HDDs using kopia.

It is ok to use the same cache for both repositories (I doubt a bit) or is it advisable to use two different cache dirs by setting KOPIA_CACHE_DIRECTORY appropriately?

I backup my laptop’s partition to two different external HDDs using kopia.

Instead of doing backup twice, you can use kopia’s feature sync-to, like:

kopia repository sync-to filesystem --path="${repo_on_second_drive_path}" --delete --must-exist

This way you won’t overload machine with double jobs of the same things, but will duplicate existing repository to a second drive without overhead. The same way you can do offline backup. Create snapshot once, then push existing repository with sync-to feature to multiple places, - filesystems, sftp, b2, aws… as offsite encrypted backup.

BTW, I had conversation yesterday where I have to explain concept of kopia’s repository and why there no needs to disconnect/reconnect to it on each snapshot. People accepted my explanation as the best one :), so I will share it here too, - what I told them:

kopia’s repository is kind of like a bank. If you opened an account in some financial bank, you don’t need to close and reopen an account every time you need to make a deposit or money withdrawal. So think about repository like it is a bank that might have multiple branches (offsite backups to B2, AWS, SFTP) where you can do your transaction (snapshot or restore). If one did kopia repo disconnect, then it means you put your bank account on freeze and to be able to unfreeze and start doing financial transaction again, you have to verify yourself again kopia repo connect ... that your are a legal owner of your account.

1 Like

Good idea to use sync-to but in my case one harddisk is outside in a safe place and the other is at home. Every second week or so I switch them.

Yes, your explanation is nice. BTW, I am thankful that you answered my original posts and started a discussion.

I think you need then to run kopia cache clear just before you doing backup drive disconnection, since cache between those two drives won’t match.

No problem, I stuck in very boring business trip and glad if I can help somehow while idling