Production setup with Repository Server and S3, questions and feedback wanted

Hi everyone,

I’m evaluating Kopia with the Repository Server for a production backup setup. I’ve read the docs and it seems to fit my needs, but I’d like to confirm a few things and hear from people who actually run something similar.

My setup: a central server on Debian 13 running the Repository Server, multiple Linux clients that I have no direct access to from the server (no SSH, no open ports), and an S3-compatible bucket with Object Lock as the storage backend.

The main reason I’m looking at the Repository Server is to keep the S3 credentials in one place. From the docs, clients just authenticate to the server and don’t need to know anything about the storage behind it. Does this work well in practice?

Since the clients are the ones initiating the connection, I assume the server never needs to reach them. Can anyone confirm?

I also need to be notified after every snapshot. I’m thinking of using --after-snapshot-root-action with a mail script. The docs say this is inherited at global level, so I could define it once for all clients. Does anyone actually do this? Any gotchas with actions in Repository Server mode?

Regarding encryption, from what I understand everything is encrypted by default with AES-256 before leaving the client. Is it one password per repository shared by all clients, or is there a way to have per-client encryption keys so that one compromised client doesn’t expose the others’ data?

A few more questions:

  • Is GFS retention doable natively (daily/weekly/monthly/yearly with different keep values)?
  • Any known issues with S3 Object Lock enabled?
  • If the Repository Server dies, can I just rebuild it and reconnect to the S3 repo with the password? Anything else critical to save?
  • All clients would snapshot around the same time. Any problems with concurrent snapshots?

More generally: is anyone running this kind of setup in production? Are there any features that aren’t fully stable or that I should watch out for?

Thanks!

I have been running multiple Kopia servers for years, at home and in production at work. I can attest yes (will work) to almost all of your question, except S3 object lock. Since repo maintenance will shuffle blobs around, setting an automatic locking policy will interfere with the pruning of blobs. However, just having object lock enabled on a bucket shouldn’t be an issue.

Each client will have its own access password and doesn’t need the repo password - thats for the Kopia server alone and the same goes for the S3 credentials. One client will always only be able to read its own data. Even if a client’s access password gets compromised, only its data will be at risk.

To recover a repo on a S3 bucket, you will need the repo password and access to the S3 bucket - every thing else is inside the repo (like client password, policies) and you could just install a new kopia server and have it take over.

My production system has ingested close to 200 TB of data and I am performing 15x parallel snapshots from our main file server - so 15 snapshots simultaneously - the file server’s load shoots up to 50, but Kopia server only has to deal with the I/O and that’s no a big deal. There are approx. 560 snapshots per run… and it all happens between 01:00 am and 04:00 am - fasted backup setup, that I ever had in 25+ years…

1 Like

Thanks a lot for the detailed answer, really helpful to hear from someone running this at that scale.

I’d like to clarify one thing about Object Lock. You mentioned that setting an automatic locking policy would interfere with blob pruning, but just having Object Lock enabled is fine. Does that mean Kopia manages the locks per object itself, or is Object Lock just enabled but not actively used? I’m trying to understand what actually protects the data from being deleted if the server or S3 credentials get compromised.

Also a question about encryption: since the Repository Server needs the repo password to operate, does that mean it can decrypt all clients’ data? If the repository server gets compromised, the attacker can decrypt all backups? Is there any way to have per-client encryption, where each client encrypts its data with its own key before sending it to the server, so that the server only stores opaque blobs it can’t read?

Again, object locking is about deleting unused blobs, which will at some point be deleted from the storage - blobs can become unused when they have lasted the repositories retenion policy or when maintenance has shuffled data around, re-written blobs or something else. You need to grasp that blobs != files. A blob is the de-facto unit of a storage blob in the repo which holds one or more chunks of data. Usually a blob has a default size of 200M - so it can hold multiple chunks. You will have to enable Kopia to get rid of unused blobs, otherwise your repo will grow uncontrollable.

If you want to guard against damag/”vandalism”, you should maybe consider a replication solution for your S3 bucket. That will mostly depend on the value of your data, I guess…

Regarding the server - sure, the server environment needs to be as secure as you can make it. Your idea of having multiple encryption keys would be the equivalent of each client working against its own repo - which would then render e.g. deduplication useless. Also, one instance of Kopia can just operate one repo, you’d need several Kopia server instances for that.