Bucket size when using object lock / ransomware protection

I am using kopia to create various backups to various Backblaze buckets. On all repositories I have enabled Object Locks and a retention period of 90 d. For the backups I am using the default retention policy (10 latest, 48 hourly, 7 daily, 4 weekly, 24 monthly, 3 annual); snapshots are made once an hour.

My problem is that some buckets are now growing very large. In one example, kopia content stats reports a content size of 380.9 GB (231.2 GB compressed), but the corresponding Backblaze bucket has grown to 903.3 GB after 3 months of backups. Notably, some other buckets are not growing much, so I am trying to investigate whether there is anything I can do about those that do.

When I run rclone size on the bucket, the result is 244.3 GB. My guess would be that that means that most of the bucket size comes from either old file revisions or from files that have been deleted/hidden but are still retained by the object lock. After superficially browsing through the bucket, it seems like there are no old file revisions there (I still had it set up to keep them), but many files are marked for deletion.

After thinking about this, my guess is that the large bucket size is the result of a combination of two factors:

  1. Having a retention period of 90 d basically means that all snapshots from the last 90 days are stored in my bucket and take up space there, in my case meaning one snapshot per hour, so at least 2160 snapshots.
  2. The reason why some buckets grow so large while others don’t must be that in some backups there are large amounts of data that frequently changes.

As potential solutions, I am wondering:

  1. Is it possible to enable object lock only for some snapshots? If I could enable it for only one snapshot per day or maybe two per week, the bucket size should be greatly reduced. If this is not possible yet, is this something that would technically even be imaginable with the way that kopia creates snapshots, so would it be worth opening a feature request?
  2. Is there a way to investigate which files cause how much disk usage in a snapshot? I’m thinking about a command that would list the files that have changed in a snapshot along with their file sizes. When a file changes, does kopia store its entire new version, or does it only store a diff? If the latter is the case, is there a way to view the sizes of all diffs of a specific snapshot?

With object lock enabled your retention policy is irrelevant up to retention lock duration when it comes to occupied space. In your case any data you save to your repository will be stored for 90 days.

Nope. It is all or nothing.

But what you could is to have two buckets and two repositories. With and without object lock. Frequent one without lock and the other where you backup (or copy snapshot from the first one) less frequently.

On the other hand if you do not care about all these snapshots then why you take them at the first place? Maybe it is worth to rethink all backup strategy e.g. use some local medium for very frequent snapshots and cloud only for longer term protection ones.

1 Like