Kopia Snapshot Retention Policies Demystified

With the suggestion of @JayKay729, I decided to create a dummy backup for testing how Kopia choose which backups to save, but in this experiment, I learned much more than I initially planned. In this post, I would like to share these discoveries, along with some examples to demonstrate them. This is a supplement to his post, so I’ll jump straight to those he hasn’t covered yet.

Numbers in Snapshot Retention

Except for Latest Snapshots, Kopia considers the number you write here as duration, not amount. For example, in this image, 48 is not 48 hourly snapshots, but rather, retain 1 snapshot per hour within 48 hours from your most recent snapshot, while 7 is retain 1 snapshot per day within 7 days from your most recent snapshot, with the following results:

Despite listing 48 hourly snapshots, only 3 snapshots were tagged as daily as only these 3 fit the 48-hour duration of 5/25/2024 11:00:00 AM to 5/27/2024 11:00:00 AM. The same applies to 7 daily snapshots, with only 4 snapshots fitting within the 7-day duration of 5/20/2024 to 5/27/2024.

Or in other words, only the most recent snapshots within the duration you specified will be considered for tagging. Anything that is older than the duration you specified will be ignored.

Intervals of Each Snapshot Retention

Have you ever wonder what Kopia considers as a week, is it Sun-Sat, Mon-Sun, or 7 days from the most recent snapshot? Or have been befuddled by Kopia producing more than 1 daily snapshot? Stay tuned, as I break them down in this section.

Hourly Snapshots

Interval = hh:00:00 - hh:59:59

The snapshot tagged with hourly is the latest snapshot within that hour. Once the snapshot exceeds even a second from that hour, it would get a separate hourly tag, like in this example:

where as soon as the clock struck 12:00:00 AM, it is considered another hour, and hourly-3 tag was assigned to 11:55:34 PM, matching @jkowalski’s response before, and most users intuition.

Daily Snapshots

Interval = UTC 12:00:00 AM - 11:59:59 PM?

Based from my tests, it does not follow my system’s time. Neither is it the server’s time since these are local backups, directly on my PC’s storage. Instead, I presume it follows a standard time like UTC, possibly to avoid conflicts if the source and destination are on different time zones (common in online backups). Here’s an example:

Here, snapshots going as far as 4:55 AM of the next day are still considered to be within that day, which matches my timezone. And this is why you can get more than 1 daily snapshot: because they’re on different days in UTC.

Weekly Snapshots

Interval = Monday to Sunday, but only checking day, not time

Checking day? Yes, it looks only at the day, not the time, or in other words, follows system date. Confusing? Check this example:

Notice here that the weekly tag is applied on the latest snapshot of the week of May 20 to 26, where as soon as the date changed to May 27, it’s already considered another week. This is in stark contrast to daily that follows UTC instead, resulting in a mismatch (unless you live in UTC-0).

This means that the gap between daily and weekly tags will differ based on where you live, ranging from exact match (UTC-0) to 14 hours difference (UTC+14). Other tags like monthly and annual possibly follow the same scheme, looking at the month or year, respectively.

Interaction Between the Numbers and Intervals of Snapshot Retention

What happens if you don’t rely on latest snapshots and let the other retention policies choose which snapshots to keep? Fortunately, the behavior is what most users expect: as long as you hit the end of an interval, a tag will be assigned, as this example demonstrates:

3rd Snapshot

4th Snapshot

As 3:00:05 PM exceeded the interval for hourly snapshot (2:59:59 PM), the previous snapshot (2:59:07 PM) gets the hourly-2 tag, even if the gap between the oldest and most recent snapshot is less than an hour (which also means that this old issue has been resolved in the more recent Kopia versions). Continuing further on 3 more snapshots:

7th Snapshot

shows that the hourly tag properly protects the snapshot from the previous hour from getting deleted.

Disabling Snapshot Retention

What if you don’t need a certain retention policy? Put “0” on it, like this:

As shown in this pic, all hourly tags disappeared.

DON’T just leave them blank, as they would revert to the previous values you last saved:

Let me know if you have further questions, or have spotted some errors. Feel free to share your opinions too, on whether you find this helpful, or find some parts needing clarification.

9 Likes

So the number acts like an expiry date, not a literal amount. That’s confusing, or at least not what was I expecting. I expected 48 hourly snapshot to be retained. That means IF I shut the system down for like a week, I expected to still find those 48 snapshots once turned on, but you showed that’s not the case… And the description really does not help.

I’m not saying that this is a bad approach, but I think the description does not explain this well. Do you think it’s helpful to propose a change?
Thanks for clearing that up.

You meant hourly here

And also thank you for this. Until now I left them blank, my fault for not checking.

1 Like

So for example 3 annual snapshots means that you will have at least 1 yearly snapshot for the last 3 years, and after those 3 years your snapshots will be deleted? Does this mean that the easiest way to keep old snapshots “forever” is to set Annual to 100?

To keep ALL snapshots forever set all policies to 0 (zero)

1 Like

This is fantastic, thanks for writing it up.
I would love to see this added to the official documentation. I am happy to take lead but would welcome collaboration with @rhinomagnetar and @JayKay729.
Would you be interested in working on this?

I’d love to but I’m away till September. Not really in a place to write documentation haha

Enjoy your time away! :grinning: I may take a first crack at it & then ask for your review when you return.

1 Like

And what about my first question?

How do you keep an snapshot from each of the last 3 years?

How do you keep an snapshot from each past year forever?

Why not pin a snapshot and put a label on it?

Re: your first question. AFAIK if you set 3 annual snapshots you will have:

  1. 2021 Snapshot
  2. 2022 Snapshot
  3. 2023 Snapshot

And then when the 2025 comes you will have

  1. 2022 Snapshot
  2. 2023 Snapshot
  3. 2024 Snapshot

Are you claiming Kopia treats this as a special case, or are you understanding retention tags exactly backward, or am I?

As I understand it, setting all policies to zero would mean none of your backups get tagged for retention, so they all get deleted right away. Wouldn’t it?

The simplest way to set this argument is to test. It is not really about what I or you understand or not but what kopia designers decided:)

The least verbose explanation I came up with which might help others understand retention:

Kopia allows you to set a series of retention flags on a snapshot. Once all those flags have expired, the snapshot is scheduled for deletion.