Retention policies documentation

Hi!

I was trying to set up my backups with kopia. Surprisingly, the biggest problem was understanding how retention policies work. Lack of documentation on this topic doesn’t really help. So I searched the forums and Reddit and all that. I think I read every post regarding that, and I think I kinda understand how it all works, but I’m not 100% sure. Either way, I wrote a “documentation” (more like ELI5) in hope someone will correct me if I’m wrong, or help someone else set the policies up.

I still don’t know when the weekly ones come up, Monday or Sunday :stuck_out_tongue:

Kopia rentention policies

Kopia adds a label to a snapshot (day, week, month…) every time it creates one.
Kopia deletes snapshots that do not have a label.

Assume snapshot every 30 mins.

Annual

Let’s say, you start creating snapshots on the 1st of June. Kopia adds Annual-1 label to the first snapshot. When another snapshot is created in 30 mins, Kopia moves the label to the newest snapshot. This is done every 30 mins, until the year finishes — the label is moved to the newest snapshot every time one is created. When a new snapshot is created in the following year, the first snapshot gets the label Annual-1, and the old Annual-1 is changed to Annual-2. This is what happens every year, until x in Annaul snapshots: x is met. In other words, snapshot with Annual-x+1 has the label removed, and it gets deleted.

Monthly

Again, 1st of June. Monthly-1 is added to the 00:01 snapshot and then moved to the newest snapshot when it’s created, until the end of the month when it’s changed to Monthly-2 and the new snapshot in July has Monthly-1. At the end of July, Monthly-2 moves to Monthly-3, Monthly-1 moves to Monthly-2 and the new snapshot gets Monthly-1. All that until the Monthly snapshots: x is met. Monthly-x+1 has the label removed, but not necessarily deleted, as it may still have other labels (like Annual-y if it’s December).

Weekly

Works the same as above, but everything happens at the end of the week (Monday or Sunday).

Daily

Same thing but at the end of the day.

Hourly

Same thing but at the end of every hour.

Latest

Let’s say Hourly snapshots: 1, Daily snapshots: 1 and Latest snapshots: 3.

1st snapshot

1st June - 10:01 - snapshot - (Hourly-1, Daily-1, Latest-1).

2nd snapshot

1st June - 10:01 - snapshot - (Latest-2)
1st June - 10:31 - snapshot - (Hourly-1, Daily-1, Latest-1)

3rd snapshot

1st June - 10:01 - snapshot - (Latest-3)
1st June - 10:31 - snapshot - (Latest-2)
1st June - 11:01 - snapshot - (Hourly-1, Daily-1, Latest-1)

4th snapshot

1st June - 10:31 - snapshot - (Latest-3)
1st June - 11:01 - snapshot - (Latest-2)
1st June - 11:31 - snapshot - (Hourly-1, Daily-1, Latest-1)

As you can see, even though Hourly snapshots: 1, the latest tag keeps them “alive”.

When you combine all of the above, you see why the most recent snapshot always has (Latest-1, Hourly-1, Daily-1, Weekly-1, Monthly-1, Annual-1)

Summary

Let’s assume 30-min intervals, 8th of June 2023, 12:00PM, and these policies:

  Annual snapshots:                     3
  Monthly snapshots:                   12
  Weekly snapshots:                     4
  Daily snapshots:                      7
  Hourly snapshots:                    24
  Latest snapshots:                    24

This is how the snapshots would look like if we didn’t do any manual snapshots.

31st Dec   2021 - 23:31 - snapshot - (Annual-3)
31st July  2022 - 23:31 - snapshot - (Monthly-12)
...
30th Nov   2022 - 23:31 - snapshot - (Monthly-8)
31st Dec   2022 - 23:31 - snapshot - (Monthly-7 ,Annual-2)
...
31st March 2023 - 23:31 - snapshot - (Monthly-4)
30th April 2023 - 23:31 - snapshot - (Monthly-3)
21st May   2023 - 23:31 - snapshot - (Weekly-4)
28th May   2023 - 23:31 - snapshot - (Weekly-3)
31st May   2023 - 23:31 - snapshot - (Monthly-2)
 2nd June  2023 - 23:31 - snapshot - (Daily-7)
 3rd June  2023 - 23:31 - snapshot - (Daily-6)
 4th June  2023 - 23:31 - snapshot - (Daily-5, Weekly-2)
 5th June  2023 - 23:31 - snapshot - (Daily-4)
 6th June  2023 - 23:31 - snapshot - (Daily-3)
 7th June  2023 - 12:31 - snapshot - (Hourly-24)
...
 7th June  2023 - 22:31 - snapshot - (Hourly-15)
 7th June  2023 - 22:31 - snapshot - (Hourly-14)
 7th June  2023 - 23:31 - snapshot - (Hourly-13, Daily-2)
 8th June  2023 - 00:01 - snapshot - (Latest-24)
 8th June  2023 - 00:31 - snapshot - (Latest-23, Hourly-12)
 8th June  2023 - 01:01 - snapshot - (Latest-22)
 8th June  2023 - 01:31 - snapshot - (Latest-23, Hourly-11)
...
 8th June  2023 - 10:01 - snapshot - (Latest-4)
 8th June  2023 - 10:31 - snapshot - (Latest-3, Hourly-2)
 8th June  2023 - 11:01 - snapshot - (Latest-2)
 8th June  2023 - 11:31 - snapshot - (Latest-1, Hourly-1, Daily-1, Weekly-1, Monthly-1, Annual-1)

There might be mistakes in dates or times, because my brain was frying typing this table haha

5 Likes

It is actually very handy - thank you!

So it works like an OR operator, that as long as the snapshot passes even a single retention policy, it’s kept. It is only deleted if it fails all of them. Did I understood it correctly?

Like for example, if I only want to retain the last 5 backups, I can put Latest snapshots: 5 then put zero on everything else?

I’m pretty sure that’s how it works :slight_smile:

Thanks!

About whether the week starts on Monday or Sunday, according to this post, it seems to be based on duration instead? That unless you retain backups that have at least a 7-day gap, weekly snapshot would be completely useless. Also, based from the example given by this user, the duration is counted based from the oldest snapshot.

I just checked one of my machines and the the weekly one is always from Sunday (I run one backup at 1:00 every day)

2024-03-31 01:00:04 UTC k26ff5face32a58d46760765d2ad2785d 2.6 GB drwxr-xr-x files:8367 dirs:537 (monthly-2)
2024-04-14 01:00:05 UTC k76360b4b9ca1eb7021400412a5a9ac06 2.7 GB drwxr-xr-x files:8540 dirs:551 (weekly-4)
2024-04-21 01:00:04 UTC k04c29080fa8b18d735abf83ef7fb4b46 4.4 GB drwxr-xr-x files:8655 dirs:585 (weekly-3)
2024-04-24 01:00:03 UTC k47aa430386f033ae32978c0a0c6451c1 4.4 GB drwxr-xr-x files:8691 dirs:615 (daily-7)
2024-04-25 01:00:04 UTC k70d57e15d472a18438fd0e0deb488520 4.4 GB drwxr-xr-x files:8697 dirs:615 (daily-6)
2024-04-26 01:00:04 UTC kc4ef4c8a615a068df361c346f82e67af 4.4 GB drwxr-xr-x files:8697 dirs:592 (daily-5)
2024-04-27 01:00:05 UTC k16519ba0ffa80b620ae221526052a569 4.4 GB drwxr-xr-x files:8704 dirs:600 (daily-4)
2024-04-28 01:00:08 UTC k584cfc064f1a809fbbc28e8a44084864 4.4 GB drwxr-xr-x files:8738 dirs:604 (daily-3,weekly-2)
2024-04-29 01:00:07 UTC k70a1cd8d45c06d379cde1784c8bbe577 4.4 GB drwxr-xr-x files:8762 dirs:612 (daily-2)
2024-04-30 01:00:03 UTC k65eb8e2031883c467163a37ed6c1b7cd 4.4 GB drwxr-xr-x files:8798 dirs:624 (latest-1,hourly-1,daily-1,weekly-1,monthly-1,annual-1)

Annual snapshots: 3 (defined for this target)
Monthly snapshots: 12 (defined for this target)
Weekly snapshots: 4 (defined for this target)
Daily snapshots: 7 (defined for this target)
Hourly snapshots: 1 (defined for this target)
Latest snapshots: 1 (defined for this target)

This is my setup

But 2024-03-31 is a Sunday. Have you tried what happens if the oldest snapshot isn’t a Sunday?

Oh yeah. That’s the issue with testing this because you actually have to wait a year to find out what “annual” does lol

Honestly, I rewrote this very message you’re reading right now like 5 time in 30 mins (this really shouldn’t be this complicated, haha). Could you please restate your argument from the beginning? I think I don’t know what we’re discussing anymore haha

That weekly snapshots seems to not be based from Sunday or Monday, but from the oldest snapshot. That if for example, the oldest snapshot you have retained is on Wednesday, the next weekly snapshot will not be the Sunday next week, but Wednesday next week, and so on.

In that setup you used, I don’t think it’s possible unless you’re willing to delete all of your backups and start from scratch, deliberately starting the very first backup on a day neither Sunday nor Monday. Do you have other backups where you can check this behavior?

I’ve checked when the first first snapshot was taken. It was on the 6th of March which is a Wednesday. It obviously got deleted as it went out of range for daily and weekly and monthly tags. Theoretically you could run an experiment on Saturday. Run snapshots daily, and check on Monday if the weekly tag is on Sunday snapshot or not.

It’s possible that my current snapshots are based off the 31st of March, but I think it’s that the weekly tag is based on the week days.

After several tests, I have confirmed your findings: it was indeed Sunday. Kopia considers Monday to Sunday as a week (as opposed to Sunday to Saturday), and the weekly tag is attached to the snapshot closest to Sunday 11:59 PM.

1 Like

Btw here’s a more comprehensive post I made regarding retention policies:

1 Like

Thanks for the details on retention policy. I wonder if this is a bug, or intended behavior, for the daily snapshots to use UTC, but for weekly (and monthly+annual?) to use local time. Seems strange to look at my snapshot history and see the weekly and daily snapshots to not be the same snapshot within a given day. I would expect End-of-Year==End-of-Month==End-of-Day, and End-of-Week==End-of-Day. I don’t have enough snapshot history yet to verify monthly/annual behavior.

I’m not sure what you mean honestly, this is what it looks like for me, running backups everyday at 1am for the past 3 months. Maybe that will help you :sweat_smile:

Looks like you’re only running one backup per day. I’m running one every hour, so my daily snapshots are tagged at 7PM EDT (UTC-4) (23:00 UTC), and my weekly snapshots are at 11PM EDT on Sundays (03:00 UTC on Mondays). See below.

Oh yeah, that’s weird, might be a bug

Yes, my main concern, is with travel, and suddenly my annual backup “changing” from Dec 31 to Jan 1 because I go to europe or asia for work, and now I loose one of my annual backups. I think the tags should stick with UTC timing.

Yeah, that might be a problem. @jkowalski is that a bug or just a setting we don’t see