Daily snapshot finishes, shows 0B, retries

I have a policy that snapshots a directory daily at 4am. Every day, it runs on schedule, takes around 2 hours, then reports that it finished. However, at that point, the snapshot size shows “0 B”, where in reality it’s around 2.2 TB.

Looking at the list of snapshots for that policy, I see that it then retries every 45 mins for an additional 2-3 attempts, then stops retrying.

If I click on “Snapshot Now” in the UI, it runs for 2-4 seconds and completes successfully. Then the snapshot size shows 2.2 TB as I would expect, and the list of snapshots then shows just my two daily snapshots. The “incomplete” entries disappear.

Looking in the logs for the snapshot tasks (“Tasks” tab in UI, drill into relevant row), I don’t see any errors or indications of a problem. I also, curiously, don’t see any tasks for the “retries” (the ones that happen every 45 mins after the initial snapshot attempt, which show up as incomplete).

I have no idea what support information may be helpful, so below are various configs and images I think may be useful. Please let me know what else to share!

List of snapshots showing incomplete retries

Incomplete snapshot showing 0 B size

Policy definition

{
    "retention": {
        "keepHourly": 0,
        "keepDaily": 2,
        "keepWeekly": 0,
        "keepMonthly": 0,
        "keepAnnual": 0
    },
    "files": {
        "ignore": [
            ".DS_Store"
        ]
    },
    "errorHandling": {
        "ignoreFileErrors": true,
        "ignoreDirectoryErrors": false,
        "ignoreUnknownTypes": true
    },
    "scheduling": {
        "timeOfDay": [
            {
                "hour": 8,
                "min": 30
            }
        ]
    },
    "compression": {
        "compressorName": "zstd",
        "neverCompress": [
            ".zst"
        ],
        "minSize": 10240
    },
    "actions": {},
    "logging": {
        "directories": {},
        "entries": {
            "snapshotted": 5,
            "ignored": 5
        }
    },
    "upload": {},
    "noParent": true
}

Repository config

repository.config

{
  "storage": {
    "type": "b2",
    "config": {
      "bucket": "xxx",
      "keyID": "xxx",
      "key": "xxx"
    }
  },
  "caching": {
    "cacheDirectory": "../../.cache/kopia/d129c661fba5f559",
    "maxCacheSize": 5242880000,
    "maxMetadataCacheSize": 5242880000,
    "maxListCacheDuration": 30
  },
  "hostname": "kopia",
  "username": "xxx",
  "description": "Repository in B2: xxx",
  "enableActions": false,
  "formatBlobCacheDuration": 900000000000

Repository status

kopia repository status

Config file:         /xxx/.config/kopia/repository.config

Description:         Repository in B2: xxx
Hostname:            kopia
Username:            xxx
Read-only:           false
Format blob cache:   15m0s

Storage type:        b2
Storage capacity:    unbounded
Storage config:      {
                       "bucket": "xxx",
                       "keyID": "xxx",
                       "key": "*******************************"
                     }

Unique ID:           xxx
Hash:                BLAKE3-256
Encryption:          AES256-GCM-HMAC-SHA256
Splitter:            DYNAMIC-1M-BUZHASH
Format version:      3
Content compression: true
Password changes:    true
Max pack length:     21 MB
Index Format:        v2

Epoch Manager:       enabled
Current Epoch: 30

Epoch refresh frequency: 20m0s
Epoch advance on:        20 blobs or 10.5 MB, minimum 24h0m0s
Epoch cleanup margin:    4h0m0s
Epoch checkpoint every:  7 epochs

Giving this post a quick bump, since it’s been hidden for 6 weeks so no one’s had a chance to see it.

This issue is still ongoing; I connect to my server once a day and click “Snapshot Now” on that job to get it to finish. No change in behavior from what I originally described.

I think this is normal behavior. Take a look and see if this FAQ describes to your situation.

Hi, it its a normal behaviour. The following snapshots will run fine and the incomplete ones will be deleted.

Cheers

I think I’m missing something. Why does the backup keep retrying every 45 mins and still not marking it complete? And why is it that if I manually click “Snapshot Now”, it only takes a few seconds and then does mark it as complete?

Have you taken a look at the faq?

Feel free to ask, if you got more questions.

Cheers

Yes, I’ve read that FAQ, but something still seems off. It states that:

If a snapshot takes longer than the predefined checkpoint interval, Kopia creates a temporary incomplete snapshot, preventing the snapshot from being garbage-collected by the maintenance tasks. Kopia will remove incomplete snapshots once a complete snapshot of the files and directories has been created.

The problem is with that second part (“Kopia will remove incomplete snapshots once a complete snapshot […] has been created.”). What I think that means is, if you have a snapshot that runs for more than some amount of time (45 minutes I would guess), then it’ll create “incomplete” snapshots along the way, but once it finishes (assuming it finishes successfully), it will clean those up and you’ll just see the one completed snapshot.

What I’m actually seeing, though, (described above) is that every morning (it’s 100% repeatable), that policy’s snapshot doesn’t appear to have completed. It still has incomplete snapshots listed, and shows a size of 0 bytes. Then if I manually snapshot it, it takes just a few seconds to create a new snapshot and remove the incomplete ones. At that point, looking at that policy’s snapshots, I see two: one from two days ago and one from a few seconds ago (when I manually created a snapshot).

If the automatic (overnight) snapshot really had completed and just didn’t clean up the incomplete snapshots by design, then what I would expect is when I click “Snapshot Now”, it would take a whole new snapshot (which would take a couple of hours), and also presumably clean up the old incomplete snapshots from overnight. I wouldn’t expect it to just clean up that last snapshot and mark it as complete.

So the questions are:

  1. Why is it consistently leaving behind those incomplete snapshots even after it completes?
  2. Could it be that it isn’t actually completing? I don’t see any errors in the logs that I know how to check, but are there logs somewhere that would show errors if that’s what was happening?
  3. Why does manually clicking “Snapshot Now” several hours later just finish that previous snapshot, but not take a new one?
  4. If it is actually completing and just leaving behind those incomplete snapshots, why is its size listed as 0B?

Okay, can you post more details about your repository?

I assume that files are changing frequently and that the snapshot is maybe uploaded to a provider. Is that correct? By doing this, you may not complete the daily snapshot in under 45 minutes.

That might be the reason, that every day you are seeing incomplete snapshots.

For example, I had a “big” snapshot of over 250Gb. That took longer than 45 minutes in the first place and was seeing incomplete snapshots back then. After that initial snapshot was created, the newer ones take around 2 minutes.

Cheers,

The repository is backing up to Backblaze B2. I don’t know exactly what details about it would be useful, but the config files are in the original post.

I have 8 policies in that repository; 7 of them complete every time without any issue. They range in size from 8 GB to 3.2 TB. Some have data that hardly ever changes, some are for frequently changing data like daily backups.

The policy that’s having this issue is for a directory where my home server saves its nightly backups of each container and VM. The server’s backup and retention logic deals with keeping the appropriate number of backups in that directory for each container/VM (e.g. 3 dailies, 4 weeklies, etc.), so I just have Kopia keeping 2 daily snapshots of that directory. Each one then contains whatever backups the server was retaining on that day.

The server’s backup logic runs daily at 4 AM and runs for about 20 mins, so I have the Kopia snapshot scheduled at 4:30 AM. The daily amount of new data is about 130 GB, and the total size of that directory is about 2.3 TB. The duration of the Kopia snapshot task varies quite a bit (not sure if that’s due to upload speed, differences in deduplication hits, or what), but seems to range from about 2 hours to 3 1/2 hours.

Given what I read in the FAQ that was shared above, I understand why incomplete snapshots are recorded every 45 mins during that job (if it took 2 hours, I’d expect one at the 45 min mark and another at 1 1/2 hours). I’m still concerned, though, that when it finishes (at the 2 hour mark), it leaves those incomplete snapshots behind.

Hi,

for me it seems like normal behaviour under these conditions. If the daily snapshot takes over 2 hours or more due to the amount of changed data paired with the upload speed etc. then it is not unusal seeing incomplete snapshots.

What about snapshotting more than one per day?

Cheers,

That’s an interesting idea. One snapshot to actually back up the data, and a second as a workaround to fix the bug preventing the first snapshot from cleaning up when it finishes. That would probably work. I can give that a try.

I still don’t at all agree that this is “normal behavior”. Those “incomplete” snapshots should just be temporary checkpoints along the way during the process, and should be removed upon successful completion of the snapshot. To finish successfully but then “by design” leave behind incomplete snapshots and leave the size showing 0B sounds like an indisputable bug.

There happened to be a code merge within the past day which relates to this issue. Very similar to your description with using a fixed time for backups. In addition, your situation has a further corner case of backups that take a very long time. You may want to chime in and say you have this corner case situation as well so the developers are aware. I’m sure they would be more than happy for an additional tester.

Thank you, @elel. I added a comment about this over on GitHub per your suggestion.

Following up on this quickly, that is working for me. So thank you for that suggestion, @ChrisA! Greatly appreciated. Takes away the manual step every day.

Created issue #3347 - Long-running snapshot leave behind incomplete snapshots and shows 0 byte size on GitHub to track this issue.