Snapshot (of very large files) is taking all day since about a month ago

Hi.
This may be nothing to do with Kopia - it could be an issue with my primary local backup (which is being snapshotted by Kopia to GCS).

but anyway, since about a month ago or maybe more, I am seeing that my daily snapshots are taking ~8hrs to complete, and a lot of data is being sent to the cloud storage.
I did change the retention policy of my local backup (which Kopia is replicating), but I only changed it to ~30 days instead of whatever it was before. I’m not sure this should be causing this to happen.
Previously, I would find that every now and then there would be a very long Kopia snapshot (8 hrs), but most days would be done in ~30 minutes. Now I am finding that pretty much every day is 8 - 12 hrs. Of course there is no ingress data charges and we have 500mbps synchronous Internet bandwidth, so I can cope, but I am concerned something is amiss.

The dataset is a small number of very large files - it is bare metal images of a number of machines. between ~200gb and 2tb of data overall.

I get the newest version of Kopia from the yum repo. Running Fedora latest.
Has anything changed in the last version that might cause this?
thanks,
Carl

Kopia is (not) yet able to split large files when scanning them. If you do have changes in very large files, a scan will take its time.

Not sure how large your source files are, but as a frame of reference, snapshots take less than a minute for me for over 1.2TB of files. However, majority of my files do not change, so if your files are regularly changing, then Kopia will be doing a lot of uploads.

It is as if my large files are changing daily, but I don’t believe they are. The local backup, which is a Veeam differential, is done in half an hour, so it’s definitely not rewriting the whole 1tb initial vbk file.

I have the Veeam compression set to ‘de-dupe friendly’ and I haven’t changed this anyway.

I really need to see if I can get Kopia to do a diff or something so I know what it thinks has changed.

You can use some hashing tool (sha256sum for example) to be make sure it really wasn’t changed. Also, if source files was “touched” (modification time changed), but wasn’t actually changed, then it is a signal for backup software to recheck such file(s). If Veeam touched it before kopia while calculated diff, then probably it is the case.
Try to eliminate Veeam at least for a day or any other programs that may access source file.

Well, the thing is, I fully expect that the large file has been slightly changed, but the idea is that the bulk of it has remained the same and kopia just takes the delta, i.e. changed parts.
It used to be working just fine.
I’ll have to dig deeper.
When it is running for 8 to 12 hrs, I can see the upload bandwidth usage is constantly high, lots of tx data, so it is not just scanning the files, it is uploading to GCS.
I am wondering if Kopia’s delta algo has changed.

The system is: veeam backs up a few machines to a Linux box with a btrfs volume. Each machine is written to a separate set of files. Snapper takes daily btrfs snapshots. My kopia before-action script mounts the latest btrfs snapshot to a fixed path called veeam-lastest-snapshot (using --bind to mount an existing mounted btrfs subvol in a second path). Kopia then snapshots /btrfs/veeam-lastest-snapshot over to GCS

By chance, did you change the splitter algorithm you are using with Kopia?

Kopia needs to re-hash the file if metadata (such as modification time or size) changes since the last snapshot. Only chunks that are different will need to be upload.

Hashing on modern hardware can achieve 400-500MB/s per thread, so this should be taking 20-30 minutes just to hash the 1TB file.

You should be able to use kopia diff to get the difference between two snapshots or kopia snapshot ls --storage-stats to see how much data has changed between snapshots.

This might took forever on huge, old backups, but if one would run in the end of each backup diff as suggested between latest and previous snapshots and save it in an own log, then it would be faster and IMO easier to track usage pattern

#!/bin/sh

### compare last and previous snapshots sizes

kopia snapshot list -l | awk '{p=l; l=$4}
END{

  cmd = "kopia diff " p " " l
  cmd | getline res

  split(res,str," ")
  diff = str[5] - str[4]

  print "==========================="
  print "Snapshots size"
  print "---------------------------"
  print "Previous: " str[4]/1000000000 " GB"
  print "  Latest: " str[5]/1000000000 " GB"
  print "    Diff: " diff/1000000000 " GB ( " diff " bytes )"
  print "==========================="

}
' 
1 Like

Thanks for all the useful tips everyone. I was able to confirm that the delta was OK, i.e. the bytes uploaded.
I am just updating my system now and I see this in the release notes, so I am hopeful this will make things quicker again