I have created a local repository (using the kopia docker image), which is about 1 TiB, and want to sync this offsite. I have tried both webdav and sftp remotes, and experience long waiting times “Looking for BLOBs to synchronize”.
I have about 40k blobs and it takes about 1.5 hours to read the blobs in the destination repository. Setting the flag --parallel=8 does not seem to make a difference. Interestingly, the log file only shows 31 minutes even though the timestamps are 85 minutes apart. Copying the missing ~180 blobs takes about 90 seconds:
2024-04-07T10:26:57.075704Z INFO kopia/cli Looking for BLOBs to synchronize...
2024-04-07T11:51:26.596629Z DEBUG kopia/repo [STORAGE] ListBlobs {"prefix":"","resultCount":44523,"error":null,"duration":"31m42.091803961s"}
2024-04-07T11:51:26.596738Z INFO kopia/cli Found 0 BLOBs to delete (0 B), 44344 in sync (1 TB)
...
2024-04-07T11:53:02.986635Z DEBUG kopia/repo [STORAGE] Close {"error":null,"duration":"1.147µs"}
My command line is: docker-compose exec kopia kopia repository sync-to from-config --file=/app/config/hetzner_sftp.config --parallel=8 --delete
Is there any way to speed up this initial scan of the destination repository? I have tried both webdav and sftp with similar results.
Well… I don’t know about Hetzner, but I have a Kopia repo of approx. 800GB/35k blobs on a Wasabi S3 bucket and the whole sync-to runs for approx. 3 mins. and the first phase of looking up the blobs in the target repo doesn’t last any longer than a couple of seconds…
Latencies will be king here… neither (S)FTP nor WebDAV are known for their low-latency behaviour… and the more files, you’ll have to deal with, the worse it gets. However, I will admit that 31 mins. for scanning the remote repo really look abnormal - even for one of those protocols.
Sorry to revive this, but did you find a solution @nakermann1973 ?
I’m in the same boat with a Hetzner storage box, having a local filesystem repository and trying to sync that to Hetzner via sftp. Comparing take a while for my ~2TB but for me the upload speeds are also slow. I get between 6-9 MB/sec, while I get 60+ MB/sec via rclone sync.
Another thing I don’t understand: I previously uploaded the whole repo via rclone sync, then tried kopia repository sync-to, which also sees the previously uploaded blobs but states, that they don’t match the local ones and it starts to upload all the blobs again.
I have not had any luck improving the initial scan speed - it still takes about 2 hours (for a ~1 TiB repo). The actual sync speed is OK at around 35MiB/s - a direct rclone sftp sync via sftp runs at around 60 MiB/s