Upload and hash at the same time?

When I run a backup, Kopia takes a minute to hash some files, then pause hashing when waiting for upload. The repository is using rclone.

Is there a way to allow them to run at the same time?

It’s mostly because rclone is slow. Hashing is pipelined with uploads but sometimes enough of upload queue builds up for hashing to stop.

1 Like

I’ve taken a look at network graphs, and there appear to be small bursts of upload speed used (there would be one second of bandwidth used followed by several seconds of zero speed). Using rclone sync continuously saturates all available bandwidth. Is it due to the latency? If so, is there a way to initiate multiple uploads simultaneously to keep the data flowing? When using rclone sync, 8 parallel uploads occur simultaneously.

The major problem here is that Rclone is basically acting as middleware allowing Kopia to connect to storage providers that Kopia does not support natively. Any time you have such middleware, there will always be at least some loss in performance, as Kopia needs to ensure that the files are being transferred correctly by Rclone. So you will never match the full speed that Kopia or Rclone have on their own. Still, perhaps it is possible for Kopia to run Rclone with different parameters that could speed up the process, but someone would need to submit a PR for that.

Sorry, I’m confused. What process is being performed by Kopia and Rclone?

I’ll try to take a look and see if the process can be optimized.

All I meant is that Kopia is essentially using Rclone as middleware, and that will always be slower than if Kopia was uploading directly. Granted, that does not mean that Kopia’s implementation of Rclone has been fully optimized. You can look at the rclone-specific code at kopia/rclone_storage.go at master · kopia/kopia · GitHub to see what exactly is going on under the hood.