I noticed a huge discrepancy between upload and download speeds when using a Wasabi S3 socket. While the upload runs at approx 88MB/s, the download/restore runs only at about 4 MB/s. Is there anything to do about that… that’s a 20x slower throughput and it really makes using kopia rather useless for backing up large files to s3.
Kopia gets high upload throughput by massively parallelizing the uploads. We have yet not done much in terms of similar optimizations for the download throughput, but it’s coming - we have various ideas for doing this, we just did not get to it yet.
If you’re looking to restore significant percentage of the repository very quickly you should be able to do this by syncing the entire repo to a local filesystem location (the sync can be heavily parallelized) and restoring from there:
Basically:
kopia repo sync-to filesystem --path /some/path --parallel=16
kopia repo connect filesystem --path /some/path
kopia restore ...
On 500mbps downlink, I can get about 45 MB/s sync throughput this way (from Wasabi us-west-1) to Ziply around Seattle.
One more question: are you downloading lots of small files or small number of very large files?
Yeah… we currently run a 1GbE link, so getting 88 MB/s up seems comparable. As with all my previous tests, I am running on large files, although I have not started to uplaod my large multi-gb kvm dump files. In this case, these were ISO images of various Linux distros in the range from a couple of 100 MBs to some GBs.
As far as syncing a repo to a local file system frst, this doesn’t seem feasible to me for multi TB repos, which is with what I’d be dealing with. My current KVM backup repo is already about 19TB in size…
…aand just out of curiosity I did try to replpcate the 19GB repo to my local client and running kopia sync-to. Using --parallel=32 the data downloaded with approx. 26 MB/s. So in order to sync a 19TB repo back to some local storage, this would take up to 9 days - geez…
Yeah. Your numbers seem unexpectedly low. Can you measure raw wasabi throughput using some other simple upload/download tool (both single connection and parallelized) and post here?
You can probably approximate single-threaded throughput this by running repo sync-to file system without parallelism (just let it run for 10 minutes or so)
Well… Wasabi has some “own” Ookla speedtest servers at their locations and you can actually perform a speedtest to a Wasabi site of your choice. So one issue will be totally the fact, that our provider seems to have some peering issues with the network, where the eu-central-1 site from Wasabi is operating on, as it only downloads at 80 mbps while it uploads up to 450 mbps. I checked that against a “regular” speedtest and found the differences between up-/downloading only marginal.
What also may be good to know is, that Wasabi allows up to 100 streams per bucket in parallel, before choking the 101th one… but that shouldn’t be an issue for me at the moment…