Slow backup process

Hi everyone,
I am trying out kopia, but I am having slowness problems after some initial snapshots that were completely fine.
I will put down all information / actions already performed:

  • Full size of the root folder I am backing up is ~9 TB
  • I am using a gdrive repository, connected to a folder in one of my Shared Drives
  • Final aim is to create the first “real” backup of 9 TB by performing multiple snapshots in which I back up only some folders (progressively removing some ignore-rules from the policy). Once I have created the first 9 TB “real” backup, I will keep backing up daily without any ignore-rule
  • I started by performing two snapshots of ~200 GB and ~300 GB respectively and they were completely fine
  • Once the second snapshot was finished, I found out that one can upload at most 400k files to a Shared Drives. This could have been a problem in the future so I needed a way to reduce the number of files produced by kopia
  • I found out that I could do this by setting --max-pack-size-mb=40, so I set it.
  • Then I performed two more snapshots, one of ~500 GB and one of ~400 GB. The first one was fine (uploading ~40-50 MB files), but the second one reached 29 hours of processing until I “stopped” it
  • “Stopped” means that I Ctrl + C the running kopia snapshot command: few second after stopping, it started a full maintenance that is running since last friday, i.e. 3 days ago, and it is still ongoing. Specifically it is “Rewriting contents from short packs” since friday
  • I am monitoring the gdrive folder and it is actually working but during this last snapshot it uploaded less than 1 file per minute on average. Many times the last two uploaded files differ in timestamp of ~5 mins. This occurred both during actual kopia snapshot command and downstream full maintenance when I stopped it.
  • I have also monitored the network speed and it has been stable and fine all the times, approx. 250 Mbit down and 280 Mbit up.

Minor information (maybe needed):

  • It seems like Google Drive API has a restriction to max 750 GB daily upload, so I am leveraging policy’s ignore-rules to restrict the scope of each snapshot I take
  • Shared Drives item count limitation could still be a problem in the future, if so I will change the pack size again
  • I am attaching a screenshot of where I am now “blocked”

Do you have any suggestion about the sudden slowness of the backup process?
Could it be related to the in-itinere change of --max-pack-size-mb=40, even if the first snapshot after this change was completely fine?

Thanks in advance for any help or pointers!

I cannot add more links to original post so I must add this as a reply.

I forgot to add the “latest.log” file, maybe it can help.

How did you create the repo? Is it through Rclone or the native option (see Repositories | Kopia])? My guess is that something in the pipeline is a bottleneck, I’ve seen some threads pointing to some RClone limitations. What platform are you on? Kopia version? What are your expected upload/download speeds?

Thanks for the reply.

Repo was created using kopia’s native support for Google Drive, i.e. gdrive. So I would say I cannot rely on threads you were mentioning.
I’m on Ubuntu 18.04 and I’m using the following kopia version: 0.11.3 build: 317cc36892707ab9bdc5f6e4dea567d1e638a070 from: kopia/kopia (output of kopia --version)
In terms of up/down speed I would expect something similar to my previous report (approx. 250 Mbit down and 280 Mbit up) as this is the network speed I have everywhere else.

Interesting. I would hazard a guess that it is the Google itself that is causing the throttling. It may be worth filing a bug report on the GitHub repo. I cannot unfortunately confirm this (I don’t have that much space on G Drive).

Thank you again for the reply.
I would not say the problem is Google since logs seems to report continuous activity.
Anyway, I will try to file a bug on the GitHub repo and hope for some help.

Thank you again!

where can we use that --max-pack-size-mb flag? is that a policy? I couldn’t find it in the help for policy set and snapshot create

kopia repository set-parameters --help

has it.

Do mind, if you change size, the next maintenance (or at least the next full maintenance) will start to resize the blobs, which will take up quite a lot of ram, since it will be holding a certain amount of old-size blobs in ram to make each new-size blob on the remote site. We bumped our S3 packs to a larger (better for the S3 service) sizes and were a bit surprised that the smallish VM that “owns” the repo got quite loaded. We thought it would be a bit more … serial in nature but it wasn’t. Not too bad and we managed to pull through, but just a small hint that setting this early or right after creation is a good thing.

1 Like