Maintenance run crashing minio/s3 backend


I have a repository with around 400 GB / 14000 blobs backed up to a (self hosted) minio server. I have not had any issues until the first maintenance run. The routine makes it to the point where it is looking for unreferenced blobs. There it will stay for a while until it encounters an error.

The cli log contains numerous lines like this:

2021-12-01T09:11:35.951849Z DEBUG kopia/repo [STORAGE] ListBlobs {"prefix": "pe", "resultCount": 0, "error": "Get \"\": net/http: timeout awaiting response headers", "duration": "10m0.165193131s"}

The underlying reason is that the minio server gets killed by the OS as it exceeds the maximum number of threads. Usually (snapshotting, verification, repository sync-to) it is running at far below 100 minio threads. Same for the initial content listing during maintenance. However, when looking for unreferenced blobs, threads shoot up in batches of ~2000 every couple of minutes until they eventually hit 11446, which is the hard limit currently imposed by the OS.

Now I never had issues with minio before and it might very well be a bug on their side. However, at first glance it seems like kopia is leaking threads here. Or is this necessary? It looks like a thread is kept alive for every single of the ~14000 blobs.
I understand that my minio instance is extremely underpowered compared to any commercial s3. It just seems suspicious to me that it didn’t break a sweat until now, no matter what it had to handle in this private use case scope, but is so exorbitantly overchallenged by the kopia maintenance run.

Thanks a lot!

Forgot the actual error reported to stdout:

2021-12-01T09:23:10.794863Z ERROR kopia/cli error deleting unreferenced blobs: error looking for unreferenced blobs: error iterating blobs: Get "[ DOMAIN ]/kopia/?delimiter=%2F&encoding-type=url&fetch-owner=true&list-type=2&prefix=p7": net/http: timeout awaiting response headers

This expectably points to the IterateUnreferencedBlobs() function.

Great catch. Please file an issue on GitHub so we can prioritize fixing it.

So I did. Thanks once mroe for the quick reply!

For reference: