I’ve made a foolish mistake when created a new backup and have run out of storage space on the remote SFTP server where I keep my kopia repository and am seeking advice on how to proceed.
I am running Kopia on Rocky Linux 8.9 using the web UI to manage everything. I have the repository located on a remote SFTP servers with 1TB of space that is currently full. Today I notice that the web UI is saying that Maintenance has been running for about a day and a half at this point and is getting errors regarding being unable to create new files.
14:57:44.375 got error cannot create temporary file: unrecognized error when creating temp file on SFTP: /home/ssc/kopia/p/a28/cf9a30cf47f2d3b9c5c15ed5a8cc1-s02bfee4d49bab31d128.f.tmp.7842e54475255a61: sftp: "Failure" (SSH_FX_FAILURE) when PutBlob(pa28cf9a30cf47f2d3b9c5c15ed5a8cc1-s02bfee4d49bab31d128) (#9), sleeping for 3.844335937s before retrying
14:57:48.219 unable to rewrite content "62adce4d8552140922221499515b70e8": unable to flush old pending writes: error writing previously failed pack: error writing pack: can't save pack data blob pa28cf9a30cf47f2d3b9c5c15ed5a8cc1-s02bfee4d49bab31d128: error writing pack file: unable to complete PutBlob(pa28cf9a30cf47f2d3b9c5c15ed5a8cc1-s02bfee4d49bab31d128) despite 10 retries: cannot create temporary file: unrecognized error when creating temp file on SFTP: /
I have tried pressing the cancel button in the web UI and the UI has updated to show “Cancelling” however it’s been a couple of hours now and it has not canceled.
My plan was to stop this maintenance task if i can, mark a few snapshots to be deleted, and manually run full maintenance to clear the old blobs.
In case you can not recover from out of space you can always transfer all repo to e.g. local disk - then run forget/maintenance etc. and transfer it back. As it is sftp site you can use any tool you prefer FileZilla, rclone etc.
And my hint for the future - create dummy 1GB placeholder file on your remote. This is what I always do. Despite best effort it happens that some software fills in all available space to the brim. In such cases I always have 1GB rescue space available I can get by deleting “safety” file.
Using a robust file system server-side that support & can enforce quota is a nice alternative to the 1GB placeholder: you set a quota 1 or 2 GB lower than available space and you increase it in case of emergency.
Also, many FS support a «root reservation» of (as far as I remember) 5%, meaning that only root will be able to fill the storage totally. Regular users will run out of space early, leaving 5% of free space available only to root. You might be able to clear that reservation if it exists on your storage to reclaim that space for you backup user.
Last thing: add storage monitoring with proper alert so you can prevent saturation next time
Edit: let me elaborate about the FS reservation.
/dev/splunkdata/lvol0 is an EXT4 FS on a Rocky Linux 8.x server.
I could free this reservation (down to 0%) like this:
tune2fs -m 0 /dev/splunkdata/lvol0
So as long as you are using a FS with reservation (EXT4 but not XFS, afaik) AND that you don’t use root (server-side) to write your backups, you should be able to reclaim some space and solve your problem.
You are right and your trick with the placeholder is nice.
On the other hand, most cloud providers allow customers to add / remove resources when needed /no longer needed. In that case, adding a bit of storage for few hours / days / weeks (depending on available options) can be the best solution to perform repository cleanup before going back to the original 1TB storage size.
If @Analog is stuck in the middle, having neither the advantages of a full server with root, nor the advantage of an «elastic» or «semi-elastic» cloud storage, then it might be a good idea to re-think this hosting
I would use rclone union served using rclone serve sftp. This way I could add temporarily extra space e.g. from my local drive - to do all pruning. And then return to what it was before (after moving all data created in local union bit to original sftp). It requires a bit of gymnastic but is actually very simple.
So it would be:
union of SFTP + local disk folder -> rclone serve sftp -> kopia
Union part has to be configured to write new files to member with most free space (so it would be local disk initially)
Thank you all for the information and suggestions on how to proceed and what to do in the future to avoid this from happening.
Because of my limited ability to easily control my storage environment from my provider this solution will be a great fail safe in the future if I run into similar issues. I will create a dummy 1GB~ place holder file for this purpose.
I will also be doing exactly this; I made the mistake of relying on my provider to send me automated warnings regarding usages but in this instance it was too little too late by the time I received notice so will be sure to monitor it myself.
This is indeed my setup
I’m on contract with my current provider for a set period of time and space, while I could adjust adjust and get more storage and then after completing maintenance tasks revert back to only my 1TB allotment I would unfortunately lose my currents plan billing agreements and be forced to pay a higher rate. Probably time to look for another provider
I have opted to transfer the current repository locally to and attempt to finish maintenance and then once complete transfer it back.