Small progress.
I’ve added all classes and methods to use native GCS backend with versioning and retention, but i’m fighitng with the following error:
# ./kopia_fix repository create gcs --credentials-file=/root/.config/auth/test-backup_service-account.json --bucket x --retention-mode GOVERNANCE --retention-period 180d --log-level debug
Enter password to create new repository:
Re-enter password for verification:
Initializing repository with:
block hash: BLAKE2B-256-128
encryption: AES256-GCM-HMAC-SHA256
splitter: DYNAMIC-4M-BUZHASH
ERROR unable to connect to repository: error connecting to repository: repository not initialized in the provided storage
2 files are created: kopia.blobcfg and kopia.repository and i can see in both the retention set, so at least one of my customization (the one that updates the retantion when saving the blob) is working (180d is referred only in Kopia command line, so only copia can set a 180d expire, no other processes)
But i don’t know why i’m getting this error, probably because with versioning i’m unable to read some files from the backend. Any idea ?
I think i’m very very very close to have a working gcs implementation, i think i’m at 95% of works. When i’m able to create a repository and a snapshot, i’ll post a PR on github
I wish I knew but I have close to zero knowledge of GCS. At least for S3 versioning has no impact on reading. Unless access permissions allow only write but not read?
so as usual with dev - last 5% of your project will take 95% of time:)
Permissions are good, as i’m using an adminsitrator account.
Almost done, repo created as expected, extension mode enabled, snapshot taken properly, BUT i’m fighitng against a 403 Forbidden error when extending the retaintion date. I don’t know why, probably i’m sending the wrong json, but looks good to me.
@kapitainsky could you please confirm me if calling Kopia with --credentials-file pointing to a service account exported from Google in json format , is correct ? because the API call to change the retention is correct, but i’m always getting an Access Denied error. I’m start to think that the access token is not used as expected.
Hello everyone. Im new on Kopia and I have a similar problem.
I would like to consider two scenarios. I also use S3 compatible storage with compliance mode and versioning. The Kopia repo is configured and the compliance mode is set in Kopia. Test snaps have been created successfully and are displayed (“kopia snapshot list”).
X:\Kopia>kopia snapshot list
2024-03-14 11:02:24 CET k05958bbacdzz926711b3448a9a54e418 1.2 MB drwxrwxrwx files:1 dirs:1 (latest-2)
2024-03-14 11:04:12 CET kdd44auuc3a678ef5944b5d082f3e5c96 9.8 MB drwxrwxrwx files:2 dirs:1 (latest-1,hourly-1,daily-1,weekly-1,monthly-1,annual-1)
Scenario A)
deletes all existing snapshots via Kopia CLI “kopia snapshot delete xyz -delete”
Snapshot List → All snaps are gone
Disconnect Repo
Reconnect with switch --point-in-time=2024-03-14T11:05:20.000Z
Snapshot List: Empty
I’m stuck at this point and came across this forum article. After some trial and error, I stumbled upon the fact that in my case there was a time delay of one hour.
Means:
Reconnect with switch --point-in-time=2024-03-14T10:05:20.000Z was successful
At this point it’s interesting how I get the snaps back into the main repo. I still have to figure that out.
Scenario B)
Connect directly to the S3 storage and deletes all data, including the repo files.
A Kopia connect is then no longer possible:
ERROR error connecting to repository: repository not initialized in the provided storage
Since versioning is active on the storage side, all objects receive the delete flag. After I removed this (“Delete”), the objects became active again and I was able to reconnect to the repo with Kopia and see all the snaps.
I’m still at the beginning and still have some tests to do to understand how Kopia works properly.
This repo has been attacked and damaged but thx to protection in place you can access its past state.
Connect to your repo using --point-in-time and then copy all repo to another one using sync-to. You can also use other tools like rclone to copy specific point in time bucket state somewhere else (can be another bucket).
From kopia functionality perspective you should use the same method as in A - connect using --point-in-time to past state you think is ok. This gives you opportunity to restore all data.
Restoring all repo - look at my previous post. Doing it your way only works because you know what and how it was damaged. In real life scenarios you simply have no idea. What is deleted and what maybe changed and then deleted etc.
But you are doing something very important:) Testing. Document what and how etc. This is exactly what you will need when something happens. Maybe never but if yes it will be super handy.
I just saw this discussion, but it is a bit unclear to me, how this works with deduplication.
Is the retention-period actually reset for reused p blocks in newer snapshots? (e.g. if a part of my current system was already there 5 years ago). Or is the retention period applied to the whole bucket, more like versioning?
If you give some example it will maybe become more clear what exactly you are asking about…
Object lock is applied (and periodically refreshed) to every file in your repository. So files are retained (as current or past versions) for specified amount of time.
Hi, I do not know what you mean by example, but I can explain, maybe.
This is about how kopia stores files.
p blobs store the bulk of snapshot data; they are mostly written once and not accessed again, except during compactions as part of full maintenance or when (optionally) testing your snapshots
Furthermore, the chunks get hashed and deduplified, so, as a simplified example, if I copy e.g. a file “example.zip” that I created 5 years ago to a new file “example_copy.zip” , this file is not stored again, only the snaphot listing is updated. So if I have a retention_period on “example.zip” of 90 days, it will be long expired, from what I understand. A new snapshot will neither be able to recover example.zip nor example_copy.zip if malware deletes it. So the only sensible policy would be retain 90 days after deletion/modification (this would rather be NoncurrentVersionExpiration in AWS I would guess).