When Kopia is creating snapshots how does it check whether a file has changed?
Is it by comparing the file date?
Does Kopia retain a local or remote database of file hashes?
I assume that if a copy of a file is not present locally and Kopia needs to check it it needs to download the file from the remote to see if it has changed, and that would mean high bandwidth usage.
Does kopia have a means of checking hashes through a script running on the server like rsync and borg I presume?
Well… have a look at the “Advanced” section of the documentation. Once you understand the basic princicples, Kopia is operating on, you will also understand, how Kopia determines, if a file has change or is not yet present in the repo at all.
Kopia uses a cache to not re-download everything over and over again.
You can run kopia snapshot verify to make sure your repository is in good shape and no files are missing. This command only reads the metadata and not the whole content and can thus be used frequently without using much bandwith. Running kopia snapshot verify is generally not necessary because Kopia runs all of this checks when maintenance is done (every 24 hours by default).
Running kopia snapshot verify with the --verify-files-percent flag instructs Kopia to read the whole content or blob (information on blobs) and therefor needs to download the blob if running on a remote repository.
My preferred way of running Kopia with remote repositories is using a Kopia Repository Server (KRS). KRS acts as a proxy between your clients and the storage. KRS can use every storage location that Kopia supports including Amazon S3 or rclone. The server running KRS can do all the maintenance and bandwith intensive tasks including kopia snapshot verify --verify-files-percent.
No, if kopia decides a file should be sent, it will use the chosen “splitter” algorithm to make pieces of the file, and then locally (encrypt and compress and) checksum those pieces. Then it “asks” the remote end if those pieces with those checksums already exist or not. If they do, they are not sent again. If they don’t, then they will be sent over. Incidentally, this is how it “knows” if you rename or move files around. Their names, paths and permissions will be sent over again, but the contents will quickly be found to already exist on the remote side and not sent again.