The default sharding option of kopia and kopiaUI are different: kopia use [1, 3] while kopiaUI use [3, 3].
The problem is that it will create lots of folders for s-files and x-files (session files and index files), and nearly every actions will list these folders, while listing many folder on a HDD or network drive is very slow. (using --flat
option will put many files in one folder, which is slow too)
I think the default .shards
can be:
{
"default": [1, 2],
"maxNonShardedLength": 20,
"overrides": [
{ "prefix": "q", "shards": [1, 1] },
{ "prefix": "s", "shards": [1] },
{ "prefix": "x", "shards": [1] }
]
}
The folder structure will be:
p\ # data blocks
04\
25\
a6\
...\
q\ # metadata blocks
1\
3\
a\
...\
s\ # session files
****.f
x\ # index blocks
n23_****.f
n24_****.f
r0_7_****.f
r8_15_****.f
s0_****.f
s1_****.f
...
s22_****.f
kopia_****.f
xw****.f
xe23.f
xe24.f
The sharding strategy is based on some assumptions:
- More files and folders in a folder, the folder is slower to read file and to create new files.
- Files are faster to be listed in a single folder then in many different folders.
- P-files and q-files are only listed when running maintenance, otherwise they are directly created or read (with the help of indexes), so these files can be grouped into folders to boost speed.
- X-files are listed on every operation so they are in a single folder.
- While s-files are only listed when running maintenance, they are created and deleted on every operation and the number of them is too small. So it is better to place these files in a single folder instead of creating many folder then empty them, so these folders will not be created and listed.
- Maintenance is running regularly, so the numbers of x-files and q-files are small.
- a ~400GB repository contains about 40000 p-files and 200 q-files, so grouping these into 256(folders)*150(files) and 16(folders)*13(files) is better then using 4096 folders.
Here is the same issue I have created on github:
After explored the forum, I found a hidden command to modify a local repository to the shards format I have mentioned:
kopia blob shards modify --path=/path/to/repo --i-am-sure-kopia-is-not-running --default-shards=1,2 --override=q=1,1 --override=x=1 --override=s=1 --unsharded-length 20
It will move these files to their new places. It may take a long time, while you cannot use the repo before it done.