borg2: which commands shall do a complete rebuild of the ChunkIndex? #8476

ThomasWaldmann · 2024-10-15T21:04:48Z

For high-latency stores (sftp: and a lot of what's available via rclone:) a full index rebuild needs to list all objects in the store and there are potentially a lot of objects.

For sftp, this also needs listing all 65536+256 nesting directories and sftp is relatively slow.

IIRC, currently these do a complete rebuild:

compact (because it removes unused/unreferenced objects)
check (repository part, before doing anything else - just to make sure)
first operation after a borg check --repair (because repair might have removed objects and thus invalidated the chunks cache)

Most commands will:

use their locally cached chunks index (if the hash is still the same of what's in repo/cache/chunks_hash)
fetch a fresh index from repo/cache/chunks
rebuild the chunks index the slow way, by listing all objects (then write it to repo/cache/chunks)

So the question is:

When shall we rely on an existing cached ChunkIndex (repo/cache/chunks) being in a good state and when shall we rather go the slow-safe route and build a fresh one?

For borg create:

it's not a big problem if the index does not have all objects that exist in the repo. If that happens, borg create will just store something to the repo that's already there. After it is finished creating the archive, it will store an updated index to repo/cache/chunks.
it would be a severe problem though if the index would falsely say "we have that object" and borg would not store it to the repo. the archive would then reference a non-existing object.

The text was updated successfully, but these errors were encountered:

ThomasWaldmann · 2024-11-16T17:07:27Z

Note:

borg compact emits some stats and for that it needs the stored sizes of the repository objects. we do not have these in the chunks index usually, so listing all objects and sizes is required for that.

ThomasWaldmann · 2024-11-19T01:54:48Z

Hmm, borg check could just try to work with the existing chunks index and not rebuild a new one.

If borg check then runs into problems, borg check --repair would rebuild the index.

ThomasWaldmann · 2024-11-23T18:45:42Z

#8561 - borg compact only needs to rebuild the chunks index IF --stats is given (if repo space usage stats before/after compaction are a must-have).

Without --stats it might be much faster now as it will re-use an existing cached chunks index from the repo and thus does not have to slowly list all repo objects.

ThomasWaldmann added this to the 2.0.0b13 milestone Oct 15, 2024

ThomasWaldmann changed the title ~~borg2: which commands shall rebuild a completely new ChunkIndex?~~ borg2: which commands shall do a complete rebuild of the ChunkIndex? Oct 15, 2024

ThomasWaldmann added cmd: check cmd: compact labels Oct 17, 2024

ThomasWaldmann modified the milestones: 2.0.0b13, 2.0.0b14 Oct 26, 2024

ThomasWaldmann modified the milestones: 2.0.0b14, 2.0.0b15 Nov 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

borg2: which commands shall do a complete rebuild of the ChunkIndex? #8476

borg2: which commands shall do a complete rebuild of the ChunkIndex? #8476

ThomasWaldmann commented Oct 15, 2024 •

edited

Loading

ThomasWaldmann commented Nov 16, 2024

ThomasWaldmann commented Nov 19, 2024

ThomasWaldmann commented Nov 23, 2024 •

edited

Loading

borg2: which commands shall do a complete rebuild of the ChunkIndex? #8476

borg2: which commands shall do a complete rebuild of the ChunkIndex? #8476

Comments

ThomasWaldmann commented Oct 15, 2024 • edited Loading

ThomasWaldmann commented Nov 16, 2024

ThomasWaldmann commented Nov 19, 2024

ThomasWaldmann commented Nov 23, 2024 • edited Loading

ThomasWaldmann commented Oct 15, 2024 •

edited

Loading

ThomasWaldmann commented Nov 23, 2024 •

edited

Loading