-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rough draft of SectorSet #333
Conversation
@@ -221,7 +221,107 @@ Bitfields are a set of bits encoded using a custom run length encoding: rle+. r | |||
|
|||
#### `SectorSet` | |||
|
|||
TODO | |||
The `SectorSet` is an integer set implemented with a simple array mapped tree. | |||
Integer indexes range from 0 to infinity (TODO practical bounds / bounds implied by encoding?). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 64 bit int limitation OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth setting some maximum based on the maximum number of sectors we expect to impose on miners
the `Pointer`. There is also a `RootNode`. The `Node` is as follows: | ||
```go | ||
type Node struct { | ||
Pointers [S]*Pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a new node always allocates S
empty Pointers
``` | ||
The `Pointer` is serialized as a cbor object (major type 5). The `Value` | ||
is serialized { in some way } with `v` as its object key. The `SectorMeta` | ||
type carries all needed sector metadata (TODO, define a type for commD commR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this (commR, commD)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @laser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SectorMeta
will need:
- CommR so that we can generate PoSts (the
generate_post
routines takes as input all CommRs in the miner's proving set). - CommD for piece inclusion proof verification.
of nodes until reaching the leaf node and set the node's pointer Value at | ||
`index % childRange` to the insert value. | ||
|
||
##### Delete Value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lack of bitfields and constant sized arrays might make this prohibitively expensive as we have search the whole array for emptyness (we could include a simple count to make this better). Also note that as specced out here users can arrive at two identical sets with different representations. Id appeciate comments on the importance of addressing these issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iterating over some small constant number of items shouldnt be prohibitively expensive. Since we have to parse bytes into a datastructure anyways, we can keep track of the item count in memory (but not for the serialized version)
`index % childRange` to the insert value. | ||
|
||
##### Delete Value | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not covering range deletes here. Let me know if that's a requirement for v1.
the root that the `SectorSet` has capacity for its existing indices and `b`. | ||
Pointers are then updated in these new nodes so that there is a path from | ||
the new node with the highest height to the existing node pointed to by root. | ||
Finally the root node updates to point to the node with highest height. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that as the SectorSet grows the first insert into the new region of the tree will start to require more and more overhead. Making this scale better probably requires a redesign of the datastructure (we could do something closer to the HAMT with more complexity to improve this). Is the current design acceptable or should we try to address this now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logarithmic scaling is about optimal here.
NodePointer Cid | ||
} | ||
``` | ||
The `RootNode` is serialized as a cbor object (major type 5) where `NodePointer` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont' think the rootnode needs to be serialized ever.
is serialized { in some way } with `v` as its object key. The `SectorMeta` | ||
type carries all needed sector metadata (TODO, define a type for commD commR | ||
pair and serialization). A `Pointer` is conceptually a union type, it can | ||
carry a `Link` or a `Value` but never both in a well formed `SectorSet` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whats the object key used for the Link?
Pseudocode: | ||
```go | ||
func (rn *RootNode) Lookup(index int) (v Anything, error) { | ||
return recLookup(rn.MaxDepth, index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why doesnt this mention the expansion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, this is just for lookup, i see
@@ -221,7 +221,107 @@ Bitfields are a set of bits encoded using a custom run length encoding: rle+. r | |||
|
|||
#### `SectorSet` | |||
|
|||
TODO | |||
The `SectorSet` is an integer set implemented with a simple array mapped tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets call the data structure AMT
and say that a
type SectorSet {UInt:SectorMeta}<AMT>
Moved content to ipld/specs#137 and the definition of the type into #367 |
The big overhaul in Sept/Oct means none of these changed files exist anymore. But, [at least some of] this prose is still valuable in these new locations: Rendering: https://filecoin-project.github.io/specs/#systems__filecoin_mining__sector__sectorset Proposal: let's create a new PR with the prose that's still true & useful from this, and close this one. @ZenGround0 do you prefer to do it yourself? @sternhenri/I can also create a first pass for you to review. Also, #367 contains followup work. |
I think this became AMT regardless this is very out of date |
Plowing ahead with these assumptions I've drafted a rough draft of the SectorSet spec. Please take a look.
The prose is not great or final-spec material but I'm hoping it gets across the basics of what I'm planning to implement based on #116 so that we can sync state and catch anything I'm missing or address inefficiencies.