Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ID genrator/flake #57

Open
parkan opened this issue Nov 1, 2016 · 6 comments
Open

ID genrator/flake #57

parkan opened this issue Nov 1, 2016 · 6 comments

Comments

@parkan
Copy link
Contributor

parkan commented Nov 1, 2016

If we're accepting "mediachain first" data, we need to be able to hand out IDs. An appropriate technique for this is *flake (after Twitter's snowflake), which uses a combination of timestamp and node ID to make mostly-k-ordered IDs.

Explainer: http://yellerapp.com/posts/2015-02-09-flake-ids.html
Erlang impl: https://github.com/boundary/flake
Clojure impl: https://github.com/maxcountryman/flake
2 Go impls: https://github.com/davidnarayan/go-flake + https://github.com/casualjim/flakeid (not sure if we should use these or handroll)

We have a perfectly good "node id" in the form of peerId, though it's possible to introduce a collision by running multiple nodes with the same identity -- I think this is a degenerate case that we don't care about.

This is more appropriate than v1 UUIDs (MAC address based) or v4 UUIDs (totally random). v3 UUIDs could potentially be used, but I don't think the semantics are quite right.

@vyzo
Copy link
Contributor

vyzo commented Nov 2, 2016

That's what we use for statement id generation: publisher-id:timestamp:counter
The counter is meant to disambiguate statements published in the very same second; in our impl it increments with every statement and goes back to 0 on node restarts.

@denisnazarov
Copy link
Contributor

@parkan do we need IDs on the actual "objects" that users are adding or can it just fall back to statement id if they omit an --idSelector.

@parkan
Copy link
Contributor Author

parkan commented Nov 2, 2016

@denisnazarov the metadata object should have an identifier, yes (think about the multi-statement case)

@parkan
Copy link
Contributor Author

parkan commented Nov 2, 2016

@vyzo I am correct in understanding that these are strings, right? We probably want to pack them into ints

@vyzo
Copy link
Contributor

vyzo commented Nov 3, 2016

yes, they are strings.
You can pack them to ints by hashing and keeping a bunch of bits.

@parkan
Copy link
Contributor Author

parkan commented Nov 3, 2016

@vyzo yeah, I would literally use the snowflake approach (which is more or less that)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants