Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The great function cleanup/rework #380

Closed
3 of 17 tasks
JustinDrake opened this issue Jan 1, 2019 · 5 comments
Closed
3 of 17 tasks

The great function cleanup/rework #380

JustinDrake opened this issue Jan 1, 2019 · 5 comments
Labels
general:enhancement New feature or request

Comments

@JustinDrake
Copy link
Contributor

JustinDrake commented Jan 1, 2019

Global organisation

  • 1. Organise functions into three categories:
    • Math and crypto: hash, hash_tree_root, repeat_hash, integer_squareroot, shuffle, split, merkle_root, verify_merkle_branch, etc.
    • Read state: get_active_validator_indices, get_shuffling, get_shard_committees_at_slot, get_block_root, get_beacon_proposer_index, get_attestation_participants, get_effective_balance, etc.
    • Write state: process_deposit, process_ejections, process_penalties_and_exits, activate_validator, initiate_validator_exit, exit_validator, etc.
  • 2. Segregate functions at the top of the spec, after the constants and data structures. The remaining high-level "glue logic" will be easier to follow, as well as the details inside individual functions.
  • 3. All function definitions listed in the table of contents for easy access.
  • 4. Functions that read from and/or write to state have state: BeaconState as the first argument.
  • 5. Functions read from state where possible to minimise the number of inputs. For example get_active_validator_indices is currently given validators: [ValidatorRecord], slot: int when it suffices to pass state: BeaconState.
  • 6. Replace pseudocode bulletpoints (e.g. in "Per-block processing") with functions. Both cleaner and more precise. For example add verify_proposer_signature, process_randao, ...
  • 7. Pull out inlined function definitions (e.g. get_ancestor, repeat_hash).
  • 8. Avoid huge lists of arguments. For example, process_deposit has 8 inputs (state, pubkey, amount, proof_of_possession, withdrawal_credentials, randao_commitment, poc_commitment) where the last 7 inputs can be replaced with a single DepositData object.

Naming

  • 1. Use the get_ prefix exclusively for functions that read state. (E.g. rename get_initial_beacon_state, get_domain, get_fork_version to something else.)
  • 2. Be consistent with verbs (e.g. validate_proof_of_possession, verify_slashable_vote_data should both use "validate" or "verify").
  • 3. Rename prepare_validator_for_withdrawal to initiate_validator_withdrawal for consistency with initiate_validator_exit.

Local cleanups

  • 1. Typehints for all
  • 2. Docstring for all
  • 3. Whitespace linter (e.g. spacing between operators, no trailing whitespace)
  • 4. Format linter (e.g. typehint formatting, break long lines)
  • 5. No hardcoded constants
  • 6. Judicious comments

@djrtwo, @hwwhww, @vbuterin Open to add/remove cleanup items from the above. It may make sense for me and/or @djrtwo to do the work when there are no open PRs to avoid merge conflict hell.

@JustinDrake JustinDrake added the general:enhancement New feature or request label Jan 1, 2019
@JustinDrake JustinDrake added this to the January Release milestone Jan 1, 2019
@hwwhww
Copy link
Contributor

hwwhww commented Jan 1, 2019

Naming 1. Use the get_ prefix exclusively for functions that read state. (E.g. rename get_initial_beacon_state, get_domain, get_fork_version to something else.)

IMO the get_ prefix implies that “this function won’t mutate anything”. For the functions that are used for generating a new object, it can be named with a prefix create_ or generate_.

It may make sense for me and/or @djrtwo to do the work when there are no open PRs to avoid merge conflict hell.

I’d support that we make it into several small PRs instead of a huge one. 😃

get_ancestor

It won’t be 100% complete since the object store is only abstractly described as the data storage interface that design by the clients, which I think it’s fine, but we can add more notes on that.

@djrtwo
Copy link
Contributor

djrtwo commented Jan 1, 2019

Very in favor of this in general :)

I'm a little torn on removing the explicit params from the get_ functions and changing just to passing in state. In practice, it is really nice when these pure functions specify the explicit subset of state required so that they can really easily be tested in isolation. That said, this is a spec so we need to find the balance. It might make sense to just pass in state but be more verbose in the header comment. Thinking.

I'd like to take a stab at [1] maybe right after we merge in #374 (reviewing now!)

@arnetheduck
Copy link
Contributor

What's helpful in general is minimizing order-dependence of the various parts of the spec - a concrete example of this would be what #348 points out - there's a dependency there on state.slot that complicates processing, testing, etc. This is also the main advantage of passing in all relevant state explicitly - it declares these dependencies and makes them visible - but as long as the property itself is maintained, the difference is kind of cosmetic.

As far as implementation goes, the direction we're heading in is to do as much validation work as possible up front ("validate that there are at most X attestations") before making any state changes - this to minimize the cost of discarding faulty blocks (including costly state rewinds) - that's made harder if validation depends on partial state updates, because we then have to rewrite the validation code with as-if logic that pulls us further from the spec text.

@vbuterin
Copy link
Contributor

vbuterin commented Jan 1, 2019

Generally looks good to me!

  1. Use the get_ prefix exclusively for functions that read state. (E.g. rename get_initial_beacon_state, get_domain, get_fork_version to something else.)

I think get for pure functions that take things other than the state as input is fine too; the important thing is not using get for impure functions.

  1. Replace pseudocode bulletpoints (e.g. in "Per-block processing") with functions. Both cleaner and more precise. For example add verify_proposer_signature, process_randao, ...

I think there's a balance here. The absolute extreme is that the spec is literally just a highly commented python implementation. It's probably worth thinking on a case-by-case basis, and also where we remove English text outside the python, add it back inside the python via comments.

Another thing is that whereas some functions as "utility" functions that are used throughout the spec, others are highly specialized to a specific task. Those functions should probably NOT go at the top of the spec; they should go inline in whatever the relevant section is. We shouldn't be requiring readers to constantly jump back and forth between different parts of the page as much as possible.

@JustinDrake
Copy link
Contributor Author

Closing for now as the focus is on content (e.g. fixing bugs) rather than presentation. A presentational overhaul will come in the "Transparent Paper" after the spec is finalised.

Anyone keen to do the cleanups feel free to reopen :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
general:enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants