-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMD-0118: Partitioned Epoch Rewards, amend/extend design #118
Conversation
60b1507
to
f027d0c
Compare
@t-nelson (can't request as reviewer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good! Thanks for writing this up and consolidating everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a couple clarifying suggestions. the changes lgtm otherwise. thanks for committing the copy of the original simd with the proposed modifications, it made review very pleasant!
i did notice some other, non-technical changes that might make the document more clear, but probably couldn't justify their own simd. do we want to entertain those here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i did notice some other, non-technical changes that might make the document more clear, but probably couldn't justify their own simd. do we want to entertain those here?
I say yes. Let's get this as complete and useful as possible. I will keep things separated by commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i left quite a bit of eye twitch intact for the sake of brevity. these suggestions knock out most of the stumbling i came across that actually brought confusion
Handling the new field is much easier than recompute. And it is a one time cost. After it is implemented, you don't need to worry about it in future. However, adding another code path introduce maintaining cost in future too. |
the snapshot shortcut is only "easier" in the short term. it is punting the problem down the road to snapshot encoding, (de)ser, storage and distribution. success requires long term thinking, not taking the easy way out. |
Regarding the original partitioned reward design to have the sysvar account carries the pending reward balance. I think there was a concern raised by Anatoly that the total capital did not stay stable during the epoch. Anatoly has an idea of merkling all the rewards at epoch boundary and depositing the rewards into a reserve account at epoch boundary. Then stake account, when withdrawing the rewards from the reserve account, will submit a merkle proof to verify the reward payment, then transfer the balance from the reserve account. Later on, we simplified it a bit and removed merkle tree, but we still keep the reward balance in the sysvar so that the total capital is stable during the epoch to address the above concern. |
Total capital is not stable during an epoch now, as |
How much balance does incinerator account get per epoch? And where does it get the blance? |
I don't know how much it receives; the runtime isn't logging that data. I suppose we could write a geyser plugin that would report that (or an indexer might know already). Anyone can transfer lamports to the incinerator to burn them. |
i'm not sure how total capitalization nor incinerator are relevant here? we just need to be able to recover remaining partition distributions from minimal state when loading snapshots. if we have total&distributed epoch reward lamports and total&distributed epoch credits, Delegations are in accounts (or snapshotted stakes cache? 🫠). what else do we need do this practically? |
One thing people may start noticing is that the It is more noticeable than a onetime increase at the epoch boundary. Probably, people would want an explanation for that. A slow and gradual increase in the total supply of sol per block may make people worry about inflation... |
The capital change due to incinerator is very minimal. It doesn't make material impact on the total sol. While reward change are much more noticeable than incinerator burning. |
Because, that's related to one of the main changes for this SIMD. If I read the SIMD correctly, there are two major chagnes:
|
Because that is user-dependent, we absolutely cannot rely on that assumption. However, it sounds like epoch-capitalization stability is not a technical concern in the first place; just a comms issue. Since there will need to be communication about things like stake withdrawals being unavailable during the rewards period anyway, we can also explain the new shape of supply increases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r+ one nit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I have two nits. (Sorry am on vacation so can't use my work account @ripatel-fd)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks a very good direction
feeb48e
to
d71daa0
Compare
d71daa0
to
a7f0dbf
Compare
…rds sysvar, happen before tx processing each block
The distribution of epoch rewards at the start block of an epoch becomes a | ||
significant bottleneck due to the rising number of stake accounts and voting | ||
nodes on the network. | ||
|
||
To address this bottleneck, we propose a new approach for distributing the | ||
epoch rewards over multiple blocks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious - have we measured that it is the distribution (write-back) versus the calculation that is the bottleneck?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, definitely, although it doesn't seem like we have that data summarized concisely in any one place. It seems to be mostly in various places in the #proj-epoch-boundary-optimization channel on discord. For instance, here's a comment about the calculation time: https://discord.com/channels/428295358100013066/960593861732884520/1096146390037561414 (64ms for 550K stake accounts)
Whereas distribution is more like 5-10s.
When booting from a snapshot, a node must check the EpochRewards sysvar account | ||
to determine whether the distribution phase is active. If so, the node must | ||
rerun the rewards partitioning using the `EpochRewards::num_partitions` and | ||
`EpochRewards::parent_blockhash` sysvar fields and determining the upcoming | ||
partitions by comparing its current block height to | ||
`EpochRewards::distribution_starting_block_height`. Then the runtime must | ||
recalculate the remaining rewards using the `EpochRewards::total_points` and | ||
`EpochRewards::total_rewards` sysvar fields, as well as the `EpochStakes` in the | ||
snapshot. The recalculated rewards can be confirmed by comparing a sum of the | ||
rewards remaining (those partitions expected to not yet have been distributed) | ||
with the difference between the `EpochRewards::total_rewards` and | ||
`EpochRewards::distributed_rewards` fields. Partitions for blocks prior to the | ||
current block height can be discarded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rewards calculation assumes that the calculation is done at the epoch boundary (for example, using VoteState::epoch_credits
). We need to make sure that re-calculating rewards when booting off a snapshot doesn't break the calculation, as this assumption is no longer true. I'm a bit worried that calculating rewards not at an epoch boundary will produce a different result.
I might be wrong but just wanted to flag as something to watch for when implementing this 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, I think solana_stake_program::stake_state::calculate_stake_points_and_credits
might end up being tweaked slightly, as well as anywhere which uses VoteState::epoch_credits
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is certainly the most sensitive aspect of the implementation. However, are you asking for any SIMD changes here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not in the SIMD, as I don't think this will be fully known until it's implemented. Just wanted to flag 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved
cdda8cd
Co-authored-by: Trent Nelson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like consensus has been reached between the Firedancer and Anza teams. Thank you @CriesofCarrots for championing this!
To clarify, with this change, would the various stake pool programs (spl, marinade) need to upgrade to have their epoch update crank instruction fail if |
@billythedummy , I'm not personally familiar with the various stake-pool programs or the instruction you reference, but if they have operations that depend on rewards distribution being complete, then the answer is yes. |
@CriesofCarrots each stake pool program basically have crank instructions that on every epoch reads the new balances of the stake accounts it owns, which are supposed to have increased at the epoch boundary due to staking rewards, and updates the exchange rate between SOL and the pool’s LST. These instructions dont mutate stake accounts in all cases, so i think we would need the Tagging stake pool programs maintainers here for visibility @joncinque @ochaloup |
This SIMD supersedes SIMD-0015 with some new design elements.
I have begun by copying in the original SIMD so that the changes can be seen more easily in subsequent commits.
#116 may be useful as a reference, as it describes the existing Labs implementation.