-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: compute gas outputs #67
Conversation
28f683c
to
c6b5923
Compare
} | ||
|
||
// LeaseGasOutputsMessages leases a set of messages that have receipts for gas output processing. minHeight and maxHeight define an inclusive range of heights to process. | ||
func (d *Database) LeaseGasOutputsMessages(ctx context.Context, claimUntil time.Time, batchSize int, minHeight, maxHeight int64) (derived.GasOutputsList, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function contains a query that joins messages to receipts to only return messages that have corresponding receipts in the database. It also joins to block_headers to get the parent_base_fee. Is this the correct value to use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function contains a query that joins messages to receipts to only return messages that have corresponding receipts in the database
It should never be the case that a message without a receipt exists in the tables produced by visor since it operates on valid blocks, I believe this join on block_headers is unnecessary if its only for ensuring messages are returned that have a receipt. (spec-ref)
It also joins to block_headers to get the parent_base_fee. Is this the correct value to use?
"The ParentBaseFee is used when calculating how much a sender burns when executing a message. Burning simply refers to sending attoFIL to a dedicated, unreachable account. A message causes ParentBaseFee * GasUsed attoFIL to be burnt" - (lotus-ref).
The ParentBaseFee
value being extracted looks right to me, here is where lotus gets it from: https://github.com/filecoin-project/lotus/blob/2bf42d5318d089741719a213c32e8d3fd03db139/chain/stmgr/call.go#L135
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the join between messages and receipts is necessary so as to ensure we only process messages that have received a receipt, which happens at some time in the future. The indexer will add messages and then some time later will find the receipts for those messages so there will be a period of time where we have messages with no receipts in the database.
Plus we need the gas_used value that is only in the receipt. The join to block headers it to retrieve the parent_base_fee. Both of these are needed for the call to ComputeGasOutputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@frrist Thanks for the response. The join on block_headers
is for returning parent_base_fee
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, this query looks good to me.
"golang.org/x/xerrors" | ||
) | ||
|
||
type GasOutputs struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what is persisted in the new derived_gas_outputs
table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for duplicating the message and receipt data in this table? Is it to avoid complicating our ingestion strategy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to be query optimized (indexed, normalized) format for supporting analysis and monitoring. This (query-optimized) schema may eventually live in another place so we shouldn't assume the presence of our ingestion schema. It also cleanly isolates our ingestion concerns from our analysis concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Columns are as requested in #63
c6b5923
to
ae4e8ae
Compare
tasks/message/gasoutputs.go
Outdated
return xerrors.Errorf("parse gas premium: %w", err) | ||
} | ||
|
||
outputs := vm.ComputeGasOutputs(item.GasUsed, item.GasLimit, baseFee, feeCap, gasPremium) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we express this within the Lens abstraction? This will keep our lotus deps within that package and make it easy to diversify our sources later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few changes and concerns to address. 🤝
}, | ||
&cli.IntFlag{ | ||
Name: "gasoutputs-batch", | ||
Aliases: []string{"gob"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for i := 0; i < cctx.Int("gasoutputs-workers"); i++ { | ||
scheduler.Add(schedule.TaskConfig{ | ||
Name: fmt.Sprintf("GasOutputsProcessor%03d", i), | ||
Task: message.NewGasOutputsProcessor(rctx.db, rctx.api, cctx.Duration("gasoutputs-lease"), cctx.Int("gasoutputs-batch"), heightFrom, heightTo), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for applying --to
and --from
here.
model/derived/gasoutputs.go
Outdated
ExitCode int64 `pg:",use_zero"` | ||
GasUsed int64 `pg:",use_zero"` | ||
Return []byte | ||
ParentBaseFee string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need Params
, Idx
, or Return
. I don't really mind them being present, but they do take up space. I think I mind them. Let's remove. I'd be willing to consider including them if they were parsable, but we can follow up with that later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"golang.org/x/xerrors" | ||
) | ||
|
||
type GasOutputs struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to be query optimized (indexed, normalized) format for supporting analysis and monitoring. This (query-optimized) schema may eventually live in another place so we shouldn't assume the presence of our ingestion schema. It also cleanly isolates our ingestion concerns from our analysis concerns.
"refund" text, | ||
"gas_refund" bigint, | ||
"gas_burned" bigint, | ||
PRIMARY KEY ("cid") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indexes to include here:
from
(using Hash)to
(using Hash)method
(using BTree)exit_code
(using BTree)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
// LeaseActors leases a set of tipsets containing messages to process. minHeight and maxHeight define an inclusive range of heights to process. | ||
func (d *Database) LeaseTipSetMessages(ctx context.Context, claimUntil time.Time, batchSize int, minHeight, maxHeight int64) (visor.ProcessingMessageList, error) { | ||
var messages visor.ProcessingMessageList | ||
// LeaseTipSetMessages leases a set of tipsets containing messages to process. minHeight and maxHeight define an inclusive range of heights to process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch.
} | ||
|
||
// LeaseGasOutputsMessages leases a set of messages that have receipts for gas output processing. minHeight and maxHeight define an inclusive range of heights to process. | ||
func (d *Database) LeaseGasOutputsMessages(ctx context.Context, claimUntil time.Time, batchSize int, minHeight, maxHeight int64) (derived.GasOutputsList, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@frrist Thanks for the response. The join on block_headers
is for returning parent_base_fee
.
|
||
GasRefund int64 | ||
GasBurned int64 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all of these columns should be notnull
(even the cases which use_zero
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
gas_outputs_completed_at = ?, | ||
gas_outputs_errors_detected = ? | ||
WHERE cid = ? | ||
`, completedAt, useNullIfEmpty(errorsDetected), cid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useNullIfEmpty
Love it.
tasks/message/gasoutputs.go
Outdated
return xerrors.Errorf("parse gas premium: %w", err) | ||
} | ||
|
||
outputs := vm.ComputeGasOutputs(item.GasUsed, item.GasLimit, baseFee, feeCap, gasPremium) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we express this within the Lens abstraction? This will keep our lotus deps within that package and make it easy to diversify our sources later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iand In the interest of allowing this to move along before I'm up, I'll put a ✅ now. I know you'll see the feedback is applied appropriately. 🤝
FYI, restarts are very aggressive.
Additionally, the higher worker counts on a pretty beefy box end up resulting in a lot of this, which is also exacerbated by the aggressive restart.
|
ae4e8ae
to
387bb05
Compare
This change adds a gas outputs processing task. The results of the task are written to a new table derived_gas_outputs. The task fetches batches of messages that have receipts and calls the lotus vm ComputeGasOutputs method, persisting the results. This change also performs a significant refactor of the visor processing tables. Previously the chain indexers would write duplicate rows to visor_processing_statechanges and visor_processing_messages. Each of those tables contained claimed_until, completed_at and errors_detected columns. They were used by the ActorStateChange and Message tasks. However if we were to add another task that processed tipsets we would need to add another processing table and then reindex the entire chain to populate it. To avoid this, the visor_processing_statechanges and visor_processing_messages tables have been combined into a single visor_processing_tipsets table. Each task now has a set of three columns in this table to track its work: statechange_claimed_until/statechange_completed_at/ statechange_errors_detected and message_claimed_until/message_completed_at/message_errors_detected. Adding a new tipset-oriented task is then a matter of adding three columns to this table. The visor_processing_messages table has been repurposed to track work that needs to be performed on each message seen. The GasOutputs task is the first task to use this table. It uses the gas_outputs_claimed_until, gas_outputs_completed_at and gas_outputs_errors_detected columns. Additional message-oriented tasks can be supported by adding new columns to this table, avoiding the need to read every message in the chain again.
387bb05
to
8ff8682
Compare
Moved ComputeGasOutputs into the node abstraction as suggested by @placer14 but github won't let me reply to that comment |
This change adds a gas outputs processing task. The results of the task are written to
a new table derived_gas_outputs. The task fetches batches of messages that have receipts
and calls the lotus vm ComputeGasOutputs method, persisting the results.
This change also performs a significant refactor of the visor processing tables. Previously
the chain indexers would write duplicate rows to visor_processing_statechanges and
visor_processing_messages. Each of those tables contained claimed_until, completed_at and
errors_detected columns. They were used by the ActorStateChange and Message tasks. However
if we were to add another task that processed tipsets we would need to add another processing
table and then reindex the entire chain to populate it.
To avoid this, the visor_processing_statechanges and visor_processing_messages tables have
been combined into a single visor_processing_tipsets table. Each task now has a set of three
columns in this table to track its work: statechange_claimed_until/statechange_completed_at/
statechange_errors_detected and message_claimed_until/message_completed_at/message_errors_detected.
Adding a new tipset-oriented task is then a matter of adding three columns to this table.
The visor_processing_messages table has been repurposed to track work that needs to be performed
on each message seen. The GasOutputs task is the first task to use this table. It uses the
gas_outputs_claimed_until, gas_outputs_completed_at and gas_outputs_errors_detected columns.
Additional message-oriented tasks can be supported by adding new columns to this table, avoiding
the need to read every message in the chain again.
fixes #63