-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement vm message extraction #1027
Conversation
2bf5690
to
db929fe
Compare
Codecov Report
@@ Coverage Diff @@
## master #1027 +/- ##
======================================
Coverage 35.6% 35.6%
======================================
Files 44 44
Lines 2879 2881 +2
======================================
+ Hits 1025 1027 +2
Misses 1750 1750
Partials 104 104 |
schemas/v1/8_vm_messages.go
Outdated
ALTER TABLE ONLY {{ .SchemaName | default "public"}}.vm_messages ADD CONSTRAINT vm_messages_pkey PRIMARY KEY (height, state_root, cid, source); | ||
CREATE INDEX vm_messages_height_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BRIN (height); | ||
CREATE INDEX vm_messages_cid_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH (cid); | ||
CREATE INDEX vm_messages_source_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH (source); | ||
CREATE INDEX vm_messages_from_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH ("from"); | ||
CREATE INDEX vm_messages_to_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH ("to"); | ||
CREATE INDEX vm_messages_actor_code_method_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BTREE (actor_code, method); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sticky approve. Nice and clean! 🤝
@@ -178,8 +182,9 @@ var TableFieldComments = map[string]map[string]string{ | |||
"DealID": "Identifier for the deal.", | |||
"EndEpoch": "The epoch at which this deal with end.", | |||
"Height": "Epoch at which this deal proposal was added or changed.", | |||
"IsString": "Related to FIP: https://github.com/filecoin-project/FIPs/blob/master/FIPS/fip-0027.md", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to read the FIP to get the gist.
"IsString": "Related to FIP: https://github.com/filecoin-project/FIPs/blob/master/FIPS/fip-0027.md", | |
"IsString": "Indicates if the value of Label is a string. Assume binary blob otherwise. Related to FIP: https://github.com/filecoin-project/FIPs/blob/master/FIPS/fip-0027.md", |
@@ -197,6 +202,21 @@ var TableFieldComments = map[string]map[string]string{ | |||
ParsedMessage: {}, | |||
InternalMessage: {}, | |||
InternalParsedMessage: {}, | |||
VmMessage: { | |||
"ActorCode": "ActorCode of To (receiver)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ActorCode": "ActorCode of To (receiver)", | |
"ActorCode": "ActorCode of To (receiver).", |
// Cid of the message. | ||
Cid string `pg:",pk,notnull"` | ||
// On-chain message triggering the message. | ||
Source string `pg:",pk,notnull"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need source
to be included in the primary key. cid
should have enough uniqueness within a height/stateroot.
Source string `pg:",pk,notnull"` | |
Source string `pg:",notnull"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Source must remain a primary key as VM message CIDs can conflict at the same height/stateroot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't cid
the CID of the intermediate (internal) message? In other words, one source
has many cid
s? If so, then the index should go on the column with the higher cardinality (between cid
and source
) which I expect to be the intermediate message CIDs and I understand that to be the cid
column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't cid the CID of the intermediate (internal) message, In other words, one source has many cids?
Yes correct.
Are you proposing we drop source as a primary key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Just remove from the PK to reduce the size of that index. We can use the standalone hash index for source
if we ever need to condition on that column. It should reduce the size of the PK index without any performance cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. After some discussion w @frrist, it seems that the cid
is not unique and so source
is required in the PK afterall. 😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably recommend the removal of the hash-based source
index. I think both indices are redundant and if the PK index isn't getting a trim, let's drop the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on dropping the source
hash index esp if it's part of the PK.
model/messages/vm.go
Outdated
// Value attoFIL contained in message. | ||
Value string `pg:"type:numeric,notnull"` | ||
// Method called on To (receiver) | ||
Method uint64 `pg:",use_zero"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not null? (I know this is inconsequential given we write our own migrations, but it's good context that doesn't hurt to include and keeps consistency w other fields usage of the same tag.)
Method uint64 `pg:",use_zero"` | |
Method uint64 `pg:",notnull,use_zero"` |
model/messages/vm.go
Outdated
// ActorCode of To (receiver) | ||
ActorCode string `pg:",notnull"` | ||
// ExitCode of message execution. | ||
ExitCode int64 `pg:",use_zero"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExitCode int64 `pg:",use_zero"` | |
ExitCode int64 `pg:",notnull,use_zero"` |
model/messages/vm.go
Outdated
// ExitCode of message execution. | ||
ExitCode int64 `pg:",use_zero"` | ||
// GasUsed by message. | ||
GasUsed int64 `pg:",use_zero"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GasUsed int64 `pg:",use_zero"` | |
GasUsed int64 `pg:",notnull,use_zero"` |
schemas/v1/8_vm_messages.go
Outdated
returns jsonb | ||
); | ||
ALTER TABLE ONLY {{ .SchemaName | default "public"}}.vm_messages ADD CONSTRAINT vm_messages_pkey PRIMARY KEY (height, state_root, cid, source); | ||
CREATE INDEX vm_messages_height_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BRIN (height); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think BRIN indexing is dependent on write order, which is use-case specific. BTREE is the better generalized index here.
CREATE INDEX vm_messages_height_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BRIN (height); | |
CREATE INDEX vm_messages_height_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BTREE (height); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, will use btree instead.
COMMENT ON COLUMN {{ .SchemaName | default "public"}}.vm_messages.method IS 'The method number invoked on the recipient actor. Only unique to the actor the method is being invoked on. A method number of 0 is a plain token transfer - no method execution'; | ||
COMMENT ON COLUMN {{ .SchemaName | default "public"}}.vm_messages.actor_code IS 'The CID of the actor that received the message.'; | ||
COMMENT ON COLUMN {{ .SchemaName | default "public"}}.vm_messages.exit_code IS 'The exit code that was returned as a result of executing the message.'; | ||
COMMENT ON COLUMN {{ .SchemaName | default "public"}}.vm_messages.gas_used IS 'A measure of the amount of resources (or units of gas) consumed, in order to execute a message.'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious: Do you know what resources this would be other than gas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reusing the description of gas here that we use elsewhere, e.g.: https://github.com/filecoin-project/lily/blob/v0.10.0/schemas/v1/base.go#L391
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspected it was copypasta. Just wondering if there is anything other than gas that it would have been referring to and couldn't think of.
schemas/v1/8_vm_messages.go
Outdated
CREATE INDEX vm_messages_height_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BTREE (height); | ||
CREATE INDEX vm_messages_cid_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH (cid); | ||
CREATE INDEX vm_messages_source_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH (source); | ||
CREATE INDEX vm_messages_from_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH ("from"); | ||
CREATE INDEX vm_messages_to_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH ("to"); | ||
CREATE INDEX vm_messages_actor_code_method_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BTREE (actor_code, method); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to do a write-optimized schema, I'd recommend to get rid of several of these indices since all these indices need to be updated whenever an insert
or update
happens.
I propose the following:
CREATE INDEX vm_messages_height_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BTREE (height); | |
CREATE INDEX vm_messages_cid_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH (cid); | |
CREATE INDEX vm_messages_source_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH (source); | |
CREATE INDEX vm_messages_from_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH ("from"); | |
CREATE INDEX vm_messages_to_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH ("to"); | |
CREATE INDEX vm_messages_actor_code_method_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BTREE (actor_code, method); | |
CREATE INDEX vm_messages_height_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BTREE (height); | |
CREATE INDEX vm_messages_from_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH ("from"); | |
CREATE INDEX vm_messages_to_idx ON {{ .SchemaName | default "public"}}.vm_messages USING HASH ("to"); |
Regarding:
CREATE INDEX vm_messages_actor_code_method_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BTREE (actor_code, method);
This only makes sense if we have queries that have a filter that's where actor_code = x and method = b
or where actor_code = x
. This wouldn't work well with filters: where method = b and actor_code = x
or where method = b
.
Therefore, I'd recommend based on Mike's comment here:
CREATE INDEX vm_messages_actor_code_idx ON {{ .SchemaName | default "public"}}.vm_messages USING BTREE (actor_code);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only makes sense if we have queries that have a filter that's where actor_code = x and method = b
This will be a common query since the method number depends on the actor code. i.e. method numbers only have meaning when compared with the actor they were called on.
@kasteph, I addressed your feedback, can you give this another look and a ✅ if it looks okay? |
@frrist looks good, approved and thank you! |
What
This PR addresses #978 by:
vm_messages
model and schema for tracking internal messages (off-chain) sent by the VM during on-chain message executionAn example of the queries this allows:
Select a message sent to a multisig actor that proposes sending a method with the below
params
Select all VM messages (via
source
) resulting from the execution of this proposed method.We see that the proposed message was used to withdraw a balance of from miner actor by calling the miners
WithdrawBalance
method (viabafy2bzaceaixcm7hv6qtgqb7ccw7tayxcu64c4qcdcqtf332uk3vrwol6ozrw
), which then resulted in the miner actor performing a send of the requested amount (viabafy2bzacebhlcwggejv4mgb6hgvoqmxvaouzpysdtdwlgg4gnbdgqnzkwak72
)