-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add nonce column to actor_states #1105
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #1105 +/- ##
======================================
Coverage 32.8% 32.8%
======================================
Files 46 46
Lines 3156 3156
======================================
Hits 1037 1037
Misses 2023 2023
Partials 96 96 |
I suspect this migration will not apply cleanly and I would really prefer to test this on a copy of that table in our production DB before we accept/merge this. |
Honestly, the view approach seems best here. We're making the schema a "lie" by indicating this column is nullable (which is impossible in the domain). This will require those who consume this schema to understand the semantics of And we're saying there's "overhead" in a view, but if we rename the existing table and put it and the new able behind a view called |
Thanks for taking the time to review this and the feedback!
Yes.
If there are multiple rows with the same In general, a unique constraint is violated if there is more than one row in the table where the values of all of the columns included in the constraint are equal. source
That's a good catch, thanks. I'll add the
That sounds good.
Do you mind expanding what you mean by the semantics of
It is much simpler to include in the column comment that this column will have null values before this change is applied because it was not in the application code rather than create a new table, rename the old table, create a view that unifies these tables, and then document that. The method you describe is seamless to the consumer, certainly, but there are also people who operate Lily for whom such tasks is just extra work (and sometimes not easy to find and can only be found in the |
@@ -79,6 +79,8 @@ type ActorState struct { | |||
Code string `pg:",pk,notnull"` | |||
// Top level of state data as json. | |||
State string `pg:",type:jsonb,notnull"` | |||
// The next Actor nonce that is expected to appear on chain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're already making changes here, let's add a state_root field too. It will make the table re-org safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, let me think harder about the schema then because I don't wanna overcomplicate things but also make sure it's correct.
` | ||
ALTER TABLE {{ .SchemaName | default "public"}}.actor_states | ||
ADD COLUMN nonce BIGINT, | ||
ADD CONSTRAINT actor_states_uniq_nullable_nonce UNIQUE (height,head,nonce); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should replace height with state_root as it's reorg safe.
The schema is all there is as far as a def of the shape of data. NULL is a placeholder for an empty value for us here because we don't know retroactively what these values are suppose to be. We are adding this piece of semantic knowledge as a comment on a column (maybe) but this knowledge does not appear in other use cases where it is important and easily overlooked (for the same reason it would be lost in the CHANGELOG). (Such as how an ORM would interpret the schema automagically, or how certain query will behave in the face of a NULLable column vs not, or even writing the query in a way that gives us the answer we expected in this specific context (which could be surprisingly different from others). Ideally, the semantics of our internal operations should influence this schema as little as possible. That said, if we're introducing these semantics to simplify dev/op efforts, that's one thing and we've made such concessions in the past. But we've also tried to make schemas mostly read-only and using new models to represent schema changes is the most honest way of tracking how our exports evolve over time. We attempted this with the If we are going to deprecate this schema in favor of something slimmer in the future once IDL work lands, then I'm much less concerned about these things in the short term. I wanted to raise this concern (as I have in the past) and make sure we are aware of the costs we are paying. |
Also, let's not trivialize the differences here. Creating a comment is easy for this PR, but I will need to spend cycles to verify production doesn't break/degrade due to this migration. We are trading off avoidance of one complexity for taking on another one at runtime. (Which, also as a operator, I'm not a fan of.) |
@kasteph I hope this doesn't come off as nitpicky. I appreciate you submitting the PR for this and I won't block the merge if others want to go this way. Please let me know once the migration is ready for testing and I'll run it in staging for a bit to measure/test. |
There is a risk that this migration is expensive when we roll out to production and ends up interrupting services while the database blocked writing to this table. To mitigate this risk, I would like to dupe the We could also handle the rollout manually by turning off actor tasks during migration and then gapfilling them after it completes. This would only affect data from the disabled tasks. I'm not sure about the scope of impact of data missing from this task on alerts and monitoring/analysis capabilities, but I'm certain we use this data and is non-zero. Lastly, we could avoid testing if this were a new table and not an If you want to abort the migration testing on a new instance, please ping me and I'll tear it down. |
Looks like restoring a production backup to a duplicate instance is broken in the UI. (!!!) I have an open request to support to investigate that. |
We discussed that this migration is not necessary. In order to avoid work needed to test/land this, we will postpone discussion on this improvement until some time later if needed. I have abandoned the migration testing work I mentioned previously. |
This PR:
actor_states
table and;nonce
column to theactor_states
table in the schema.nonce
column has aUNIQUE
constraint on it but still nullable (postgres docs)actor_states
) and (actor_states_with_nonce
).Resolves #1104