-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #3650 (participation format in BeaconState result is out of spec) #3776
Conversation
if not result.data.add(uint8(parsed)): | ||
reader.raiseUnexpectedValue("The participation flags list size exceeds limit") | ||
|
||
RestJson.useCustomSerialization(AltairBeaconState.current_epoch_participation): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not make ParticipationFlags a distinct type instead? this custom serialization stuff is quite fragile because the default is wrong: you don't get an error from the compiler if it's missing from the compile (same reason why Rest in general must never automatically serialize types)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notably, this format for flags is also required when reading yaml files (in tests) - it's a distinct type in the REST spec, not a "custom field" - ie any time participation flags are serialized, they should be serialized as numbers - this is not specific to these fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would require changing all usage sites, tweaks in SSZ, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, yes - that's the point, a bit - it's a different type, per one of the specs for which it's used, so that should be reflected in our code as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serializing by field maintains the existing semantic mismatch - sooner or later, the semantic mismatch will lead to a bug similar to the one that's being fixed here - custom field serialization is not the tool for the job here (it rarely is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arguably, we should also remove the list[byte-like]
overloads for serialization because they also are overly permissive, and if another field type is added that looks like a byte but is a number, it should not be picked up automatically either - instead, there should be a compile-time error so the right serialization can be chosen deliberately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, how would these non type-based serializers look like? If you have a spec where the same type is serialized in different ways in two different places, I would say "slightly botched" is an accurate description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well .. it's not the same type, that's the point - it's a different type in one of the specs that is relevant to serialization and we've interpreted it wrong.
The pedantic solution here, if we were to not use the spec types for REST, would be a RestBeaconState
and RestParticipationFlags
. This would be "correct" but also truly a lot of work. The distinct type is not a lot of work, comparatively.
Using custom field serializers is ugly hack, and we can do it for this release perhaps, but it should be fixed immediately after.
Non-type-based serializers are simple functions that load data field by field without relying on type matching magic - typically, they'd be generated from the oapi spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing this discussion is a bit pointless, but why should the REST API use a different definition of the BeaconState
type? Where is this "intention" documented? I would argue the inconsistency was an oversight and an unfortunate design flaw because it brings no benefits and increases the cost of the implementation significantly as you admit yourself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this "intention" documented?
I asked the people more heavily heavily involved than us in beacon api spec writing. For them, this is not a flaw: it is the most natural way to represent the flags. We're looking at it from a SSZ-centric point of view (because our types were first developed for SSZ) - they're looking at it form a JSON-centric one, where encoding lists of numbers as .. lists of numbers is more natural.
More formally, it's documented here: https://github.com/ethereum/beacon-APIs/blob/master/types/altair/epoch_participation.yaml#L6
but why should the REST API use a different definition of the BeaconState type
Well, we use Nim types to represent both REST types and SSZ types - the REST types are more information-rich than SSZ, and we must preserve that information: SSZ is a subset of REST, type-richness-wise, so we can save ourselves some work by reducing REST to SSZ and not the other way around. We're already reducing Nim types to REST and to SSZ: neither of these two standards match Nim exactly - this is important to remember, because the Nim types we use in the spec code will always be a compromise between the three use cases, as long as we reuse them for both Nim code, SSZ and REST.
Effectively, this becomes a constraint: it means we also can't use all features of Nim in our spec code - ie some Nim features don't map 1:1 to SSZ/REST and vice versa - we're left with the lowest common denominator between these use cases. It's a "cost" of serialization frameworks in general, reusing types that you use in code as "serialization specifications" which inevitably ends up having semantic differences.
@@ -79,7 +79,7 @@ type | |||
ParticipationFlags* = uint8 | |||
|
|||
EpochParticipationFlags* = | |||
HashList[ParticipationFlags, Limit VALIDATOR_REGISTRY_LIMIT] | |||
distinct HashList[ParticipationFlags, Limit VALIDATOR_REGISTRY_LIMIT] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is something of an odd choice - it's the ParticipationFlags
type that has a different serialization than the other "byte-like" types (UInt8
in the beacon API) - if later we have an array[ParticipationFlags]
it will be serialized as a list of numbers. we can leave as-is for the release but this spec deviation should be noted in the code
see https://github.com/ethereum/beacon-APIs/blob/e3a3f6e1edb25371a0641d16097d13ee369589bc/types/primitive.yaml#L51 where Bytes32
and Root
is what makes a list be serialized as a string - ie our code is not representative of what goes on in the spec.
Notable sub-module changes:
handle trailing comma properly when deserializing seq and inline table nim-toml-serialization#47