Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2 API calls and SSZ encoded response. #167

Closed
cheatfate opened this issue Sep 8, 2021 · 16 comments · Fixed by #170
Closed

V2 API calls and SSZ encoded response. #167

cheatfate opened this issue Sep 8, 2021 · 16 comments · Fixed by #170

Comments

@cheatfate
Copy link

Calls like https://ethereum.github.io/beacon-APIs/#/Beacon/getBlockV2 and https://ethereum.github.io/beacon-APIs/#/Debug/getStateV2 has discriminator field for application/json encoding, but its unclear how response should be encoded for application/octet-stream (SSZ) encoding - should the response include version field too?
Because without discriminator it could be pretty hard to decoded SSZ encoded SignedBeaconBlock or BeaconState object.

@ajsutton
Copy link
Contributor

ajsutton commented Sep 8, 2021

I don't think it makes any sense to add the discriminator to the SSZ encoded form as then it wouldn't be parsable as a block or block directly by clients. The primary use for the SSZ encoding form is to get a state for use with checkpoint sync or a state and block to use with manual state transition testing.

I think it's reasonable to expect that anyone requesting the data as SSZ has knowledge of the spec and thus would know the epoch forks activated at etc so could select the right schema to decode with. And you'd need to know the schema in order to parse the version field out of the SSZ response anyway (you could assume it will always be the first x bytes but could just as easily find the slot from the block or state using that method as well then lookup the right schema from there).

@zah
Copy link

zah commented Sep 8, 2021

While this argument is reasonable for requests by slot, it's a bit less clear whether somebody requesting blocks or states by their hash tree root will have knowledge regarding the returned object type. The SSZ union type could be employed to eliminate any ambiguity here (in version 3 of the API).

@mpetrunic
Copy link
Collaborator

You could fetch block header by root to figure out slot. Only viable solution would probably be to return version as response header

@ajsutton
Copy link
Contributor

ajsutton commented Sep 8, 2021

Returning the version as a header would make a lot of sense. The utility of these methods is badly broken if the SSZ data isn't just the state or block.

@michaelsproul
Copy link
Collaborator

While this argument is reasonable for requests by slot, it's a bit less clear whether somebody requesting blocks or states by their hash tree root will have knowledge regarding the returned object type.

The caller's SSZ library should be able to handle this for them, because it's possible to read the slot from the returned bytes and then decide what to decode based on that. This is what Lighthouse, Teku and probably most of the other clients do.

@arnetheduck
Copy link
Contributor

Returning the version as a header would make a lot of sense.

this argument applies for json as well - it's annoying to have to make two passes of the json - once to figure out what type it is and the second time to actually decode the payload - I'm assuming here streaming parsers are used, where you don't decode to a dynamic json object in-between (this is generally extremely slow and wasteful)

@arnetheduck
Copy link
Contributor

The caller's SSZ library should be able to handle this for them

This doesn't fly if there are no differences to grab on to - it's quite possible that there will be a spec change that results in types with the same serialized representation - for example, an empty list being serialized doesn't leave a trace.

@cheatfate
Copy link
Author

While this argument is reasonable for requests by slot, it's a bit less clear whether somebody requesting blocks or states by their hash tree root will have knowledge regarding the returned object type.

The caller's SSZ library should be able to handle this for them, because it's possible to read the slot from the returned bytes and then decide what to decode based on that. This is what Lighthouse, Teku and probably most of the other clients do.

I'm sorry, but i can't find such client implementations in any of the clients, because this V2 calls are not used in validator_client.
For example i have found only such version of client in Teku and it definitely uses only application/json encoding.
https://github.com/ConsenSys/teku/blob/08467e305ea67c947d982a23b03d4801599e2ede/acceptance-tests/src/testFixtures/java/tech/pegasys/teku/test/acceptance/dsl/TekuNode.java#L310-L317

@mcdee
Copy link
Contributor

mcdee commented Sep 24, 2021

I could get behind a versioning header in the response. It still allows users to either double parse (inefficient, but we're already using JSON so efficiency isn't top of our list of priorities) or pick through the SSZ (which may end up requiring heuristics some number of iterations down the line), but gives a fairly clean alternative for those that want to use it.

Regarding the specifics of the header: the value could match that of the version field in the returned JSON (so either PHASE0or ALTAIR at current), which should be relatively simple as servers already have that data around. This could be added to the content type response header as a parameter, e.g. Content-type: application/json; version=PHASE0 or Content-type: application/octet-stream; version=ALTAIR.

@cheatfate
Copy link
Author

cheatfate commented Sep 24, 2021

@mcdee sorry, but double parsing for 100+ Mb JSON file is extremely ineffective solution. getState() call generates such big json responses for prater network.

@michaelsproul
Copy link
Collaborator

michaelsproul commented Sep 24, 2021

Sorry, I'd missed the previous suggestion of using an HTTP response header. I see no issue with adding a header to both the SSZ and JSON endpoints. Implementations can choose to use it or ignore it if they wish. Lighthouse's contextual decoding for SSZ is here for reference.

@cheatfate
Copy link
Author

Sorry, I'd missed the previous suggestion of using an HTTP response header. I see no issue with adding a header to both the SSZ and JSON endpoints. Implementations can choose to ignore it they wish. Lighthouse's contextual decoding for SSZ is here for reference.

What about BeaconState?

@arnetheduck
Copy link
Contributor

contextual decoding

This is somewhat sad because it relies on the slot field remaining at the same offset in the encoding across all fork versions, creating an additional constraint to deal with when changing the data structures between forks - it certainly works for now and for some types, but it's somewhat .. sad. A designated header field for the purpose frees the future us from having to consider such hacks.

@mpetrunic
Copy link
Collaborator

I could get behind a versioning header in the response. It still allows users to either double parse (inefficient, but we're already using JSON so efficiency isn't top of our list of priorities) or pick through the SSZ (which may end up requiring heuristics some number of iterations down the line), but gives a fairly clean alternative for those that want to use it.

Regarding the specifics of the header: the value could match that of the version field in the returned JSON (so either PHASE0or ALTAIR at current), which should be relatively simple as servers already have that data around. This could be added to the content type response header as a parameter, e.g. Content-type: application/json; version=PHASE0 or Content-type: application/octet-stream; version=ALTAIR.

I would actually propose separate header just for the sake of siplicity of documenting and using it.

Something like "Eth-Beacon-Version": ALTAIR

@mcdee
Copy link
Contributor

mcdee commented Sep 26, 2021

I would actually propose separate header just for the sake of siplicity of documenting and using it.

A content type parameter would be more accurate, and I would hope that there are enough libraries out there that can parse the header for it to not be an issue, but it isn't something I'm willing to fight over much.

@mpetrunic
Copy link
Collaborator

#170

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants