Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start account and sequence numbers from 1 #6949

Closed
wants to merge 1 commit into from

Conversation

aaronc
Copy link
Member

@aaronc aaronc commented Aug 5, 2020

ref: #6213

In ADR 020 (#6111) we decided to start account and sequence numbers from 1 to avoid needing to deal with differences in how
protobuf implementations encode 0.

This PR makes in-state account numbers start from 2 and uses 1 as the "zero" value that is used for genesis transactions (and potentially fee grant transactions in the future https://github.com/cosmos/cosmos-sdk/pull/5768/files#r465244086)

I do want to note that I have reservations about merging this. It has been noted that there could be unintended consequences to this change. For instance, we would likely need to increment all existing account numbers. So I think before we move forward with this we should really think through what all the consequences of this are and if it worth it.


Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

  • Targeted PR against correct branch (see CONTRIBUTING.md)
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the module structure standards.
  • Wrote unit and integration tests
  • Updated relevant documentation (docs/) or specification (x/<module>/spec/)
  • Added relevant godoc comments.
  • Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
  • Re-reviewed Files changed in the Github PR explorer
  • Review Codecov Report in the comment section below once CI passes

@codecov
Copy link

codecov bot commented Aug 5, 2020

Codecov Report

Merging #6949 into master will increase coverage by 5.48%.
The diff coverage is 45.45%.

@@            Coverage Diff             @@
##           master    #6949      +/-   ##
==========================================
+ Coverage   61.82%   67.31%   +5.48%     
==========================================
  Files         520      151     -369     
  Lines       32129    10343   -21786     
==========================================
- Hits        19865     6962   -12903     
+ Misses      10656     2836    -7820     
+ Partials     1608      545    -1063     

@webmaster128
Copy link
Member

Thank you for bringing this up @aaronc. I'm still a strong supporter of the change because the handling of zero values and protobuf leads to a lot of problems in interface design. Look at this random current example of GET http://localhost:1317/auth/accounts/cosmos1pkptre7fdkl6gfrzlesjjvhxhlc3r4gmmk8rs6:

{
  "height": "0",
  "result": {
    "type": "cosmos-sdk/BaseAccount",
    "value": {
      "address": "cosmos1pkptre7fdkl6gfrzlesjjvhxhlc3r4gmmk8rs6",
      "account_number": "1"
    }
  }
}

Why are pub_key and sequence missing here? They just get lost somewhere in the process and as a consequence, the API does not behave like what you expect from a JSON API. The height clearly shows that 0s are encoded as "0", right.

I understand that this might be a bit of work in the state machine and migrations, but many people using very different technologies will have to deal with the signing interface.

This PR makes in-state account numbers start from 2 and uses 1 as the "zero" value that is used for genesis transactions (and potentially fee grant transactions in the future https://github.com/cosmos/cosmos-sdk/pull/5768/files#r465244086)

I just briefly looked at the use case – makes sense. I would just not consider it a "zero value" but a "special value". It is a value that has to be treated differently then the normal numbers, but it is no extrapolation of the regular values to zero. You just need to map to any place in the uint64 range that avoids conflicts. One solution is to reserve a range for special values (like [1, 1] (your suggestion), [1, 9] or [1, 999]). A different solution is what the C++ standard library does to handle position or not found in an usize type:

position = str.find(str2);

if (position != string::npos)
    cout << "first 'needle' found at: " << int(found) << endl;

where the constant string::npos is the last possible value of usize, i.e. 2^64-1 on a 64 bit system. This is conflict-resistent because you cannot produce a real position with such a high value.

@aaronc
Copy link
Member Author

aaronc commented Aug 5, 2020

Thanks for chiming in @webmaster128 ! So one change I would like to make with proto json is to emit default values in client facing endpoints. There is a flag we can use to enable this across the board. I think that would help with the json issue you shared above.

I would also like to hear thoughts from people in the ecosystem about this change to understand if there are potential negative side effects we should know about or not. /cc @ebuchman @zmanian @alexanderbez @alessio @jackzampolin

I do want to make sure the client UX is as good as possible and also make sure we are doing that wisely and with all the information.

@alexanderbez
Copy link
Contributor

Do we honestly have any alternatives @aaronc? From what I understand, this is mainly important with respect to signing msgs and txs, not the client-facing APIs and data. Is that correct? Might help to quickly outline why 0 can be a problem here.

@webmaster128
Copy link
Member

webmaster128 commented Aug 6, 2020

Might help to quickly outline why 0 can be a problem here.

Let me try: protobuf seralization is not unique (i.e. there exist an unlimited number of valid binary representations for a protobuf document). For signature verification, signer and verifier need to agree on the same serialization of a SignDoc. This is a relatively simple document with a TxBody embedded as bytes, chain ID, account number and sequence. In order to avoid the need for specifying and implementing a sophisticated protobuf normalization algorithm it is valuable to avoid the existence of empty values in SignDoc. Then the serialization rules are as simple as: 1. no extra fields and 2. fields ordered by field number. It is very likely that those rules are followed by default in common protobuf encoders.

@aaronc
Copy link
Member Author

aaronc commented Aug 6, 2020

As I have noted in the past, writing a hand generated serializer for SignDoc should be fairly trivial. All that is really needed is a function to encode varints which all protobuf implementations provide. There are only five fields to encode - three byte/strings and two unit64's. This is maybe 10-15 lines of code and we can provide a canonical reference implementation.

If we took that approach, zero assumptions need to be made about the underlying proto implementation even field ordering.

So I wouldn't consider the complexity of hand-encoding SignDoc to be a sufficient rationale. Maybe there are other reasons to avoid 0, but anyway...

@alexanderbez
Copy link
Contributor

@aaronc you have my support to take that route and close this PR then.

@cwgoes
Copy link
Contributor

cwgoes commented Aug 6, 2020

As I have noted in the past, writing a hand generated serializer for SignDoc should be fairly trivial. All that is really needed is a function to encode varints which all protobuf implementations provide. There are only five fields to encode - three byte/strings and two unit64's. This is maybe 10-15 lines of code and we can provide a canonical reference implementation.

If we took that approach, zero assumptions need to be made about the underlying proto implementation even field ordering.

So I wouldn't consider the complexity of hand-encoding SignDoc to be a sufficient rationale. Maybe there are other reasons to avoid 0, but anyway...

Just a quick question - will a custom serialisation function cause issues for hardware signers (e.g. the Ledger)? We should check to see if this will require an app update and let the Ledger developers know if so.

@aaronc
Copy link
Member Author

aaronc commented Aug 6, 2020

@cwgoes for one thing, the ledger will continue to use amino json for now and then be replaced with SignModeTextual. Also, as I understand it the ledger isn't actually responsible for generating a SignDoc, just reading it.

@webmaster128
Copy link
Member

As I have noted in the past, writing a hand generated serializer for SignDoc should be fairly trivial. All that is really needed is a function to encode varints which all protobuf implementations provide. There are only five fields to encode - three byte/strings and two unit64's. This is maybe 10-15 lines of code and we can provide a canonical reference implementation.

This sounds like the most hacky of all possible options. It is not only 10-15 lines of code. We then need a spec specific to that one document and test vectors as well. This creates almost all of the work but cannot be reused. Encoding protobuf is not rocket science, but not trivial either.

If you are not willing to work around the empty value problem of protobuf, we should have a properly specified canonicalization, like canonical-proto3 without the JSON part. This then can be reused if needed.

@aaronc
Copy link
Member Author

aaronc commented Aug 6, 2020

As I have noted in the past, writing a hand generated serializer for SignDoc should be fairly trivial. All that is really needed is a function to encode varints which all protobuf implementations provide. There are only five fields to encode - three byte/strings and two unit64's. This is maybe 10-15 lines of code and we can provide a canonical reference implementation.

This sounds like the most hacky of all possible options. It is not only 10-15 lines of code. We then need a spec specific to that one document and test vectors as well. This creates almost all of the work but cannot be reused. Encoding protobuf is not rocket science, but not trivial either.

With all due respect, I think this is simply not true. Let's all take a deep breath and avoid knee jerk reactions.

Here is some pseudocode that encodes to a buffer that has two functions WriteBytes and WriteUVarInt64 which all protobuf implementations must have anyway:

if( !signDoc.body_bytes.empty() ) {
  buf.WriteUVarInt64(0xA) // wire type and field number for body_bytes
  buf.WriteUVarInt64(signDoc.body_bytes.length())
  buf.WriteBytes(signDoc.body_bytes)
}

if( !signDoc.auth_info.empty()) {
 buf.WriteUVarInt64(0x12) // wire type and field number for auth_info
  buf.WriteUVarInt64(signDoc.auth_info.length())
  buf.WriteBytes(signDoc.auth_info)
}

if( !signDoc.chain_id.empty()) {
  buf.WriteUVarInt64(0x1a) // wire type and field number for chain_id
  buf.WriteUVarInt64(signDoc.chain_id.length())
  buf.WriteBytes(signDoc.chain_id)
}

if signDoc.account_number != 0 {
  buf.WriteUVarInt64(0x20) // wire type and field number for account_number
  buf.WriteUVarInt(signDoc.account_number)
}

if signDoc.account_sequence != 0 {
  buf.WriteUVarInt64(0x28) // wire type and field number for account_sequence
  buf.WriteUVarInt(signDoc.account_sequence)
}

So I stand corrected 18 lines of significant code if you check for empty values for body_bytes, etc. which I think you're assuming will never be empty (so it would be 15 without those).

We can argue about what is more hacky, but one could argue that it's hacky to make assumptions about all protobuf implementations encoding fields in order or changing account/sequence numbers to accomodate encoding. Let's just assess the situation and make the best decision! In order to actually change acount/sequence numbers I actually need green lights from other folks too (ping @ebuchman).

By the way, I am aware that weave uses pseudocode to describe transaction signing: https://weave.iov.one/docs/weave/weave-api-spec/tx-sign-spec#signing-transactions. I don't necessarily see how this is much different and we do actually have a spec - we're basically saying encode a protobuf document in field order with no defaults. If you have a compliant implementation, just use it. If not, then hand code it... which is still true even with acc num/seq starting from 1 because encoding fields in order is another assumption, just saying...

@webmaster128
Copy link
Member

webmaster128 commented Aug 6, 2020

My primary concern is another case of do what the Go code does, which you need to reverse engineer in order to write a compliant implementation. If we want a good interface, I think it is necessary to think about the interface first, not how it is implemented in any technology.

Given your code snippet above, I assume you favour the encoding rules

  1. no extra fields
  2. fields ordered by field number
  3. empty fields are omitted

Am I missing something? If not, this is a relatively simple unique encoding schema that can be documented and reused. If it gets a nice name (I suggest "regencoding"), everyone can easily refer to the spec. It is only one more rule as in the avoid empty approach. It is likely you get this encoding for free in many cases. If not, you can implement it easily by either configuring an existing encoder or writing your own.

In my previous comment I confused canonicalization with unique encoding, but we e.g. don't need to be able to tell if a serialization is canonical or not. Sorry for that.

I'm not promoting Weave, any decision we made in Weave or the Weave documentation. Weave is unmaintained for quite some time and its original inventor did not work on it for well over a year now. I think we should aim higher in terms of accessibility by more people using more different technologies. But the first difference is that there is a complete spec above any code snippet and the code you see there only backs up what was defined before. The second difference is that the encoding is completely custom and not an existing encoding with extra rules. This makes things like numbers significatly simpler as we you can just used fixed length encodings.

and we do actually have a spec - we're basically saying encode a protobuf document in field order with no defaults

If that is what we want, good. I just don't see where it is specified in ADR-0020. What we have right now is in the verification: Create a SignDoc and serialize it. Due to the simplicity of the type it is expected that this matches the serialization used by the signer.

@webmaster128
Copy link
Member

webmaster128 commented Aug 6, 2020

By the way, the simplest way to follow rule number 3. in protobufjs is to encode like this:

    const bytes = SignDoc.encode({
      // encoder sorts fields by field number
      accountNumber: accountNumber || null, // normalize 0 to unset
      accountSequence: accountSequence || null, // normalize 0 to unset
      authInfoBytes: authInfoBytes,
      chainId: chainId,
      bodyBytes: bodyBytes,
    }).finish();

@ethanfrey
Copy link
Contributor

Here is some pseudocode that encodes to a buffer that has two functions WriteBytes and WriteUVarInt64 which all protobuf implementations must have anyway:

if( !signDoc.body_bytes.empty() ) {
buf.WriteUVarInt64(0xA) // wire type and field number for body_bytes
buf.WriteUVarInt64(signDoc.body_bytes.length())
buf.WriteBytes(signDoc.body_bytes)
}
...

This is not what you want to have people to be forced to implement in each language, especially without a spec.

I think the only viable options are either:

  1. document empty fields (0, "", []) are not encoded, ensure that is done in Go, and then this puts the burden of checks on all the client library (which can relatively easily be done (eg || null), but is an annoying gotcha.
  2. ensure we never have valid "zero values" which were actually intentionally set but then ignored by go. Empty pubkey bytes can be not-encoded in any case, but a 0 amount is generally intended as a 0.

btw, I think account numbers do start with 1, it is just sequences.

Adding yet more custom go code leads us towards the same snakepit that was go-amino. Not that it was a bad codec, but it was unportable. Ensuring portability should be the first goal (or second after correctness).

@aaronc
Copy link
Member Author

aaronc commented Aug 6, 2020

For the record this has absolutely nothing to do with golang specifics. Gogo proto just happens to do the logical thing. In the intial version of ADR 020 I explained that 1) fields need to be serialized in order and 2) that default values need to be omitted. This is just the logical choice for attempting to canonicalize a protobuf message and it's documented here: https://github.com/regen-network/canonical-proto3. So nothing to do with golang. That documentation was linked in earlier versions but since we've tried to simplify signing to the utmost it's not even referenced, but we probably should reference it or even include the relevant parts in ADR 020.

There is no way to avoid some special encoding rules for SignDoc because the protobuf spec does not define them. So they are necessary. We could have chosen to not omit default values or to serialize fields in reverse order! Neither would have been logical choices IMHO but they are valid protobuf.

Like I said either a) you have protobuf implementation that happens to follow these rules or b) you hand encode SignDoc. I don't see any way around some clients needing to hand code stuff because somewhere there's probably a protobuf impl that doesn't encode fields in order...

None of this is to say that we shouldn't consider avoiding 0 but so far the biggest reason seems to be to accommodate how protobuf.js does things which I would note is also tying a spec to a language. And so far I haven't heard any feedback on potential downsides or lack thereof from anyone else here and I can't proceed without that...

@webmaster128
Copy link
Member

but we probably should reference it or even include the relevant parts in ADR 020.

There is no way to avoid some special encoding rules for SignDoc because the protobuf spec does not define them. So they are necessary.

Agreed. And at that point it probably does not really matter if we have 2 or 3 encoding rules on top of protobuf. I'd inline the encoding rules because most of canonical-proto3 is not needed here.

the biggest reason seems to be to accommodate how protobuf.js does things which I would note is also tying a spec to a language

For me it is not about better supporting one language or library but about the question whether or not we can assume serializations match by accident. protobuf.js is just one example that shows we either need to specify zero handling or work around it.

@zmanian
Copy link
Member

zmanian commented Aug 7, 2020

I'm pretty strongly opposed to something that will require so much user education.

This is computers! Sequences start with "0"

@webmaster128
Copy link
Member

btw, I think account numbers do start with 1, it is just sequences.

Turns out account numbers start with 0 as well, you just hardly see it. This is the account of the delegator address or our validator in local simapp setup:

$ curl -sS http://localhost:1318/auth/accounts/cosmos1lv5lq7qjr42attsdma622whwa7v0rc2d4pful8 | jq
{
  "height": "0",
  "result": {
    "type": "cosmos-sdk/BaseAccount",
    "value": {
      "address": "cosmos1lv5lq7qjr42attsdma622whwa7v0rc2d4pful8",
      "public_key": "61rphyEDts8E3PNwuSD0PqxJQimrqo6G1ZMqckQt5//nJl+vjEc=",
      "sequence": "1"
    }
  }
}

@webmaster128
Copy link
Member

Given protobufjs/protobuf.js#1357 (comment), I think it is pretty clear that serializing default values for numbers is disallowed by the spec and a bug in the encoder. This invalidates the original motivation for this change and I think we can close this PR.

@aaronc aaronc closed this Aug 7, 2020
@amaury1093 amaury1093 deleted the aaronc/6213-acc-num-seq-from-1 branch September 3, 2020 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants