Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(test): avoid failing getblocktemplate acceptance test when the chain tip changes #6091

Merged
merged 11 commits into from
Feb 3, 2023

Conversation

arya2
Copy link
Contributor

@arya2 arya2 commented Feb 2, 2023

Motivation

We need to ensure that block templates remain valid for repeated proposal checks.

See #5963 (comment).

Closes #6079.

I was hoping to add/retain test coverage but this grew more complicated than expected so let me know if one of the alternative designs on that issue would be preferable.

Complex Code or Requirements

This PR adds some concurrency in try_validate_block_template:

  • A spawned task for making long poll requests and sending new templates with submit_old: false to long_poll_result_rx until receiving a message on done_rx
  • A select! for cancelling block proposals and making new requests when receiving messages from long_poll_result_rx

Solution

  • Spawns a thread that calls getblocktemplate with long polling
  • Interrupts and retries block proposal check when new templates with a submit_old: false are received

Related changes:

  • Increases required number of successful proposals and decreases the interval to test long polling

Review

Let me know if one of the alternative designs would be preferable.

Reviewer Checklist

  • Will the PR name make sense to users?
    • Does it need extra CHANGELOG info? (new features, breaking changes, large changes)
  • Are the PR labels correct?
  • Does the code do what the ticket and PR says?
    • Does it change concurrent code, unsafe code, or consensus rules?
  • How do you know it works? Does it have tests?

Follow Up Work

  • Address review comments

@arya2 arya2 added C-enhancement Category: This is an improvement P-High 🔥 I-integration-fail Continuous integration fails, including build and test failures C-testing Category: These are tests A-rpc Area: Remote Procedure Call interfaces labels Feb 2, 2023
@arya2 arya2 requested a review from a team as a code owner February 2, 2023 02:22
@arya2 arya2 self-assigned this Feb 2, 2023
@arya2 arya2 requested review from dconnolly and removed request for a team February 2, 2023 02:22
@github-actions github-actions bot added the C-bug Category: This is a bug label Feb 2, 2023
@arya2 arya2 requested a review from teor2345 February 2, 2023 02:23
teor2345
teor2345 previously approved these changes Feb 2, 2023
Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, let's run it a few times to make sure that it works?

zebrad/tests/common/rpc_client.rs Outdated Show resolved Hide resolved
@teor2345 teor2345 added the C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG label Feb 2, 2023
@arya2
Copy link
Contributor Author

arya2 commented Feb 2, 2023

Looks good, let's run it a few times to make sure that it works?

It ran successfully a few times but I changed the constants to make 300 block proposals at 5 second intervals to be sure. I'll try 3000 if that succeeds.

Update: It succeeded with 300.

Update: 3000 failed with "not based on the current best chain tip", possibly because it was dropping the send futures.

Update: 1000 succeeded. I'm going to try a shorter EXTRA_LONGPOLL_WAIT_TIME because I'd like to be sure that longpolling is working as expected too.

Update: a shorter extra longpoll wait time worked too (with 2500 iterations).

@codecov
Copy link

codecov bot commented Feb 2, 2023

Codecov Report

Merging #6091 (419f3e8) into main (a51bc2e) will decrease coverage by 0.06%.
The diff coverage is n/a.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6091      +/-   ##
==========================================
- Coverage   78.08%   78.02%   -0.06%     
==========================================
  Files         304      304              
  Lines       39035    39035              
==========================================
- Hits        30482    30459      -23     
- Misses       8553     8576      +23     

mergify bot added a commit that referenced this pull request Feb 2, 2023
Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you're dropping some futures, but you might want to be awaiting them?
https://github.com/ZcashFoundation/zebra/actions/runs/4071294784/jobs/7012969586#step:8:970

Because send is cancel-safe, you might not be sending any messages on these channels at all:
https://docs.rs/tokio/latest/tokio/sync/mpsc/struct.Sender.html#method.send

Depending on the behaviour you want, you could await the futures or use blocking_send(). If there is any risk of deadlocks or hangs, then try_send() or send_timeout() might be better.

@arya2
Copy link
Contributor Author

arya2 commented Feb 2, 2023

Looks like you're dropping some futures, but you might want to be awaiting them? https://github.com/ZcashFoundation/zebra/actions/runs/4071294784/jobs/7012969586#step:8:970

Because send is cancel-safe, you might not be sending any messages on these channels at all: https://docs.rs/tokio/latest/tokio/sync/mpsc/struct.Sender.html#method.send

Depending on the behaviour you want, you could await the futures or use blocking_send(). If there is any risk of deadlocks or hangs, then try_send() or send_timeout() might be better.

🤦Thank you. I somehow missed the async on that fn. Updated my rust-analyzer settings to check clippy on save for next time.

@arya2 arya2 requested a review from teor2345 February 2, 2023 21:05
@arya2
Copy link
Contributor Author

arya2 commented Feb 2, 2023

I updated the constants to run the proposal test 1000x at 300ms interval. The log for an interrupt by a template with submit_old: false is usually visible in CI now:

2023-02-02T22:12:38.388229Z INFO got longpolling response with submitold of false before result of proposal tests

https://github.com/ZcashFoundation/zebra/actions/runs/4078725058/jobs/7030039421#step:8:1096

Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns about the reliability of the timings used here, but I'm not sure if they are blockers.

@arya2
Copy link
Contributor Author

arya2 commented Feb 3, 2023

I have some concerns about the reliability of the timings used here, but I'm not sure if they are blockers.

These are excellent notes that have helped me notice some additional mistakes in this PR, but I'd like to handle these in a new PR so that the issue doesn't cause any more improper CI failures in the meantime.

Your concerns are perceptive, the timings - particularly BLOCK_PROPOSAL_INTERVAL - are wrong or flawed and in need of better explanations.

@teor2345
Copy link
Contributor

teor2345 commented Feb 3, 2023

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Feb 3, 2023

rebase

❌ Base branch update has failed

Git reported the following error:

Rebasing (1/9)
Auto-merging zebra-rpc/src/methods/get_block_template_rpcs/types/hex_data.rs
CONFLICT (content): Merge conflict in zebra-rpc/src/methods/get_block_template_rpcs/types/hex_data.rs
error: could not apply 0d8b0256... adds basic usage of long polling in gbt test
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
Could not apply 0d8b0256... adds basic usage of long polling in gbt test

err-code: 46491

@teor2345
Copy link
Contributor

teor2345 commented Feb 3, 2023

@arya2 did you want to fix the merge conflict then I can approve?

@arya2 arya2 requested a review from teor2345 February 3, 2023 03:52
Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's give this a shot!

@teor2345
Copy link
Contributor

teor2345 commented Feb 3, 2023

In your next PR, would you mind reducing the verbose test output, if it isn't already fixed in this PR?

2023-02-03T03:52:28.131089Z INFO got getblocktemplate response, might not have transactions response_text="{"jsonrpc":"2.0","result":{"capabilities":["proposal"],"version":4,"previousblockhash":"00000000003a0c30f40de72e22e59a07e33adb56c9185b82b57a35ce415112d8","blockcommitmentshash":"3eb2e1161099c57fc790abdddf84d13aadcb857873036a9b847f6e0006f55126","lightclientroothash":"3eb2e1161099c57fc790abdddf84d13aadcb857873036a9b847f6e0006f55126","finalsaplingroothash":"3eb2e1161099c57fc790abdddf84d13aadcb857873036a9b847f6e0006f55126","defaultroots":{"merkleroot":"252af63d99a964648e66b8b0e2c4fb3ec4c9a78dacc5ba093d5a6079f151955b","chainhistoryroot":"010537ba23f1a4932eae08cc122fa203afede82cbb939087b4ee8b9014c9e93f","authdataroot":"dc40ac2b3a4ae92e4aa0d42abeea6934ef91e6ab488772c0466d7051180a4e83","blockcommitmentshash":"3eb2e1161099c57fc790abdddf84d13aadcb857873036a9b847f6e0006f55126"},"transactions":[{"data":"040000808520...

https://github.com/ZcashFoundation/zebra/actions/runs/4080594267/jobs/7033830311#step:8:199

Some of these transactions, blocks, or templates can contain 4MB of hex data. And sometimes GitHub won't load the logs if they are too big.

@mergify mergify bot merged commit 0aab3df into main Feb 3, 2023
@mergify mergify bot deleted the fix-bad-prevblk-in-test branch February 3, 2023 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rpc Area: Remote Procedure Call interfaces C-bug Category: This is a bug C-enhancement Category: This is an improvement C-testing Category: These are tests C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG I-integration-fail Continuous integration fails, including build and test failures
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ignore expected "proposal is not based on the current chain tip" error in getblocktemplate acceptance test
2 participants